Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling from RNN #304

Open
yashmaverick opened this issue Jul 23, 2016 · 5 comments
Open

Sampling from RNN #304

yashmaverick opened this issue Jul 23, 2016 · 5 comments

Comments

@yashmaverick
Copy link

@nicholas-leonard
I am trying to predict the next number in a sequence, where each sequence is of length say 5.

For example: input is {1,2,3,4,5}
target is {2,3,4,5,6}

Training set has 1000 such sequences
Validation set has 100 sequences

Model is as shown below:
SeqLen = 5
rho = 5 -- no .of steps BPTT
batchSize = 1
hiddenSize = 20
inputSize = 1
outputSize = 1
no_sampling = 10

model = nn.Sequential()
:add(nn.Sequencer(nn.FastLSTM(inputSize,hiddenSize)))
:add(nn.Sequencer(nn.Linear(hiddenSize, outputSize)))
:add(nn.Sequencer(nn.ReLU()))

While inference, how to do sampling from the model ??

I wish to sample from the model 10 times (say 10 trials).

While sampling at first time,
inputs are {t1,t2,t3,t4,t5} and true output is say {t2,t3,t4,t5,t6} and
if the model predicts {t2',t3',t4',t5',t6'}

Next time when I sample, what will be my inputs?

Case1: inputs {t2,t3,t4,t5,t6'} or

Case2: {t2',t3',t4',t5',t6'}

Case3: only {t6'}, if I go on sampling indefinitely like this, is there a chance that predictions after rho trail here (5th trial) are same ??

But in either of Case1 and Case2, to sample for 5th time, my inputs will be completely
predictions i.e {t6',t7',t8',t9',t10'}. The issue is only with sampling for first four trials during sampling.

Also, will it be good if I treat this problem as 'Sequence to One' prediction, where during training the inputs are {1,2,3,4,5} and target is {6} ??

@nicholas-leonard
Copy link
Member

@yashmaverick If you want to sample from the model, you should feed inputs one at a time. So suppose I want to condition the model on t1, and generate a sequence of n samples, I can do something like :

rnn:evaluate()
local input = t1
local samples = {}
for i=1,n do
   local output = rnn:forward( {input} )
   table.insert(samples, output[1])
   input = output[1]
end 
print("generated sequence: ", sample)

Does this make sense to you?

Also, will it be good if I treat this problem as 'Sequence to One' prediction, where during training the inputs are {1,2,3,4,5} and target is {6} ??

No I think for training it is best to have, like you say, inputs = {t1, t2, ...} and targets= {t2, t3, ...}. That way you gradients from the output at each time-step.

@yashmaverick
Copy link
Author

yashmaverick commented Jul 23, 2016

@nicholas-leonard
Thanks..!
In the above sample code can I use rnn:remember(eval) in the for loop i.e while sampling ?
The outputs with and without using rnn:remember(eval) are different.

@nicholas-leonard
Copy link
Member

@yashmaverick Yes use rnn:remember('eval')!

@hashbangCoder
Copy link

@nicholas-leonard @yashmaverick Should rnn:remember('eval') be used inside the loop? Because in #247 it is used only once outside the loop and it works fine.
I tried using it inside the loop and the RNN samples the same word everytime. Using it outside leads to different sample at each time step

@hashbangCoder
Copy link

hashbangCoder commented Sep 25, 2016

@nicholas-leonard I'm having interesting interactions between remember() and forget(). I'm building a captioning model and during training, every 1000 or so iterations, I sample the RNN on a test image. This is a high-level overview of my code -

<train code above>
if iter%1000 == 0 then
    evaluate_rnn(rnn,test_image)
end

[test code snippet]
function evaluate_rnn(rnn,image)
    rnn:evaluate()   -- for the dropout modules 
    rnn:forget()
    rnn:remember('eval')
    for i=1,maxSample do
        <sample output and feed it back>
    end
    rnn:forget()
end

If I leave out the first forget() call, my network samples the same word each time. If I move the remember('eval') inside the loop, it samples the same word again. Running it as shown above (mostly) produces different samples at each time step. Although I haven't trained it enough for enough variations.

Could you maybe tell if the remember call should be placed inside or outside? Am I missing out on something else? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants