Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to forward a sequencer module step by step? for sampling #352

Closed
Tae618 opened this issue Oct 15, 2016 · 4 comments
Closed

how to forward a sequencer module step by step? for sampling #352

Tae618 opened this issue Oct 15, 2016 · 4 comments

Comments

@Tae618
Copy link

Tae618 commented Oct 15, 2016

I have built a model as follows

function layer:buildLanguageModel()
  local dec = nn.Sequential()
  dec:add(nn.LookupTableMaskZero(self.vocabsize+1, self.hiddensize))
  dec.lstmLayers = {}
  for i=1, self.nlayer do  
      dec.lstmLayers[i] = nn.LSTM(self.hiddensize, self.hiddensize):maskZero(1)
      dec:add(nn.Sequencer(dec.lstmLayers[i]))
  end
  dec:add(nn.Sequencer(nn.MaskZero(nn.Linear(self.hiddensize, self.vocabsize+1), 1)))
  dec:add(nn.Sequencer(nn.MaskZero(nn.LogSoftMax(), 1)))

  return dec
end

In the training phase, it will use a sequence of words as input.
But in the evaluating phase, I use a start token as input for t=0 step forward and then the input of t step will be the output of t-1 step. So I need forward the sequencer module step by step from 1 to sequence length step.
Anyone knows how to do this ?

@Tae618 Tae618 changed the title how how to forward a sequencer module step by step? for sampling Oct 15, 2016
@Tae618
Copy link
Author

Tae618 commented Oct 15, 2016

I wonder if sequencer module can forward at single time-step. Maybe it must forward for sequence and can't forward a single step.

@JoostvDoorn
Copy link
Contributor

You can use remember with Sequencer. You might find it helpful to have a look at #304 .

@Tae618
Copy link
Author

Tae618 commented Oct 15, 2016

Thank you for your suggestion, I have known how to deal with it.
Just like #247 does and making some changes will be ok.

@sudongqi
Copy link

sudongqi commented Oct 15, 2016

The solution on #247 doesn't work for me. Maybe it's because I am using SeqLSTM and SeqLSTM handle forget and remember differently? I will do a testing on an overfitted example.

Update: This bug is caused by not loading the model properly. #353

@Tae618 Tae618 closed this as completed Oct 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants