Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I can't train a sequencer( model ) with manually created table whereas it works with a table made by tensor:split() function #310

Open
freakeinstein opened this issue Jul 30, 2016 · 2 comments

Comments

@freakeinstein
Copy link

freakeinstein commented Jul 30, 2016

Hi, I'm a very beginner to this module, as well as torch and trying to implement a sequence to sequence transliteration. Here's my code below ( I could share whole jupyter - itorch file as well, link and a part of my training data link ):

chunk - 1 : onehot function:

function oneHot(tensor_in,size) -- recieves horizontal tensor of chars (sequence) and size of onehot

    local input = torch.split(tensor_in,1,2)

    for i = 1,#input do
        local temp = torch.zeros(1,size)
        temp[{1,(input[i][1][1])}],input[i] = 1,temp -- [i] => i'th 1X1 tensor from table [1][1] => get first row's forst colum value 

    end
    return input -- returns table of onehots for a sequence 
end
  • example input: oneHot(torch.Tensor({{1,2,3,4,5,6,5,6,7,8,9,10}}),10)
  • output:
    { 1 : DoubleTensor - size: 1x10 2 : DoubleTensor - size: 1x10 3 : DoubleTensor - size: 1x10 4 : DoubleTensor - size: 1x10 5 : DoubleTensor - size: 1x10 6 : DoubleTensor - size: 1x10 7 : DoubleTensor - size: 1x10 8 : DoubleTensor - size: 1x10 9 : DoubleTensor - size: 1x10 10 : DoubleTensor - size: 1x10 11 : DoubleTensor - size: 1x10 12 : DoubleTensor - size: 1x10 }

chunk - 2 model definition (rough) :

input_dim = 256
output_dim = 256
learning_rate,dout = 0.0001,0.5
epoch,batch = 1,50

model = nn.Sequencer(nn.LSTM(input_dim,output_dim)) --:add(nn.Sequencer(nn.LogSoftMax()))
criterion = nn.SequencerCriterion(nn.MSECriterion())

paramsj,gradj = model:getParameters()

chunk - 3 training (non batch) ignore 'BATCH COMMENT' for now:

for epoch = 1,5 do --#X do -- to repeat through every training sequence
    features = oneHot(X[epoch],input_dim) -- returns a tensor of [seq_len X one_hot] where each seq_len should be supplied
    targets = oneHot(Y[epoch],output_dim) -- to each forward sequence ( remember() till end and finally forget() )

    --[[features = torch.randn(10,256)  -- BATCH COMMENT, for test 2
    targets = torch.randn(10,256)

    features = features:split(1)
    targets = targets:split(1)]]--

    local m = nn.JoinTable(1)
    features = m:forward(features)
    targets = m:forward(targets)

    model:training()
    model:zeroGradParameters()

    out = model:forward(features)
    cost = criterion:forward(out,targets)
    print(cost)
    grad = criterion:backward(out,targets)
    model:backward(features,grad)


    model:updateParameters(0.001)
    print(paramsj[10],gradj[190]) -- to test whether training is being done
end 
  • corresponding output: meaning: training is not happening.
0.038746744386439   
0.02311717159653    0   

0.034582571371645   
0.02311717159653    0   

0.035191604152156   
0.02311717159653    1.7742855159069e-06 

0.039041843895305   
0.02311717159653    0   

0.022075240017417   
0.02311717159653    0

chunk - 3 training (non batch) uncomment 'BATCH COMMENT' in above code and re-run:

  • corresponding output: meaning: training is happening now.
10.772214914956 
0.023118052759942   3.0919057734209e-05 

10.506374708426 
0.023117784315262   0.00019368354859442 

10.221337475722 
0.023117571309151   -0.00025967498456485    

9.9043819908171 
0.023117199944637   -5.4534678097802e-05    

9.8754959708459 
0.02311717159653    -0.00037491185671427    

I prefer a correction with nice explanation, thanks.

@JoostvDoorn
Copy link
Contributor

Maybe you could provide a simple example that isolates your problem?

@freakeinstein
Copy link
Author

I've added it as description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants