You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 29, 2022. It is now read-only.
Please consider adding sampled softmax loss, in addition to "cross_entropy_sequence_loss".
For tasks with large target vocabularies, the speedup can be significant (with, perhaps, minor accuracy loss per step).
Even on "nmt_large" config with batch size of 128 and voc size of 32,000 I am getting about x1.224 speedup.
It is, however, a little tricky to add. I have a first draft here: https://github.com/okuchaiev/seq2seq/tree/sampled_softmax_first_try
It is not ready to be merged yet.
Let me know if this is of interest - I plan to polish my implementation and would appreciate thoughts on what is the right way to add it here.
The text was updated successfully, but these errors were encountered:
Please consider adding sampled softmax loss, in addition to "cross_entropy_sequence_loss".
For tasks with large target vocabularies, the speedup can be significant (with, perhaps, minor accuracy loss per step).
Even on "nmt_large" config with batch size of 128 and voc size of 32,000 I am getting about x1.224 speedup.
It is, however, a little tricky to add. I have a first draft here: https://github.com/okuchaiev/seq2seq/tree/sampled_softmax_first_try
It is not ready to be merged yet.
Let me know if this is of interest - I plan to polish my implementation and would appreciate thoughts on what is the right way to add it here.
The text was updated successfully, but these errors were encountered: