Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
The previous code is unfortunately passed a test case. When all sequences are shorter than maximum length, at 1st time step, the first dimension size of `self.noise` is 1 in TrimZero algorithm. Then, (Lazy) Dropout's `self.noise` is copied across time steps, presumably, by [this](https://github.com/Element-Research/rnn/blob/master/AbstractRecurrent.lua#L30), as a result, it can avoid an error `incorrect size: only supporting singleton expansion (size=1)` since the first dimension size of `self.noise` is always equal to 1. Note that since Bayesian GRU with TrimZero should use monotonic sampling (the same dropout samplings across a batch) for dropouts, the performance is the same if an error is not occurred due to the distribution of sequence lengths.
- Loading branch information