This repository has been archived by the owner on Dec 29, 2022. It is now read-only.

Adding the hard attention when predicting decoder outputs . #287

Open

shamanez opened this issue Aug 4, 2017 · 0 comments

shamanez commented Aug 4, 2017 •

edited

Loading

This paper (Show, Attend and Tell: Neural Image Caption Generation with Visual Attention) mention about the hard attention . But then we cannot use standard backprop because of the non differential property. If we add this will it increases accuracy of the seq2seq model ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.