-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple object recognition with visual attention #134
Comments
@pefi9 Use recurrent-visual-attention.lua script as a starting point. Build a dataset (without dp) where input is image output is sequence of targets (with location?). You should probably still be able to use most of the modules used in the original script, but you will need to assemble them differently and create a different one for the first time-step. If you need help, fork this repo and make multiple-object-recognition.lua script. Create a branch and pull request it here with |
Thanks @nicholas-leonard . I tried two approaches:
It seems to be learning, but the performance is not excellent. I'm sure I do have there some more mistakes. |
@pefi9 I don't think you should need to modify RecurrentAttention. Say you want to detect n objects per image, then formulate the problem as giving rho/n steps per object. So for 2 objects, I could assign a rho of 10 such that an object should be identified every 5 time-steps. You should build a MultiObjectReward criterion for doing https://github.com/pefi9/mo_va/blob/multi_digit_development/4_train.lua#L102-L129 (of course, you will still need a loop over n objects to update the ConfusionMatrix). Why build a criterion? So you can unit test it. Also, the current implementation only allows one call to reinforce() per batch as a single So yeah, I think you would build a MultiObjectReward criterion and include some unit tests so that it behaves as expected. Also, you should be able to use the original RecurrentAttention without modification as the output should have concat = nn.ConcatTable():add(nn.Select(n)):add(nn.Select(n*2))...:add(nn.Select(-1)) |
@nicholas-leonard , I had time to look at it today. I tried to handle the step-wise reward by implementing the https://github.com/Element-Research/rnn/blob/master/Sequencer.lua#L144-L146 , but as RecurrentAttention wrap the locator into Recursor, there is issue: "Sequencer.lua:37: expecting input table". So I created MOReinforce and MOReinforceNormal where the first one return reward for a specific step and the second one keeps track of the actual step. |
@pefi9 Sorry had a bad cold these past days. So I think we should modify AbstractSequencer to accept tables of rewards (one per time-step). |
@pefi9 I have modified the AbstractRecurrent to handle tables of rewards : 417f8df . Basically, you shouldn't need MOReinforce and MOReinforceNormal anymore. Instead, make sure that your MORewardCriterion calls |
@nicholas-leonard No worries, hope you are well now. I had couple errors in the code I'll update the github version tomorrow. It's works fine for single object however it takes a lot of time to train for multiple digits. Modification which I have not tackled yet is to enable recognition of sequences with variable length. I`m not whether is it even possible to do with the current version of RecurrentAttention? |
Thanks for the update. |
@pefi9 For variable length sequences, you could add a |
@nicholas-leonard With the new AbstractRecurrent I've got this error:
I assume it's caused by the RecurrentAttention. By changing its parent to nn.Container I have got different error:
? |
@pefi9 I just removed that check in latest commit. As for the first error, not sure how that is happening. |
@nicholas-leonard I've got 2 findings:
|
@pefi9 Hi, I'm also going to implement the DRAM model to apply to some real-world images. So have you got the problems solved? Do you think is it possible to use ReinforceNormal, Reinforce and RecurrentAttention without any modifications and just write a new Criterion to get the time-step reward now? Thanks. |
Hi @vyouman, yes it should be possible. However, we did not solved the first point in my previous comment. The work around I used is to change the parent class of RecurrentAttention from "nn.AbstractSequencer" to "nn.Container". |
@pefi9 Thanks for your patient reply. :p I wonder if you have any idea about how to solve the sequences of the variable length, to be clear, say the longest sequences in the dataset is D, and there are samples of different variable length in one batch, but the longest sequcence in a single batch may be shorter than D. Does it help to write a terminate class? Kind of confused about the solution to the sequences of variable length. |
@vyouman, I had the same question (Nicholas' answer from Feb 12). It's not possible at the moment. You have to define the maximum number of objects (length of the sequence) and number of taken glimpses in advance (I did: https://github.com/pefi9/mo_va/blob/multi_digit_development/2_model_VA.lua#L122-L126 , where opt.digits is the max length and opt.steps is the # of taken glimpses per object, digit). It would be nice feature to have but I can't think of any easy extension of the current code which would enable it. |
You could add padding. Specifically, you add dummy classes at the end of the target sequence that mean "END OF SEQUENCE". |
@nicholas-leonard I've come across the same problem that @pefi9 faced with RecurrentAttention's error when getStepModule is called. Shall I change the parent class like they did as well? Until now I was using a custom reinforce method for the Recursor module that essentially did the same thing, but I think it'd be better to delete my code and use what's built into this library. |
@nicholas-leonard Yeah, I've also encoutered the problem @pefi9 and @ssampang came across because of the deprecated getStepModule of the AbstractSequencer. Changing the parent class just doesn't work. I'm trying to implement the Deep Recurrent Attention model and my reward is a table. /home/vyouman/torch/install/bin/luajit: ...an/torch/install/share/lua/5.1/rnn/AbstractSequencer.lua:4: DEPRECATED 27 Oct 2015. Wrap your internal modules into a Recursor instead stack traceback: [C]: in function 'error' ...an/torch/install/share/lua/5.1/rnn/AbstractSequencer.lua:4: in function 'getStepModule' ...an/torch/install/share/lua/5.1/rnn/AbstractRecurrent.lua:177: in function 'reinforce' /home/vyouman/torch/install/share/lua/5.1/dpnn/Module.lua:586: in function 'reinforce' ...-linux.gtk.x86_64/workspace/DRAM/src/VRCaptionReward.lua:53: in function 'backward' ...product-linux.gtk.x86_64/workspace/DRAM/src/testRAEx.lua:171: in main chunk [C]: at 0x00406670 |
Hi,
I am trying to use the recurrent attention model for multiple object ( http://arxiv.org/pdf/1412.7755v2.pdf ). Would you have suggestions how to do it?
The text was updated successfully, but these errors were encountered: