Extract features from bounding boxes #665

TheShadow29 · 2019-04-12T06:31:54Z

Hi. First thanks for the amazing repository.

Features extracted from a detection network are often used in other tasks (like vqa). The code shows how to extract out the features given the bounding boxes. Currently, I have just added some utility functions to demo/predictor.py. This possibly solves #164 with minor changes.

Currently, I am not sure how to test if everything is correct. A sanity check I have done is to re-classify the extracted boxes, and the results seem to be consistent.

Thanks

facebook-github-bot · 2019-04-12T06:32:11Z

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot · 2019-04-12T06:39:28Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

botcs · 2019-09-16T07:56:54Z

Hi @TheShadow29

Thanks for the PR.
I would like to see #164 resolved first, and then we can come back to discussing this PR as well

TheShadow29 · 2019-09-16T18:21:38Z

@botcs sounds good. Actually #164 does two things at once (if I have understood correctly). During the forward pass, it retains the image proposals as well as the image features.

This pr instead requires ground-truth box to be given first and then uses the box to retrieve the image features. There are two advantages to this:
(i) In some cases where you have the ground-truth boxes of the objects, you can directly use them (like in ms-coco derived datasets like refcoco)
(ii) Even if you don't have ground-truth boxes, if you first do a full forward pass of the test image, then get the final bounding box prediction and then re-use it to get the image feature via roi-align/pool, the new features would be better than the image features obtained directly during the forward pass.

The only down-side is that (ii) would take a bit more time to get the features (I don't have timing comparisons but I guess it would be aroudn 1.5 times slower). However, this is usually a one-time process so getting better features might be valuable at the cost of more processing time.

Let me know what you think. Thank you for your patience.

kangkang59812 · 2019-10-21T08:55:29Z

@botcs sounds good. Actually #164 does two things at once (if I have understood correctly). During the forward pass, it retains the image proposals as well as the image features.

This pr instead requires ground-truth box to be given first and then uses the box to retrieve the image features. There are two advantages to this:
(i) In some cases where you have the ground-truth boxes of the objects, you can directly use them (like in ms-coco derived datasets like refcoco)
(ii) Even if you don't have ground-truth boxes, if you first do a full forward pass of the test image, then get the final bounding box prediction and then re-use it to get the image feature via roi-align/pool, the new features would be better than the image features obtained directly during the forward pass.

The only down-side is that (ii) would take a bit more time to get the features (I don't have timing comparisons but I guess it would be aroudn 1.5 times slower). However, this is usually a one-time process so getting better features might be valuable at the cost of more processing time.

Let me know what you think. Thank you for your patience.

Thanks for your effort. But when I extracted the features using bbox(such as [13,4]), I got features of [15,1024]. So assert len(features) == len(gt_box_list[0].bbox) happended. How to fix it?
In box_head.py, result = self.post_processor((class_logits, box_regression), proposals), the bbox in 'result' is changed from [13,4] to [15,4] after processing.

TheShadow29 · 2019-10-21T19:48:49Z

@kangkang59812 Thanks for checking it out. What network are you using? I think I tested with res50 fpn maskrcnn architecture. My guess is there were some changes made to the repo (the pr was made quite some time back) and this pr would have to updated.

kangkang59812 · 2019-10-22T02:59:15Z

@TheShadow29 faster_rcnn_R_101_FPN. In inference.py line 108, I think it unwraps the boxlist to avoid additional overhead according to the comment. So 13 is changed to 15. By the way, instead of your sanity check , I test the extracted features using the roi_heads.box.predictor.cls_score() and I think the results are correct.

TheShadow29 · 2019-10-22T04:35:08Z

@kangkang59812 Awesome. Thanks for confirming

simaiden · 2020-01-05T15:29:33Z

Hi @TheShadow29 , I used your implementation to do retrieval but it gets worse performance that using a pre trained ResNet50 (in imagenet, without further training) to extract features after the detection. Maybe the bbox has to be resized according to image padding?

edit: it seems that is not necessary to pad the boxes, as said in https://github.com/facebookresearch/maskrcnn-benchmark/issues/965#issuecomment-510926086 , so I don't know why the worse performance

TheShadow29 · 2020-01-06T05:35:41Z

@simaiden Could you briefly explain your setup of retrieval? It doesn't seem to be same as object detection

simaiden · 2020-01-06T21:59:09Z

@simaiden Could you briefly explain your setup of retrieval? It doesn't seem to be same as object detection

I use Mask R-CNN to detect clothes in an image, after that get a feature vector with the cropped region. This region became an input to ResNet50, then I do global average pooling in some layer to get the feature vector. With this approach I get good results but not with the roi feature using your implementation.

TheShadow29 · 2020-01-07T05:36:47Z

@simaiden Could you verify it for object detection on coco? There might a few more things taking place under the hood in your use-case.

Extract features from bounding boxes

a6e05ea

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Apr 12, 2019

botcs added enhancement New feature or request good first issue Good for newcomers labels Sep 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract features from bounding boxes #665

Extract features from bounding boxes #665

TheShadow29 commented Apr 12, 2019

facebook-github-bot commented Apr 12, 2019

facebook-github-bot commented Apr 12, 2019

botcs commented Sep 16, 2019

TheShadow29 commented Sep 16, 2019 •

edited

Loading

kangkang59812 commented Oct 21, 2019 •

edited

Loading

TheShadow29 commented Oct 21, 2019

kangkang59812 commented Oct 22, 2019

TheShadow29 commented Oct 22, 2019

simaiden commented Jan 5, 2020 •

edited

Loading

TheShadow29 commented Jan 6, 2020

simaiden commented Jan 6, 2020 •

edited

Loading

TheShadow29 commented Jan 7, 2020

Extract features from bounding boxes #665

Are you sure you want to change the base?

Extract features from bounding boxes #665

Conversation

TheShadow29 commented Apr 12, 2019

facebook-github-bot commented Apr 12, 2019

facebook-github-bot commented Apr 12, 2019

botcs commented Sep 16, 2019

TheShadow29 commented Sep 16, 2019 • edited Loading

kangkang59812 commented Oct 21, 2019 • edited Loading

TheShadow29 commented Oct 21, 2019

kangkang59812 commented Oct 22, 2019

TheShadow29 commented Oct 22, 2019

simaiden commented Jan 5, 2020 • edited Loading

TheShadow29 commented Jan 6, 2020

simaiden commented Jan 6, 2020 • edited Loading

TheShadow29 commented Jan 7, 2020

TheShadow29 commented Sep 16, 2019 •

edited

Loading

kangkang59812 commented Oct 21, 2019 •

edited

Loading

simaiden commented Jan 5, 2020 •

edited

Loading

simaiden commented Jan 6, 2020 •

edited

Loading