Skip to content

Commit

Permalink
[Refactor]: Refactor DETR and Deformable DETR (open-mmlab#8763)
Browse files Browse the repository at this point in the history
* [Fix] Fix UT to be compatible with pytorch 1.6 (open-mmlab#8707)

* Update

* Update

* force reinstall pycocotools

* Fix build_cuda

* docker install git

* Update

* comment other job to speedup process

* update

* uncomment

* Update

* Update

* Add comments for --force-reinstall

* [Refactor] Refactor anchor head and base head with boxlist (open-mmlab#8625)

* Refactor anchor head

* Update

* Update

* Update

* Add a series of boxes tools

* Fix box type to support n x box_dim boxes

* revert box type changes

* Add docstring

* refactor retina_head

* Update

* Update

* Fix comments

* modify docstring of coder and ioucalculator

* Replace with_boxlist with use_box_type

* fix: fix config of detr-r18

* fix: modified import of MSDeformAttn in PixelDecoder of Mask2Former

* feat: add TransformerDetector as the base detector of DETR-like detectors

* refactor: refactor modules and configs of DETR

* refactor: refactor DETR-related modules in transformer.py

* refactor: refactor DETR-related modules in transformer.py

* fix: add type comments in detr.py

* correct trainloop in detr_r50 config

* fix: modify the parent class of DETRHead to BaseModule

* refactor: refactor modules and configs of Deformable DETR

* fix: modify the usage of num_query

* fix: modify the usage of num_query in configs

* refactor: replace input_proj of detr with ChannelMapper neck

* refactor: delete multi_apply in DETRHead.forward()

* Update detr_r18_8xb2-500e_coco.py

using channel mapper for r18

* change the name of detection_transfomer.py to base_detr.py

* refactor: modify construct binary masks section of forward_pretransformer

* refactor: utilize abstractmethod

* update ABCmeta to make sure reload class TransformerDetector

* some annotation

* some annotation

* some annotation

* refactor: delete _init_transformer in detectors

* refactor: modify args of deformable detr

* refactor: modify about super().__init__()

* Update detr_head.py

Remove the multi feat lvl in function 'predict_by_feat'

* Update detr.py

update init_weights

* some annotation for head

* to make sure the head args the same as detector

* to make sure the head args the same as detector

* some bug

* fix: fix bugs of num_pred in DeformableDETRHead

* add kwargs to transformer

* support MLP and sineembed position

* detele positional encodeing

* delete useless postnorm

* Revert "add kwargs to transformer"

This reverts commit a265c1a.

* Update detr_head.py

Update type and shape of args

* Update detr_head.py

fix args docstring in predict_by_feat

* Update base_detr.py

Update docstring for forward_pretransformer

* Update deformable_detr.py

Fix docstring

* to support conditional detr with reload forward_transformer

* fix: update config files of Two-stage and Box-refine

* replace all bs with batch_size in detr-related files

* update deformable.py and transformer.py

* update docstring in base_detr

* update docstring in base_detr, detr

* doc refine

* Revert "doc refine"

This reverts commit b69da4f.

* doc refine

* doc refine

* updabase_detr, detr, and le layers/transformdoc

* fix doc in base_detr

* add origin repo link

* add origin repo link

* refine doc

* refine doc

* refine doc

* refine doc

* refine doc

* refine doc

* refine doc

* refine doc

* doc: add doc of the first edition of Deformable DETR

* batch_size to bs

* refine doc

* refine doc

* feat: add config comments of specific module

* refactor: refactor base DETR class TransformerDetector

* fix: fix wrong return typehint of forward_encoder in TransformerDetector

* refactor: refactor DETR

* refactor: refactor Deformable DETR

* refactor: refactor forward_encoder and pre_decoder

* fix: fix bugs of new edition

* refactor: small modifications

* fix: move get_reference_points to deformable_encoder

* refactor: merge init_&inter_reference to references in Deformable DETR

* modify docstring of get_valid_ratio in Deformable DETR

* add some docstring

* doc: add docstring of deformable_detr.py

* doc: add docstring of deformable_detr_head.py

* doc: modify docstring of deformable detr

* doc: add docstring of deformable_detr_head.py

* doc: modify docstring of deformable detr

* doc: add docstring of base_detr.py

* doc: refine docstring of base_detr.py

* doc: refine docstring of base_detr.py

* a little change of MLP

* a little change of MLP

* a little change of MLP

* a little change of MLP

* refine config

* refine config

* refine config

* refine doc string for detr

* little refine doc string for detr.py

* tiny modification

* doc: refine docstring of detr.py

* tiny modifications to resolve the conversations

* DETRHead.predict() draft

* tiny modifications to resolve conversations

* refactor: modify arg names and forward strategies of bbox_head

* tiny modifications to resolve the conversations

* support MLP

* fix docsting of function pre_decoder

* fix docsting of function pre_decoder

* fix docstring

* modifications for resolving conversations

* refactor: eradicate key_padding_mask args

* refactor: eradicate key_padding_mask args

* fix: fix bug of deformable detr and resolve some conversations

* refactor: rename base class with DetectionTransformer and other modifications

* fix: fix config of detr

* fix the bug of init

* fix: fix init_weight of DETR and Deformable DETR

* resolve conflict

* fix auto-merge bug

* fix pre-commit bug

* refactor: move the position of encoder and decoder

* delete Transformer in ci test

* delete Transformer in ci test

Co-authored-by: jbwang1997 <[email protected]>
Co-authored-by: KeiChiTse <[email protected]>
Co-authored-by: LYMDLUT <[email protected]>
Co-authored-by: lym <[email protected]>
Co-authored-by: Kei-Chi Tse <[email protected]>
  • Loading branch information
6 people authored and jshilong committed Jan 19, 2023
1 parent d2a3cbb commit 4d30934
Show file tree
Hide file tree
Showing 19 changed files with 2,635 additions and 1,771 deletions.
65 changes: 25 additions & 40 deletions configs/deformable_detr/deformable-detr_r50_16xb2-50e_coco.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
]
model = dict(
type='DeformableDETR',
num_query=300,
num_feature_levels=4,
with_box_refine=False,
as_two_stage=False,
data_preprocessor=dict(
type='DetDataPreprocessor',
mean=[123.675, 116.28, 103.53],
Expand All @@ -27,50 +31,31 @@
act_cfg=None,
norm_cfg=dict(type='GN', num_groups=32),
num_outs=4),
encoder=dict( # DeformableDetrTransformerEncoder
num_layers=6,
layer_cfg=dict( # DeformableDetrTransformerEncoderLayer
self_attn_cfg=dict( # MultiScaleDeformableAttention
embed_dims=256),
ffn_cfg=dict(
embed_dims=256, feedforward_channels=1024, ffn_drop=0.1))),
decoder=dict( # DeformableDetrTransformerDecoder
num_layers=6,
return_intermediate=True,
layer_cfg=dict( # DeformableDetrTransformerDecoderLayer
self_attn_cfg=dict( # MultiheadAttention
embed_dims=256,
num_heads=8,
dropout=0.1),
cross_attn_cfg=dict( # MultiScaleDeformableAttention
embed_dims=256),
ffn_cfg=dict(
embed_dims=256, feedforward_channels=1024, ffn_drop=0.1)),
post_norm_cfg=None),
positional_encoding_cfg=dict(num_feats=128, normalize=True, offset=-0.5),
bbox_head=dict(
type='DeformableDETRHead',
num_query=300,
num_classes=80,
in_channels=2048,
sync_cls_avg_factor=True,
as_two_stage=False,
transformer=dict(
type='DeformableDetrTransformer',
encoder=dict(
type='DetrTransformerEncoder',
num_layers=6,
transformerlayers=dict(
type='BaseTransformerLayer',
attn_cfgs=dict(
type='MultiScaleDeformableAttention', embed_dims=256),
feedforward_channels=1024,
ffn_dropout=0.1,
operation_order=('self_attn', 'norm', 'ffn', 'norm'))),
decoder=dict(
type='DeformableDetrTransformerDecoder',
num_layers=6,
return_intermediate=True,
transformerlayers=dict(
type='DetrTransformerDecoderLayer',
attn_cfgs=[
dict(
type='MultiheadAttention',
embed_dims=256,
num_heads=8,
dropout=0.1),
dict(
type='MultiScaleDeformableAttention',
embed_dims=256)
],
feedforward_channels=1024,
ffn_dropout=0.1,
operation_order=('self_attn', 'norm', 'cross_attn', 'norm',
'ffn', 'norm')))),
positional_encoding=dict(
type='SinePositionalEncoding',
num_feats=128,
normalize=True,
offset=-0.5),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
_base_ = 'deformable-detr_r50_16xb2-50e_coco.py'
model = dict(bbox_head=dict(with_box_refine=True))
model = dict(with_box_refine=True)
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
_base_ = 'deformable-detr_refine_r50_16xb2-50e_coco.py'
model = dict(bbox_head=dict(as_two_stage=True))
model = dict(as_two_stage=True)
2 changes: 1 addition & 1 deletion configs/detr/detr_r18_8xb2-500e_coco.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@
backbone=dict(
depth=18,
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet18')),
bbox_head=dict(in_channels=512))
neck=dict(in_channels=[512]))
78 changes: 42 additions & 36 deletions configs/detr/detr_r50_8xb2-150e_coco.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
]
model = dict(
type='DETR',
num_query=100,
data_preprocessor=dict(
type='DetDataPreprocessor',
mean=[123.675, 116.28, 103.53],
Expand All @@ -19,45 +20,50 @@
norm_eval=True,
style='pytorch',
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
neck=dict(
type='ChannelMapper',
in_channels=[2048],
kernel_size=1,
out_channels=256,
act_cfg=None,
norm_cfg=None,
num_outs=1),
encoder=dict( # DetrTransformerEncoder
num_layers=6,
layer_cfg=dict( # DetrTransformerEncoderLayer
self_attn_cfg=dict( # MultiheadAttention
embed_dims=256,
num_heads=8,
dropout=0.1),
ffn_cfg=dict(
embed_dims=256,
feedforward_channels=2048,
num_fcs=2,
ffn_drop=0.1,
act_cfg=dict(type='ReLU', inplace=True)))),
decoder=dict( # DetrTransformerDecoder
num_layers=6,
layer_cfg=dict( # DetrTransformerDecoderLayer
self_attn_cfg=dict( # MultiheadAttention
embed_dims=256,
num_heads=8,
dropout=0.1),
cross_attn_cfg=dict( # MultiheadAttention
embed_dims=256,
num_heads=8,
dropout=0.1),
ffn_cfg=dict(
embed_dims=256,
feedforward_channels=2048,
num_fcs=2,
ffn_drop=0.1,
act_cfg=dict(type='ReLU', inplace=True))),
return_intermediate=True),
positional_encoding_cfg=dict(num_feats=128, normalize=True),
bbox_head=dict(
type='DETRHead',
num_classes=80,
in_channels=2048,
transformer=dict(
type='Transformer',
encoder=dict(
type='DetrTransformerEncoder',
num_layers=6,
transformerlayers=dict(
type='BaseTransformerLayer',
attn_cfgs=[
dict(
type='MultiheadAttention',
embed_dims=256,
num_heads=8,
dropout=0.1)
],
feedforward_channels=2048,
ffn_dropout=0.1,
operation_order=('self_attn', 'norm', 'ffn', 'norm'))),
decoder=dict(
type='DetrTransformerDecoder',
return_intermediate=True,
num_layers=6,
transformerlayers=dict(
type='DetrTransformerDecoderLayer',
attn_cfgs=dict(
type='MultiheadAttention',
embed_dims=256,
num_heads=8,
dropout=0.1),
feedforward_channels=2048,
ffn_dropout=0.1,
operation_order=('self_attn', 'norm', 'cross_attn', 'norm',
'ffn', 'norm')),
)),
positional_encoding=dict(
type='SinePositionalEncoding', num_feats=128, normalize=True),
embed_dims=256,
loss_cls=dict(
type='CrossEntropyLoss',
bg_cls_weight=0.1,
Expand Down
Loading

0 comments on commit 4d30934

Please sign in to comment.