This is the official implementation code for Pairwise CNN-Transformer Features for Human-Object Interaction Detection. [paper]
We don't update the code anymore. Please contact [email protected] (emil) with any questions. If you are interested in our work, please read the UPT code first; reproducing our work is straightforward.
Refer to launch_template.sh for training and testing commands with different options.
To test the PCT model on HICO-DET, you can either use the Python utilities UPT implemented or the Matlab utilities provided by Chao et al.. For V-COCO, we did not implement evaluation utilities, and instead use the utilities provided by Gupta et al.. Refer to these instructions for more details.
UPT provides weights for fine-tuned DETR models to facilitate reproducibility. To attempt fine-tuning the DETR model yourself, refer to this repository.
| Model | Dataset | Default Settings | PCT Weights |
|---|---|---|---|
| PCT-R50 | HICO-DET | (33.63, 28.73, 35.10) |
weights |
| PCT-R101 | HICO-DET | (33.79, 29.70, 35.00) |
weights |
| Model | Dataset | Scenario 1 | Scenario 2 | PCT Weights |
|---|---|---|---|---|
| PCT-R50 | V-COCO | 59.4 |
65.0 |
weights |
| PCT-R101 | V-COCO | 61.4 |
67.1 |
weights |
Many thanks to Researcher Zhang for the valuable advice.
