Benchmarking 3D Pose and Shape Estimation Beyond Algorithms

Hui En Pang Zhongang Cai Lei Yang Tianwei Zhang Ziwei Liu

S-Lab, Nanyang Technological University

NeurIPS 2022

[arXiv] • [Slides]

Getting started

Installation | Train | Evaluation | FLOPs |

Experiments

Single-datasets | Mixed-datasets | Augmentations | Backbones | Losses | Backbone-initialisation | Algorithms | Downloads |

Introduction

This repository builds upon MMHuman3D, an open source PyTorch-based codebase for the use of 3D human parametric models in computer vision and computer graphics. MMHuman3D is a part of the OpenMMLab project. The main branch works with PyTorch 1.7+.

These features will be contributed to MMHuman3D at a later date.

Major Features added to MMHuman3D

We have added multiple major features on top of MMHuman3D.

Benchmarks on 31 datasets
Benchmarks on 11 dataset combinations
Benchmarks on 9 backbones and different initialisation
Benchmarks on 9 augmentation techniques
Provide trained models on optimal configurations for inference
Evaluation on 5 test sets
FLOPs calculation

Additional:

Train annotation files for 31 datasets will be provided in the future
Future works can easily obtain benchmarks on HMR for baseline comparison on their selected dataset mixes and partition using our provided pipeline and annotation files.

Experiments

Single-datasets

Supported datasets:

(click to collapse)

AGORA (CVPR'2021)
AI Challenger (ICME'2019)
COCO (ECCV'2014)
COCO-WholeBody (ECCV'2020)
EFT-COCO-Part (3DV'2021)
EFT-COCO (3DV'2021)
EFT-LSPET (3DV'2021)
EFT-OCHuman (3DV'2021)
EFT-PoseTrack (3DV'2021)
EFT-MPII (3DV'2021)
Human3.6M (TPAMI'2014)
InstaVariety (CVPR'2019)
LIP (CVPR'2017)
LSP (BMVC'2010)
LSP-Extended (CVPR'2011)
MPI-INF-3DHP (3DC'2017)
MPII (CVPR'2014)
MTP (CVPR'2021)
MuCo-3DHP (3DV'2018)
MuPoTs-3D (3DV'2018)
OCHuman (CVPR'2019)
3DOH50K (CVPR'2020)
Penn Action (ICCV'2012)
3D-People (ICCV'2019)
PoseTrack18 (CVPR'2018)
PROX (ICCV'2019)
3DPW (ECCV'2018)
SURREAL (CVPR'2017)
UP-3D (CVPR'2017)
VLOG (CVPR'2019)
CrowdPose (CVPR'2019)

Please refer to datasets.md for training configs and results.

Benchmarks on different dataset combinations

Mixed-datasets

(click to collapse)

Mix 1: H36M, MI, COCO
Mix 2: H36M, MI, EFT-COCO
Mix 3: H36M, MI, EFT-COCO, MPII
Mix 4: H36M, MuCo, EFT-COCO
Mix 5: H36M, MI, COCO, LSP, LSPET, MPII
Mix 6: EFT-[COCO, MPII, LSPET], SPIN-MI, H36M
Mix 7: EFT-[COCO, MPII, LSPET], MuCo, H36M, PROX
Mix 8: EFT-[COCO, PT, LSPET], MI, H36M
Mix 9: EFT-[COCO, PT, LSPET, OCH], MI, H36M
Mix 10: PROX, MuCo, EFT-[COCO, PT, LSPET, OCH], UP-3D, MTP, Crowdpose
Mix 11: EFT-[COCO, MPII, LSPET], MuCo, H36M

Please refer to mixed-datasets.md for training configs and results.

Backbones

(click to collapse)

Please refer to backbone.md for training configs and results.

Backbone-initialisation

We find that transfering knowledge from a pose estimation model gives more competitive performance.

Initialised backbones:

(click to collapse)

ResNet-50 ImageNet (default)
ResNet-50 MPII
ResNet-50 COCO
HRNet-W32 ImageNet
HRNet-W32 MPII
HRNet-W32 COCO
Twins-SVT ImageNet
Twins-SVT MPII
Twins-SVT COCO

Please refer to backbone.md for training configs and results.

Augmentations

New augmentations:

(click to collapse)

Coarse dropout
Grid dropout
Photometric distortion
Random crop
Hard erasing
Soft erasing
Self-mixing
Synthetic occlusion
Synthetic occlusion over keypoints

Please refer to augmentation.md for training configs and results.

Losses

We find that training with L1 loss gives more competitive performance. Please refer to mixed-datasets-l1.md for training configs and results.

Downloads

We provide trained models from the optimal configurations for download and inference. Please refer to combine.md for training configs and results.

Dataset	Backbone	3DPW (PA-MPJPE)	Download
H36M, MI, COCO, LSP, LSPET, MPII	ResNet-50	51.66	model
H36M, MI, COCO, LSP, LSPET, MPII	HRNet-W32	49.18	model
H36M, MI, COCO, LSP, LSPET, MPII	Twins-SVT	48.77	model
H36M, MI, COCO, LSP, LSPET, MPII	Twins-SVT	47.70	model
EFT-[COCO, LSPET, MPII], H36M, SPIN-MI	HRNet-W32	47.68	model
EFT-[COCO, LSPET, MPII], H36M, SPIN-MI	Twins-SVT	47.31	model
H36M, MI, EFT-COCO	HRNet-W32	48.08	model
H36M, MI, EFT-COCO	Twins-SVT	48.27	model
H36M, MuCo, EFT-COCO	Twins-SVT	47.92	model

Algorithms

We benchmarked our major findings on several algorithms and hope to add more in the future. Please refer to algorithms.md for training configs and logs.

(click to collapse)

SPIN
GraphCMR
PARE
Mesh Graphormer

Installation

General set-up instructions follow that of MMHuman3d. Please refer to install.md for installation.

Train

Training with a single / multiple GPUs

python tools/train.py ${CONFIG_FILE} ${WORK_DIR} --no-validate

Example: using 1 GPU to train HMR.

python tools/train.py ${CONFIG_FILE} ${WORK_DIR} --gpus 1 --no-validate

Training with Slurm

If you can run MMHuman3D on a cluster managed with slurm, you can use the script slurm_train.sh.

./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR} ${GPU_NUM} --no-validate

Common optional arguments include:

--resume-from ${CHECKPOINT_FILE}: Resume from a previous checkpoint file.
--no-validate: Whether not to evaluate the checkpoint during training.

Example: using 8 GPUs to train HMR on a slurm cluster.

./tools/slurm_train.sh my_partition my_job configs/hmr/resnet50_hmr_pw3d.py work_dirs/hmr 8 --no-validate

You can check slurm_train.sh for full arguments and environment variables.

Evaluation

There's five benchmarks for evaluation:

3DPW-test (P2)
H36m-test (P2)
EFT-COCO-val
EFT-LSPET-test
EFT-OCHuman-test

Evaluate with a single GPU / multiple GPUs

python tools/test.py ${CONFIG} --work-dir=${WORK_DIR} ${CHECKPOINT} --metrics=${METRICS}

Example:

python tools/test.py configs/hmr/resnet50_hmr_pw3d.py --work-dir=work_dirs/hmr work_dirs/hmr/latest.pth --metrics pa-mpjpe mpjpe

Evaluate with slurm

If you can run MMHuman3D on a cluster managed with slurm, you can use the script slurm_test.sh.

./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} ${CONFIG} ${WORK_DIR} ${CHECKPOINT} --metrics ${METRICS}

Example:

./tools/slurm_test.sh my_partition test_hmr configs/hmr/resnet50_hmr_pw3d.py work_dirs/hmr work_dirs/hmr/latest.pth 8 --metrics pa-mpjpe mpjpe

FLOPs

tools/get_flops.py is a script adapted from flops-counter.pytorch and MMDetection to compute the FLOPs and params of a given model.

python tools/get_flops.py ${CONFIG_FILE} [--shape ${INPUT_SHAPE}]

You will get the results like this.

==============================
Input shape: (3, 1280, 800)
Flops: 239.32 GFLOPs
Params: 37.74 M
==============================

Note: This tool is still experimental and we do not guarantee that the number is absolutely correct. You may well use the result for simple comparisons, but double check it before you adopt it in technical reports or papers.

FLOPs are related to the input shape while parameters are not. The default input shape is (1, 3, 224, 224).
Some operators are not counted into FLOPs like GN and custom operators. Refer to mmcv.cnn.get_model_complexity_info() for details.

Citation

If you find our work useful for your research, please consider citing the paper:

@inproceedings{
  title={Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond Algorithms},
  author={Pang, Hui En and Cai, Zhongang and Yang, Lei and Zhang, Tianwei and Liu, Ziwei},
  booktitle={NeurIPS},
  year={2022}
}

License

Distributed under the S-Lab License. See LICENSE for more information.

Acknowledgements

This study is supported by NTU NAP, MOE AcRF Tier 2 (T2EP20221-0033), and under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s).

Explore More SMPLCap Projects

[arXiv'25] SMPLest-X: An extended version of SMPLer-X with stronger foundation models.
[ECCV'24] WHAC: World-grounded human pose and camera estimation from monocular videos.
[CVPR'24] AiOS: An all-in-one-stage pipeline combining detection and 3D human reconstruction.
[NeurIPS'23] SMPLer-X: Scaling up EHPS towards a family of generalist foundation models.
[NeurIPS'23] RoboSMPLX: A framework to enhance the robustness of whole-body pose and shape estimation.
[ICCV'23] Zolly: 3D human mesh reconstruction from perspective-distorted images.
[arXiv'23] PointHPS: 3D HPS from point clouds captured in real-world settings.
[NeurIPS'22] HMR-Benchmarks: A comprehensive benchmark of HPS datasets, backbones, and training strategies.

Name		Name	Last commit message	Last commit date
Latest commit History 379 Commits
.github		.github
configs		configs
demo		demo
docs		docs
docs_zh-CN		docs_zh-CN
mmhuman3d		mmhuman3d
requirements		requirements
resources		resources
tests		tests
tools		tools
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.readthedocs.yml		.readthedocs.yml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Benchmarking 3D Pose and Shape Estimation Beyond Algorithms

[arXiv] • [Slides]

Getting started

Installation | Train | Evaluation | FLOPs |

Experiments

Single-datasets | Mixed-datasets | Augmentations | Backbones | Losses | Backbone-initialisation | Algorithms | Downloads |

Introduction

Major Features added to MMHuman3D

Experiments

Single-datasets

Mixed-datasets

Backbones

Backbone-initialisation

Augmentations

Losses

Downloads

Algorithms

Installation

Train

Training with a single / multiple GPUs

Training with Slurm

Evaluation

Evaluate with a single GPU / multiple GPUs

Evaluate with slurm

FLOPs

Citation

License

Acknowledgements

Explore More SMPLCap Projects

About

Releases

Packages

Contributors 22

Languages

License

SMPLCap/hmr-benchmarks

Folders and files

Latest commit

History

Repository files navigation

Benchmarking 3D Pose and Shape Estimation Beyond Algorithms

[arXiv] • [Slides]

Getting started

Installation | Train | Evaluation | FLOPs |

Experiments

Single-datasets | Mixed-datasets | Augmentations | Backbones | Losses | Backbone-initialisation | Algorithms | Downloads |

Introduction

Major Features added to MMHuman3D

Experiments

Single-datasets

Mixed-datasets

Backbones

Backbone-initialisation

Augmentations

Losses

Downloads

Algorithms

Installation

Train

Training with a single / multiple GPUs

Training with Slurm

Evaluation

Evaluate with a single GPU / multiple GPUs

Evaluate with slurm

FLOPs

Citation

License

Acknowledgements

Explore More SMPLCap Projects

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 22

Languages

Packages