GitHub - meamaturinlove221/VideoPose3D: Efficient 3D human pose estimation in video using 2D keypoint trajectories

Based on VideoPose3D. First, use the detectron2 module to detect 2D keypoints in each frame, which contain the position and confidence of each body part. Then, use an affinity matrix to measure the similarity between different keypoints, and cluster the keypoints that belong to the same body part according to a similarity threshold, and connect them with lines to form the skeleton of the body. Next, feed the obtained 2D keypoint sequence into a pre-trained 3D pose estimation model, which uses a spatio-temporal convolutional network to capture the dynamic information in the video, and outputs the 3D keypoint coordinates for each frame. Finally, project the predicted 3D keypoint coordinates onto the 2D plane, and compare them with the original 2D keypoints, calculate the error between them, and update the model parameters according to the error backpropagation, to make the prediction more accurate.

(1) Bottom-up joint detection. The bottom-up human pose estimation approach is a methodology that eliminates the requirement for prior human detection. It directly identifies body joints from the entire image through joint detection, subsequently employing algorithmic processes to assemble these joints into complete skeletal structures. Compared with top-down approaches, this method effectively mitigates errors and distortions caused by detector imperfections and image cropping operations.

(2) Affinity matrix-based clustering for joint association. An affinity matrix is utilized to represent the likelihood of different joints belonging to the same anatomical structure. Clustering algorithms are then applied to group joints into coherent anatomical segments, which are subsequently connected to form partial skeletal structures.

(3) Pretrained model integration for depth inference. A pretrained deep learning model is employed to infer three-dimensional depth coordinates for each anatomical segment, leveraging both the partial skeletal structures and image features [3]. This process enables the estimation of spatial positions in the three-dimensional coordinate system.

(4) Projection-based iterative refinement with 3D-to-2D optimization. The framework implements:

An optimization function that updates camera extrinsic parameters to minimize projection loss
A projection function that maps 3D joint coordinates to 2D image planes
An iterative optimization loop that computes projection errors between reprojected 2D coordinates and original image annotations, subsequently refining parameters through gradient-based optimization.

This systematic approach enables progressive refinement of 3D pose estimation through cyclic projection comparison and parameter adjustment.

Known bugs： Unexpected key(s) in state_dict: "layers_conv.4.weight", "layers_conv.5.weight", "layers_conv.6.weight", "layers_conv.7.weight", "layers_bn.4.weight", "layers_bn.4.bias", "layers_bn.4.running_mean", "layers_bn.4.running_var", "layers_bn.4.num_batches_tracked", "layers_bn.5.weight", "layers_bn.5.bias", "layers_bn.5.running_mean", "layers_bn.5.running_var", "layers_bn.5.num_batches_tracked", "layers_bn.6.weight", "layers_bn.6.bias", "layers_bn.6.running_mean", "layers_bn.6.running_var", "layers_bn.6.num_batches_tracked", "layers_bn.7.weight", "layers_bn.7.bias", "layers_bn.7.running_mean", "layers_bn.7.running_var", "layers_bn.7.num_batches_tracked".

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
__pycache__		__pycache__
common		common
data		data
images		images
inference		inference
3D关节数据投影到2D.py		3D关节数据投影到2D.py
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DATASETS.md		DATASETS.md
DOCUMENTATION.md		DOCUMENTATION.md
INFERENCE.md		INFERENCE.md
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
input and output.pdf		input and output.pdf
main.py		main.py
run.py		run.py
test-npy.py		test-npy.py
test.py		test.py
ui.py		ui.py
ui本体.py		ui本体.py
在C++中加载 TORCHSCRIPT 模型.py		在C++中加载 TORCHSCRIPT 模型.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

meamaturinlove221/VideoPose3D

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages