Infant Posture Assessment System

English | 中文

A command-line tool for infant posture assessment powered by YOLOv8-pose + YOLOv8-seg and a full-modal large language model (Qwen-VL).

Supports three assessment modes: gait analysis, motor development screening, and posture correction evaluation. Input a parent-recorded video of your baby, and the tool outputs a structured Markdown assessment report along with a visualized video overlaid with skeleton and segmentation masks.

Quick Start

Requires Python >= 3.13.

# Clone and install
uv sync --extra dev

# Quick assessment (default gait mode, requires LLM API key)
uv run gait-assess --video ./baby_walking.mp4 --output ./results/

# Try without API key — generates annotated video and report skeleton
uv run gait-assess --video ./baby_walking.mp4 --output ./results/ --skip-llm

# Motor development screening (requires age in months)
uv run gait-assess --video ./baby_walking.mp4 --mode developmental --age-months 12 --output ./results/

# Posture correction evaluation
uv run gait-assess --video ./baby_standing.mp4 --mode posture --output ./results/

Environment Setup

LLM configurations are read from environment variables or a .env file:

cat > .env << 'EOF'
QWEN_API_KEY=your-api-key
GAIT_LLM_MODEL=qwen-vl-max
GAIT_LLM_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
EOF

Features

Three Assessment Modes:
- gait — Gait analysis: based on gait cycle detection, evaluates walking posture abnormalities
- developmental — Motor development screening: matches motor milestones by age to screen for developmental delay risks
- posture — Posture correction evaluation: analyzes spinal/shoulder/pelvic symmetry in static standing posture
Video Preprocessing: automatic frame splitting, blur filtering, resolution standardization
Pose Detection: YOLOv8-pose extracts 17 COCO keypoints
Human Segmentation: YOLOv8-seg generates segmentation masks
Gait Analysis: detects gait cycles based on ankle trajectory, extracts 4 key phase frames
Pose Computation: knee angle, ankle angle, spinal tilt, pelvic tilt, shoulder height difference, etc.
LLM Assessment: full-modal large model end-to-end evaluation with dual-channel input (video + structured pose data)
Visualized Output: skeleton overlay, mask overlay, key frame marking
Interactive Viewer: viewer.html with per-frame skeleton/segmentation playback and key frame gallery
Report Generation: structured Markdown report with risk badges, finding cards, suggestion cards, and embedded key frame images

User Guide

Assessment Modes

`gait` — Gait Analysis (default)

Best for: babies who are already walking independently.

The tool detects gait cycles from ankle trajectories, extracts 4 key phase frames (heel strike, mid-stance, toe-off, mid-swing), computes joint angles and symmetry metrics, and feeds them to the LLM for a holistic assessment.

uv run gait-assess --video ./baby_walking.mp4 --mode gait --output ./results/

`developmental` — Motor Development Screening

Best for: screening developmental delays against age-appropriate motor milestones.

Requires --age-months (e.g., 12 for a 1-year-old). The LLM compares observed pose and movement patterns against standard milestones for that age group.

uv run gait-assess --video ./baby.mp4 --mode developmental --age-months 12 --output ./results/

`posture` — Posture Correction Evaluation

Best for: static standing posture analysis.

Focuses on spinal alignment, shoulder height symmetry, and pelvic tilt. The subject should stand still facing the camera for 3–5 seconds.

uv run gait-assess --video ./baby_standing.mp4 --mode posture --output ./results/

Recording Tips

For the best assessment results, please follow these recording guidelines:

Lighting: choose a well-lit, evenly-lit environment; avoid backlight or strong shadows
Distance: place the phone/camera 2–3 meters from the baby, ensuring the whole body is in frame
Angle: camera height level with the baby's waist; a frontal or side view works best
Background: choose a simple background; avoid multiple people in the frame
Duration: record at least 5–10 seconds of continuous walking, with the baby taking 3–5+ steps
Clothing: avoid overly loose clothing; short sleeves/shorts are recommended for clear limb visibility

Understanding Your Report

The generated report.md contains three main sections:

Risk Badge — A color-coded indicator at the top:

🟢 Normal — No significant concerns detected
🟡 Mild — Minor deviations, monitor and re-assess in 2–4 weeks
🟠 Moderate — Notable findings, consider a pediatric consultation
🔴 Severe — Significant concerns, seek professional evaluation promptly

Findings — Specific observations from the video and pose analysis, such as asymmetrical arm swing, knee valgus/varus, or delayed heel strike.

Suggestions — Actionable recommendations tailored to the findings, such as targeted exercises, activity suggestions, or follow-up timelines.

⚠️ This tool is for reference only and does not constitute a medical diagnosis. If you have any concerns, please consult a professional pediatrician or rehabilitation therapist.

Developer Guide

Installation

# Install dependencies (including dev)
uv sync --extra dev

# Or install into the current environment
uv pip install -e ".[dev]"

YOLO model weights (.pt files) are stored in the models/ directory. They are downloaded automatically from Ultralytics Hub on first run (internet connection required). You can also download them manually and place them in models/.

Python API

In addition to the CLI, you can directly invoke the assessment pipeline in Python code.

Basic Usage

from pathlib import Path
from gait_assess import assess, AppConfig

config = AppConfig(
    video=Path("./baby.mp4"),
    output=Path("./results"),
    llm_api_key="your-api-key",
)

result = assess("./baby.mp4", config)

print(result["report_path"])       # Path('.../report.md')
print(result["video_path"])        # Path('.../annotated_video.mp4')
print(result["assessment"].risk_level)  # "正常"

Mode-Specific Functions

from gait_assess import assess_gait, assess_developmental, assess_posture

# Gait assessment
gait_result = assess_gait("./baby.mp4", config)

# Motor development screening (age_months can be omitted, inferred from pose)
dev_result = assess_developmental("./baby.mp4", config)

# Posture correction evaluation
posture_result = assess_posture("./baby.mp4", config)

Return Fields

assess() and mode-specific functions return a dictionary containing the following fields:

Field	Type	Description
`report_path`	`Path`	Markdown assessment report file path
`video_path`	`Path`	Annotated visualization video path
`viewer_video_path`	`Path`	Interactive viewer video path
`viewer_data_path`	`Path`	Per-frame JSON data path
`viewer_html_path`	`Path \| None`	viewer.html path
`assessment`	`AssessmentResult`	LLM assessment result (risk level, findings, suggestions)
`gait_cycle`	`GaitCycle`	Gait cycle analysis result
`config`	`AppConfig`	Runtime configuration object
`frames`	`list[np.ndarray]`	Preprocessed frame list
`fps`	`float`	Video frame rate
`frame_results`	`list[FrameResult]`	Per-frame pose detection/segmentation results

Skip LLM (Offline Assessment)

result = assess("./baby.mp4", config, skip_llm=True)
# assessment.risk_level == "未知"

Error Handling

from gait_assess.api import AssessmentError

try:
    result = assess("./baby.mp4", config)
except AssessmentError as e:
    print(f"Stage: {e.stage}")    # "preprocess" / "llm"
    print(f"Reason: {e.original}") # Original exception

Output Files

results/
├── report.md              # Markdown assessment report (with risk badges, finding/suggestion cards)
├── annotated_video.mp4    # Annotated visualization video
├── viewer.html            # Interactive report viewer (per-frame playback + key frame gallery)
└── key_frames/
    ├── frame_00.jpg       # Key phase frame images
    ├── frame_01.jpg
    ├── frame_02.jpg
    └── frame_03.jpg

The viewer.html file provides an interactive experience:

Per-frame playback with skeleton and segmentation overlay
Key frame gallery with phase labels
Risk level and finding/suggestion cards rendered with styled badges

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                         Data Pipeline                                        │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   Input Video                                                               │
│       │                                                                     │
│       ▼                                                                     │
│   ┌──────────────┐     ┌──────────────┐     ┌──────────────┐               │
│   │Preprocessor  │────▶│YOLOv8-pose   │────▶│YOLOv8-seg    │               │
│   │(frames, blur)│     │(17 keypoints)│     │(segmentation)│               │
│   └──────────────┘     └──────────────┘     └──────────────┘               │
│          │                      │                    │                      │
│          └──────────────────────┼────────────────────┘                      │
│                                 ▼                                           │
│                        ┌────────────────┐                                   │
│                        │  Pose Metrics  │                                   │
│                        │(angles, symmetry│                                  │
│                        │ temporal tracks) │                                 │
│                        └────────┬───────┘                                   │
│                                 │                                           │
│                    ┌────────────┴────────────┐                              │
│                    ▼                         ▼                              │
│           ┌──────────────┐          ┌─────────────────┐                     │
│           │Gait Analyzer │          │   LLM Assessor  │                     │
│           │(cycle detect,│          │(video + pose     │                     │
│           │ key frames)  │          │  data dual input)│                     │
│           └──────┬───────┘          └────────┬────────┘                     │
│                  │                            │                             │
│                  └────────────┬───────────────┘                             │
│                               ▼                                             │
│                    ┌────────────────────┐                                   │
│                    │  Report Generator  │                                   │
│                    │  + Visualizer      │                                   │
│                    └─────────┬──────────┘                                   │
│                              │                                              │
│                    ┌─────────┴──────────┐                                   │
│                    ▼                    ▼                                   │
│              report.md        annotated_video.mp4                           │
│              viewer.html                                                    │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Key design decisions:

Dual-channel LLM input: The LLM receives both the full video (base64 encoded, preserving temporal continuity) and structured pose data text (key frame coordinates, joint angles, temporal metrics). Video provides context; pose data provides precise coordinates. Video is more token-efficient than 8 separate key frame images.
Jinja2 prompt templates: Prompts are externalized as prompts/*.jinja.md files. Each assessment mode has its own template. Adding a new mode only requires a new template file — no code changes.
Infant-specific tuning: YOLO confidence threshold lowered from 0.5 to 0.3 (infants have smaller body frames). Only the largest detected person is retained per frame.

Project Structure

src/gait_assess/
  __init__.py          # Package entry
  api.py               # Programmatic API layer (assess() and mode-specific functions)
  cli.py               # CLI entry and pipeline orchestration
  models.py            # Pydantic data models
  preprocessor.py      # Video frame splitting, blur filtering, standardization
  pose_segmentor.py    # YOLO-pose + YOLO-seg inference
  gait_analyzer.py     # Gait cycle detection, key frame extraction
  pose_utils.py        # Joint angles, symmetry, temporal trajectory computation
  llm_assessor.py      # Full-modal LLM call, Jinja2 template rendering
  visualizer.py        # Skeleton/mask overlay, video encoding
  report_generator.py  # Markdown report generation
  prompts/             # Jinja2 prompt templates
    gait.jinja.md
    developmental.jinja.md
    posture.jinja.md
models/                # YOLO model weight files (*.pt, ignored by .gitignore)
tests/
  fixtures/            # Test videos
  test_*.py            # Unit tests for each module

Development

# Use Makefile
make help      # Show all commands
make install   # Install dependencies
make test      # Run tests
make typecheck # Run static type check
make lint      # Run code lint check (ruff)
make e2e       # End-to-end validation (skip LLM)
make e2e-full  # End-to-end validation (with LLM, requires API key)
make viewer    # Run e2e and automatically open viewer.html
make clean     # Clean results

# Or run directly
uv run pytest
uv run basedpyright src/

Full CLI Arguments

uv run gait-assess \
  --video ./baby_walking.mp4 \
  --output ./results/ \
  --mode gait \
  --age-months 12 \
  --conf-threshold 0.3 \
  --blur-threshold 100.0 \
  --target-height 720 \
  --min-duration 3.0 \
  --skip-llm

Disclaimer

This tool is for reference only and does not constitute a medical diagnosis. Assessment results are based on computer vision and large language model analysis and may contain errors. If you have concerns about your baby's walking posture, please consult a professional pediatrician or rehabilitation therapist.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.claude		.claude
docs		docs
src/gait_assess		src/gait_assess
tests		tests
.env_demo		.env_demo
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README.zh.md		README.zh.md
architecture.png		architecture.png
architecture.svg		architecture.svg
logo.png		logo.png
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Infant Posture Assessment System

Table of Contents

Quick Start

Environment Setup

Features

User Guide

Assessment Modes

`gait` — Gait Analysis (default)

`developmental` — Motor Development Screening

`posture` — Posture Correction Evaluation

Recording Tips

Understanding Your Report

Developer Guide

Installation

Python API

Basic Usage

Mode-Specific Functions

Return Fields

Skip LLM (Offline Assessment)

Error Handling

Output Files

Architecture

Project Structure

Development

Full CLI Arguments

Disclaimer

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Infant Posture Assessment System

Table of Contents

Quick Start

Environment Setup

Features

User Guide

Assessment Modes

gait — Gait Analysis (default)

developmental — Motor Development Screening

posture — Posture Correction Evaluation

Recording Tips

Understanding Your Report

Developer Guide

Installation

Python API

Basic Usage

Mode-Specific Functions

Return Fields

Skip LLM (Offline Assessment)

Error Handling

Output Files

Architecture

Project Structure

Development

Full CLI Arguments

Disclaimer

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`gait` — Gait Analysis (default)

`developmental` — Motor Development Screening

`posture` — Posture Correction Evaluation

Packages