GitHub - deathlegionteamlk/DeepFaceReal-Physics: Real-time deepfake detection with full body tracking, physics simulation, parallax background, WhatsApp integration, and LLM character responses. Built by Death Legion Team.

Features • Comparison • Quick Start • Architecture • Engine Docs • API v2 • Windows EXE • Credits

🌟 Star on GitHub • 🐛 Report Bug • 📖 Read Docs

🎬 What Is This?

HeyGen costs $24–240/month and runs in the cloud. SadTalker needs a GPU and still only hits 10 FPS. DeepFaceReal-Physics runs on your CPU, is free, and ships everything: 3D face reconstruction, audio-driven head motion, Wav2Lip lip sync, natural eye gaze, conversational gestures, body physics — the full stack.

No subscription. No GPU required. No waiting on cloud queues.

✨ Features

Engine	What It Does
🎯 3D Face	468 MediaPipe landmarks, Delaunay triangulation, 6DoF head pose (pitch/yaw/roll/xyz), expression blendshapes
🗣️ Talking Head	MFCC/pitch/energy extraction, audio-to-head-pose mapping, natural nodding/tilting patterns
👄 Wav2Lip	Phoneme detection, lip shape prediction, temporal smoothing, real-time audio buffering
👁️ Eye & Gaze	Saccades every 200–300ms, natural blinks every 2–4s, gaze target tracking, pupil rendering
👐 Gestures	Speech-rhythm hand movement, shoulder/head micro-shifts, posture variation, 0.0–1.0 intensity knob
🔄 Pipeline	Async multi-stage queue, frame skipping, cached inference — all CPU-optimized
🧠 Body Physics	MediaPipe Holistic (543 landmarks), momentum/inertia, spring dynamics
🖼️ Parallax BG	3-layer depth parallax driven by head position, depth blur
📱 Mobile Camera	IP Webcam integration for Android phone as webcam
💬 Characters	OpenRouter LLM with personality system prompts
🖥️ UI	Streamlit v2 dashboard, HeyGen Mode preset, recording export
🔌 API	FastAPI v2 with REST + WebSocket endpoints
🪟 Windows EXE	PyInstaller standalone build

📊 HeyGen vs SadTalker vs DeepFaceReal

Capability	HeyGen ($24–240/mo)	SadTalker	DeepFaceReal-Physics
3D Face Reconstruction	✅	❌	✅
Audio-Driven Head Motion	✅	✅	✅
Wav2Lip Lip Sync	✅	❌	✅
Natural Eye Gaze	✅	❌	✅
Conversational Gestures	✅	❌	✅
Real-Time ≥15 FPS	❌ Cloud	⚠️ 10 FPS GPU	✅ 15–20 FPS CPU
Face Swap	❌	❌	✅
LLM Character AI	⚠️ Limited	❌	✅
Self-Hosted	❌	✅	✅
Open Source	❌	✅	✅
Price	$24–240/month	Free	Free
WhatsApp Integration	❌	❌	✅
Windows EXE	N/A	Manual	✅
API + WebSocket	✅	❌	✅
GPU Required	✅	✅	❌ CPU only

🚀 Quick Start

Prerequisites: Python 3.10+, 4GB RAM (8GB recommended), webcam

Install & Run

# Clone
git clone https://github.com/deathlegionteamlk/DeepFaceReal-Physics.git
cd DeepFaceReal-Physics

# Install
pip install -r requirements.txt

# Start the UI (port 8080)
streamlit run app.py --server.port 8080

# In a second terminal — start the API (port 8081)
python api.py

Docker

docker build -t deepfacereal-physics .
docker run -p 8080:8080 -p 8081:8081 deepfacereal-physics

🏗️ Architecture

                    ┌──────────────────────────────┐
                    │        Input Sources          │
                    │  ┌──────┐ ┌────────┐ ┌─────┐ │
                    │  │ USB  │ │   IP   │ │Audio│ │
                    │  │ Cam  │ │ Webcam │ │File │ │
                    │  └──┬───┘ └───┬────┘ └──┬──┘ │
                    └─────┼─────────┼──────────┼───┘
                          │         │          │
                    ┌─────▼─────────▼──────────▼───┐
                    │    Audio Feature Extraction   │
                    │   (MFCC · Pitch · Energy · F0)│
                    └──────────────┬────────────────┘
                                   │
                    ┌──────────────▼────────────────┐
                    │         3D Face Engine         │
                    │  MediaPipe 468 landmarks       │
                    │  Delaunay Triangulation        │
                    │  6DoF Head Pose (solvePnP)     │
                    │  Expression Blendshapes        │
                    └──────────────┬────────────────┘
                                   │
                    ┌──────────────▼────────────────┐
                    │       Talking Head Engine      │
                    │  Audio → head pose             │
                    │  Audio → expression            │
                    │  Natural motion patterns       │
                    └──────────────┬────────────────┘
                                   │
                    ┌──────────────▼────────────────┐
                    │       Lip Sync (Wav2Lip)       │
                    │  Phoneme detection             │
                    │  Lip shape prediction          │
                    │  Wav2Lip inference             │
                    │  Temporal smoothing            │
                    └──────────────┬────────────────┘
                                   │
          ┌────────────────────────┼─────────────────────────┐
          │                        │                         │
 ┌────────▼────────┐    ┌──────────▼──────────┐   ┌─────────▼───────┐
 │  Eye & Gaze     │    │   Gesture Engine     │   │ Physics Engine  │
 │  Saccades       │    │   Hand gestures      │   │ Momentum        │
 │  Blinks         │    │   Shoulder/head      │   │ Spring dynamics │
 │  Gaze tracking  │    │   Posture shifts     │   │ Frame skipping  │
 │  Pupil render   │    │   Intensity config   │   │ Async queues    │
 └────────┬────────┘    └──────────┬──────────┘   └─────────┬───────┘
          │                        │                         │
          └────────────────────────┼─────────────────────────┘
                                   │
                    ┌──────────────▼────────────────┐
                    │      Composite & Render        │
                    │  Face swap · overlays          │
                    │  Background · enhancement      │
                    └──────────────┬────────────────┘
                                   │
                    ┌──────────────▼────────────────┐
                    │            Output              │
                    │  Streamlit :8080               │
                    │  FastAPI   :8081               │
                    │  Virtual Camera                │
                    └───────────────────────────────┘

🔧 Engine Docs

1. 3D Face Engine — `core/face_3d_engine.py`

468 MediaPipe landmarks → Delaunay triangulation → 3D mesh. 6DoF head pose via solvePnP. Expression blendshape extraction.

face_3d = get_face_3d_engine()
mesh = face_3d.process_frame(image)

2. Talking Head — `core/talking_head.py`

13 MFCC coefficients, F0 pitch, RMS energy → head pose prediction. Drives speech-synchronized nodding and tilt.

talking_head = get_talking_head()
seq = talking_head.process_audio(audio_data, face_img)

3. Wav2Lip Lip Sync — `core/lip_sync.py`

Phoneme detection → lip shape parameters → Wav2Lip inference on face region. EMA filter keeps transitions smooth.

lip_sync = create_lip_sync()
frame = lip_sync.sync_frame(face_img, audio_chunk)

4. Eye & Gaze Engine — `core/eye_engine.py`

Saccades every 200–300ms, micro-saccades during fixation, blinks every 2–4s (100–400ms duration). Configurable gaze targets.

eye_engine = get_eye_engine()
state = eye_engine.update()

5. Gesture Engine — `core/gesture_engine.py`

Hand patterns keyed to audio energy, shoulder/head micro-movements, posture variation. Intensity from 0.0 to 1.0.

gesture = get_gesture_engine()
params = gesture.process_gestures(energy)

6. Real-Time Pipeline — `core/pipeline.py`

Async queue per stage. Frame skipping for CPU relief. Cached Wav2Lip results for repeated phonemes. Resolution management (downscale detect, upscale output).

pipeline = get_realtime_pipeline()
pipeline.start()

💻 Professional UI

The Streamlit UI (app.py) runs on port 8080.

Tab	What's Inside
🎯 Avatar Studio	Source photo upload, real-time preview, recording export
📱 Mobile	QR code for IP Webcam, auto-detect, camera source picker
👤 Characters	Gallery, creation, management with face data
💬 Chat	LLM character conversation with message history
🎬 Engines	Live status for all 6 engines, per-stage timing
⚙️ Settings	Engine toggles, sliders, quality controls, HeyGen Mode preset

HeyGen Mode

One click. Turns everything on at max quality:

✅ 3D Face · ✅ Talking Head · ✅ Wav2Lip · ✅ Eye Gaze · ✅ Gestures · ✅ Parallax BG · ✅ Physics · ✅ High Quality Enhancement

🔌 API v2

FastAPI (api.py) on port 8081. Auto-generated docs at /docs.

v2 Endpoints

Method	Path	What It Does
POST	`/generate/talking-head`	Generate talking head video from audio + face image
POST	`/animate/face`	Animate face with expression coefficients + head pose
WS	`/ws/realtime`	Real-time streaming with head pose + eye state
POST	`/config/render`	Configure any engine's render parameters

Legacy Endpoints

Method	Path
GET	`/`	API info
GET	`/status`	System status with per-engine FPS
POST	`/swap`	Face swap on uploaded image
POST	`/chat`	Send message to character LLM
GET/POST/DELETE	`/characters`	Character CRUD
POST	`/characters/{name}/activate`	Activate character
POST/GET	`/physics/config`, `/physics/status`	Physics control
POST/GET	`/camera/source`, `/camera/status`	Camera control
WS	`/ws/chat`, `/ws/video`	Streaming chat + video

Talking Head — Example

curl -X POST http://localhost:8081/generate/talking-head \
  -H "Content-Type: application/json" \
  -d '{
    "audio_b64": "BASE64_ENCODED_WAV_AUDIO",
    "face_b64": "BASE64_ENCODED_FACE_IMAGE",
    "fps": 20
  }'

Render Config — Example

curl -X POST http://localhost:8081/config/render \
  -H "Content-Type: application/json" \
  -d '{"engine": "eye", "config": {"blink_interval_min": 1.5, "blink_interval_max": 4.0}}'

🪟 Windows EXE

# On Windows
pip install pyinstaller
python build_exe.py

# Output: dist/DeepFaceReal.exe + DeepFaceReal_API.exe

The build bundles all core modules, InsightFace models (buffalo_l, inswapper_128), MediaPipe models, Wav2Lip models, OpenCV/NumPy/Pillow/Streamlit/FastAPI, and a launcher batch file.

📱 Mobile Integration

Android (IP Webcam)

Install IP Webcam from Google Play
Tap Start Server
Note the IP (e.g. 192.168.1.100:8080)
In Streamlit → 📱 Mobile tab → enter IP or scan QR code

Desktop Virtual Camera

sudo apt install v4l2loopback-dkms
sudo modprobe v4l2loopback devices=1 video_nr=10

Start the pipeline → virtual camera appears as a device → select "DeepFakeCam" in WhatsApp Desktop, Zoom, or Meet.

🛠️ Project Structure

DeepFaceReal-Physics/
├── app.py                    # Streamlit UI v2 (port 8080)
├── api.py                    # FastAPI v2 (port 8081)
├── build_exe.py              # Windows EXE builder
├── start.sh                  # Launch both services
├── core/
│   ├── face_3d_engine.py     # 3D face reconstruction + pose
│   ├── talking_head.py       # Audio-driven talking head
│   ├── lip_sync.py           # Wav2Lip lip sync
│   ├── eye_engine.py         # Eye & gaze
│   ├── gesture_engine.py     # Conversational gestures
│   ├── pipeline.py           # Real-time async pipeline
│   ├── face_swapper.py       # InsightFace swap
│   ├── physics_engine.py     # MediaPipe Holistic + physics
│   ├── background_engine.py  # Parallax background
│   ├── webcam_pipeline.py    # Camera capture
│   ├── character_manager.py  # Character profiles
│   └── llm_character.py      # OpenRouter LLM
├── models/                   # Downloaded models
├── profiles/                 # Character profiles
└── static/                   # Static assets

📊 Performance

Engine	Target FPS	CPU Cores	Resolution
3D Face	30 FPS	1	640×480
Talking Head	30 FPS	1	Audio only
Lip Sync	20 FPS	1	Face region
Eye Gaze	60 FPS	0.5	Overlay
Gestures	30 FPS	0.5	Overlay
Pipeline Total	≥15–20 FPS	4 cores	640×480

Every heavy stage processes every 2nd–3rd frame. Repeated phonemes hit the Wav2Lip cache. Async queues keep everything non-blocking.

🤝 Contributing

Fork the repo
Create a feature branch (git checkout -b feature/your-feature)
Commit (git commit -m 'Add your feature')
Push (git push origin feature/your-feature)
Open a Pull Request

📄 License

MIT License — see LICENSE for details.

🙏 Credits

Built on:

InsightFace • MediaPipe • Wav2Lip • ONNX Runtime • OpenRouter • Streamlit • FastAPI • SciPy • scikit-image

Inspired by SadTalker, Ditto, and HeyGen.

⭐ Star on GitHub

DeepFaceReal-Physics v2.0.0 · MIT License · DeathLegionTeamLK

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
core		core
.gitignore		.gitignore
CHARACTERS.md		CHARACTERS.md
README.md		README.md
WHATSAPP_SETUP.md		WHATSAPP_SETUP.md
api.py		api.py
app.py		app.py
build_exe.py		build_exe.py
start.sh		start.sh

Folders and files

Latest commit

History

Repository files navigation

🎬 What Is This?

✨ Features

📊 HeyGen vs SadTalker vs DeepFaceReal

🚀 Quick Start

Install & Run

Docker

🏗️ Architecture

🔧 Engine Docs

1. 3D Face Engine — core/face_3d_engine.py

2. Talking Head — core/talking_head.py

3. Wav2Lip Lip Sync — core/lip_sync.py

4. Eye & Gaze Engine — core/eye_engine.py

5. Gesture Engine — core/gesture_engine.py

6. Real-Time Pipeline — core/pipeline.py

💻 Professional UI

HeyGen Mode

🔌 API v2

v2 Endpoints

Legacy Endpoints

Talking Head — Example

Render Config — Example

🪟 Windows EXE

📱 Mobile Integration

Android (IP Webcam)

Desktop Virtual Camera

🛠️ Project Structure

📊 Performance

🤝 Contributing

📄 License

🙏 Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. 3D Face Engine — `core/face_3d_engine.py`

2. Talking Head — `core/talking_head.py`

3. Wav2Lip Lip Sync — `core/lip_sync.py`

4. Eye & Gaze Engine — `core/eye_engine.py`

5. Gesture Engine — `core/gesture_engine.py`

6. Real-Time Pipeline — `core/pipeline.py`

Packages