Transform AI-generated text into natural, human-like writing using DoRA fine-tuned Qwen3-0.6B with 32k context support.
HumanTouch solves the problem of robotic, repetitive AI-generated content by converting it into natural, flowing human-like text while preserving the original meaning and context.
Key Features:
- π§ DoRA Fine-tuning - Advanced weight decomposition (rank 128) for superior quality
- π 32k Context - Handle long documents without truncation
- π― High Quality - No quantization or Flash Attention compromises
- π Multi-GPU Support - Scale training across 2-8+ GPUs with DeepSpeed
- β‘ Easy to Use - Simple Python scripts for training and inference
- π Linear Scaling - Near-linear speedup with multiple GPUs
The Problem: AI text often sounds robotic, repetitive, and lacks natural human flow.
The Solution: HumanTouch learns from 500K human vs AI text pairs to add natural variations, improve flow, and maintain context coherence.
Why Multi-GPU?
- Faster Training: 2-4x speedup with 4 GPUs, 4-8x speedup with 8 GPUs
- Larger Models: Train bigger models that don't fit on single GPU
- Better Quality: Larger effective batch sizes improve convergence
- Cost Efficiency: Reduce training time and cloud costs significantly
Automatic Optimizations:
- Learning Rate Scaling: Automatic sqrt scaling for stable multi-GPU training
- Gradient Accumulation: Smart adjustment to maintain effective batch size
- DeepSpeed Integration: ZeRO-2/3 optimization based on GPU count
- Communication Efficient: Optimized data transfers between GPUs
Single-GPU Setup:
- GPU: A100 80GB, H100, or RTX 4090 24GB+
- RAM: 32GB+ system memory
- Storage: 100GB+ free space
Multi-GPU Setup (2-8 GPUs):
- GPU: 2-8x A100 80GB, H100, RTX 4090 24GB+, or RTX A6000
- RAM: 64GB+ system memory (scales with GPU count)
- Storage: 200GB+ free space
- Network: High-speed interconnect (NVLink, InfiniBand) recommended for 4+ GPUs
- PSU: Adequate power supply for multiple high-end GPUs
- Python: 3.10 or higher
- CUDA: 11.8+ or 12.1+ (for GPU support)
- UV: Fast Python package manager
Windows:
# PowerShell (as Administrator)
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
# Or via pip
pip install uv
Linux/Mac:
# Via curl
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or via pip
pip install uv
Verify installation:
uv --version
# Clone repository
git clone https://github.com/your-username/HumanTouch.git
cd HumanTouch
# Create virtual environment
uv venv
# Activate environment
# Windows:
.venv\Scripts\activate
# Linux/Mac:
source .venv/bin/activate
Install PyTorch (choose your platform):
# CUDA 11.8
uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# CUDA 12.1
uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# CPU only
uv pip install torch torchvision torchaudio
Install project dependencies:
uv pip install -r requirements.txt
Install DeepSpeed:
uv pip install deepspeed
Verify GPU setup:
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
python -c "import torch; print(f'GPU count: {torch.cuda.device_count()}')"
# Windows
mkdir data\raw data\processed data\evaluation models logs
# Linux/Mac
mkdir -p data/{raw,processed,evaluation} models logs
- Go to Kaggle: https://www.kaggle.com/datasets/shanegerami/ai-vs-human-text
- Download the dataset (CSV file)
- Place the CSV file in
data/raw/
folder - Rename it to
AI_Human_Text.csv
(if needed)
Alternative - Kaggle API:
# Install Kaggle API
uv pip install kaggle
# Setup API key (see Kaggle docs)
kaggle datasets download -d shanegerami/ai-vs-human-text -p data/raw/
Basic processing:
python data_processing.py --input data/raw/AI_Human_Text.csv --output data/processed
With optional zip package creation:
python data_processing.py --input data/raw/AI_Human_Text.csv --output data/processed --create_zip
Expected output:
================================================================================
HumanTouch Data Processing
================================================================================
Input file: data/raw/AI_Human_Text.csv
Output directory: data/processed
Max sequence length: 32768
================================================================================
β Dataset loaded successfully!
Shape: (487235, 2)
Columns: ['text', 'generated']
Generated distribution: {0.0: 305797, 1.0: 181438}
Filtering texts by length...
Original samples: 487,231
After length filtering: 487,182
Average text length: 2263 characters
Average estimated tokens: 566
Balancing dataset...
AI texts: 181,419
Human texts: 305,763
Balanced dataset: 362,838 samples
Final AI texts: 181,419
Final human texts: 181,419
Creating training pairs...
AI texts available: 181,419
Human texts available: 181,419
Creating 181,419 training pairs...
Creating pairs: 100%|ββββββββββββ| 181419/181419 [00:01<00:00, 109313.90it/s]
β Created 181,419 training pairs
Sample processed data (showing 2 examples):
================================================================================
Sample 1:
AI Text: The artificial intelligence system processed...
Human Text: Our AI tool crunched through the data...
Full formatted text length: 5832 characters
Splitting and saving dataset to data/processed...
Train samples: 145,135
Validation samples: 18,142
Test samples: 18,142
Saving JSON files...
Creating Hugging Face dataset...
β Saved Hugging Face dataset to data/processed/hf_dataset
β Dataset saved to data/processed
Files created:
- train.json, validation.json, test.json
- hf_dataset/ (Hugging Face format)
- dataset_stats.json
π DATA PROCESSING COMPLETE!
================================================================================
β Processed 181,419 training pairs
β Train: 145,135 samples
β Validation: 18,142 samples
β Test: 18,142 samples
β Files saved to data/processed/
β Max sequence length: 32768
================================================================================
π― Single-GPU Training (Interactive Mode Selection):
python train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch
π Multi-GPU Training with DeepSpeed:
# 2 GPUs
deepspeed --num_gpus=2 train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch --mode full
# 4 GPUs
deepspeed --num_gpus=4 train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch --mode full
# 8 GPUs (for high-end setups)
deepspeed --num_gpus=8 train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch --mode full
# Interactive mode selection with multi-GPU
deepspeed --num_gpus=4 train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch
Training Mode Options: Both single-GPU and multi-GPU training offer 3 optimized configurations:
π― CHOOSE TRAINING MODE
================================================================================
π Multi-GPU Mode: 4 GPUs Detected
================================================================================
1. Basic Training
- Fast training for testing (2 epochs, small dataset) (Multi-GPU: 4 GPUs)
- Model: Qwen/Qwen2.5-0.5B
- DoRA Rank: 32
- Max Length: 2048 tokens
- Epochs: 2
- Dataset Size: 5,000 train, 500 val
- Learning Rate (scaled): 1.00e-04
- Grad Accumulation (adjusted): 2
2. Full Training
- Maximum quality training (6 epochs, large dataset) (Multi-GPU: 4 GPUs)
- Model: Qwen/Qwen3-0.6B-Base
- DoRA Rank: 128
- Max Length: 8192 tokens
- Epochs: 6
- Dataset Size: 50,000 train, 5,000 val
- Learning Rate (scaled): 1.60e-04
- Grad Accumulation (adjusted): 4
3. Smaller Gpu Training
- Balanced training for smaller GPUs (4 epochs, medium dataset) (Multi-GPU: 4 GPUs)
- Model: Qwen/Qwen2.5-0.5B
- DoRA Rank: 64
- Max Length: 4096 tokens
- Epochs: 4
- Dataset Size: 20,000 train, 2,000 val
- Learning Rate (scaled): 1.60e-04
- Grad Accumulation (adjusted): 4
Enter your choice (1-3):
π Direct Mode Specification:
Single-GPU:
# Basic training (fast, for testing)
python train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch --mode basic
# Full training (maximum quality)
python train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch --mode full
# Smaller GPU training (balanced)
python train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch --mode smaller_gpu
Multi-GPU with DeepSpeed:
# Basic training (2 GPUs)
deepspeed --num_gpus=2 train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch --mode basic
# Full training (4 GPUs)
deepspeed --num_gpus=4 train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch --mode full
# Smaller GPU training (2 GPUs)
deepspeed --num_gpus=2 train.py --dataset_path data/processed/hf_dataset --output_dir models/humantouch --mode smaller_gpu
βοΈ Advanced Options with Overrides:
Single-GPU:
# Custom configuration with testing
python train.py \
--dataset_path data/processed/hf_dataset \
--output_dir models/humantouch \
--mode full \
--epochs 8 \
--rank 256 \
--test_model \
--run_name "custom-experiment"
# Training with specific model
python train.py \
--dataset_path data/processed/hf_dataset \
--output_dir models/humantouch \
--mode smaller_gpu \
--model_name "Qwen/Qwen2.5-1.5B" \
--max_length 8192
Multi-GPU:
# Custom multi-GPU configuration
deepspeed --num_gpus=4 train.py \
--dataset_path data/processed/hf_dataset \
--output_dir models/humantouch \
--mode full \
--epochs 8 \
--rank 256 \
--test_model \
--run_name "multi-gpu-experiment"
# High-end 8-GPU setup
deepspeed --num_gpus=8 train.py \
--dataset_path data/processed/hf_dataset \
--output_dir models/humantouch \
--mode full \
--model_name "Qwen/Qwen2.5-1.5B" \
--max_length 8192
π Training Features:
- Multi-GPU Support: Automatic multi-GPU detection and DeepSpeed integration (2-8+ GPUs)
- GPU Detection: Automatic GPU memory and capability detection
- Learning Rate Scaling: Automatic sqrt scaling for multi-GPU setups
- Gradient Accumulation: Smart adjustment to maintain effective batch size
- Progress Monitoring: Real-time training progress with WandB integration
- Memory Optimization: Automatic DeepSpeed configuration based on mode and GPU count
- Communication Optimization: Efficient data transfers for multi-GPU training
- Error Handling: Comprehensive error recovery and debugging
- Model Testing: Built-in model testing after training completion
- Configuration Saving: All training settings saved for reproducibility
β±οΈ Expected Training Times:
Single-GPU:
- Basic Mode: 30-60 minutes (testing/prototyping)
- Smaller GPU Mode: 8-15 hours (RTX 4090, etc.)
- Full Mode: 15-25 hours (A100/H100)
Multi-GPU (with linear scaling):
- Basic Mode: 15-30 minutes (2-4 GPUs)
- Smaller GPU Mode: 2-8 hours (2-4 GPUs)
- Full Mode: 4-12 hours (2-4 GPUs), 2-6 hours (4-8 GPUs)
π‘ Training Tips:
Single-GPU:
# Monitor GPU usage
nvidia-smi
# Training with custom WandB project
python train.py --dataset_path data/processed --output_dir models/humantouch --mode full --run_name "production-v1"
# Disable WandB logging
python train.py --dataset_path data/processed --output_dir models/humantouch --mode basic --no_wandb
# Test model after training
python train.py --dataset_path data/processed --output_dir models/humantouch --mode basic --test_model
Multi-GPU:
# Monitor all GPUs
watch -n 1 nvidia-smi
# Multi-GPU training with WandB
deepspeed --num_gpus=4 train.py --dataset_path data/processed --output_dir models/humantouch --mode full --run_name "multi-gpu-v1"
# Check DeepSpeed installation
python -c "import deepspeed; print(f'DeepSpeed version: {deepspeed.__version__}')"
# Multi-GPU test after training
deepspeed --num_gpus=2 train.py --dataset_path data/processed --output_dir models/humantouch --mode basic --test_model
Built-in model testing (during training):
# Test model automatically after training
python train.py --dataset_path data/processed --output_dir models/humantouch --mode basic --test_model
Comprehensive evaluation:
# Full evaluation suite
python evaluate.py \
--model_path models/humantouch \
--test_dataset data/processed/hf_dataset \
--max_samples 500 \
--output_dir evaluation_results
# Quick single text test
python evaluate.py \
--model_path models/humantouch \
--quick_test \
--text "The artificial intelligence system processed the data efficiently and generated comprehensive analytical reports."
Expected test output:
============================================================
MODEL TEST RESULTS
============================================================
Input: The artificial intelligence system processed the data efficiently...
Output: Our AI tool crunched through the data pretty efficiently and came up with some detailed analysis reports.
============================================================
Interactive mode:
python inference.py --model_path models/humantouch --interactive
Single text:
python inference.py \
--model_path models/humantouch \
--text "Artificial intelligence algorithms demonstrate significant capabilities in processing and analyzing large datasets."
Batch processing:
# Create input file
echo ["Text 1 to humanize", "Text 2 to humanize"] > input_texts.json
# Process batch
python inference.py \
--model_path models/humantouch \
--input_file input_texts.json \
--output_file humanized_results.json
HumanTouch/
βββ data_processing.py # Enhanced dataset preparation with progress tracking
βββ train.py # Interactive DoRA training with 3 modes
βββ evaluate.py # Model evaluation script
βββ inference.py # Text humanization script
βββ requirements.txt # Python dependencies
βββ configs/
β βββ dora_config.yaml # DoRA configuration settings
βββ colab_ready/ # Google Colab versions
β βββ data_processing_colab.py
β βββ train_colab.py
βββ data/
β βββ raw/ # Original Kaggle dataset
β βββ processed/ # Processed training data
β βββ evaluation/ # Evaluation results
βββ models/ # Trained model checkpoints
βββ logs/ # Training logs
The new interactive training system provides three optimized configurations:
# Interactive mode selection (recommended)
python train.py --dataset_path data/processed --output_dir models/humantouch
# Direct mode selection
python train.py --dataset_path data/processed --output_dir models/humantouch --mode [basic|full|smaller_gpu]
Mode Comparison:
Mode | Model | Rank | Length | Epochs | Time | Quality |
---|---|---|---|---|---|---|
Basic | Qwen2.5-0.5B | 32 | 2048 | 2 | 30-60min | Testing |
Smaller GPU | Qwen2.5-0.5B | 64 | 4096 | 4 | 8-15hrs | Good |
Full | Qwen3-0.6B | 128 | 8192 | 6 | 15-25hrs | Best |
Custom parameter overrides:
# Override specific parameters
python train.py \
--dataset_path data/processed \
--output_dir models/humantouch \
--mode full \
--rank 256 \
--epochs 8 \
--max_length 16384 \
--model_name "Qwen/Qwen2.5-1.5B"
# Memory-optimized training
python train.py \
--dataset_path data/processed \
--output_dir models/humantouch \
--mode smaller_gpu \
--max_length 2048 \
--rank 32
# High quality (slower)
python inference.py --model_path models/humantouch --max_tokens 2048 --temperature 0.7
# Faster generation
python inference.py --model_path models/humantouch --max_tokens 1024 --temperature 0.8
# More creative
python inference.py --model_path models/humantouch --temperature 0.9
# More conservative
python inference.py --model_path models/humantouch --temperature 0.5
Single-GPU - CUDA out of memory during training:
# Solution 1: Use Basic mode for testing
python train.py --dataset_path data/processed --output_dir models/humantouch --mode basic
# Solution 2: Reduce parameters manually
python train.py --dataset_path data/processed --output_dir models/humantouch --mode smaller_gpu --max_length 2048 --rank 32
# Solution 3: Check GPU memory
python -c "import torch; print(f'GPU Memory: {torch.cuda.get_device_properties(0).total_memory/1e9:.1f} GB')"
Multi-GPU - Memory or communication issues:
# Solution 1: Check all GPUs
nvidia-smi
# Solution 2: Test with fewer GPUs
deepspeed --num_gpus=2 train.py --dataset_path data/processed --output_dir models/humantouch --mode basic
# Solution 3: Check GPU memory across all devices
python -c "import torch; [print(f'GPU {i}: {torch.cuda.get_device_properties(i).total_memory/1e9:.1f} GB') for i in range(torch.cuda.device_count())]"
# Solution 4: Use smaller model for multi-GPU
deepspeed --num_gpus=4 train.py --dataset_path data/processed --output_dir models/humantouch --mode smaller_gpu
Single-GPU - Training fails or is very slow:
# Check system status
nvidia-smi
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}, Device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else "None"}')"
# Use Basic mode for quick testing
python train.py --dataset_path data/processed --output_dir models/test --mode basic --test_model
# Check DeepSpeed installation
python -c "import deepspeed; print('DeepSpeed OK')"
Multi-GPU - Training fails or communication issues:
# Check DeepSpeed installation with multi-GPU support
python -c "import deepspeed; print(f'DeepSpeed version: {deepspeed.__version__}')"
# Test basic multi-GPU functionality
deepspeed --num_gpus=2 train.py --dataset_path data/processed --output_dir models/test --mode basic
# Check GPU topology and communication
nvidia-smi topo -m
# Verify all GPUs are visible
python -c "import torch; print(f'Total GPUs: {torch.cuda.device_count()}')"
# Run with debugging enabled
NCCL_DEBUG=INFO deepspeed --num_gpus=2 train.py --dataset_path data/processed --output_dir models/test --mode basic
Problem: Dataset processing fails
# Check input file
head -5 data/raw/AI_Human_Text.csv
# Process with debugging
python data_processing.py --input data/raw/AI_Human_Text.csv --output data/processed --create_zip
# Verify output
ls -la data/processed/
head -2 data/processed/train.json
Problem: Generated text quality needs improvement
# Use Full mode for best quality
python train.py --dataset_path data/processed --output_dir models/humantouch --mode full
# Increase training with custom parameters
python train.py --dataset_path data/processed --output_dir models/humantouch --mode full --epochs 8 --rank 256
# Test different models
python train.py --dataset_path data/processed --output_dir models/humantouch --mode full --model_name "Qwen/Qwen2.5-1.5B"
Problem: Package installation fails
# Update UV
uv self update
# Clear cache
uv cache clean
# Try pip instead
pip install -r requirements.txt
# Check Python version
python --version # Should be 3.10+
Problem: Cannot load trained model
# Verify model files exist
# Windows: dir models\humantouch
# Linux/Mac: ls -la models/humantouch/
# Check PEFT installation
python -c "from peft import PeftModel; print('PEFT OK')"
# Test base model loading
python -c "from transformers import AutoModelForCausalLM; AutoModelForCausalLM.from_pretrained('Qwen/Qwen3-0.6B-Base')"
- Final Loss: < 1.5
- Eval Loss: < 2.0
- Training Time: 48-120 hours
- Model Size: ~2-5GB
- ROUGE-L F1: 0.55-0.65
- BLEU Score: 0.35-0.45
- Human Rating: 8/10+
- Perplexity: 2.0-3.0
Input: "The artificial intelligence system processed the data efficiently and generated comprehensive analytical reports."
Output: "Our AI tool crunched through the data pretty efficiently and came up with some detailed analysis reports."
- Content Creation: Humanize AI-generated articles, blogs, marketing copy
- Academic Writing: Improve AI-assisted research papers and reports
- Documentation: Make technical documentation more readable
- Creative Writing: Add human touch to AI-generated stories
- Business: Natural-sounding AI communication and reports
For Google Colab users, we provide specialized scripts:
# Use the Colab-ready scripts
colab_ready/
βββ data_processing_colab.py # Auto-installs packages, Drive integration
βββ train_colab.py # Memory-optimized for Colab
Colab Features:
- π¦ Auto-installation of all dependencies
- πΎ Google Drive integration for dataset storage
- π§ Memory optimization for free/pro Colab
- π Progress tracking with download packages
- βοΈ Same 3 training modes as local version
Input CSV Format:
text
: The text contentgenerated
: 1 for AI text, 0 for human textprompt
: (optional) prompt used to generate text
Output Structure:
data/processed/
βββ train.json # Training data (80%)
βββ validation.json # Validation data (10%)
βββ test.json # Test data (10%)
βββ dataset_stats.json # Dataset statistics
βββ hf_dataset/ # HuggingFace format
βββ humantouch_processed_data.zip # Complete package
# Setup WandB (optional)
uv pip install wandb
wandb login
# Training automatically logs to WandB
python train.py --dataset_path data/processed --output_dir models/humantouch --mode full
# Disable WandB if needed
python train.py --dataset_path data/processed --output_dir models/humantouch --mode full --no_wandb
# Custom run names
python train.py --dataset_path data/processed --output_dir models/humantouch --mode full --run_name "production-v1"
- Fork the repository
- Create feature branch:
git checkout -b feature/amazing-feature
- Make changes and test them
- Format code:
black . && isort .
(optional) - Commit:
git commit -m 'Add amazing feature'
- Push:
git push origin feature/amazing-feature
- Open pull request
MIT License - see LICENSE file for details.
- Qwen Team - Qwen3 base model
- NVIDIA Research - DoRA methodology
- Hugging Face - PEFT and Transformers
- Microsoft - DeepSpeed optimization
- Kaggle - AI vs Human text dataset
- Issues: GitHub Issues for bug reports
- Questions: GitHub Discussions for help
- Documentation: This README + code comments
# 1. Setup environment
git clone https://github.com/your-username/HumanTouch.git
cd HumanTouch
uv venv && source .venv/bin/activate # or .venv\Scripts\activate on Windows
uv pip install -r requirements.txt
# 2. Process your dataset
python data_processing.py --input data/raw/AI_Human_Text.csv --output data/processed
# 3. Train with interactive mode selection
python train.py --dataset_path data/processed --output_dir models/humantouch
# 4. Test your model
python train.py --dataset_path data/processed --output_dir models/humantouch --mode basic --test_model
π Ready to start? The enhanced interactive system will guide you through the entire process!