A comprehensive single-nucleus RNA sequencing analysis pipeline designed for computational biologists and researchers. Built with Seurat, this tool provides both interactive web-based analysis and high-throughput command-line processing for multi-sample studies.
# Clone and setup
git clone https://github.com/VibhavSetlur/scRNA-seq_Data_Validation
cd snRNA-seq-Pipeline
./launch.sh setup
# Launch web interface
./launch.sh shinyAccess: http://localhost:3838
This pipeline streamlines single-nucleus RNA-seq analysis with:
- Multi-sample processing with parallel computing
- Sample-specific configurations for customized analysis
- Interactive web interface for exploratory analysis
- Command-line tools for batch processing
- Comprehensive QC and visualization
- Publication-ready outputs
- Data Input: H5 files (10X Genomics) or RDS files (Seurat objects)
- Quality Control: Filtering based on features, counts, and mitochondrial content
- Processing: Normalization, variable feature selection, scaling
- Dimensionality Reduction: PCA and UMAP
- Clustering: Multiple algorithms (Leiden, Louvain, etc.)
- Marker Identification: Differential expression analysis
- Visualization: Automated plot generation
- H5 files: 10X Genomics Cell Ranger output
- RDS files: Pre-processed Seurat objects
- Multiple samples: Parallel processing with sample-specific parameters
- Single-nucleus RNA-seq count matrices
- Gene expression data with cell barcodes
- Optional: SouporCell output for doublet detection
Process multiple samples simultaneously with:
- Parallel computing for efficiency
- Sample-specific parameter customization
- Automated sample comparisons
- Integrated results visualization
Comprehensive filtering options:
- Feature count thresholds
- UMI count filtering
- Mitochondrial content filtering
- Cell and gene quality metrics
- Multiple clustering algorithms
- Differential expression analysis
- Dimensionality reduction
- Marker gene identification
./launch.sh shiny- Interactive parameter adjustment
- Real-time visualization
- Sample preview and validation
- Progress monitoring
# Single sample
Rscript scripts/run_pipeline_terminal.R \
--h5_input data/sample.h5 \
--project_name MyProject
# Multiple samples
Rscript scripts/run_multi_sample_pipeline.R \
--h5_inputs data/sample1.h5 data/sample2.h5 \
--project_name MultiProject \
--parallel \
--n_cores 2./launch.sh deployProjectName_outputs/
├── individual_samples/
│ └── sample_name/
│ ├── QC plots and summaries
│ ├── Processing results
│ ├── Clustering analysis
│ └── Marker gene tables
├── comparisons/
│ ├── Sample comparisons
│ └── Cross-sample analyses
├── combined_analysis/
└── reports/
Create YAML files to customize analysis for individual samples:
samples:
Control_1:
qc:
min_features: 200
max_mt_percent: 20
clustering:
resolution: 0.5
algorithm: "leiden"See docs/sample_configuration.md for detailed examples.
- R 4.0+ with Bioconductor
- 8GB+ RAM recommended
- Multi-core CPU for parallel processing
./launch.sh setupSee docs/installation.md for detailed instructions.
- Installation Guide: Detailed setup instructions
- User Guide: Step-by-step analysis workflow
- Sample Configuration: Customizing parameters
- Advanced Usage: Advanced features and options
- Troubleshooting: Common issues and solutions
- API Reference: Function documentation
- Memory errors: Increase
R_MAX_MEM_SIZE - Port conflicts: Use
SHINY_PORT=8080 ./launch.sh shiny - Parallel processing: Reduce
--n_coresparameter
- Check
docs/troubleshooting.md - Review logs:
./scripts/deployment/deploy.sh logs - Open an issue on GitHub
- Small datasets (<10K cells): 2-4 cores
- Medium datasets (10K-100K cells): 4-8 cores
- Large datasets (>100K cells): 8+ cores, 16GB+ RAM
- ~2GB per 10K cells
- Scale linearly with dataset size
- Adjust
R_MAX_MEM_SIZEas needed
We welcome contributions from the research community:
- Fork the repository
- Create a feature branch
- Test thoroughly with your data
- Submit a pull request
If you use this pipeline in your research, please cite:
snRNA-seq Pipeline v1.0
Single-nucleus RNA sequencing analysis pipeline
https://github.com/VibhavSetlur/scRNA-seq_Data_Validation
- Documentation: Check the
docs/folder - Issues: GitHub issue tracker
- Questions: Open a discussion on GitHub
MIT License - see LICENSE for details.
For detailed documentation, examples, and advanced usage, see the docs/ folder.