A lightweight Python tool to parse and validate Illumina Sample Sheet v2 format. This tool supports multiple formats including BCLConvert, Cloud, TSO500L, and TSO500S sections.
- Parse CSV to JSON: Convert Illumina Sample Sheet v2 CSV format to structured JSON data
- Validate Samplesheets: Ensure samplesheets meet the required format and contain necessary sections
- Retrieve Library Information: Extract library information from samplesheets
- Revert JSON to CSV: Convert structured JSON data back to Illumina Sample Sheet v2 CSV format
- Support for multiple section types:
- Run Info Sections (Header, Reads, Sequencing)
- BCLConvert Sections
- Cloud Sections
- TSO500L Sections
- TSO500S Sections
- Data validation using Pydantic models
- Consistent naming conventions with automatic conversion between PascalCase and snake_case
- Comprehensive error handling with detailed error messages
pip install v2-samplesheet-parser
from v2_samplesheet_parser import parse_samplesheet
# Your samplesheet content as a string
samplesheet_content = """
[Header]
FileFormatVersion,2
RunName,my-illumina-sequencing-run
InstrumentPlatform,NovaSeq 6000
[Reads]
Read1Cycles,151
Read2Cycles,151
"""
# Parse the samplesheet
result = parse_samplesheet(samplesheet_content)
print(result)
The parser supports various section types:
-
Run Info Sections
- [Header]
- [Reads]
- [Sequencing]
-
BCLConvert Sections
- [BCLConvert_Settings]
- [BCLConvert_Data]
-
Cloud Sections
- [Cloud_Settings]
- [Cloud_Data]
-
TSO500L Sections
- [TSO500L_Settings]
- [TSO500L_Data]
- [Cloud_TSO500L_settings]
- [Cloud_TSO500L_Data]
-
TSO500S Sections
- [TSO500S_Settings]
- [TSO500S_Data]
- [Cloud_TSO500S_Settings]
- [Cloud_TSO500S_Data]
- Clone the repository:
git clone https://github.com/umccr/v2-samplesheet-parser.git
cd v2-samplesheet-parser
- Install development dependencies:
pip install -e ".[dev]"
pytest tests/
If you encounter any issues or have suggestions for improvements:
- Check if the issue already exists in the Issue Tracker
- If not, create a new issue with:
- A clear description of the problem
- Steps to reproduce
- Expected behavior
- Actual behavior
- Sample data (if applicable)
- Python version and environment details
This project is licensed under the MIT License - see the LICENSE file for details.
- Ray Liu ([email protected])
- Based on the Illumina Sample Sheet v2 format specification
- Built with Pydantic for robust data validation