Python bindings for RToon - Token-Oriented Object Notation
A compact, token-efficient format for structured data in LLM applications
Token-Oriented Object Notation is a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage. This package provides Python bindings for the rtoon Rust implementation.
Tip
Think of TOON as a translation layer: use JSON programmatically, convert to TOON for LLM input.
Note
This module uses rtoon (Rust implementation) as a dependency via PyO3/maturin.
- Why TOON?
- Why
py-rtoon - Key Features
- Installation
- Quick Start
- Examples
- API Reference
- Format Overview
- Testing
- Contributing
- License
- See Also
AI is becoming cheaper and more accessible, but larger context windows allow for larger data inputs as well. LLM tokens still cost money � and standard JSON is verbose and token-expensive.
👉 Click to see the token efficiency comparison
JSON (verbose, token-heavy):
{
"users": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" }
]
}TOON (compact, token-efficient):
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
TOON conveys the same information with 30�60% fewer tokens! <�
Python is the dominant language for AI/ML development, powering most LLM applications, agent frameworks, and data pipelines. However, when working with LLMs, you need:
- Blazing fast encoding/decoding powered by Rust (via PyO3)
- Zero-copy operations where possible for maximum efficiency
- Production-ready performance for high-throughput applications
- Orders of magnitude faster than pure Python implementations
- Native Python API with proper type hints and docstrings
- Works with standard
jsonmodule - no need to change your existing code structure - Simple integration into existing LLM pipelines (LangChain, LlamaIndex, etc.)
- Familiar patterns for Python developers
When you're building AI applications, token costs add up quickly:
# Before: Sending full JSON to LLM
prompt = f"Analyze this data: {json.dumps(large_dataset)}"
# Cost: ~5000 tokens
# After: Using TOON format
toon_data = py_rtoon.encode_default(json.dumps(large_dataset))
prompt = f"Analyze this data: {toon_data}"
# Cost: ~2000 tokens (60% reduction!)Real-world savings:
- Processing 1M API calls with 1000-token JSON objects
- JSON cost: ~$15 at GPT-4 rates
- TOON cost: ~$6 (saving $9 per million calls)
Agent frameworks:
# Pass structured data to agents efficiently
agent.run(f"Process: {py_rtoon.encode_default(json.dumps(data))}")RAG pipelines:
# Encode documents for vector storage with metadata
metadata_toon = py_rtoon.encode_default(json.dumps(metadata))Prompt engineering:
# Build token-efficient prompts with complex data
prompt = f"""
Given this user profile:
{py_rtoon.encode_default(json.dumps(user_data))}
Provide recommendations.
"""API response optimization:
# Return compact responses to save bandwidth and tokens
return {"data": py_rtoon.encode_default(json.dumps(results))}While you could implement TOON in pure Python, py-rtoon gives you:
- 5-50x faster encoding/decoding performance
- Battle-tested Rust implementation with comprehensive test coverage
- Memory efficiency - important for processing large datasets
- Active maintenance - benefits from improvements in the core rtoon library
- Type safety - Rust's guarantees prevent entire classes of bugs
- Token-efficient: typically 30~60% fewer tokens than JSON
- LLM-friendly guardrails:
- explicit lengths and fields enable validation
- Minimal syntax: removes redundant punctuation (braces, brackets, most quotes)
- Indentation-based structure: like YAML, uses whitespace instead of braces
- Tabular arrays: declare keys once, stream data as rows
- Round-trip support: encode and decode with full fidelity
- Fast: powered by Rust via PyO3
- Pythonic: clean API with proper type hints
- Customizable: delimiter (comma/tab/pipe), length markers, and indentation
# Using uv (recommended)
uv add py-rtoon
# Using pip
pip install py-rtoonimport py_rtoon
# Encode Python dict directly to TOON
data = {
"user": {
"id": 123,
"name": "Ada",
"tags": ["reading", "gaming"],
"active": True
}
}
toon = py_rtoon.encode_default(data)
print(toon)Output:
user:
active: true
id: 123
name: Ada
tags[2]: reading,gaming
Decode back to Python dict:
# Decode TOON back to dict
decoded = py_rtoon.decode_default(toon)
print(decoded)
# {'user': {'active': True, 'id': 123, 'name': 'Ada', 'tags': ['reading', 'gaming']}}import py_rtoon
# Encode dict to TOON (new Pythonic API!)
data = {"name": "Alice", "age": 30, "tags": ["python", "rust"]}
toon = py_rtoon.encode_default(data)
print(f"Encoded: {toon}")
# Decode TOON back to dict
decoded = py_rtoon.decode_default(toon)
print(f"Decoded: {decoded}")
print(f"Type: {type(decoded)}")Output:
Encoded: name: Alice
age: 30
tags[2]: python,rust
Decoded: {'name': 'Alice', 'age': 30, 'tags': ['python', 'rust']}
Type: <class 'dict'>
Backward compatible with JSON strings:
import json
# Still works with JSON strings
json_str = json.dumps(data)
toon = py_rtoon.encode_default(json_str)Use different delimiters to avoid quoting and save more tokens:
import py_rtoon
import json
data = {
"items": [
{"sku": "A1", "name": "Widget", "qty": 2},
{"sku": "B2", "name": "Gadget", "qty": 1}
]
}
json_str = json.dumps(data)
# Use pipe delimiter
options = py_rtoon.EncodeOptions()
options_with_pipe = options.with_delimiter(py_rtoon.Delimiter.pipe())
toon_pipe = py_rtoon.encode(json_str, options_with_pipe)
print("With pipe delimiter:")
print(toon_pipe)
# Use tab delimiter
options_with_tab = options.with_delimiter(py_rtoon.Delimiter.tab())
toon_tab = py_rtoon.encode(json_str, options_with_tab)
print("\nWith tab delimiter:")
print(toon_tab)Customize encoding with length markers:
import py_rtoon
import json
data = {
"tags": ["reading", "gaming", "coding"],
"items": [
{"sku": "A1", "qty": 2, "price": 9.99},
{"sku": "B2", "qty": 1, "price": 14.5}
]
}
json_str = json.dumps(data)
# Add length marker '#'
options = py_rtoon.EncodeOptions()
options_with_marker = options.with_length_marker('#')
toon = py_rtoon.encode(json_str, options_with_marker)
print(toon)Output:
items[#2]{sku,qty,price}:
A1,2,9.99
B2,1,14.5
tags[#3]: reading,gaming,coding
TOON supports full round-trip encoding and decoding:
import py_rtoon
import json
original_data = {
"product": "Widget",
"price": 29.99,
"stock": 100,
"categories": ["tools", "hardware"]
}
# Convert to JSON string
json_str = json.dumps(original_data)
# Encode to TOON
toon = py_rtoon.encode_default(json_str)
print(f"TOON:\n{toon}\n")
# Decode back to JSON
decoded_json = py_rtoon.decode_default(toon)
decoded_data = json.loads(decoded_json)
# Verify round-trip
assert original_data == decoded_data
print("� Round-trip successful!")Encode a JSON string to TOON format using default options.
Parameters:
json_str(str): A JSON string to encode
Returns:
- str: A TOON-formatted string
Raises:
ValueError: If the JSON is invalid or encoding fails
Example:
import py_rtoon
import json
data = {"name": "Alice", "age": 30}
toon = py_rtoon.encode_default(json.dumps(data))Decode a TOON string to JSON format using default options.
Parameters:
toon_str(str): A TOON-formatted string to decode
Returns:
- str: A JSON string
Raises:
ValueError: If the TOON string is invalid or decoding fails
Example:
import py_rtoon
toon = "name: Alice\nage: 30"
json_str = py_rtoon.decode_default(toon)Encode a JSON string to TOON format with custom options.
Parameters:
json_str(str): A JSON string to encodeoptions(EncodeOptions): Options for customizing the output format
Returns:
- str: A TOON-formatted string
Raises:
ValueError: If the JSON is invalid or encoding fails
Decode a TOON string to JSON format with custom options.
Parameters:
toon_str(str): A TOON-formatted string to decodeoptions(DecodeOptions): Options for customizing the decoding behavior
Returns:
- str: A JSON string
Raises:
ValueError: If the TOON string is invalid or decoding fails
Delimiter options for encoding TOON format.
Static Methods:
comma() -> Delimiter: Comma delimiter (default)pipe() -> Delimiter: Pipe delimiter (|)tab() -> Delimiter: Tab delimiter (\t)
Example:
import py_rtoon
delimiter = py_rtoon.Delimiter.pipe()Options for encoding to TOON format.
Methods:
__init__(): Create new encoding options with defaultswith_delimiter(delimiter: Delimiter) -> EncodeOptions: Set the delimiter for arrayswith_length_marker(marker: str) -> EncodeOptions: Set the length marker character
Example:
import py_rtoon
options = (py_rtoon.EncodeOptions()
.with_delimiter(py_rtoon.Delimiter.pipe())
.with_length_marker('#'))Options for decoding TOON format.
Methods:
__init__(): Create new decoding options with defaultswith_strict(strict: bool) -> DecodeOptions: Enable/disable strict mode (validates array lengths)with_coerce_types(coerce: bool) -> DecodeOptions: Enable/disable type coercion
Example:
import py_rtoon
options = (py_rtoon.DecodeOptions()
.with_strict(True)
.with_coerce_types(False))- Objects:
key: valuewith 2-space indentation for nesting - Primitive arrays: inline with count, e.g.,
tags[3]: a,b,c - Arrays of objects: tabular header, e.g.,
items[2]{id,name}:\n ... - Mixed arrays: list format with
-prefix - Quoting: only when necessary (special chars, ambiguity, keywords like
true,null) - Root forms: objects (default), arrays, or primitives
For complete format specification, see the TOON Specification.
py-rtoon includes a comprehensive test suite with 86 tests covering all functionality:
# Run all tests
uv run pytest
# Run with verbose output
uv run pytest -v
# Run specific test file
uv run pytest src/tests/test_basic.pyTest Coverage:
- ✅ Basic encoding/decoding (17 tests)
- ✅ Custom delimiters (6 tests)
- ✅ Options configuration (13 tests)
- ✅ Round-trip conversion (10 tests)
- ✅ Edge cases (16 tests)
- ✅ Dict support (24 tests) - NEW!
All tests use Python 3.11+ type hints and follow best practices. See src/tests/README.md for more details.
Contributions are welcome! Please feel free to submit a Pull Request.
How to Contribute
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Run tests to ensure everything works (
uv run pytest -v) - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Please ensure all 86 tests pass before submitting your PR.
MIT 2025
- Rust implementation (dependency): rtoon
- Original JavaScript/TypeScript implementation: @byjohann/toon
- TOON Specification: SPEC.md
Release and index to Pypi- ✅ DoneAdd compatibility to other Python version with other platform, now only Python 3.14 on Mac-OS (M3) is tested<- ✅ Done by Github CI- Add performance benchmarking other TOON tools <- Need contributors
- Add LLM Accuracy benchmarking <- Need contributors
- Add more data type support (Pydantic/ORM/dict)
- Ensure framework compatibility like (Langchain/Langgraph/CrewAI/ etc.)
- Add code checker in CI pipeline
Built with ❤️ using Rust + Python