🐍 py-rtoon 🦀

Python bindings for RToon - Token-Oriented Object Notation

A compact, token-efficient format for structured data in LLM applications

Token-Oriented Object Notation is a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage. This package provides Python bindings for the rtoon Rust implementation.

Tip

Think of TOON as a translation layer: use JSON programmatically, convert to TOON for LLM input.

Note

This module uses rtoon (Rust implementation) as a dependency via PyO3/maturin.

Why TOON?

AI is becoming cheaper and more accessible, but larger context windows allow for larger data inputs as well. LLM tokens still cost money � and standard JSON is verbose and token-expensive.

JSON vs TOON Comparison

👉 Click to see the token efficiency comparison

JSON (verbose, token-heavy):

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ]
}

TOON (compact, token-efficient):

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

TOON conveys the same information with 30�60% fewer tokens! <�

Why py-rtoon?

Python is the dominant language for AI/ML development, powering most LLM applications, agent frameworks, and data pipelines. However, when working with LLMs, you need:

🚀 Performance Without Compromise

Blazing fast encoding/decoding powered by Rust (via PyO3)
Zero-copy operations where possible for maximum efficiency
Production-ready performance for high-throughput applications
Orders of magnitude faster than pure Python implementations

🐍 Seamless Python Integration

Native Python API with proper type hints and docstrings
Works with standard json module - no need to change your existing code structure
Simple integration into existing LLM pipelines (LangChain, LlamaIndex, etc.)
Familiar patterns for Python developers

💰 Cost Optimization for LLM Applications

When you're building AI applications, token costs add up quickly:

# Before: Sending full JSON to LLM
prompt = f"Analyze this data: {json.dumps(large_dataset)}"
# Cost: ~5000 tokens

# After: Using TOON format
toon_data = py_rtoon.encode_default(json.dumps(large_dataset))
prompt = f"Analyze this data: {toon_data}"
# Cost: ~2000 tokens (60% reduction!)

Real-world savings:

Processing 1M API calls with 1000-token JSON objects
JSON cost: ~$15 at GPT-4 rates
TOON cost: ~$6 (saving $9 per million calls)

🛠️ Perfect for Common Python + LLM Workflows

Agent frameworks:

# Pass structured data to agents efficiently
agent.run(f"Process: {py_rtoon.encode_default(json.dumps(data))}")

RAG pipelines:

# Encode documents for vector storage with metadata
metadata_toon = py_rtoon.encode_default(json.dumps(metadata))

Prompt engineering:

# Build token-efficient prompts with complex data
prompt = f"""
Given this user profile:
{py_rtoon.encode_default(json.dumps(user_data))}

Provide recommendations.
"""

API response optimization:

# Return compact responses to save bandwidth and tokens
return {"data": py_rtoon.encode_default(json.dumps(results))}

✨ Why Not Pure Python?

While you could implement TOON in pure Python, py-rtoon gives you:

5-50x faster encoding/decoding performance
Battle-tested Rust implementation with comprehensive test coverage
Memory efficiency - important for processing large datasets
Active maintenance - benefits from improvements in the core rtoon library
Type safety - Rust's guarantees prevent entire classes of bugs

Key Features

Token-efficient: typically 30~60% fewer tokens than JSON
LLM-friendly guardrails:
- explicit lengths and fields enable validation
Minimal syntax: removes redundant punctuation (braces, brackets, most quotes)
Indentation-based structure: like YAML, uses whitespace instead of braces
Tabular arrays: declare keys once, stream data as rows
Round-trip support: encode and decode with full fidelity
Fast: powered by Rust via PyO3
Pythonic: clean API with proper type hints
Customizable: delimiter (comma/tab/pipe), length markers, and indentation

Installation

# Using uv (recommended)
uv add py-rtoon

# Using pip
pip install py-rtoon

Quick Start

import py_rtoon

# Encode Python dict directly to TOON
data = {
    "user": {
        "id": 123,
        "name": "Ada",
        "tags": ["reading", "gaming"],
        "active": True
    }
}

toon = py_rtoon.encode_default(data)
print(toon)

Output:

user:
  active: true
  id: 123
  name: Ada
  tags[2]: reading,gaming

Decode back to Python dict:

# Decode TOON back to dict
decoded = py_rtoon.decode_default(toon)
print(decoded)
# {'user': {'active': True, 'id': 123, 'name': 'Ada', 'tags': ['reading', 'gaming']}}

Examples

Basic Encoding and Decoding

import py_rtoon

# Encode dict to TOON (new Pythonic API!)
data = {"name": "Alice", "age": 30, "tags": ["python", "rust"]}
toon = py_rtoon.encode_default(data)
print(f"Encoded: {toon}")

# Decode TOON back to dict
decoded = py_rtoon.decode_default(toon)
print(f"Decoded: {decoded}")
print(f"Type: {type(decoded)}")

Output:

Encoded: name: Alice
age: 30
tags[2]: python,rust

Decoded: {'name': 'Alice', 'age': 30, 'tags': ['python', 'rust']}
Type: <class 'dict'>

Backward compatible with JSON strings:

import json

# Still works with JSON strings
json_str = json.dumps(data)
toon = py_rtoon.encode_default(json_str)

Custom Delimiters

Use different delimiters to avoid quoting and save more tokens:

import py_rtoon
import json

data = {
    "items": [
        {"sku": "A1", "name": "Widget", "qty": 2},
        {"sku": "B2", "name": "Gadget", "qty": 1}
    ]
}

json_str = json.dumps(data)

# Use pipe delimiter
options = py_rtoon.EncodeOptions()
options_with_pipe = options.with_delimiter(py_rtoon.Delimiter.pipe())
toon_pipe = py_rtoon.encode(json_str, options_with_pipe)
print("With pipe delimiter:")
print(toon_pipe)

# Use tab delimiter
options_with_tab = options.with_delimiter(py_rtoon.Delimiter.tab())
toon_tab = py_rtoon.encode(json_str, options_with_tab)
print("\nWith tab delimiter:")
print(toon_tab)

Custom Options

Customize encoding with length markers:

import py_rtoon
import json

data = {
    "tags": ["reading", "gaming", "coding"],
    "items": [
        {"sku": "A1", "qty": 2, "price": 9.99},
        {"sku": "B2", "qty": 1, "price": 14.5}
    ]
}

json_str = json.dumps(data)

# Add length marker '#'
options = py_rtoon.EncodeOptions()
options_with_marker = options.with_length_marker('#')
toon = py_rtoon.encode(json_str, options_with_marker)
print(toon)

Output:

items[#2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5
tags[#3]: reading,gaming,coding

Round-Trip Conversion

TOON supports full round-trip encoding and decoding:

import py_rtoon
import json

original_data = {
    "product": "Widget",
    "price": 29.99,
    "stock": 100,
    "categories": ["tools", "hardware"]
}

# Convert to JSON string
json_str = json.dumps(original_data)

# Encode to TOON
toon = py_rtoon.encode_default(json_str)
print(f"TOON:\n{toon}\n")

# Decode back to JSON
decoded_json = py_rtoon.decode_default(toon)
decoded_data = json.loads(decoded_json)

# Verify round-trip
assert original_data == decoded_data
print("� Round-trip successful!")

API Reference

Functions

`encode_default(json_str: str) -> str`

Encode a JSON string to TOON format using default options.

Parameters:

json_str (str): A JSON string to encode

Returns:

str: A TOON-formatted string

Raises:

ValueError: If the JSON is invalid or encoding fails

Example:

import py_rtoon
import json

data = {"name": "Alice", "age": 30}
toon = py_rtoon.encode_default(json.dumps(data))

`decode_default(toon_str: str) -> str`

Decode a TOON string to JSON format using default options.

Parameters:

toon_str (str): A TOON-formatted string to decode

Returns:

str: A JSON string

Raises:

ValueError: If the TOON string is invalid or decoding fails

Example:

import py_rtoon

toon = "name: Alice\nage: 30"
json_str = py_rtoon.decode_default(toon)

`encode(json_str: str, options: EncodeOptions) -> str`

Encode a JSON string to TOON format with custom options.

Parameters:

json_str (str): A JSON string to encode
options (EncodeOptions): Options for customizing the output format

Returns:

str: A TOON-formatted string

Raises:

ValueError: If the JSON is invalid or encoding fails

`decode(toon_str: str, options: DecodeOptions) -> str`

Decode a TOON string to JSON format with custom options.

Parameters:

toon_str (str): A TOON-formatted string to decode
options (DecodeOptions): Options for customizing the decoding behavior

Returns:

str: A JSON string

Raises:

ValueError: If the TOON string is invalid or decoding fails

Classes

`Delimiter`

Delimiter options for encoding TOON format.

Static Methods:

comma() -> Delimiter: Comma delimiter (default)
pipe() -> Delimiter: Pipe delimiter (|)
tab() -> Delimiter: Tab delimiter (\t)

Example:

import py_rtoon

delimiter = py_rtoon.Delimiter.pipe()

`EncodeOptions`

Options for encoding to TOON format.

Methods:

__init__(): Create new encoding options with defaults
with_delimiter(delimiter: Delimiter) -> EncodeOptions: Set the delimiter for arrays
with_length_marker(marker: str) -> EncodeOptions: Set the length marker character

Example:

import py_rtoon

options = (py_rtoon.EncodeOptions()
    .with_delimiter(py_rtoon.Delimiter.pipe())
    .with_length_marker('#'))

`DecodeOptions`

Options for decoding TOON format.

Methods:

__init__(): Create new decoding options with defaults
with_strict(strict: bool) -> DecodeOptions: Enable/disable strict mode (validates array lengths)
with_coerce_types(coerce: bool) -> DecodeOptions: Enable/disable type coercion

Example:

import py_rtoon

options = (py_rtoon.DecodeOptions()
    .with_strict(True)
    .with_coerce_types(False))

Format Overview

Objects: key: value with 2-space indentation for nesting
Primitive arrays: inline with count, e.g., tags[3]: a,b,c
Arrays of objects: tabular header, e.g., items[2]{id,name}:\n ...
Mixed arrays: list format with - prefix
Quoting: only when necessary (special chars, ambiguity, keywords like true, null)
Root forms: objects (default), arrays, or primitives

For complete format specification, see the TOON Specification.

Testing

py-rtoon includes a comprehensive test suite with 86 tests covering all functionality:

# Run all tests
uv run pytest

# Run with verbose output
uv run pytest -v

# Run specific test file
uv run pytest src/tests/test_basic.py

Test Coverage:

✅ Basic encoding/decoding (17 tests)
✅ Custom delimiters (6 tests)
✅ Options configuration (13 tests)
✅ Round-trip conversion (10 tests)
✅ Edge cases (16 tests)
✅ Dict support (24 tests) - NEW!

All tests use Python 3.11+ type hints and follow best practices. See src/tests/README.md for more details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

How to Contribute

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Run tests to ensure everything works (uv run pytest -v)
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Please ensure all 86 tests pass before submitting your PR.

License

MIT 2025

TODO-Lists

~~Release and index to Pypi~~ - ✅ Done
~~Add compatibility to other Python version with other platform, now only Python 3.14 on Mac-OS (M3) is tested~~ <- ✅ Done by Github CI
Add performance benchmarking other TOON tools <- Need contributors
Add LLM Accuracy benchmarking <- Need contributors
Add more data type support (Pydantic/ORM/dict)
Ensure framework compatibility like (Langchain/Langgraph/CrewAI/ etc.)
Add code checker in CI pipeline

Built with ❤️ using Rust + Python

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.claude		.claude
.github		.github
.vscode		.vscode
examples		examples
src		src
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
RELEASING.md		RELEASING.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
uv.lock		uv.lock

License

batprem/py-rtoon

Folders and files

Latest commit

History

Repository files navigation

🐍 py-rtoon 🦀

Table of Contents

Why TOON?

JSON vs TOON Comparison

Why py-rtoon?

🚀 Performance Without Compromise

🐍 Seamless Python Integration

💰 Cost Optimization for LLM Applications

🛠️ Perfect for Common Python + LLM Workflows

✨ Why Not Pure Python?

Key Features

Installation

Quick Start

Examples

Basic Encoding and Decoding

Custom Delimiters

Custom Options

Round-Trip Conversion

API Reference

Functions

encode_default(json_str: str) -> str

decode_default(toon_str: str) -> str

encode(json_str: str, options: EncodeOptions) -> str

decode(toon_str: str, options: DecodeOptions) -> str

Classes

Delimiter

EncodeOptions

DecodeOptions

Format Overview

Testing

Contributing

License

See Also

TODO-Lists

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`encode_default(json_str: str) -> str`

`decode_default(toon_str: str) -> str`

`encode(json_str: str, options: EncodeOptions) -> str`

`decode(toon_str: str, options: DecodeOptions) -> str`

`Delimiter`

`EncodeOptions`

`DecodeOptions`

Packages