Skip to content

Conversation

@therrshan
Copy link

@therrshan therrshan commented Nov 16, 2025

Description

Adds --stats flag to the CLI for displaying token count statistics when encoding JSON to TOON. Uses the existing compare_formats() utility from utils.py to show a comparison table with token counts and savings percentage.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test coverage improvement

Related Issues

Closes #

Changes Made

  • Added --stats argument to CLI argparse configuration
  • Integrated existing compare_formats() function from utils.py
  • Added error handling for missing tiktoken with helpful installation message
  • Added comprehensive test coverage in test_cli.py
  • Updated README.md to document the new --stats flag

SPEC Compliance

  • This PR implements/fixes spec compliance
  • Spec section(s) affected:
  • Spec version:

Testing

  • All existing tests pass
  • Added new tests for changes
  • Tested on Python 3.8
  • Tested on Python 3.9
  • Tested on Python 3.10
  • Tested on Python 3.11
  • Tested on Python 3.12

Test Output

799 passed, 13 skipped in 3.85s
TOTAL 1123 76 93.23%
Coverage : 93.23%

Code Quality

  • Ran ruff check src/toon_format tests - no issues
  • Ran ruff format src/toon_format tests - code formatted
  • Ran mypy src/toon_format - no critical errors
  • All tests pass: pytest tests/ -v

Checklist

  • My code follows the project's coding standards (PEP 8, line length 100)
  • I have added type hints to new code
  • I have added tests that prove my fix/feature works
  • New and existing tests pass locally
  • I have updated documentation (README.md, CLAUDE.md if needed)
  • My changes do not introduce new dependencies
  • I have maintained Python 3.8+ compatibility
  • I have reviewed the TOON specification for relevant sections

Performance Impact

  • No performance impact
  • Performance improvement (describe below)
  • Potential performance regression (describe and justify below)

Breaking Changes

  • No breaking changes
  • Breaking changes (describe migration path below)

Screenshots / Examples

$ echo '{"users": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]}' | uv run toon - --stats

Output:

Format Comparison
────────────────────────────────────────────────
Format      Tokens    Size (chars)
JSON           45            117
TOON           19             36
────────────────────────────────────────────────
Savings: 26 tokens (57.8%)
users[2]{id,name}:
  1,Alice
  2,Bob

Additional Context

This is my first contribution to toon-python. The feature leverages existing token counting utilities from utils.py that were already part of the public API but not exposed in the CLI. No new dependencies required - uses existing tiktoken from the benchmark dependency group.

Checklist for Reviewers

  • Code changes are clear and well-documented
  • Tests adequately cover the changes
  • Documentation is updated
  • No security concerns
  • Follows TOON specification
  • Backward compatible (or breaking changes are justified and documented)

@therrshan therrshan requested review from a team and johannschopplich as code owners November 16, 2025 22:52
Copy link
Contributor

@johannschopplich johannschopplich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (README-wise only).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants