-
-
Notifications
You must be signed in to change notification settings - Fork 914
feat(mode): add TOON mode for compact token-efficient structured outputs #1976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ured outputs TOON (Token-Oriented Object Notation) is a YAML-like format that achieves 30-60% token reduction on LLM outputs compared to JSON. Uses toon-format library: https://github.com/toon-format/toon-python Features: - Add Mode.TOON enum value classified as JSON-like mode (wasn't sure exactly where it would fit, but made the most sense within the JSON classification) - Implement handle_toon/reask_toon request handlers - Add parse_toon to OpenAISchema for response parsing - Support partial streaming via line-based parsing - Add extract_code_block_from_stream utilities - Recursive structure generation for nested Pydantic models - Proper TOON array format with [N] count markers - String quoting hints for numeric-looking string fields Supported types: - Simple fields (str, int, float, bool) - Nested Pydantic models - Lists of primitives (list[str], list[int]) - Lists of objects (list[Model]) Enabled for OpenAI-compatible providers: - OpenAI, OpenRouter, Anyscale, Together, Databricks NOTES: - I've tested various response cases using the models listed below, but further testing *may* still be required to ensure complex nested schemas are properly in both standard and partial cases: - openai/gpt-4o-mini - openai/gpt-5.2 - openrouter/moonshotai/kimi-k2-0905 - openrouter/google/gemma-2-27b-it - The dynamic system prompt being used to provide TOON response schema may benefit from optimization through DSPY or a similar framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to 40a6603 in 2 minutes and 57 seconds. Click for details.
- Reviewed
1010lines of code in11files - Skipped
0files when reviewing. - Skipped posting
7draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. instructor/core/client.py:809
- Draft comment:
Consider centralizing the provider mode lists to avoid duplicating MODE.TOON across multiple provider checks. - Reason this comment was not posted:
Confidence changes required:70%<= threshold85%None
2. instructor/dsl/partial.py:290
- Draft comment:
Avoid silently passing exceptions during TOON decoding; log or handle decoding errors to aid debugging in both sync and async chunk parsers. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 85% The comment is about code quality and error handling. It's suggesting that the bareexcept Exception: passblocks are problematic. However, looking at the logic, the exception handling appears intentional - the code is trying to decode partial TOON content as it streams in, and it's expected that some chunks won't be valid until more data arrives. The code keeps track oflast_successful_dataand falls back to it if the final decode fails. This is a streaming parser that needs to handle incomplete data gracefully. The silent exception handling seems to be by design for this use case. The comment doesn't provide strong evidence that this is actually a problem - it's more of a general code quality suggestion. Without seeing actual issues or understanding the TOON format better, it's hard to say if logging would be helpful or just noisy. I might be missing that logging could be valuable for debugging streaming issues. The comment could be valid if users need to debug why TOON parsing is failing. However, the exception handling appears intentional for handling partial streaming data, and adding logging might create noise since exceptions are expected during streaming. While logging could theoretically help with debugging, the exception handling here is clearly intentional for handling partial streaming data. The code has a fallback mechanism withlast_successful_data, which suggests the author thought through the error handling strategy. This is a speculative suggestion about code quality rather than identifying a concrete bug. The comment doesn't provide evidence that the current approach is causing problems. This comment should be deleted. It's a speculative code quality suggestion without strong evidence of an actual problem. The silent exception handling appears intentional for handling partial streaming data, and the comment doesn't demonstrate that this is causing issues or that logging would improve the code.
3. instructor/providers/openai/utils.py:510
- Draft comment:
For type checking in _generate_toon_structure, consider using typing.get_origin and typing.get_args instead of comparing str(origin) with 'typing.Union'. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 75% vs. threshold = 85% This is a code quality suggestion about using proper typing utilities. The comment is about new code added in the diff (the_generate_toon_structurefunction is entirely new). The suggestion is technically correct -typing.get_originandtyping.get_argsare the proper way to introspect type hints in Python 3.8+, and comparingstr(origin)is fragile. However, I need to check if this is actionable and clear. The comment does provide a specific alternative approach. The current code usesstr(origin) == "typing.Union"which could break with different Python versions or typing implementations. This is a legitimate code quality concern about the new code. The comment might be considered somewhat obvious to experienced Python developers who work with typing. Also, the current implementation might work fine in practice, so this could be seen as a minor improvement rather than a critical issue. The comment doesn't explain why the change is needed or what problems the current approach might cause. While it might be somewhat obvious to some developers, usingstr(origin)for type comparison is indeed a code smell and the suggestion to usetyping.get_originandtyping.get_argsis a concrete, actionable improvement. This is exactly the kind of code quality refactor that the rules say is good. The comment is clear about what to change and provides the proper alternative. This is a valid code quality suggestion about new code in the diff. It provides a clear, actionable recommendation to use proper typing utilities instead of string comparison. This aligns with the rule that "Comments that suggest code quality refactors are good! But only if they are actionable and clear."
4. instructor/providers/openai/utils.py:560
- Draft comment:
Consider adding a closing triple-backtick in the reask_toon error message for consistent TOON code block formatting. - Reason this comment was not posted:
Confidence changes required:80%<= threshold85%None
5. instructor/utils/core.py:322
- Draft comment:
Clarify the skip_lang_tag logic in extract_code_block_from_stream by adding an inline comment explaining that a newline ends the language tag. - Reason this comment was not posted:
Confidence changes required:80%<= threshold85%None
6. instructor/processing/function_calls.py:440
- Draft comment:
If _extract_toon_from_response is intended for broader use, consider making it public and enhance its documentation. - Reason this comment was not posted:
Confidence changes required:50%<= threshold85%None
7. instructor/providers/openai/utils.py:579
- Draft comment:
Typographical error: In the error message for TOON mode reask, the opening code block markertoon does not have a matching closing marker. Consider updating this to include the proper closing triple backticks (e.g.,toon```) to clearly indicate a code block. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 85% The comment is about line 579 which contains instructional text for the LLM: "Return your corrected response in atoon code block." This is not markdown being rendered - it's a plain string that will be sent to the LLM as instructions. The backticks are part of the instruction text itself, telling the LLM what format to use. There's no syntax error here. The comment misunderstands the context - this isn't a markdown code block that needs closing, it's text instructing the LLM to create a code block in its response. This is similar to how you might write "Please format your answer in a **bold** style" - the **bold** is just example text, not actual markdown formatting. Could the instruction be clearer if it showed both opening and closing markers? Maybe the LLM would better understand if we wrote "Return your corrected response in atoon ... ``` code block." However, the current format is consistent with how such instructions are typically written, and there's no actual syntax error in the code. While showing both markers might be slightly clearer, the current instruction is perfectly valid and follows common patterns. The comment incorrectly identifies this as a "typographical error" when it's actually intentional instructional text. The code works as intended - it's telling the LLM what format to use in its response. This comment should be deleted. It misidentifies instructional text as a syntax error. The string on line 579 is telling the LLM to format its response in a toon code block - it's not creating a code block itself, so there's no missing closing marker.
Workflow ID: wflow_C6xIqOk9sFwr0sIt
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
|
can you update docs and specific figure out which models this supports? |
…documentation - Add enum coercion for TOON parsing (enums returned as strings from toon-format) - Support Literal, Union, Optional, and Annotated type annotations - Add TOON mode documentation to `docs/modes-comparison.md` - Add unit tests for all supported type annotations Token savings after testing with various schemas & models: ~17% vs JSON, ~20% vs MD_JSON & ~28% vs TOOLS mode
|
@jxnl was doing a bit more experimentation/testing with this last night, there's definitely meaningful savings in output token length compared to TOOLS/JSON/MD_JSON, along with high model support, but there's a bit of internal complexity I had to add in terms of supporting/parsing Enum/Literal/Union types through response. Hopefully once this PR is merged, should significantly reduce the complexity I had to add directly to instructor. I've updated docs with relevant information however, up to you if you'd want to go through with supporting this mode, otherwise happy to close the PR as an experiment for now. |
|
should be fine with this as a mode, since all other models can support this too with the same mode right? whether is gemini, claude, gpt |
|
can you add some runnable scripts in examples/toon/run.py |
Add ANTHROPIC_TOON mode for using TOON format with Claude models: - Add handle_anthropic_toon and reask_anthropic_toon handlers - Add parse_anthropic_toon for response parsing - Register Mode.ANTHROPIC_TOON in mode handlers and reask handlers Improve TOON structure generation: - Fix tabular array format for models containing list fields - Use list format instead of tabular when objects have nested lists Add example scripts in examples/toon/: - run.py: Demonstrates OpenAI and Anthropic TOON modes - Includes token usage comparison between TOON, JSON, MD_JSON, and TOOLS Supported providers for TOON: - OpenAI (Mode.TOON) - OpenRouter, Together, Anyscale, Groq (Mode.TOON) - Anthropic (Mode.ANTHROPIC_TOON)
|
added implementation for Claude models, but still needs further handling to support gemini/vertex/etc. models directly, the |
|
closed in favor of an updated PR |
Description
Add
Mode.TOONfor TOON (Token-Oriented Object Notation) - a compact data format that achieves 30-60% token reduction compared to JSON while maintaining structured output capabilities.TOON uses a YAML-like syntax that eliminates JSON's redundant braces, brackets, and quotes:
Changes
Mode.TOONenum value classified as JSON-like modehandle_toon/reask_toonrequest handlers in OpenAI utilsparse_toonmethod toOpenAISchemafor response parsingextract_code_block_from_streamutilities for streamingtoon-formatnot installedTesting
create_partialInstall with:
pip install 'instructor[toon]'Uses toon-format library: https://github.com/toon-format/toon-python
This PR was written by Cursor
Important
Add
Mode.TOONfor compact structured outputs with new handlers, parsers, and tests.Mode.TOONfor compact token-efficient structured outputs.handle_toonandreask_tooninutils.pyfor TOON request handling.parse_toontofunction_calls.pyfor TOON response parsing.partial.py.extract_code_block_from_streamutilities incore.py.toon-formatpackage.test_toon_mode.py.toon-formattopyproject.tomldependencies.This description was created by
for 40a6603. You can customize this summary. It will automatically update as commits are pushed.