fix(code): pin tree-sitter-language-pack <1.0.0 in [code] extra#1219
fix(code): pin tree-sitter-language-pack <1.0.0 in [code] extra#1219TUTU244 wants to merge 3 commits into
Conversation
…roomlabs-ai#1216) tree-sitter-language-pack >=1.0.0 removed the bundled tree-sitter package and introduced an incompatible node API (.kind instead of .type, callable root_node(), parse_bytes() vs parse(), etc.). The existing Python code in code_compressor.py relies on the 0.x API throughout, so a fresh install of headroom-ai[code] silently fails: is_tree_sitter_available() returns True (tree_sitter_language_pack imports fine) but every parse falls back to passthrough because `from tree_sitter import Parser` then fails with ImportError when tree-sitter-language-pack >=1.x is installed. Fix: - Pin `tree-sitter-language-pack>=0.10.0,<1.0.0` in pyproject.toml so that pip resolves to 0.13.0 (latest 0.x), which declares tree-sitter>=0.25.2 as a dependency and keeps the existing node-walk code working. - Also add `from tree_sitter import Parser` to `_check_tree_sitter_available()` so that environments where only the 1.x language-pack is present receive a clear "not available" signal (and log the install hint) instead of silently passing content through unchanged. Confirmed working: Python 3.14 / tree-sitter 0.25.2 / language-pack 0.13.0. 52% token reduction on a 593-token Python module in manual test. Pre-existing test_parser_usable_in_thread_pool failure is unrelated to this change (it fails on main before and after the patch). Closes headroomlabs-ai#1216
PR governanceThis PR does not yet satisfy the required template fields:
Please update the PR body, or move the PR back to draft while it is still in progress. |
JerrettDavis
left a comment
There was a problem hiding this comment.
The code change is the right direction: pinning tree-sitter-language-pack below the incompatible 1.x API and checking that tree_sitter.Parser is importable prevents the code compressor from silently accepting an unusable tree-sitter install.
Before merge, please refresh the branch because it is currently merge-conflicted (DIRTY), and please replace the empty PR template with the real summary/testing details from the commit message. The implementation itself looks good from my side once those housekeeping issues are fixed.
|
Maintainer refresh: merged current main and resolved the |
JerrettDavis
left a comment
There was a problem hiding this comment.
Re-reviewed after the tree-sitter pin refresh. I pushed one test-only follow-up (d2d15f0f) so the thread-safety regression test passes bytes to Parser.parse(), matching the API the production compressor already uses.
Local validation:
uv run --extra code --with pytest --with pytest-asyncio python -m pytest tests/test_transforms/test_code_compressor.py tests/test_transforms/test_tree_sitter_thread_safety.py tests/test_code_compressor_thread_safety.py -q(77 passed)uv run ruff check headroom/transforms/code_compressor.py tests/test_transforms/test_tree_sitter_thread_safety.py pyproject.toml
Approval stands.
Description
Closes #
Type of Change
Changes Made
Testing
pytest)ruff check .)mypy headroom)Test Output
Real Behavior Proof
Review Readiness
Checklist
Screenshots (if applicable)
Add screenshots to help explain your changes.
Additional Notes