⚡️ Speed up function get_source_code_files by 359%
#31
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 359% (3.59x) speedup for
get_source_code_filesincognee/tasks/repo_processor/get_repo_file_dependencies.py⏱️ Runtime :
275 milliseconds→60.0 milliseconds(best of67runs)📝 Explanation and details
The optimized code achieves a 358% speedup (275ms → 60ms) and 148% throughput improvement (3645 → 9045 ops/sec) through two key optimizations:
1. Pre-computed Extension-to-Language Lookup Map
ext_to_lang[_ext]_get_language_from_extension()function calls that were consuming 5.7% of total runtime2. Directory-Level Exclusion Filtering
EXCLUDED_DIRS & root_partsandexcluded_pathsmatching) outside the file loopPath.resolve()callsPath(root).resolve()for every single file (66.1% of runtime), now called only once per directoryPerformance Impact by Test Type:
The optimizations are particularly effective for repositories with many files per directory and multiple excluded directories, which are common in real-world codebases with build artifacts, dependencies, and test directories.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-get_source_code_files-mh11filmand push.