-
Notifications
You must be signed in to change notification settings - Fork 16
Improve get_n_words performance #187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #187 +/- ##
==========================================
+ Coverage 96.95% 96.97% +0.02%
==========================================
Files 64 64
Lines 4860 4861 +1
==========================================
+ Hits 4712 4714 +2
+ Misses 148 147 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Benchmark Results (Julia v1)Time benchmarks
Memory benchmarks
|
|
Note: Benchmark CI shows only a modest micro-benchmark improvement for |
Reworked
get_n_wordsin src/Utils/GeneralUtils.jl into a single-pass splitter that skips leading whitespace, tracks the last character boundary to avoidprevindoverhead, and stops splitting aftern-1tokens so the nth entry still gathers the remainder. The function now returns only the words found (no trailing#undef/empty fields) and returnsString[]for delimiter-only lines. Added a brief inline comment for the index bookkeeping.Added coverage in test/Utils/GeneralUtils.jl for requests with too few words, trailing delimiters, and delimiter-only input to lock in the new behavior.
Performance (BenchmarkTools):
Baseline (saved before changes) vs current measured with
@benchmark get_n_words(line,3)andjudge(new, base):Profiled 200k calls with
Profile.@profile; remaining hotspots are the unavoidable string slicing/allocations, confirming that the loop/previndoverhead was removed.