Skip to content

Optimize go-to-line using buffer / index #3

@Cameron7195

Description

@Cameron7195

Summary
Speed up navigation to arbitrary lines (e.g., line 2,000,000) in large JSONL files via efficient file buffering / indexing.

Problem
“Go to line” gets noticeably slower as the target line increases, suggesting a naive line-by-line traversal. This makes working with very large JSONL datasets (hundreds of thousands to millions of lines) painful.

Proposed Solution

  • Implement a more efficient strategy, e.g.:
    • Build and cache a sparse line index (byte offsets for every Nth line).
    • Use file pointers / streams to seek near the desired line, then scan only locally.
  • Ensure the index is built lazily and incrementally so initial open is still fast.
  • Consider memory usage trade-offs for gigantic files.

Acceptance Criteria

  • Measurably reduced latency for:
    • Jumping to line 5,000, 50,000, and 500,000+ in a large JSONL test file.
  • Go-to-line operation feels near-constant time for common dataset sizes used in practice.
  • No excessive memory usage or blocking UI while building/using the index.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions