Skip to content

Conversation

@KyleKing
Copy link
Owner

No description provided.

Break down planned features into 9 phases:
- Phase 1: Foundation (tests & caching)
- Phase 2: Performance benchmarking
- Phase 3: Basic filtering
- Phase 4: Search & highlighting (REVIEW)
- Phase 5: Statistics & metrics
- Phase 6: Timestamp formatting (REVIEW)
- Phase 7: Context lines
- Phase 8: Themes & completions (REVIEW)
- Phase 9: Documentation & CI polish

Review phases (4, 6, 8) need decisions before implementation.
All features from original plan included.
Addresses items 3.4 and 2.2 from the implementation plan:

**3.4 Complete Planned Tasks - Dotted Key Parsing Tests:**
- Add comprehensive parameterized tests for dotted key edge cases
- Test keys with dots, special chars, nested paths
- Test missing keys, partial paths, non-dict intermediates
- Test empty keys, None values, and type conversions
- Test pop_key fallback chain and list value handling
- All tests passing with >80% coverage (96% achieved)

**2.2 Cache Dotted Notation Pattern Parsing:**
- Add _dotted_keys cache to Keys dataclass
- Pre-filter keys containing '.' in __post_init__
- Add get_dotted_keys() method for performance optimization
- Update core.py to use cached dotted keys
- Eliminate per-line '.' checking overhead
- Add test_keys_dotted_cache to verify caching behavior

**Performance improvements:**
- Reduced per-line overhead by pre-filtering dotted keys
- Eliminated string check ('.' in key) on every log line
- Cached keys computed once at Config initialization

**Test coverage:**
- 15 new tests for dotted key parsing edge cases
- 1 new test for caching functionality
- 30 total tests passing
- 96% code coverage (exceeds >80% target)

Resolves TODO in config.py:22
**Phase 3: Filtering (Stern-Aligned)**
- Replace generic --filter with stern's --include/-i (regex whitelist)
- Add --exclude/-e for regex blacklist filtering
- Add --field-selector for field-based filtering with glob support
- Support dotted keys in field selectors (e.g., log.level=debug)
- Support standard extracted fields (timestamp, level, message)
- Pre-compile regex patterns for performance
- Case-insensitive regex by default with toggle
- Apply filters to formatted output for simplicity

**Phase 4: Highlighting (Stern-Aligned)**
- Simplify to focus purely on highlighting (no filtering)
- Use --highlight/-H with stern's behavior
- Support multiple patterns with different colors
- Case-insensitive by default with toggle
- Integrate with Rich's Text styling
- Filter logic moved to Phase 3

**Key Changes:**
- Clear separation: Phase 3 filters, Phase 4 highlights
- Align CLI flags with stern for familiarity
- Both phases work together seamlessly
- No new external dependencies

**Stern Compatibility:**
Similar flags and behavior to stern's:
- -i/--include (whitelist filtering)
- -e/--exclude (blacklist filtering)
- -H/--highlight (visual emphasis)
- --field-selector (field-based filtering)

**Organized phases directory:**
- Move all phase files to phases/ subdirectory
- Remove old phase files from root
Replace whitelist/blacklist with allowlist/blocklist throughout
Phase 3 and 4 for more inclusive terminology.

Changes:
- Phase 3: Update all references to use allowlist/blocklist
- Phase 4: Update all references to use allowlist/blocklist
- Comments and documentation updated
- No functional changes, terminology only
**Phase 6: Timestamp Formatting**
Review decisions applied:
- ✅ Use arrow for parsing and formatting (Option C)
- ✅ Use arrow.humanize() for relative times
- ✅ Allow user to configure timestamp field name
- ✅ Extend existing logic for dotted key timestamp detection
- ✅ CLI argument for timezone conversion (default to computer TZ)
- ✅ Make timestamp formatting opt-in via CLI flag
- Status: REVIEWED

**Phase 8: Themes & Shell Completions**
Review decisions applied:
- ✅ TOML files in package for themes (Option B)
- ✅ Extend existing config pattern
- ✅ Include: Dark (default), Solarized Dark/Light, Catppuccin Dark/Light, None
- ✅ Don't over-extend current functionality unless critical
- ✅ Use invoke's built-in completions (Option A)
- ✅ Fallback to usage if invoke is insufficient
- ✅ Support bash and zsh (required), fish/PowerShell optional
- ✅ Use --completions flag like stern
- Status: REVIEWED

Both phases now have clear implementation guidance based on
review feedback from commit fb9f70a.
**Stern-Aligned Filtering Implementation:**

Added filtering capabilities aligned with stern's behavior:
- `--include` / `-i`: Regex allowlist filtering
- `--exclude` / `-e`: Regex blocklist filtering
- `--field-selector`: Field-based filtering with glob support and dotted key support
- `--case-insensitive`: Case-insensitive regex matching

**Implementation Details:**

1. **Config Extensions** (tail_jsonl/config.py):
   - Added filter fields: include_pattern, exclude_pattern, field_selectors, case_insensitive
   - Pre-compile regex patterns in __post_init__ for performance
   - Cached compiled patterns: _include_re, _exclude_re

2. **Filters Module** (tail_jsonl/_private/filters.py):
   - New should_include_record() function with AND logic
   - Field selector evaluation before formatting (cheaper)
   - Regex pattern evaluation on formatted output
   - Dotted key support using dotted library
   - Helper _get_field_value() for field extraction

3. **Core Integration** (tail_jsonl/_private/core.py):
   - Capture formatted output before printing
   - Apply filters using should_include_record()
   - Only print records that pass all filters
   - Local import to avoid circular dependency

4. **CLI Flags** (tail_jsonl/scripts.py):
   - Added --include/-i for allowlist filtering
   - Added --exclude/-e for blocklist filtering
   - Added --field-selector (repeatable) for field-based filtering
   - Added --case-insensitive flag
   - Parse field selectors from KEY=PATTERN format
   - CLI flags override config file settings

**Testing:**

- 59 comprehensive tests in test_filters.py
- Parameterized tests covering:
  - Include/exclude regex filtering
  - Field selectors (simple, dotted keys, standard fields)
  - Multiple field selectors with AND logic
  - Case sensitivity
  - Edge cases (empty patterns, invalid regex, missing keys)
- All tests passing (89 total)
- 93% code coverage (exceeds >80% target)
- filters.py has 100% coverage

**Performance Optimizations:**

- Pre-compile regex patterns at config initialization
- Cache compiled patterns
- Evaluate field selectors before formatting (O(k) where k = selector count)
- No buffering required (streaming friendly)

**Acceptance Criteria Met:**

✅ --include / -i filters to lines matching regex
✅ --exclude / -e filters out lines matching regex
✅ --field-selector filters by field key=value with glob support
✅ Multiple field selectors work with AND logic
✅ Dotted keys work in field selectors (e.g., log.level=debug)
✅ Standard extracted fields work (timestamp, level, message)
✅ Case-insensitive option works for regex
✅ Invalid patterns handled gracefully with clear errors
✅ Comprehensive unit tests with parameterized cases
✅ Edge cases handled gracefully
✅ >80% test coverage achieved (93%)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants