Skip to content

ReshiAdavan/Scout

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 

Repository files navigation

scout

A highly-performant, recursive multi-keyword search tool that uses Aho-Corasick, file-change caching, and inverted indexes for instant, context-rich results across large textual datasets and filesystems.

setup

  • clone
  • populate keywords.txt with the keywords you want to query for
  • configure .scoutignore with the files you want to omit for search
  • run mfind with
    • clang++ -std=c++17 -pthread mfind.cpp core/fileCache.cpp helpers/ignoreRules.cpp -I. -Icore -Ihelpers -Iqueues -Iwalkers -Iworkers -o mfind
    • ./mfind .

todos

multifind

  • aho-corasick
  • case-insensitive match
  • grouped color-coded output
  • longest match / all match priority toggle
  • keyword tagging or ID support
  • match stats per file/keyword
  • export results (e.g. JSON, CSV)
  • watch mode (live file changes)

general

  • chunking
  • recursive find in directory
  • parallelization
    • file-level threading
    • chunk-level threading
  • extension filters (e.g., .txt, .cpp)
  • ignore rules support (.gitignore)
  • cache file hashes to skip unchanged files
  • in-memory index for very large codebases
  • LSP-compatible output formatting

structure

  • walkers/ - Spawns walker threads, traverses dirs, pushes files
  • queues/ - Holds concurrent fileQueue + matchQueue impls
  • workers/ - Thread pool to pop files, run searchFile() using Aho-Corasick
  • core/ - Aho-Corasick engine + file caching + match inverted index
  • helpers/ - Format/context/extension utils

Releases

No releases published

Packages

No packages published

Languages