We're trying to compress tool outputs that can be 500KB+ (large git diffs, test suite output). Loading the entire content into memory, running all 14 stages, then returning the result causes noticeable latency spikes.
Would it be feasible to add a streaming API that processes content in chunks? Something like:
engine = FusionEngine()
for chunk in engine.compress_stream(large_input, chunk_size=8192):
yield chunk
I understand some stages (SemanticDedup, Ionizer) need full context, but stages like RLE, TokenOpt, and Abbrev could work incrementally.