Parse PDFs, Word docs, HTML, images, and other file formats with Unstructured and index the extracted content directly into Moss for semantic search. The cookbook demonstrates a full ingestion pipeline from raw files to a queryable Moss index with document chunking, metadata preservation, and incremental upserts.
Parse PDFs, Word docs, HTML, images, and other file formats with Unstructured and index the extracted content directly into Moss for semantic search. The cookbook demonstrates a full ingestion pipeline from raw files to a queryable Moss index with document chunking, metadata preservation, and incremental upserts.