diff --git a/docs/assets/docling_arch.png b/docs/assets/docling_arch.png new file mode 100644 index 00000000..cecda04b Binary files /dev/null and b/docs/assets/docling_arch.png differ diff --git a/docs/assets/docling_arch.pptx b/docs/assets/docling_arch.pptx new file mode 100644 index 00000000..19c22172 Binary files /dev/null and b/docs/assets/docling_arch.pptx differ diff --git a/docs/concepts/architecture.md b/docs/concepts/architecture.md new file mode 100644 index 00000000..07aa1b30 --- /dev/null +++ b/docs/concepts/architecture.md @@ -0,0 +1,19 @@ +![docling_architecture](../assets/docling_arch.png) + +In a nutshell, Docling's architecture is outlined in the diagram above. + +For each document format, the *document converter* knows which format-specific *backend* to employ for parsing the document and which *pipeline* to use for orchestrating the execution, along with any relevant *options*. + +!!! tip + + While the document converter holds a default mapping, this configuration is parametrizable, so e.g. for the PDF format, different backends and different pipeline options can be used — see [Usage](../usage.md#adjust-pipeline-features). + +The *conversion result* contains the [*Docling document*](./docling_document.md), Docling's fundamental document representation. + +Some typical scenarios for using a Docling document include directly calling its *export methods*, such as for markdown, dictionary etc., or having it chunked by a *chunker*. + +For more details on Docling's architecture, check out the [Docling Technical Report](https://arxiv.org/abs/2408.09869). + +!!! note + + The components illustrated with dashed outline indicate base classes that can be subclassed for specialized implementations. diff --git a/docs/concepts/index.md b/docs/concepts/index.md index a29db1af..54f24b64 100644 --- a/docs/concepts/index.md +++ b/docs/concepts/index.md @@ -1,3 +1 @@ -In this area you can find guides on the main Docling concepts. - -Use the navigation on the left to browse through them. +Use the navigation on the left to browse some core Docling concepts. diff --git a/docs/examples/index.md b/docs/examples/index.md index 5c2d3acd..a0934920 100644 --- a/docs/examples/index.md +++ b/docs/examples/index.md @@ -1,3 +1 @@ -In this area you can find examples covering a range of possible workflows and use cases. - -Use the navigation on the left to browse through them. +Use the navigation on the left to browse through examples covering a range of possible workflows and use cases. diff --git a/docs/integrations/index.md b/docs/integrations/index.md index c09c917d..3539c2f6 100644 --- a/docs/integrations/index.md +++ b/docs/integrations/index.md @@ -1,3 +1 @@ -In this area you can find guides on the Docling integrations with popular frameworks and tools. - -Use the navigation on the left to browse through them. +Use the navigation on the left to browse through Docling integrations with popular frameworks and tools. diff --git a/mkdocs.yml b/mkdocs.yml index 73361337..70f75ff0 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -58,6 +58,7 @@ nav: - Docling v2: v2.md - Concepts: - Concepts: concepts/index.md + - Architecture: concepts/architecture.md - Docling Document: concepts/docling_document.md # - Chunking: concepts/chunking.md - Examples: