Skip to content

Commit

Permalink
docs: add architecture outline (#341)
Browse files Browse the repository at this point in the history
Signed-off-by: Panos Vagenas <[email protected]>
  • Loading branch information
vagenas authored Nov 15, 2024
1 parent 835e077 commit 25fd149
Show file tree
Hide file tree
Showing 7 changed files with 23 additions and 9 deletions.
Binary file added docs/assets/docling_arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/docling_arch.pptx
Binary file not shown.
19 changes: 19 additions & 0 deletions docs/concepts/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
![docling_architecture](../assets/docling_arch.png)

In a nutshell, Docling's architecture is outlined in the diagram above.

For each document format, the *document converter* knows which format-specific *backend* to employ for parsing the document and which *pipeline* to use for orchestrating the execution, along with any relevant *options*.

!!! tip

While the document converter holds a default mapping, this configuration is parametrizable, so e.g. for the PDF format, different backends and different pipeline options can be used — see [Usage](../usage.md#adjust-pipeline-features).

The *conversion result* contains the [*Docling document*](./docling_document.md), Docling's fundamental document representation.

Some typical scenarios for using a Docling document include directly calling its *export methods*, such as for markdown, dictionary etc., or having it chunked by a *chunker*.

For more details on Docling's architecture, check out the [Docling Technical Report](https://arxiv.org/abs/2408.09869).

!!! note

The components illustrated with dashed outline indicate base classes that can be subclassed for specialized implementations.
4 changes: 1 addition & 3 deletions docs/concepts/index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1 @@
In this area you can find guides on the main Docling concepts.

Use the navigation on the left to browse through them.
Use the navigation on the left to browse some core Docling concepts.
4 changes: 1 addition & 3 deletions docs/examples/index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1 @@
In this area you can find examples covering a range of possible workflows and use cases.

Use the navigation on the left to browse through them.
Use the navigation on the left to browse through examples covering a range of possible workflows and use cases.
4 changes: 1 addition & 3 deletions docs/integrations/index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1 @@
In this area you can find guides on the Docling integrations with popular frameworks and tools.

Use the navigation on the left to browse through them.
Use the navigation on the left to browse through Docling integrations with popular frameworks and tools.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ nav:
- Docling v2: v2.md
- Concepts:
- Concepts: concepts/index.md
- Architecture: concepts/architecture.md
- Docling Document: concepts/docling_document.md
# - Chunking: concepts/chunking.md
- Examples:
Expand Down

0 comments on commit 25fd149

Please sign in to comment.