-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: add architecture outline (#341)
Signed-off-by: Panos Vagenas <[email protected]>
- Loading branch information
Showing
7 changed files
with
23 additions
and
9 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
![docling_architecture](../assets/docling_arch.png) | ||
|
||
In a nutshell, Docling's architecture is outlined in the diagram above. | ||
|
||
For each document format, the *document converter* knows which format-specific *backend* to employ for parsing the document and which *pipeline* to use for orchestrating the execution, along with any relevant *options*. | ||
|
||
!!! tip | ||
|
||
While the document converter holds a default mapping, this configuration is parametrizable, so e.g. for the PDF format, different backends and different pipeline options can be used — see [Usage](../usage.md#adjust-pipeline-features). | ||
|
||
The *conversion result* contains the [*Docling document*](./docling_document.md), Docling's fundamental document representation. | ||
|
||
Some typical scenarios for using a Docling document include directly calling its *export methods*, such as for markdown, dictionary etc., or having it chunked by a *chunker*. | ||
|
||
For more details on Docling's architecture, check out the [Docling Technical Report](https://arxiv.org/abs/2408.09869). | ||
|
||
!!! note | ||
|
||
The components illustrated with dashed outline indicate base classes that can be subclassed for specialized implementations. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1 @@ | ||
In this area you can find guides on the main Docling concepts. | ||
|
||
Use the navigation on the left to browse through them. | ||
Use the navigation on the left to browse some core Docling concepts. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1 @@ | ||
In this area you can find examples covering a range of possible workflows and use cases. | ||
|
||
Use the navigation on the left to browse through them. | ||
Use the navigation on the left to browse through examples covering a range of possible workflows and use cases. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1 @@ | ||
In this area you can find guides on the Docling integrations with popular frameworks and tools. | ||
|
||
Use the navigation on the left to browse through them. | ||
Use the navigation on the left to browse through Docling integrations with popular frameworks and tools. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters