Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table image extraction for reference #755

Closed
rajuptvs opened this issue Jan 15, 2025 · 5 comments
Closed

Table image extraction for reference #755

rajuptvs opened this issue Jan 15, 2025 · 5 comments
Labels
enhancement New feature or request

Comments

@rajuptvs
Copy link

Requested feature

Background

Currently, the system supports referencing images using URIs in markdown formatting, which has proven valuable for many data pipeline implementations. For example:

image

Proposed Enhancement

I propose extending this URI reference functionality to table images as well. This addition would provide more flexibility in document handling, particularly in cases where current markdown tables created may not be correct.

Technical Implementation

I've already prototyped a similar functionality using the following approach:

  1. Store image data in item.image and its URI in item.image.uri using the item.get_image()

  2. Implement reference handling through the existing image processing pipeline:

elif image_mode == ImageRefMode.REFERENCED:

    new_doc = self._with_pictures_refs(

        image_dir=artifacts_dir, reference_path=reference_path

    )

I think this would enable extensibility of pipelines using docling and very beneficial to do various kinds of post-processing on table images.

...

Alternatives

...

@rajuptvs rajuptvs added the enhancement New feature or request label Jan 15, 2025
@wcool1
Copy link

wcool1 commented Jan 16, 2025

Hello sir. I wonder that referencing images using URIs in markdown formatting, which has proven valuable for many data pipeline implementations? Why URIs are better than text/table in the output of markdown by OCR? Could you give me some cases or prove?

@jyothisv
Copy link

This feature would be quite useful for me. Alternatively, is there an easy way to use another model (say an LLM) specifically to convert table images into markdown and inject the output into the document?

@PeterStaar-IBM
Copy link
Contributor

@rajuptvs Actually, this feature is already supported. Any DocItem from the DoclingDocument can be cropped from the original document, just ensure that you keep the page_images at conversion.

@josippavicic
Copy link

Can someone give a code example for this supported feature?

@PeterStaar-IBM
Copy link
Contributor

@josippavicic Just call this get_image (https://github.com/DS4SD/docling-core/blob/b787d53173e9e2325f25f03a7e442d5b4194e5a4/docling_core/types/doc/document.py#L568) on any DocItem in the Document.

 for item, level in true_doc.iterate_items():
     if isinstance(item, DocItem):
          pil_image = item.get_image()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants