Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document preprocessing refactoring #26

Open
AndreP-git opened this issue Jan 28, 2025 · 1 comment
Open

Document preprocessing refactoring #26

AndreP-git opened this issue Jan 28, 2025 · 1 comment

Comments

@AndreP-git
Copy link
Collaborator

To which extent do we want to extend supported formats? If we choose to go for a general refactoring, this aspect must be taken into account as well.

@LeoBaro
Copy link
Collaborator

LeoBaro commented Feb 4, 2025

We must support pdf (standard) and docx (use-case related).
We'll need to review tables parsing.
We'll need to support multi-modal capabilities to extract knowledge from the diagrams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants