Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate DataCurate4LLMs repository for inclusion as a dependency of SDG #509

Open
bbrowning opened this issue Jan 27, 2025 · 0 comments

Comments

@bbrowning
Copy link
Contributor

Investigate the repository at https://github.com/krishnatejakk/DataCurate4LLMs to see if it's suitable to include as a dependency of InstructLab/SDG. Is the license compatible? Is that project maintained? What's it's maturity level as far as releases, automated testing? Are we prepared to contribute fixes or code to that repo as necessary and work with the maintainer(s) to ensure they get merged?

From a purely technical perspective, do we have any conflicting dependencies? What are the hardware expectations, embedding models required, etc for us to be able to run this within SDG? Does it add any new requirements for our users and their systems?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant