This repository was archived by the owner on Sep 9, 2025. It is now read-only.

Description
This task involves automating the current 'precheck' stage which currently involves a human 'triage-er' to validate whether the student model already knows the information which a user is trying to teach the model.
Similar to the steps used in a standard RAG workflow, the sentences could be converted to vectors using embeddings and then could be compared using metrics like cosine similarity scores. Based on these scores, the 'precheck' stage can either be marked as a 'success' (✅ ) or a 'failure' (❎ ).
This can be included with the precheck
call to the @instructlab-bot
GH bot.
References:
- https://huggingface.co/tasks/sentence-similarity
- https://www.sbert.net/docs/usage/semantic_textual_similarity.html