Automate the 'precheck' validation step using semantic similarity scores

This task involves automating the current 'precheck' stage which currently involves a human 'triage-er' to validate whether the student model already knows the information which a user is trying to teach the model. 

Similar to the steps used in a standard RAG workflow, the sentences could be converted to vectors using embeddings and then could be compared using metrics like cosine similarity scores. Based on these scores, the 'precheck' stage can either be marked as a 'success' (✅ ) or a 'failure' (❎ ).

This can be included with the `precheck` call to the `@instructlab-bot` GH bot.

References:
1. https://huggingface.co/tasks/sentence-similarity
2. https://www.sbert.net/docs/usage/semantic_textual_similarity.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Automate the 'precheck' validation step using semantic similarity scores #354

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Automate the 'precheck' validation step using semantic similarity scores #354

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions