-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added streaming functionality to the LLM. #35
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the remarks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost done - please finish these last fixes and I will approve.
…ng.py) and updated references
…nable_response_streaming).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
Add Streaming Support to RAG Pipeline
Summary
This PR introduces token-level streaming to the RAG pipeline, enhancing interactivity and responsiveness. The new
RagStreamHandler
class manages real-time response streaming and integrates intoRagPipelineWrapper
.Key Changes
RagStreamHandler
for real-time token streaming with methods to start, stop, and yield response chunks.stream_query
inRagPipelineWrapper
to handle streaming queries._add_llm
.run
to switch between streaming and non-streaming modes based on thestream
setting.STREAM_TIMEOUT
to manage streaming timeout.stream
insettings.py
:Testing
Impact
How to Use
settings.py
:stream_query
method inRagPipelineWrapper
to enable streaming queries.main.py
script:Notes for Reviewers
RagStreamHandler
for robustness and edge-case handling.