Description
Background
In PR #198 , we added support for OpenAI’s new Response API. After running benchmarks, we saw a 25% speed-up without any drop in answer quality (details in the Discord thread). Next, we want to roll this out in Glific and run early latency tests using Glific Flows
Solutioning
For this, we need changes in two main things for Glific usage
- Moving from the Assistant API to the Response API
For this we need to pass below parameters most of which were saved at assistant level

Existing NGOs: Create a temporary table mapping assistant IDs to prompts. Glific keeps sending its assistant ID; our platform looks up the prompt info and calls the Response API.
New NGOs: Can directly use prompt when we add prompt management through this #206
Additionaly we may need to add similar endpoint like /threads
that does it asynchronously by giving immediate results at first and call endpoint when response is generated
Also we can add failsafe mechanism to cut off response generation and retry as we see sporadically some query take longer than usual to generate response
- Using response_id instead of thread_id for conversation memory
this will be pretty straightforward, as in the flow level, we geta response from previous webhook, we can use the results variable to use response_id like we were using for thread_id
Metadata
Metadata
Assignees
Labels
Type
Projects
Status