Skip to content

Response API: Production Ready for Glific #214

Closed
@AkhileshNegi

Description

@AkhileshNegi

Background

In PR #198 , we added support for OpenAI’s new Response API. After running benchmarks, we saw a 25% speed-up without any drop in answer quality (details in the Discord thread). Next, we want to roll this out in Glific and run early latency tests using Glific Flows

Solutioning
For this, we need changes in two main things for Glific usage

  1. Moving from the Assistant API to the Response API
    For this we need to pass below parameters most of which were saved at assistant level
Image

Existing NGOs: Create a temporary table mapping assistant IDs to prompts. Glific keeps sending its assistant ID; our platform looks up the prompt info and calls the Response API.

New NGOs: Can directly use prompt when we add prompt management through this #206

Additionaly we may need to add similar endpoint like /threads that does it asynchronously by giving immediate results at first and call endpoint when response is generated
Also we can add failsafe mechanism to cut off response generation and retry as we see sporadically some query take longer than usual to generate response

  1. Using response_id instead of thread_id for conversation memory
    this will be pretty straightforward, as in the flow level, we geta response from previous webhook, we can use the results variable to use response_id like we were using for thread_id

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

Status

Closed

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions