You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have some upcoming features coming out of our research team that will require Pipelines that contain multiple LLMBlocks, with each LLMBlock potentially using a different model id deployed on the same inference server, different model family (ie granite vs mistral and so on), or even different inference endpoints entirely for each.
Today users specify the inference endpoint via passing an OpenAI Client into the PipelineContext. This client gets used for every LLMBlock, with no ability to map the client to each LLMBlock.
What we need is some way to pass in multiple OpenAI clients, and map each to the relevant LLMBlock. One example of what this could look like:
We pass a map of OpenAI clients to PipelineContext in this example instead of a single one. If a single one was passed (ie for backwards compatibility), we could turn this internally into a map with that single client as the "default" value. Users would be able to control the full array of client parameters here, including SSL cert handling, timeouts, retries, and anything else that can be configured on the OpenAI client or its underlying httpx.Client.
Then, in your pipeline.yaml, we map each LLMBlock to a client. Any block that does not specify a client gets the "default" client. Otherwise, they can choose a string value that picks one of the clients passed into the PipelineContext out of that map. This allows us to create a re-usable Pipeline that expects N clients, and provides users running that Pipeline a way to configure those clients for their specific environment without modifying the Pipeline yaml itself.
The text was updated successfully, but these errors were encountered:
@williamcaban Tagging you for visibility here, since this is related to a discussion we had elsewhere recently about preparing for cases where we need separate OpenAI Client endpoints in a single Pipeline. I outlined what I believe the general issue to be above, including some proposed (but not set in stone) backwards-compatible design changes to enable this.
Is config_path: modifying client behavior? (Like setting temperatures etc). Is that part of existing constructs or should it be cover in this enhancement?
The only new attribute in the pipeline config I propose above is the client key to choose which OpenAI client to use. I didn't explicitly state that in the text above, so thanks for asking the clarifying questions!
We have some upcoming features coming out of our research team that will require Pipelines that contain multiple LLMBlocks, with each LLMBlock potentially using a different model id deployed on the same inference server, different model family (ie granite vs mistral and so on), or even different inference endpoints entirely for each.
Today users specify the inference endpoint via passing an OpenAI Client into the PipelineContext. This client gets used for every LLMBlock, with no ability to map the client to each LLMBlock.
What we need is some way to pass in multiple OpenAI clients, and map each to the relevant LLMBlock. One example of what this could look like:
Python code
We pass a map of OpenAI clients to PipelineContext in this example instead of a single one. If a single one was passed (ie for backwards compatibility), we could turn this internally into a map with that single client as the
"default"
value. Users would be able to control the full array of client parameters here, including SSL cert handling, timeouts, retries, and anything else that can be configured on the OpenAI client or its underlying httpx.Client.pipeline.yaml
Then, in your pipeline.yaml, we map each LLMBlock to a client. Any block that does not specify a client gets the
"default"
client. Otherwise, they can choose a string value that picks one of the clients passed into the PipelineContext out of that map. This allows us to create a re-usable Pipeline that expects N clients, and provides users running that Pipeline a way to configure those clients for their specific environment without modifying the Pipeline yaml itself.The text was updated successfully, but these errors were encountered: