Support reusing a prebuilt pipeline for multiple API calls #28

ilya-kolchinsky · 2025-01-10T11:59:59Z

This should include optionally altering the parameters specified at pipeline construction time.

dmartinol · 2025-01-20T15:55:01Z

Can we have these API compatible with the interfaces DocumentStoreRetriever and DocumentStoreIngestore defined in the instructlab package here?

FYI, @jwm4 @anastasds

ilya-kolchinsky · 2025-01-20T16:13:10Z

The ingestion interface (DocumentStoreIngestore) should not be a problem as it is identical to what we already have available. However, DocumentStoreRetriever significantly deviates from Pragmatic since it only handles the retrieval + prompt building whereas Pragmatic encapsulates the communication with the model and executes the prompt.
If absolutely necessary and there is no workaround, we can discuss a possibility of adding a "no_execute" flag that would instruct Pragmatic to return the augmented prompt instead of the query result.

dmartinol · 2025-01-20T16:27:42Z

If absolutely necessary and there is no workaround, we can discuss a possibility of adding a "no_execute" flag that would instruct Pragmatic to return the augmented prompt instead of the query result.

IMO it's necessary, otherwise we should have to move all the model communications to pragmatic, including those related to the non-RAG cases.

jwm4 · 2025-01-20T16:47:17Z

Here are some thoughts from me:

I think the full end-to-end run-time pipeline for RAG should live inside PRAGmatic (which, as I understand it, is Ilya's view too). I think that's very important for unlocking the ability to innovate effectively within PRAGmatic and for getting some amount of consistency in behavior across different consumers of PRAGmatic.
I don't agree with Daniele that this would require PRAGmatic to handle non-RAG model calls.
As we discussed on Slack a while back, I think the best way for ilab chat to integrate with the RAG pipeline is via the OpenAI API. I think PRAGmatic should provide an OpenAI wrapper that wraps the entire RAG pipeline.
I think we should plan to unwind the changes that Anastas made for RAG in chat in 1.4 and replace them with something where the only RAG specific code in the ilab chat code comes when instantiating the OpenAI instance that implements the chat; if RAG is on then we instantiate a RAG/PRAGmatic OpenAI object and if RAG is off we instantiate one that talks directly to the model.

anastasds · 2025-01-20T17:23:40Z

I do not have an opinion on direction for PRAGmatic itself and will only speak to an architectural point of view regarding InstructLab.

I struggle to find a coherent product design philosophy behind encapsulating model interactions within a RAG pipeline. Adding a second model interaction paradigm would almost definitely lead to duplication of code, feature disparity across the two, configuration complexity, a doubling of implementation cost of possible new chat-related functionality, likely double model evaluation costs by throwing into question equivalency of experimental setup, unnecessarily silo domain knowledge within the organization, complicate prompt template management, create a more complex and less understandable user interface, and more.

Regarding the OpenAI-compatible API around PRAGatic, that would make sense to me if it were returning model responses. Otherwise, coercing a retrieval pipeline to return outputs that are formatted like chat model responses seems to be a significant amount of unnecessary complexity that, as far as I can tell, actually subtracts value.

Another way of expressing this all is from the point of view of modularization - "things that change together should stay together". (See also: shearing layers.)Changes to model interactions in the InstructLab product should have to be made in only one place.

This topic throws into question about whether PRAGmatic is the best fit for InstructLab's immediate need for a retrieval pipeline whose output can be incorporated into its existing chat functionality. Perhaps there is an underlying retrieval component that should be a common component of both PRAGmatic and InstructLab?

jwm4 · 2025-01-20T18:14:08Z

I agree with you that "things that change together should stay together" -- but I feel the entire RAG pipeline (including retrieval and generation) is something that should change together, which is why I think it should stay together. Also, I agree that the OpenAI API makes sense only if it returns a response: that is what I was proposing.

jwm4 · 2025-01-20T18:56:43Z

We have a lot going on this week, so let's revisit next week once we've all had more time to think about this topic.

ilya-kolchinsky · 2025-01-20T19:03:26Z

Two things to mention here:

The issue under which we are commenting is the one to introduce these "OpenAI API wrappers" that would encapsulate the entire RAG pipeline and return model responses. So, if we decide to implement it the way Bill suggested (which I like a lot), it will be supported out-of-the-box.
On the other hand, why don't we go the other way around? How about having ilab provide an OpenAI API compatible object to Pragmatic through which the latter would interact with the model? It would require from Pragmatic to support using a client-supplied Python object for LLM communication (as opposed to creating it's own one as it currently works), and it will solve the potential code duplication / work duplication issues mentioned by Anastas. And it's a useful feature anyway.

jwm4 · 2025-01-20T19:13:45Z

Or even do both: Have an OpenAI-compatible wrapper which takes as input an OpenAI-compatible responde generation object.

anastasds · 2025-01-20T19:29:56Z

why don't we go the other way around? How about having ilab provide an OpenAI API compatible object to Pragmatic through which the latter would interact with the model?

I almost left a comment that one approach I see to having a unified chat paradigm would be with "InstructLab as a backend" to PRAGmatic, which is the same exact idea. However, the naive approach - InstructLab uses PRAGmatic for retrieval, PRAGmatic uses InstructLab for chat - would create a circular dependency between the two codebases. Even if both publish client libraries independent of their core codebases, the dependency graph would end up having a loop (or be at significant risk of creating one in the future).

dmartinol · 2025-01-20T20:02:49Z

Perhaps there is an underlying retrieval component that should be a common component of both PRAGmatic and InstructLab?

This looks like the implementation of the DocumentStoreRetriever interface from which we started the discussion, doesn't it?

BTW: I apologize for that, but my initial understanding was that PRAGmatic would act as a provider of modular RAG components (ingest, retrieve and similar functions to further configure the RAG pipeline), rather than serving as the platform to execute the RAG runtime. Even if the final result may be the same, the impact on the client platform(s) may be different.

hemajv self-assigned this Jan 13, 2025

hemajv mentioned this issue Jan 28, 2025

Add reusable pipelines #38

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support reusing a prebuilt pipeline for multiple API calls #28

Support reusing a prebuilt pipeline for multiple API calls #28

ilya-kolchinsky commented Jan 10, 2025

dmartinol commented Jan 20, 2025

ilya-kolchinsky commented Jan 20, 2025

dmartinol commented Jan 20, 2025

jwm4 commented Jan 20, 2025

anastasds commented Jan 20, 2025 •

edited

Loading

jwm4 commented Jan 20, 2025

jwm4 commented Jan 20, 2025

ilya-kolchinsky commented Jan 20, 2025

jwm4 commented Jan 20, 2025

anastasds commented Jan 20, 2025 •

edited

Loading

dmartinol commented Jan 20, 2025

Support reusing a prebuilt pipeline for multiple API calls #28

Support reusing a prebuilt pipeline for multiple API calls #28

Comments

ilya-kolchinsky commented Jan 10, 2025

dmartinol commented Jan 20, 2025

ilya-kolchinsky commented Jan 20, 2025

dmartinol commented Jan 20, 2025

jwm4 commented Jan 20, 2025

anastasds commented Jan 20, 2025 • edited Loading

jwm4 commented Jan 20, 2025

jwm4 commented Jan 20, 2025

ilya-kolchinsky commented Jan 20, 2025

jwm4 commented Jan 20, 2025

anastasds commented Jan 20, 2025 • edited Loading

dmartinol commented Jan 20, 2025

anastasds commented Jan 20, 2025 •

edited

Loading

anastasds commented Jan 20, 2025 •

edited

Loading