Add AUGMENT stage to request pipeline #815

jwiegley · 2025-04-29T04:33:08Z

This PR lays the groundwork for implementation many forms of RAG in gptel, including:

Document inclusion, as selected by vector similarity
"Agents" that augment the context when referenced
"Dynamic context" that modifies the context is response to inputs or conditions

karthink

Recording some thoughts/questions about the code, let me know what you think.

I can start making some changes to simplify this patch soon, can I push them to your branch?

karthink · 2025-05-02T07:06:19Z

gptel.el

+asynchronous.
+
+Returns the transformed info at the end."
+  (if (null fns)


This causes the augmentors to run sequentially -- not an issue for synchronous augmentors, but is unnecessarily slow if you have async ones.

I thought about why you might have designed it this way instead of using a counting callback and running through gptel-augment-handler-functions with run-hook-with-args -- the only reason I can think of is that alterations to fsm might be noncommutative, and that the ordering of gptel-augment-handler-functions might not be respected when there are async callbacks involved. Is this something you foresee as a problem?

Otherwise we can switch to a counting callback, which can shorten the augment wait considerably.

karthink · 2025-05-02T07:06:19Z

gptel.el

+             #'(lambda (fsm) (gptel--augment-info fsm (cdr fns)))
+             fsm)))
+
+(defun gptel--handle-augment (fsm)


If we switch to a counting callback in gptel--augment-info (see comment), we don't need a (recursive) gptel--augment-info or gptel--finish-augmentation any more, and gptel--handle-augment can handle the whole augment process, localizing all the augment code.

karthink · 2025-05-02T07:06:19Z

gptel.el

@@ -2379,53 +2471,83 @@ be used to rerun or continue the request at a later time."
                 gptel-backend (and gptel--num-messages-to-send
                                    (* 2 gptel--num-messages-to-send))))))
            ((consp prompt) (gptel--parse-list gptel-backend prompt)))))
-         (info (list :data (gptel--request-data gptel-backend full-prompt)
+         (info (list :data (list :args t
+                                 :full-prompt full-prompt


The main problem here is that full-prompt is not a backend-agnostic format -- it can be of a different type/schema for each backend. This makes modifying the prompt in an augmentor difficult.

This can be papered over somewhat with gptel--inject-prompt, which can act as a backend-agnostic interface. Even if I add its counterpart gptel--retrieve-prompt and make these setfable, it's not a great interface.

My original idea was that the copied buffer itself is the backend-agnostic prompt list, and another intermediate, universal format for prompts between it and the final messages array is unnecessary. (It can also produce a lot of garbage.)

Now we need a new backend-agnostic prompts list format. Note that it can include many roles: user, llm/assistant, tool-call, tool-result and possibly more in the future.

I still think using the buffer can work because it's a simple data structure and Emacs is great at modifying it. Adding text to the top and bottom of the buffer (corresponding to before the first messages array and after the last) is very simple, as is deleting or adding text anywhere in the middle. This doesn't have to be done via gptel-prompt-filter-hook -- it can be part of the augment process.

karthink · 2025-05-02T07:06:19Z

gptel.el

    (setf (gptel-fsm-info fsm) info))
  (unless dry-run (gptel--fsm-transition fsm)) ;INIT -> WAIT
  fsm)

+(defun gptel--realize-info (info)
+  (let ((data (plist-get info :data)))
+    (if (not (plist-member data :args))


What is the purpose of :args here?

karthink · 2025-05-02T07:06:20Z

gptel.el

@@ -2379,53 +2471,83 @@ be used to rerun or continue the request at a later time."
                 gptel-backend (and gptel--num-messages-to-send
                                    (* 2 gptel--num-messages-to-send))))))
            ((consp prompt) (gptel--parse-list gptel-backend prompt)))))
-         (info (list :data (gptel--request-data gptel-backend full-prompt)
+         (info (list :data (list :args t


It's not clear to me yet how to avoid this step of storing and rehydrating this state in :data. I'll continue to think about this. But at minimum I think capturing many of these keys is unnecessary since you pass info (actually the whole fsm) to the augmentors, and they are available as :keys inside info in the current implementation on master.

:context: Accessible inside info as :context anyway.

:gptel-include-reasoning: Available as :include-reasoning in :info.

:callback: Do we expect the augmentors to change the callback?

:in-place: Available as :in-place in :info.

karthink · 2025-05-02T07:06:20Z

gptel.el

@@ -2829,7 +2951,13 @@ the response is inserted into the current buffer after point."
                             (kill-buffer buf)))
                         nil t nil)))
      ;; TODO: Add transformer here.
-      (setf (alist-get proc-buf gptel--request-alist) fsm))))
+      (setf (alist-get proc-buf gptel--request-alist)
+            (cons fsm


We don't need to capture proc-buf into the closure, since this information is found in gptel-abort anyway and we can pass it as an argument to the abort-callback. This should make the callbacks "static" functions that can be native-compiled.

jwiegley · 2025-05-02T07:10:20Z

Recording some thoughts/questions about the code, let me know what you think.

I can start making some changes to simplify this patch soon, can I push them to your branch?

I will read through your changes in more detail tomorrow; but in the meantime, please push whatever changes you wish!

jwiegley added 12 commits April 28, 2025 21:30

Add AUGMENT stage to request pipeline

c8da840

Add AUGMENT stage to FSM

7e3bbb1

Fix syntax errors

d7b2dc0

Correction to handling of system messages for augmentation

a51373c

Allow file collections to be augmented

ef732f8

A small bit of minor code reorg

5757354

Extend documentation for one function a tiny bit

1676968

Allow augmentations to be asynchronous

1ba7eeb

Update documentation

0bed2c7

Minor change

7a3799a

Make the abort mechanism more general, for other async process types

03b6a5a

Add gptel-sync-fn macro

483a135

karthink reviewed May 2, 2025

View reviewed changes

jwiegley marked this pull request as ready for review May 2, 2025 07:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AUGMENT stage to request pipeline #815

Add AUGMENT stage to request pipeline #815

jwiegley commented Apr 29, 2025

karthink left a comment

karthink May 2, 2025

karthink May 2, 2025

karthink May 2, 2025 •

edited

Loading

karthink May 2, 2025

karthink May 2, 2025

karthink May 2, 2025

jwiegley commented May 2, 2025

Add AUGMENT stage to request pipeline #815

Are you sure you want to change the base?

Add AUGMENT stage to request pipeline #815

Conversation

jwiegley commented Apr 29, 2025

karthink left a comment

Choose a reason for hiding this comment

karthink May 2, 2025

Choose a reason for hiding this comment

karthink May 2, 2025

Choose a reason for hiding this comment

karthink May 2, 2025 • edited Loading

Choose a reason for hiding this comment

karthink May 2, 2025

Choose a reason for hiding this comment

karthink May 2, 2025

Choose a reason for hiding this comment

karthink May 2, 2025

Choose a reason for hiding this comment

jwiegley commented May 2, 2025

karthink May 2, 2025 •

edited

Loading