New in v12.0.0 - ✨ Agentic Workflows and much more #877

olimorris · 2025-02-09T16:52:39Z

olimorris
Feb 9, 2025
Maintainer

Well this started off as a simple PR a week ago and morphed into something much more...

(Btw, no breaking changes in this release, just felt like bumping to v12.0.0)

Agentic Workflows

I've wanted to be able to benchmark different LLMs in CodeCompanion for a while. In truth, you always could, it was just incredibly manual. Combining that desire with my aim to improve workflows (which I suspect many people have never used or configured), we've ended up here:

AgenticWorkflows.mp4

To summarize what's happening:

We're leveraging a new, in-built workflow called Edit<->Test Workflow
We're asking the LLM to solve the coding challenge using the @editor tool and run the test suite afterwards with the @cmd_runner tool
This workflow creates an automated looping user prompt that will prompt the LLM if the tests fail, triggering it to re-edit the code
When the tests pass, the automated prompts are stopped

The model I'm using in the video is gpt-4o-2024-08-06 and I'm leveraging a new global variable to auto-approve all edits and not use diff mode. I must have tested this about 100 times now, and can confirm it works brilliantly with:

claude 3.5-sonnet
o3-mini
o1-mini
gemini-2.0-flash
and gpt-4o-2024-08-06

Unfortunately I've had no success with gpt-3.5-turbo, llama3.1 or qwen2.5-coder 7b. It might be my setup and I should caveat these findings by saying I don't use locally hosted models very often.

You can check out how I've implemented Agentic Workflows in the doc site.

In order to get the plugin to the state where Agentic Workflows could function, I had to add a lot of other functionality and fix a few things along the way...

Other changes in this release

Features

Two new #buffer parameters:
- #buffer{watch} - To watch a buffer
- #buffer{pin} - To pin a buffer
You can now have more natural conversations when tagging variables and tools. Previously if you typed I'd like to run the @cmd_runner tool, the LLM would be sent I'd like to run the tool. Now, the LLM will see I'd like to run the cmd_runner tool
The @editor tool can now replace the entire contents of buffers
You can now use headers in your user prompt (as long as they're H3 and above). Previously, the Tree-sitter queries prevented this from being possible

Fixes

Tools are now more wayyyyyyy more resilient and can handle the multiple ways that LLMs call them (previously the system prompt asked for individual tools to be in individual XML blocks - not required anymore)
The @editor prompt has been tightened up to prompt the user if a buffer number hasn't been provided
Buffer numbers are no longer referred to as “ID” in the prompts to LLMs. This may have been causing some confusion when using tools
You can now see custom adapters in the cmdline again after doing :CodeCompanionChat

Refactors

The default system prompt has been tweaked to remove some ambiguity
The system prompt for the @editor tool has been improved
The system prompt for the @cmd_runner tool has been improved
The #buffer:10-25 parameters have been removed. It had been broken for a while and no one noticed
Buffer parameters must now be in braces
The default Gemini adapter is now gemini-2.0-flash

A quick word on tools...In testing Agentic Workflows, it's become clear how newer models are exceptional at following instructions and in the case of CodeCompanion, leveraging the tools. Earlier models or models with small numbers of parameters really struggle.

Roadmap

I'm going to focus on improving the inline-assistant next. It's been buggy for too long for a lot of users on locally hosted models and it's too slow.

After that I'll likely streamline workspaces a little bit (move to a new schema version in the process) and then turn my attention to multi-model editing (#735). Regarding the latter...that might just be the answer to getting tools to work with small models.

Closing

As always, hope you like the direction the plugin is going in. Agentic Workflows will not be for everyone but the capabilities I've added to allow that functionality should be.

References

Challenge pages: exercism/Python
The challenge was edited so the tests would use pytest instead of unittest. The latter doesn't write output to stdout so there would have been no visuals in the video for the user to observe

tomaszferens · 2025-02-09T17:45:35Z

tomaszferens
Feb 9, 2025

WOW, this looks super cool!

I'm also super happy about tools being more resilient :)

I wonder if I could use this Agentic Worfklow for typechecking TypeScript projects like this:

Run tsc manually to get a list of broken files
Add them to the chat using /fileslash command
Tell agent to update the files using @files tool
Run tsc & repeat

Looks very promising. Thank you!!

1 reply

olimorris Feb 9, 2025
Maintainer Author

You absolutely could. That would be an awesome idea. Would be a tonne of context to send over to the LLM but in theory that would work.

bassamsdata · 2025-02-09T20:05:10Z

bassamsdata
Feb 9, 2025

This is incredible, and I’m certain this was a massive undertaking. Thank you for all your hard work.

I have a question about this:

The @editor tool can now replace the entire contents of a buffer when adding code.

I find this a bit risky, and it would be fantastic if there was some integration with Git. For instance, if there are any unstated hunks in the buffer, it would be great if the editor tool could automatically add/commit them before it starts overwriting the buffer. Perhaps we should do it manually now, though :)

2 replies

olimorris Feb 9, 2025
Maintainer Author

Any tools that do anything risky, such as the @files tool has a human-in-the-loop approval/rejection mechanism. The @editor tool can't save a buffer and everything goes through a diff unless you turn it off like I did in the demo.

bassamsdata Feb 10, 2025

Thank you! Ah, I should have tried it before writing my comment. :)

bashtoni · 2025-02-09T21:30:59Z

bashtoni
Feb 9, 2025

This looks great, I'm looking forward to trying it out.

Thanks so much for all your hard work on CodeCompanion. I really appreciate what you're doing, not just for adding great new features like this but the frequent incremental bug fixes and updates you put out so often.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New in v12.0.0 - ✨ Agentic Workflows and much more #877

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

New in v12.0.0 - ✨ Agentic Workflows and much more #877

olimorris Feb 9, 2025 Maintainer

Agentic Workflows

Other changes in this release

Features

Fixes

Refactors

Roadmap

Closing

References

Replies: 3 comments · 3 replies

tomaszferens Feb 9, 2025

olimorris Feb 9, 2025 Maintainer Author

bassamsdata Feb 9, 2025

olimorris Feb 9, 2025 Maintainer Author

bassamsdata Feb 10, 2025

bashtoni Feb 9, 2025

olimorris
Feb 9, 2025
Maintainer

Replies: 3 comments 3 replies

tomaszferens
Feb 9, 2025

olimorris Feb 9, 2025
Maintainer Author

bassamsdata
Feb 9, 2025

olimorris Feb 9, 2025
Maintainer Author

bashtoni
Feb 9, 2025