Conversation
3855d31 to
2f8d349
Compare
|
Thank you! Great idea :-) The one thing I'd like to see improved is making the argument handling at the CLI a bit more flexible. Perhaps we could: (1) Support positional arguments My reasoning is that users trying to write JSON will get kind of fiddly and error prone. Related: we recently added "centaur mode" to Inspect SWE to let a |
|
OK that makes sense, let me think about improving the CLI API then before opening this PR! :-)
|
|
I've fixed the hanging test, now working on changing the CLI API |
In argparse, an argument is either positional or named. You can't define one argument that accepts both task `tool addition 12 34` and `task tool addition --x 12 --y 34`. This is supported by libraries like `click`, but installing click and its dependencies in the human container doesn't seem worth it
In argparse, an argument is either positional or named. You can't define one argument that accepts both task `tool addition 12 34` and `task tool addition --x 12 --y 34`. This is supported by libraries like `click`, but installing click and its dependencies in the human container doesn't seem worth it
misleading since users must use --raw-json-escape-hatch for everything anyway
… HumanAgentCommands work
|
Here's an attempt using However, it doesn't support positional arguments, all arguments must be passed as named arguments. @jjallaire how much do you care about positional arguments? My impression is that (Separately, I think relying on CLI standards like If at this point you'd prefer to take over the PR and do things differently I'd also be happy with that! |
|
I'm fine with using cli args (no positional). The YAML was just do that string like "true" get properly typed (perhaps argparse already does that though?) |
|
Got it! argparse already handles integers properly when defined using The standard way of handling booleans in argparse is by the presence/absence of flags, rather than by passing in True or False as strings. This is done with The only case I can think of where this could be confusing is with a tool that has an optional boolean argument that defaults to True instead of False. Even in that case I would rather just do it the default argparse way, rather than doing a conversion with e.g. I think in any case what we are doing here (automatically generating code for a CLI interface) is always going to be pretty imperfect. |
| escape_hatch_preparse = dedent(""" | ||
| # Pre-parse for --raw-json-escape-hatch (bypasses argparse validation) | ||
| import json | ||
| ESCAPE_HATCH = "--raw-json-escape-hatch" |
There was a problem hiding this comment.
What's the motivation behind calling it --raw-json-escape-hatch vs. the shorter/easier to type --json or --raw-json?
There was a problem hiding this comment.
The name is verbose on purpose: less likely to conflict with an existing tool argument.
This PR adds an optional
tools: list[Tool]parameter tohuman_cli, that allows humans to call tools in the same way LMs can.Motivation: to give humans and LMs the same tools, without having to maintain them in two different places. (Think of some complex custom tool; this isn't relevant for something like file editing tools)
Flagging my main design decisions (happy to make different choices on any of these):
task tool addition --x 12 --y 34)Maps JSON Schema types → argparse for simple types: str, int, float,
bool, and arrays of primitives.
task tool db_lookup --raw-json-escape-hatch '{"config": {"nested": true}}')Tools with complex parameter types (dicts, nested objects, unions) show
a help message directing users to use --raw-json-escape-hatch. The
JSON schema is displayed so users know what to pass.
Nesting under task keyword (i.e.
task tool db_lookup, nottool db_lookup)This seemed simplest with the current setup.
Result display
ToolResultcan be str, int, images, audio, etc. Serialize types to string if possible; raiseNotImplementedErrorfor image/audio/video.This PR contains: