Releases: saigontechnology/AgentCrew
Release list
v0.17.6
🚀 New Features
/learncommand — Extract and store reusable adaptive behaviors from the current conversation. The agent analyzes the conversation, proposes behaviors inwhen [condition], do [action steps]format, and lets you confirm each one with global or project scope.- Transfer tool now requires user confirmation — Agent transfers are no longer auto-approved; you’ll be prompted to approve or deny each transfer request.
- Fuzzy search for
/modelcompleter — The model completer now supports ranked fuzzy matching (prefix > substring > fuzzy character match), making it much easier to find models. - Improved loading animation — The console loading spinner now shows rotating tips and a smooth hue gradient that flows over time.
- New model:
sakana/fugu-ultra— Added Sakana Fugu Ultra via CommandCode, a multi-agent orchestration system matching frontier performance on coding, reasoning, and agentic tasks. - Enhanced image description — Consolidated and improved vision preprocessing prompt for more detailed, structured descriptions. Vision provider priority reordered, API endpoints corrected, and document processing timeouts increased for reliability.
- Onboarding improvements — Fixed agent config parsing; now supports multiline agent descriptions and better TOML extraction from LLM responses.
🐛 Bug Fixes
- Removed duplicate exit message on Ctrl+C — The “Confirmed exit. Goodbye!” message was printed twice; now it appears only once.
- Fixed onboarding agent config parsing — The TOML extraction regex now correctly handles both
```tomland```blocks. - Corrected image description API endpoints — Updated Together AI and Google endpoints to the latest URLs.
- Increased document processing timeout — PDF and Word document timeouts raised from 60s to 600s to handle larger files.
🧹 Chores & Refactoring
- Removed deprecated
/copycommand — The/copycommand and its associated clipboard integration (both console and GUI) have been removed. - Renamed internal tags —
<Transfer_Tool>→<Transfer_Request>,<Transfer_Post_Action_Reminder>→<Post_Transfer_Action_Reminder>for consistency. - Thread‑safe adaptive behaviors — Added a threading lock around adaptive behavior file reads/writes to prevent race conditions.
- Centralized command help messages — Moved all command help strings into a shared constant (
COMMAND_HELP_MESSAGES) used by both the welcome message and the loading animation tips. - Updated default models — Onboarding now uses newer model versions (e.g., GLM-5.2, GPT-5.5, kimi-k2.7-code).
🙌 Contributors
v0.17.4
Features
- Async file extraction: Converted file processing to be fully asynchronous to prevent blocking the main thread.
- Added
FileHandler.async_process_file()to offload blocking I/O to a thread pool. - Updated
CodeAnalysisService.get_file_content(), MCP resource formatting, A2A adapter, and ACP filesystem handler to use async variants.
- Added
- GitHub Copilot service – Simplified tool message handling and removed model‑specific workarounds (e.g., special handling for
gpt‑4.1).
Chores & Cleanup
- Removed deprecated fallback code paths in
AgentCrew.app,a2a/task_manager.py,file_commands.py, andgithub_copilot_service.pythat were never executed.
v0.17.2
🚀 Major Changes
- Stateless MCP Service – Reworked the MCP client to be fully stateless: no persistent event loop, no keep-alive workers, no eager server startup. Tool definitions are discovered once (brief connect → list → disconnect) and cached; each tool call creates a temporary session, executes, and tears down immediately. This results in faster runtime and smaller memory footprint.
- Background MCP Discovery – Agent activation now starts MCP discovery in a background thread (non-blocking), reducing activation time.
- Improved Context & Memory Handling – Tool rejection reasons and user feedback (e.g., “ask” tool answers) are now preserved in memory and context, improving multi-turn conversation quality.
- Voyage Embedding Only – Removed OpenAI embedding fallback; only Voyage (model
voyage-4) is used as third-party embedding model.
🐛 Bug Fixes
- Fixed issue where vision preprocessing would reject model IDs containing a slash.
- Fixed memory-related issues (multiple fixes).
- Fixed
asktool pairing across parallel calls in local agent.
🔧 Chores
- Simplified
MCPSessionManager– removed thread/loop management, now purely stateless.
v0.17.1
Changelog
Added
- Vision Support for Document Parsing: Enhanced the Docling integration to support multiple vision-capable LLM providers (Google, OpenAI, Claude, etc.) instead of a single hardcoded option. This allows for more flexible and cost-effective image description in documents.
- Comprehensive File Handler Tests: Added a full test suite (
tests/test_file_handler.py) covering various file types (PDF, images, text, JSON) to ensure robust file processing. - New Dependencies: Added
mail-parserfor email processing capabilities. - Command Code Service: Updated the
CommandCodeServiceto use the DeepSeek V4 Pro model and specific headers.
Changed
- Conversation Forking Logic: Fixed the
fork_conversationmethod to correctly trace the effective parent when forking within inherited content, ensuring forks become siblings rather than nested children. - Console Display: Limited the maximum number of messages rendered in the conversation history to the last 50 messages to improve performance.
- Tool Learning Behavior: Refined the learning logic to strictly enforce the
when <condition>, <action>format for behavior learning. - Dependency Updates: Updated
openai(2.32.0 -> 2.36.0),together(2.10.0 -> 2.16.0),docling(2.91.0 -> 2.103.0),pydantic-settings(2.13.1 -> 2.14.1),docling-core(2.74.0 -> 2.82.0), anddocling-parse(5.6.1 -> 6.2.0). Removedocrmac.
Fixed
- Fork Parent Tracking: Ensured that forking conversations correctly identifies the effective parent when the fork point falls within inherited content from an ancestor.
v0.17.0
Release v0.17.0
🚀 Features
-
Vision Preprocessing for Non-Vision Models
IntroducedVisionPreprocessingUtilsandVisionDescriptionCacheto automatically replace image content with text descriptions generated by an associated vision model when the primary model lacks vision capabilities.- New
vision_modelfield onModeldefinitions to specify a companion vision model per provider. - Caching of image descriptions (SHA‑256 fingerprint, configurable path, optional disable).
- Applied to many models across CommandCode, CrofAI, DeepInfra, Fireworks, OpenCode Go providers.
- New
-
Enhanced Token Usage Tracking
- Changed metadata keys from
total_tokenstototal_input_tokensfor clarity. - Added
cached_tokens,cache_creation_tokens,total_tokens, andcostto task/session metadata. - Token usage is now merged across turns in ACP sessions and persisted to session store.
- Console UI and Qt GUI now display total session tokens and cumulative cost.
process_messagenow acceptsstr | listof messages and an optionalmodel_idparameter, enabling vision model calls.
- Changed metadata keys from
-
ACP Session Improvements
TokenUsageis stored and restored inAcpSessionState.- New
send_usage_updatemethod pushes token usage, cost, context size, and usage to ACP clients. - Session lifecycle persists
token_usageas a dict.
-
New Models Added
- CommandCode:
Kimi K2.7 Code HighSpeed,Nemotron 3 Ultra(free),Qwen3.7 Plus(vision). - OpenCode Go:
Qwen3.7 Plus, removed deprecatedmimo-v2-pro,mimo-v2-omni,minimax-m2.5. - DeepInfra:
Nemotron 3 Super(vision),Qwen 3.5 27B(vision). - Fireworks: Vision model assignments for existing models.
- CommandCode:
-
Model Capability Updates
- Added
visioncapability to many models: GPT‑5.3 Codex, GPT‑5.4 Mini, Qwen3.6 Plus, Qwen3.7 Plus, Step 3.7 Flash, MiMo V2.5, Gemini 3.1 Flash Lite, and more for CommandCode provider. - Added
vision_modelreferences (e.g.,xiaomi/mimo-v2.5,google/gemma-4-31B-it,qwen3.6-27b) for models without native vision.
- Added
🐛 Bug Fixes
-
Stop Interrupt Handling
Fixed double token merge when streaming is stopped early – now returns immediately without processing tools after stop. -
Conversation Fork Token Reset
Forked conversations now reset token usage metadata (input_tokens,output_tokens, `
v0.16.8
Release v0.16.8
✨ New Features
GLM 5.2 Model Support
- Added GLM 5.2 to all major providers: CommandCode, CrofAI, DeepInfra, Fireworks, and OpenCode Go – a next-generation flagship model for agentic engineering with significantly stronger coding capabilities and long-context reasoning.
CrofAI Provider Updates
- Added Greg 2 Ultra and Greg 2 Super models (replacing Greg 1 Normal/Super) with updated pricing.
- Added
visioncapability to Greg 1 Mini. - New GLM 5.2 entry for CrofAI with 262K context window.
Image Generation Overhaul
- OpenAIImageProvider now supports ChatGPT subscription authentication via Codex OAuth tokens, allowing users to generate images with
gpt-image-2using their ChatGPT subscription (no separate API key required). Runagentcrew chatgpt-authto enable. - GeminiImageProvider now takes higher priority over OpenAI image generation.
- DeepInfra image generation model updated from
FLUX-2-protoFLUX-2-klein-9b. - Provider availability checks now correctly verify the client object rather than just the API key string.
Infrastructure
- Added
ChatGPT-Account-IDheader support toOpenAICodexServiceand image generation, matching Codex CLI behavior for subscription routing. - Exposed
account_id,is_available(), andget_auth()class methods onOpenAICodexOAuthfor broader integration.
🔧 Changes & Fixes
- OpenCode Go: Removed
structured_outputfrom Kimi K2.5, K2.6, Qwen3.5/3.6/3.7 models, MiniMax M3/M2.7/M2.5, and DeepSeek V4 Pro/Flash because this capability is not supported. - Setup: Updated image generation warning message to guide users toward ChatGPT subscription auth.
v0.16.7
v0.16.6
Release v0.16.6
Features
-
Added MiniMax M3 model to providers
Integrated MiniMax M3 with support for vision, tool use, thinking, streaming, and structured output across Fireworks, Together AI, and CommandCode providers. -
Stream handler: add upper limit for retry stream
Introduced aMAX_EMPTY_RESPONSE_RETRIESlimit (default 5) to prevent infinite loop when the model returns empty responses. -
Image generation: allow agent to set target output dir
Added an optionaloutput_dirparameter to the image generation tool so agents can specify where to save generated images.
Bug Fixes
-
Reasoning models: make reasoning model work properly with tool calls
Fixed handling ofreasoning_contentfor reasoning-capable models when tool calls are present, ensuring empty reasoning content defaults to a space to avoid API errors. -
Custom LLM: in OpenAI completion API, tool cannot have image type
Cheat by converting image content from tool responses into a separate user message for providers that do not allow images in assistant/tool roles (applies to both CustomLLM and TogetherAI services). -
File reading: fix issue that agent cannot read image file correctly
UpdatedCodeAnalysisServiceto return the image result dict (includingimage_url) when a file is an image, and added.webpsupport to MIME mapping. -
Model: fix bug that can cause loop on reasoning model without reason content
Prevent infinite retries when a reasoning model returns emptyreasoning_contentin_run_stream_response. -
Debug: make the formatting correctly follow source format
Changed_truncate_contentto preserve content list structure (e.g., images, tool calls) instead of flattening it into a single text string, improving console display.
Other Changes
guess_mime_by_extensionpromoted to a static method inFileHandlerfor reuse.
v0.16.3
Release v0.16.3
🚀 New Features
-
Image Generation Tool — Generate images using a structured JSON meta prompt format. Supports three providers with automatic fallback: OpenAI
gpt-image-2→ Google Geminigemini-3.1-flash-image→ DeepInfraFLUX-2-pro. Agents can describe subjects, environments, camera settings, lighting, and style to produce detailed images. The generated image is displayed inline in the conversation. (#cc89733, #f7ba2b00, #42d2dabe) -
Console UI Image Rendering — Inline image display for attached images, generated images, and tool results containing images. Uses
textual-imagerenderable with base64 data URI decoding. (#f7ba2b00) -
Optimized Image Input — Images attached to conversations are automatically converted to WebP format with resizing (max 2048px) and quality compression (80) for more efficient context usage. Includes caching to avoid reprocessing. (#133c16e3)
-
Kimi K2.7 Code Model — Added to CommandCode, CrofAI, Fireworks, and OpenCode Go providers. A coding-focused agentic model with improved long-horizon coding and 30% fewer thinking tokens compared to K2.6. (#758fc67b)
🔄 Improvements
-
Reasoning Content Rework — Thinking blocks are now embedded directly inside assistant messages instead of separate
MessageType.Thinkingmessages. This improves compatibility with OSS models while preserving thinking data for providers that support it (Claude, Gemini, OpenAI reasoning models). TheMessageType.Thinkingenum has been removed. (#d3a34336, #68b297bc) -
File Handling Decoupled from LLM — File processing is now handled independently by
FileHandlerinstead of each LLM service. The deprecated_process_file,process_file_for_message, andhandle_file_commandmethods in provider services now returnNone(marked with@deprecated). (#a12f0467) -
Enhanced Context Messages — Improved context prompts for better agent behavior and transfer handling. (#3dfd099d)
-
Browser Screenshot Optimization — Screenshots go through the same image optimization pipeline before being sent to the model. (#133c16e3)
🐛 Fixes
- Fixed token usage merging in
LocalAgentstreaming — now properly mergeschunk_token_usageeven when no tool uses are present. - Fixed
_finalize_current_turnmethod signature — removed unusedassistant_responseandemit_response_completedparameters. - Fixed GitHub Copilot service thinking block handling for backward compatibility with old conversations.
- Fixed TogetherAI service to output
reasoning_contentinstead of wrapping thinking in<thinking>tags.
🧹 Housekeeping
- Added
textual-image==0.13.2dependency topyproject.toml - Added test script for image rendering
- Updated
.gitignoreto exclude.agentcrew/images/output directory
v0.16.2
Release v0.16.2
✨ New Features
- Added Claude Fable 5 model via CommandCode — the most capable model for demanding reasoning and long-horizon agent tasks.
- Real pricing added for all CommandCode models — previously all models showed zero cost; now accurate per-model pricing is set (input/output/cached).
- Removed deprecated OpenAI Codex models —
gpt-5.2andgpt-5.3-codexmodels removed from the Codex provider list.
🐛 Bug Fixes
- Token usage merge fix (Anthropic models): Fixed
merge()method inTokenUsagethat was resettinginput_tokens,output_tokens, andcached_tokensto0when the other instance hadNonevalues — now properly falls back to self values. - Docling converter null safety: Added null check for
self.converterin file handler to prevent crashes when Docling is enabled but the converter isn't initialized. - Agent evaluation parsing rewrite: Completely reworked the
<agent_evaluation>block parser to support:- Multiple evaluation blocks within the same response text.
- Removal of the requirement that the block must appear at the start of the text.
- Better detection of incomplete tags being typed.
- Agent transfer planning fix: Agent plans generated after a transfer no longer instruct the receiving agent to "consider transferring again" — the evaluation prompt is now only included for user requests, not internal agent-to-agent requests.
- File editing tool enhancement: Added full content mode support — when a single search/replace block has an empty
searchfield, it's treated as a full file overwrite instead of a search/replace operation. - Code analysis tool improvements: Made the
pathparameter optional with default"."anddeep_analysisnow defaults toTrue.
🧹 Chores & Improvements
- Reduced MCP resource token consumption: MCP resources in context now only show
urianddescriptionfields instead of all six fields (uri,name,title,description,mimeType,size), significantly reducing context usage. - Cleaner directory context: Current directory structure now includes the directory name without the full path, making context more readable.
- Removed unused
_build_adaptive_behavior_contextmethod fromLocalAgent.