Skip to content

Releases: saigontechnology/AgentCrew

v0.17.6

Choose a tag to compare

@daltonnyx daltonnyx released this 30 Jun 09:17

🚀 New Features

  • /learn command — Extract and store reusable adaptive behaviors from the current conversation. The agent analyzes the conversation, proposes behaviors in when [condition], do [action steps] format, and lets you confirm each one with global or project scope.
  • Transfer tool now requires user confirmation — Agent transfers are no longer auto-approved; you’ll be prompted to approve or deny each transfer request.
  • Fuzzy search for /model completer — The model completer now supports ranked fuzzy matching (prefix > substring > fuzzy character match), making it much easier to find models.
  • Improved loading animation — The console loading spinner now shows rotating tips and a smooth hue gradient that flows over time.
  • New model: sakana/fugu-ultra — Added Sakana Fugu Ultra via CommandCode, a multi-agent orchestration system matching frontier performance on coding, reasoning, and agentic tasks.
  • Enhanced image description — Consolidated and improved vision preprocessing prompt for more detailed, structured descriptions. Vision provider priority reordered, API endpoints corrected, and document processing timeouts increased for reliability.
  • Onboarding improvements — Fixed agent config parsing; now supports multiline agent descriptions and better TOML extraction from LLM responses.

🐛 Bug Fixes

  • Removed duplicate exit message on Ctrl+C — The “Confirmed exit. Goodbye!” message was printed twice; now it appears only once.
  • Fixed onboarding agent config parsing — The TOML extraction regex now correctly handles both ```toml and ``` blocks.
  • Corrected image description API endpoints — Updated Together AI and Google endpoints to the latest URLs.
  • Increased document processing timeout — PDF and Word document timeouts raised from 60s to 600s to handle larger files.

🧹 Chores & Refactoring

  • Removed deprecated /copy command — The /copy command and its associated clipboard integration (both console and GUI) have been removed.
  • Renamed internal tags<Transfer_Tool><Transfer_Request>, <Transfer_Post_Action_Reminder><Post_Transfer_Action_Reminder> for consistency.
  • Thread‑safe adaptive behaviors — Added a threading lock around adaptive behavior file reads/writes to prevent race conditions.
  • Centralized command help messages — Moved all command help strings into a shared constant (COMMAND_HELP_MESSAGES) used by both the welcome message and the loading animation tips.
  • Updated default models — Onboarding now uses newer model versions (e.g., GLM-5.2, GPT-5.5, kimi-k2.7-code).

🙌 Contributors

  • Tài Nguyễn (@nqat2003) — Fuzzy search for /model completer (#45), removed duplicate exit message on Ctrl+C (#44).
  • Quy Truong — All other features, fixes, and chores.

v0.17.4

Choose a tag to compare

@daltonnyx daltonnyx released this 24 Jun 04:52

Features

  • Async file extraction: Converted file processing to be fully asynchronous to prevent blocking the main thread.
    • Added FileHandler.async_process_file() to offload blocking I/O to a thread pool.
    • Updated CodeAnalysisService.get_file_content(), MCP resource formatting, A2A adapter, and ACP filesystem handler to use async variants.
  • GitHub Copilot service – Simplified tool message handling and removed model‑specific workarounds (e.g., special handling for gpt‑4.1).

Chores & Cleanup

  • Removed deprecated fallback code paths in AgentCrew.app, a2a/task_manager.py, file_commands.py, and github_copilot_service.py that were never executed.

v0.17.2

Choose a tag to compare

@daltonnyx daltonnyx released this 23 Jun 14:05

🚀 Major Changes

  • Stateless MCP Service – Reworked the MCP client to be fully stateless: no persistent event loop, no keep-alive workers, no eager server startup. Tool definitions are discovered once (brief connect → list → disconnect) and cached; each tool call creates a temporary session, executes, and tears down immediately. This results in faster runtime and smaller memory footprint.
  • Background MCP Discovery – Agent activation now starts MCP discovery in a background thread (non-blocking), reducing activation time.
  • Improved Context & Memory Handling – Tool rejection reasons and user feedback (e.g., “ask” tool answers) are now preserved in memory and context, improving multi-turn conversation quality.
  • Voyage Embedding Only – Removed OpenAI embedding fallback; only Voyage (model voyage-4) is used as third-party embedding model.

🐛 Bug Fixes

  • Fixed issue where vision preprocessing would reject model IDs containing a slash.
  • Fixed memory-related issues (multiple fixes).
  • Fixed ask tool pairing across parallel calls in local agent.

🔧 Chores

  • Simplified MCPSessionManager – removed thread/loop management, now purely stateless.

v0.17.1

Choose a tag to compare

@daltonnyx daltonnyx released this 22 Jun 07:39

Changelog

Added

  • Vision Support for Document Parsing: Enhanced the Docling integration to support multiple vision-capable LLM providers (Google, OpenAI, Claude, etc.) instead of a single hardcoded option. This allows for more flexible and cost-effective image description in documents.
  • Comprehensive File Handler Tests: Added a full test suite (tests/test_file_handler.py) covering various file types (PDF, images, text, JSON) to ensure robust file processing.
  • New Dependencies: Added mail-parser for email processing capabilities.
  • Command Code Service: Updated the CommandCodeService to use the DeepSeek V4 Pro model and specific headers.

Changed

  • Conversation Forking Logic: Fixed the fork_conversation method to correctly trace the effective parent when forking within inherited content, ensuring forks become siblings rather than nested children.
  • Console Display: Limited the maximum number of messages rendered in the conversation history to the last 50 messages to improve performance.
  • Tool Learning Behavior: Refined the learning logic to strictly enforce the when <condition>, <action> format for behavior learning.
  • Dependency Updates: Updated openai (2.32.0 -> 2.36.0), together (2.10.0 -> 2.16.0), docling (2.91.0 -> 2.103.0), pydantic-settings (2.13.1 -> 2.14.1), docling-core (2.74.0 -> 2.82.0), and docling-parse (5.6.1 -> 6.2.0). Removed ocrmac.

Fixed

  • Fork Parent Tracking: Ensured that forking conversations correctly identifies the effective parent when the fork point falls within inherited content from an ancestor.

v0.17.0

Choose a tag to compare

@daltonnyx daltonnyx released this 18 Jun 16:41

Release v0.17.0

🚀 Features

  • Vision Preprocessing for Non-Vision Models
    Introduced VisionPreprocessingUtils and VisionDescriptionCache to automatically replace image content with text descriptions generated by an associated vision model when the primary model lacks vision capabilities.

    • New vision_model field on Model definitions to specify a companion vision model per provider.
    • Caching of image descriptions (SHA‑256 fingerprint, configurable path, optional disable).
    • Applied to many models across CommandCode, CrofAI, DeepInfra, Fireworks, OpenCode Go providers.
  • Enhanced Token Usage Tracking

    • Changed metadata keys from total_tokens to total_input_tokens for clarity.
    • Added cached_tokens, cache_creation_tokens, total_tokens, and cost to task/session metadata.
    • Token usage is now merged across turns in ACP sessions and persisted to session store.
    • Console UI and Qt GUI now display total session tokens and cumulative cost.
    • process_message now accepts str | list of messages and an optional model_id parameter, enabling vision model calls.
  • ACP Session Improvements

    • TokenUsage is stored and restored in AcpSessionState.
    • New send_usage_update method pushes token usage, cost, context size, and usage to ACP clients.
    • Session lifecycle persists token_usage as a dict.
  • New Models Added

    • CommandCode: Kimi K2.7 Code HighSpeed, Nemotron 3 Ultra (free), Qwen3.7 Plus (vision).
    • OpenCode Go: Qwen3.7 Plus, removed deprecated mimo-v2-pro, mimo-v2-omni, minimax-m2.5.
    • DeepInfra: Nemotron 3 Super (vision), Qwen 3.5 27B (vision).
    • Fireworks: Vision model assignments for existing models.
  • Model Capability Updates

    • Added vision capability to many models: GPT‑5.3 Codex, GPT‑5.4 Mini, Qwen3.6 Plus, Qwen3.7 Plus, Step 3.7 Flash, MiMo V2.5, Gemini 3.1 Flash Lite, and more for CommandCode provider.
    • Added vision_model references (e.g., xiaomi/mimo-v2.5, google/gemma-4-31B-it, qwen3.6-27b) for models without native vision.

🐛 Bug Fixes

  • Stop Interrupt Handling
    Fixed double token merge when streaming is stopped early – now returns immediately without processing tools after stop.

  • Conversation Fork Token Reset
    Forked conversations now reset token usage metadata (input_tokens, output_tokens, `

v0.16.8

Choose a tag to compare

@daltonnyx daltonnyx released this 17 Jun 15:22

Release v0.16.8

✨ New Features

GLM 5.2 Model Support

  • Added GLM 5.2 to all major providers: CommandCode, CrofAI, DeepInfra, Fireworks, and OpenCode Go – a next-generation flagship model for agentic engineering with significantly stronger coding capabilities and long-context reasoning.

CrofAI Provider Updates

  • Added Greg 2 Ultra and Greg 2 Super models (replacing Greg 1 Normal/Super) with updated pricing.
  • Added vision capability to Greg 1 Mini.
  • New GLM 5.2 entry for CrofAI with 262K context window.

Image Generation Overhaul

  • OpenAIImageProvider now supports ChatGPT subscription authentication via Codex OAuth tokens, allowing users to generate images with gpt-image-2 using their ChatGPT subscription (no separate API key required). Run agentcrew chatgpt-auth to enable.
  • GeminiImageProvider now takes higher priority over OpenAI image generation.
  • DeepInfra image generation model updated from FLUX-2-pro to FLUX-2-klein-9b.
  • Provider availability checks now correctly verify the client object rather than just the API key string.

Infrastructure

  • Added ChatGPT-Account-ID header support to OpenAICodexService and image generation, matching Codex CLI behavior for subscription routing.
  • Exposed account_id, is_available(), and get_auth() class methods on OpenAICodexOAuth for broader integration.

🔧 Changes & Fixes

  • OpenCode Go: Removed structured_output from Kimi K2.5, K2.6, Qwen3.5/3.6/3.7 models, MiniMax M3/M2.7/M2.5, and DeepSeek V4 Pro/Flash because this capability is not supported.
  • Setup: Updated image generation warning message to guide users toward ChatGPT subscription auth.

v0.16.7

Choose a tag to compare

@daltonnyx daltonnyx released this 16 Jun 04:45

hotfix: some error happens when switch from non-reasoning to reasoning model in middle of conversation

v0.16.6

Choose a tag to compare

@daltonnyx daltonnyx released this 15 Jun 15:51

Release v0.16.6

Features

  • Added MiniMax M3 model to providers
    Integrated MiniMax M3 with support for vision, tool use, thinking, streaming, and structured output across Fireworks, Together AI, and CommandCode providers.

  • Stream handler: add upper limit for retry stream
    Introduced a MAX_EMPTY_RESPONSE_RETRIES limit (default 5) to prevent infinite loop when the model returns empty responses.

  • Image generation: allow agent to set target output dir
    Added an optional output_dir parameter to the image generation tool so agents can specify where to save generated images.

Bug Fixes

  • Reasoning models: make reasoning model work properly with tool calls
    Fixed handling of reasoning_content for reasoning-capable models when tool calls are present, ensuring empty reasoning content defaults to a space to avoid API errors.

  • Custom LLM: in OpenAI completion API, tool cannot have image type
    Cheat by converting image content from tool responses into a separate user message for providers that do not allow images in assistant/tool roles (applies to both CustomLLM and TogetherAI services).

  • File reading: fix issue that agent cannot read image file correctly
    Updated CodeAnalysisService to return the image result dict (including image_url) when a file is an image, and added .webp support to MIME mapping.

  • Model: fix bug that can cause loop on reasoning model without reason content
    Prevent infinite retries when a reasoning model returns empty reasoning_content in _run_stream_response.

  • Debug: make the formatting correctly follow source format
    Changed _truncate_content to preserve content list structure (e.g., images, tool calls) instead of flattening it into a single text string, improving console display.

Other Changes

  • guess_mime_by_extension promoted to a static method in FileHandler for reuse.

v0.16.3

Choose a tag to compare

@daltonnyx daltonnyx released this 15 Jun 06:30

Release v0.16.3

🚀 New Features

  • Image Generation Tool — Generate images using a structured JSON meta prompt format. Supports three providers with automatic fallback: OpenAI gpt-image-2 → Google Gemini gemini-3.1-flash-image → DeepInfra FLUX-2-pro. Agents can describe subjects, environments, camera settings, lighting, and style to produce detailed images. The generated image is displayed inline in the conversation. (#cc89733, #f7ba2b00, #42d2dabe)

  • Console UI Image Rendering — Inline image display for attached images, generated images, and tool results containing images. Uses textual-image renderable with base64 data URI decoding. (#f7ba2b00)

  • Optimized Image Input — Images attached to conversations are automatically converted to WebP format with resizing (max 2048px) and quality compression (80) for more efficient context usage. Includes caching to avoid reprocessing. (#133c16e3)

  • Kimi K2.7 Code Model — Added to CommandCode, CrofAI, Fireworks, and OpenCode Go providers. A coding-focused agentic model with improved long-horizon coding and 30% fewer thinking tokens compared to K2.6. (#758fc67b)

🔄 Improvements

  • Reasoning Content Rework — Thinking blocks are now embedded directly inside assistant messages instead of separate MessageType.Thinking messages. This improves compatibility with OSS models while preserving thinking data for providers that support it (Claude, Gemini, OpenAI reasoning models). The MessageType.Thinking enum has been removed. (#d3a34336, #68b297bc)

  • File Handling Decoupled from LLM — File processing is now handled independently by FileHandler instead of each LLM service. The deprecated _process_file, process_file_for_message, and handle_file_command methods in provider services now return None (marked with @deprecated). (#a12f0467)

  • Enhanced Context Messages — Improved context prompts for better agent behavior and transfer handling. (#3dfd099d)

  • Browser Screenshot Optimization — Screenshots go through the same image optimization pipeline before being sent to the model. (#133c16e3)

🐛 Fixes

  • Fixed token usage merging in LocalAgent streaming — now properly merges chunk_token_usage even when no tool uses are present.
  • Fixed _finalize_current_turn method signature — removed unused assistant_response and emit_response_completed parameters.
  • Fixed GitHub Copilot service thinking block handling for backward compatibility with old conversations.
  • Fixed TogetherAI service to output reasoning_content instead of wrapping thinking in <thinking> tags.

🧹 Housekeeping

  • Added textual-image==0.13.2 dependency to pyproject.toml
  • Added test script for image rendering
  • Updated .gitignore to exclude .agentcrew/images/ output directory

v0.16.2

Choose a tag to compare

@daltonnyx daltonnyx released this 12 Jun 06:37

Release v0.16.2

✨ New Features

  • Added Claude Fable 5 model via CommandCode — the most capable model for demanding reasoning and long-horizon agent tasks.
  • Real pricing added for all CommandCode models — previously all models showed zero cost; now accurate per-model pricing is set (input/output/cached).
  • Removed deprecated OpenAI Codex modelsgpt-5.2 and gpt-5.3-codex models removed from the Codex provider list.

🐛 Bug Fixes

  • Token usage merge fix (Anthropic models): Fixed merge() method in TokenUsage that was resetting input_tokens, output_tokens, and cached_tokens to 0 when the other instance had None values — now properly falls back to self values.
  • Docling converter null safety: Added null check for self.converter in file handler to prevent crashes when Docling is enabled but the converter isn't initialized.
  • Agent evaluation parsing rewrite: Completely reworked the <agent_evaluation> block parser to support:
    • Multiple evaluation blocks within the same response text.
    • Removal of the requirement that the block must appear at the start of the text.
    • Better detection of incomplete tags being typed.
  • Agent transfer planning fix: Agent plans generated after a transfer no longer instruct the receiving agent to "consider transferring again" — the evaluation prompt is now only included for user requests, not internal agent-to-agent requests.
  • File editing tool enhancement: Added full content mode support — when a single search/replace block has an empty search field, it's treated as a full file overwrite instead of a search/replace operation.
  • Code analysis tool improvements: Made the path parameter optional with default "." and deep_analysis now defaults to True.

🧹 Chores & Improvements

  • Reduced MCP resource token consumption: MCP resources in context now only show uri and description fields instead of all six fields (uri, name, title, description, mimeType, size), significantly reducing context usage.
  • Cleaner directory context: Current directory structure now includes the directory name without the full path, making context more readable.
  • Removed unused _build_adaptive_behavior_context method from LocalAgent.