diff --git a/assets/README/README_RU.md b/assets/README/README_RU.md index 73b74f12..94bd6864 100644 --- a/assets/README/README_RU.md +++ b/assets/README/README_RU.md @@ -5,18 +5,25 @@ # DeepTutor: Персональный учебный ассистент на базе ИИ [![Python](https://img.shields.io/badge/Python-3.10%2B-3776AB?style=flat-square&logo=python&logoColor=white)](https://www.python.org/downloads/) -[![Next.js](https://img.shields.io/badge/Next.js-16-000000?style=flat-square&logo=next.js&logoColor=white)](https://nextjs.org/) [![FastAPI](https://img.shields.io/badge/FastAPI-0.100+-009688?style=flat-square&logo=fastapi&logoColor=white)](https://fastapi.tiangolo.com/) +[![React](https://img.shields.io/badge/React-19-61DAFB?style=flat-square&logo=react&logoColor=black)](https://react.dev/) +[![Next.js](https://img.shields.io/badge/Next.js-16-000000?style=flat-square&logo=next.js&logoColor=white)](https://nextjs.org/) +[![TailwindCSS](https://img.shields.io/badge/Tailwind-3.4-06B6D4?style=flat-square&logo=tailwindcss&logoColor=white)](https://tailwindcss.com/) [![License](https://img.shields.io/badge/License-AGPL--3.0-blue?style=flat-square)](../../LICENSE) -[![Discord](https://img.shields.io/badge/Discord-Join-7289DA?style=flat&logo=discord&logoColor=white)](https://discord.gg/eRsjPgMU4t) -[![Feishu](https://img.shields.io/badge/Feishu-Group-blue?style=flat)](../../Communication.md) -[![WeChat](https://img.shields.io/badge/WeChat-Group-green?style=flat&logo=wechat)](https://github.com/HKUDS/DeepTutor/issues/78) + +

+ Discord +    + Feishu +    + WeChat +

-[**Быстрый Старт**](#быстрый-старт) · [**Основные Модули**](#основные-модули) · [**Часто Задаваемые Вопросы**](#часто-задаваемые-вопросы) +[**Быстрый старт**](#-быстрый-старт) · [**Основные модули**](#-основные-модули) · [**Часто задаваемые вопросы**](#-часто-задаваемые-вопросы) -[🇬🇧 English](../../README.md) · [🇨🇳 中文](README_CN.md) · [🇯🇵 日本語](README_JA.md) · [🇪🇸 Español](README_ES.md) · [🇫🇷 Français](README_FR.md) · [🇸🇦 العربية](README_AR.md) · [🇵🇹 Português](README_PT.md) · [🇮🇳 हिन्दी](README_HI.md) +[🇬🇧 English](../../README.md) · [🇨🇳 中文](README_CN.md) · [🇯🇵 日本語](README_JA.md) · [🇪🇸 Español](README_ES.md) · [🇫🇷 Français](README_FR.md) · [🇸🇦 العربية](README_AR.md) · [🇮🇳 हिन्दी](README_HI.md) · [🇵🇹 Português](README_PT.md) @@ -28,13 +35,36 @@ --- -> **[2026.1.18]** Релиз [v0.5.2](https://github.com/HKUDS/DeepTutor/releases/tag/v0.5.1) — улучшение RAG-пайплайна (поддержка Docling) и улучшения CI/CD с исправлением нескольких мелких ошибок — спасибо за отзывы! +### 📰 Новости > **[2026.1.1]** С Новым годом! Присоединяйтесь к нашему [Discord-сообществу](https://discord.gg/zpP9cssj), [WeChat-сообществу](https://github.com/HKUDS/DeepTutor/issues/78) или [Discussions](https://github.com/HKUDS/DeepTutor/discussions) — формируйте будущее DeepTutor! 💬 > **[2025.12.30]** Посетите наш [официальный сайт](https://hkuds.github.io/DeepTutor/) для получения дополнительной информации! > **[2025.12.29]** DeepTutor уже в сети! ✨ + +### 📦 Релизы + +> **[2026.1.23]** Релиз [v0.6.0](https://github.com/HKUDS/DeepTutor/releases/tag/v0.6.0) - Сохранение сеансов интерфейса, полная поддержка китайского языка, обновления развертывания Docker и исправления незначительных ошибок -- Спасибо всем за обратную связь! + +
+История релизов + +> **[2026.1.18]** Релиз [v0.5.2](https://github.com/HKUDS/DeepTutor/releases/tag/v0.5.1) - Улучшение конвейера RAG с поддержкой Docling и улучшение рабочих процессов CI/CD с исправлением нескольких незначительных ошибок -- Спасибо всем за отзывы! + + +> **[2026.1.15]** Релиз [v0.5.0](https://github.com/HKUDS/DeepTutor/releases/tag/v0.5.0) - Унифицированные службы LLM и встраивания, выбор конвейера RAG и значительные улучшения модулей Home, History, QuestionGen и Settings -- Спасибо всем участникам! + +> **[2026.1.9]** Релиз [v0.4.1](https://github.com/HKUDS/DeepTutor/releases/tag/v0.4.1) с полной переработкой системы провайдера LLM, улучшением надежности генерации вопросов и очисткой кодовой базы - Спасибо всем участникам! + +> **[2026.1.9]** Релиз [v0.4.0](https://github.com/HKUDS/DeepTutor/releases/tag/v0.4.0) с новой структурой кода, поддержкой нескольких llm и встраиваний - Спасибо всем участникам! + +> **[2026.1.5]** [v0.3.0](https://github.com/HKUDS/DeepTutor/releases/tag/v0.3.0) - Унифицированная архитектура PromptManager, автоматизация CI/CD и предварительно собранные образы Docker на GHCR + +> **[2026.1.2]** [v0.2.0](https://github.com/HKUDS/DeepTutor/releases/tag/v0.2.0) - Развертывание Docker, обновление до Next.js 16 и React 19, исправления безопасности WebSocket и критических уязвимостей + +
+ --- ## Ключевые особенности DeepTutor @@ -216,12 +246,15 @@ • **Система памяти**: Управление состоянием сеанса и отслеживание цитат для контекстной непрерывности. ## 📋 Будущие задачи - > 🌟 Поставьте звезду, чтобы следить за нашими будущими обновлениями! -- [ ] Поддержка локальных LLM-сервисов (например, ollama) -- [ ] Рефакторинг модуля RAG (см. [Обсуждения](https://github.com/HKUDS/DeepTutor/discussions)) -- [ ] Глубокое кодирование из генерации идей -- [ ] Персонализированное взаимодействие с блокнотом +- [x] Поддержка многоязычности +- [x] Сообщество DeepTutor +- [x] Поддержка видео- и аудиофайлов +- [x] Настройка атомарного конвейера RAG +- [ ] Пошаговое редактирование базы знаний +- [ ] Персонализированное рабочее пространство +- [ ] Визуализация базы данных +- [ ] Онлайн-демонстрация ## 🚀 Быстрый старт @@ -299,22 +332,23 @@ cp .env.example .env ### Шаг 2: Выберите метод установки - - - - - -
+#### 🐳 Вариант A: Установка через Docker -

🐳 Развертывание Docker

-

Рекомендуется — Без настройки Python/Node.js

+> Установка Python/Node.js не требуется ---- +**Требования**: [Docker](https://docs.docker.com/get-docker/) & [Docker Compose](https://docs.docker.com/compose/install/) + +**Быстрый старт** — Сборка из исходного кода: -**Требования**: [Docker](https://docs.docker.com/get-docker/) и [Docker Compose](https://docs.docker.com/compose/install/) +```bash +docker compose up # Сборка и запуск (~11 мин при первом запуске на mac mini M4) +docker compose build --no-cache # Очистка кэша и пересборка после обновления репозитория +``` -
-🚀 Вариант A: Предварительно Собранный Образ (Быстрее Всего) +**Или использовать предварительно собранный образ** (быстрее): ```bash -# Работает на всех платформах — Docker автоматически определяет вашу архитектуру +# Работает на всех платформах — Docker автоматически определяет архитектуру docker run -d --name deeptutor \ -p 8001:8001 -p 3782:3782 \ --env-file .env \ @@ -325,67 +359,74 @@ docker run -d --name deeptutor \ # Windows PowerShell: используйте ${PWD} вместо $(pwd) ``` -Или использовать файл `.env`: +**Общие команды**: ```bash -docker run -d --name deeptutor \ - -p 8001:8001 -p 3782:3782 \ - --env-file .env \ - -v $(pwd)/data:/app/data \ - -v $(pwd)/config:/app/config:ro \ - ghcr.io/hkuds/deeptutor:latest +docker compose up -d # Запуск +docker compose down # Остановка +docker compose logs -f # Просмотр логов +docker compose up --build # Пересборка после изменений ``` -
-
-🔨 Вариант B: Собрать из Исходного Кода +📋 Дополнительные параметры Docker (предварительно собранные образы, облачная установка, пользовательские порты) -```bash -# Собрать и запустить (~5-10 мин при первом запуске) -docker compose up --build -d +**Теги предварительно собранных образов:** -# Просмотр логов -docker compose logs -f -``` +| Тег | Архитектуры | Описание | +|:----|:--------------|:------------| +| `:latest` | AMD64 + ARM64 | Последний стабильный выпуск (автоопределение архитектуры) | +| `:v0.5.x` | AMD64 + ARM64 | Конкретная версия (автоопределение архитектуры) | +| `:v0.5.x-amd64` | Только AMD64 | Явный образ AMD64 | +| `:v0.5.x-arm64` | Только ARM64 | Явный образ ARM64 | -
+> 💡 Тег `:latest` является **мультиархитектурным образом** — Docker автоматически загружает правильную версию для вашей системы (Intel/AMD или Apple Silicon/ARM) -**Команды**: +**Облачная установка** — Необходимо установить внешний URL-адрес API: ```bash -docker compose up -d # Запустить -docker compose logs -f # Логи -docker compose down # Остановить -docker compose up --build # Пересобрать -docker pull ghcr.io/hkuds/deeptutor:latest # Обновить образ +docker run -d --name deeptutor \ + -p 8001:8001 -p 3782:3782 \ + -e NEXT_PUBLIC_API_BASE_EXTERNAL=https://your-server.com:8001 \ + --env-file .env \ + -v $(pwd)/data:/app/data \ + ghcr.io/hkuds/deeptutor:latest ``` -> **Режим Разработки**: Добавьте `-f docker-compose.dev.yml` +**Пример пользовательских портов:** -
+```bash +docker run -d --name deeptutor \ + -p 9001:9001 -p 3000:3000 \ + -e BACKEND_PORT=9001 \ + -e FRONTEND_PORT=3000 \ + -e NEXT_PUBLIC_API_BASE_EXTERNAL=https://your-server.com:9001 \ + --env-file .env \ + -v $(pwd)/data:/app/data \ + ghcr.io/hkuds/deeptutor:latest +``` -

💻 Ручная Установка

-

Для разработки или сред без Docker

+ --- +#### 💻 Вариант B: Ручная установка + +> Для разработки или сред без Docker + **Требования**: Python 3.10+, Node.js 18+ -**Настроить Окружение**: +**1. Настройка окружения**: ```bash -# Использовать conda (Рекомендуется) -conda create -n deeptutor python=3.10 -conda activate deeptutor +# Использование conda (Рекомендуется) +conda create -n deeptutor python=3.10 && conda activate deeptutor -# Или использовать venv -python -m venv venv -source venv/bin/activate +# Или использование venv +python -m venv venv && source venv/bin/activate # Windows: venv\Scripts\activate ``` -**Установить Зависимости**: +**2. Установка зависимостей**: ```bash # Установка в один клик (Рекомендуется) @@ -397,21 +438,39 @@ pip install -r requirements.txt npm install --prefix web ``` -**Запустить**: +**3. Запуск**: ```bash -# Запустить веб-интерфейс -python scripts/start_web.py +python scripts/start_web.py # Запуск интерфейса и бэкенда +# Или: python scripts/start.py # Только CLI +# Остановка: Ctrl+C +``` -# Или только CLI -python scripts/start.py +
+🔧 Запуск интерфейса и бэкенда отдельно -# Остановить: Ctrl+C +**Бэкенд** (FastAPI): +```bash +python src/api/run_server.py +# Или: uvicorn src.api.main:app --host 0.0.0.0 --port 8001 --reload ``` -
+**Интерфейс** (Next.js): +```bash +cd web && npm install && npm run dev -- -p 3782 +``` + +**Примечание**: Создайте `web/.env.local`: +```bash +NEXT_PUBLIC_API_BASE=http://localhost:8001 +``` + +| Сервис | Порт по умолчанию | +|:---:|:---:| +| Бэкенд | `8001` | +| Интерфейс | `3782` | + + ### URLs Доступа @@ -735,107 +794,107 @@ data/user/co-writer/ **Основные особенности** -| Feature | Description | +| Особенность | Описание | |:---:|:---| -| Three-Phase Architecture | **Phase 1 (Planning)**: RephraseAgent (topic optimization) + DecomposeAgent (subtopic decomposition)
**Phase 2 (Researching)**: ManagerAgent (queue scheduling) + ResearchAgent (research decisions) + NoteAgent (info compression)
**Phase 3 (Reporting)**: Deduplication → Three-level outline generation → Report writing with citations | -| Dynamic Topic Queue | Core scheduling system with TopicBlock state management: `PENDING → RESEARCHING → COMPLETED/FAILED`. Supports dynamic topic discovery during research | -| Execution Modes | **Series Mode**: Sequential topic processing
**Parallel Mode**: Concurrent multi-topic processing with `AsyncCitationManagerWrapper` for thread-safe operations | -| Multi-Tool Integration | **RAG** (hybrid/naive), **Query Item** (entity lookup), **Paper Search**, **Web Search**, **Code Execution** — dynamically selected by ResearchAgent | -| Unified Citation System | Centralized CitationManager as single source of truth for citation ID generation, ref_number mapping, and deduplication | -| Preset Configurations | **quick**: Fast research (1-2 subtopics, 1-2 iterations)
**medium/standard**: Balanced depth (5 subtopics, 4 iterations)
**deep**: Thorough research (8 subtopics, 7 iterations)
**auto**: Agent autonomously decides depth | +| Трехфазная архитектура | **Фаза 1 (Планирование)**: RephraseAgent (оптимизация темы) + DecomposeAgent (декомпозиция подтем)
**Фаза 2 (Исследование)**: ManagerAgent (планирование очереди) + ResearchAgent (принятие решений об исследованиях) + NoteAgent (сжатие информации)
**Фаза 3 (Отчетность)**: Дедупликация → Генерация структуры из трех уровней → Написание отчета с цитатами | +| Динамическая очередь тем | Основная система планирования с управлением состоянием TopicBlock: `PENDING → RESEARCHING → COMPLETED/FAILED`. Поддерживает динамическое обнаружение тем во время исследования | +| Режимы выполнения | **Последовательный режим**: Последовательная обработка тем
**Параллельный режим**: Одновременная обработка нескольких тем с `AsyncCitationManagerWrapper` для потокобезопасных операций | +| Интеграция нескольких инструментов | **RAG** (гибридный/наивный), **Поиск по запросу** (поиск сущностей), **Поиск статей**, **Веб-поиск**, **Выполнение кода** — динамически выбирается ResearchAgent | +| Единая система цитирования | Централизованный CitationManager как единый источник истины для генерации ID цитирования, сопоставления ref_number и дедупликации | +| Предустановленные конфигурации | **quick**: Быстрое исследование (1-2 подтемы, 1-2 итерации)
**medium/standard**: Сбалансированная глубина (5 подтем, 4 итерации)
**deep**: Тщательное исследование (8 подтем, 7 итераций)
**auto**: Агент самостоятельно решает глубину | -**Citation System Architecture** +**Архитектура системы цитирования** -The citation system follows a centralized design with CitationManager as the single source of truth: +Система цитирования следует централизованному дизайну с CitationManager как единым источником истины: ``` ┌─────────────────────────────────────────────────────────────────┐ │ CitationManager │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ -│ │ ID Generation │ │ ref_number Map │ │ Deduplication │ │ -│ │ PLAN-XX │ │ citation_id → │ │ (papers only) │ │ +│ │ Генерация ID │ │ Карта ref_number│ │ Дедупликация │ │ +│ │ PLAN-XX │ │ citation_id → │ │ (только статьи)│ │ │ │ CIT-X-XX │ │ ref_number │ │ │ │ │ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │ └───────────┼────────────────────┼────────────────────┼───────────┘ │ │ │ ┌──────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐ - │DecomposeAgent│ │ReportingAgent│ │ References │ - │ ResearchAgent│ │ (inline [N]) │ │ Section │ + │DecomposeAgent│ │ReportingAgent│ │ Раздел │ + │ ResearchAgent│ │ (inline [N]) │ │ Ссылок │ │ NoteAgent │ └─────────────┘ └────────────┘ └─────────────┘ ``` -| Component | Description | +| Компонент | Описание | |:---:|:---| -| ID Format | **PLAN-XX** (planning stage RAG queries) + **CIT-X-XX** (research stage, X=block number) | -| ref_number Mapping | Sequential 1-based numbers built from sorted citation IDs, with paper deduplication | -| Inline Citations | Simple `[N]` format in LLM output, post-processed to clickable `[[N]](#ref-N)` links | -| Citation Table | Clear reference table provided to LLM: `Cite as [1] → (RAG) query preview...` | -| Post-processing | Automatic format conversion + validation to remove invalid citation references | -| Parallel Safety | Thread-safe async methods (`get_next_citation_id_async`, `add_citation_async`) for concurrent execution | +| Формат ID | **PLAN-XX** (запросы RAG на этапе планирования) + **CIT-X-XX** (этап исследований, X=номер блока) | +| Сопоставление ref_number | Последовательные номера, начинающиеся с 1, созданные из отсортированных ID цитирования, с дедупликацией статей | +| Встроенные цитаты | Простой формат `[N]` в выводе LLM, пост-обработка в кликабельные ссылки `[[N]](#ref-N)` | +| Таблица цитирования | Четкая таблица ссылок, предоставленная LLM: `Цитировать как [1] → (RAG) предпросмотр запроса...` | +| Пост-обработка | Автоматическое преобразование формата + проверка для удаления недействительных ссылок на цитаты | +| Параллельная безопасность | Потокобезопасные асинхронные методы (`get_next_citation_id_async`, `add_citation_async`) для параллельного выполнения | -**Parallel Execution Architecture** +**Архитектура параллельного выполнения** -When `execution_mode: "parallel"` is enabled, multiple topic blocks are researched concurrently: +Когда включено `execution_mode: "parallel"`, несколько блоков тем исследуются одновременно: ``` ┌─────────────────────────────────────────────────────────────────────────┐ -│ Parallel Research Execution │ +│ Параллельное выполнение исследований │ ├─────────────────────────────────────────────────────────────────────────┤ │ │ │ DynamicTopicQueue AsyncCitationManagerWrapper │ │ ┌─────────────────┐ ┌─────────────────────────┐ │ -│ │ Topic 1 (PENDING)│ ──┐ │ Thread-safe wrapper │ │ -│ │ Topic 2 (PENDING)│ ──┼──→ asyncio │ for CitationManager │ │ -│ │ Topic 3 (PENDING)│ ──┤ Semaphore │ │ │ -│ │ Topic 4 (PENDING)│ ──┤ (max=5) │ • get_next_citation_ │ │ -│ │ Topic 5 (PENDING)│ ──┘ │ id_async() │ │ +│ │ Тема 1 (PENDING)│ ──┐ │ Потокобезопасная │ │ +│ │ Тема 2 (PENDING)│ ──┼──→ asyncio │ обертка для │ │ +│ │ Тема 3 (PENDING)│ ──┤ Semaphore │ │ │ +│ │ Тема 4 (PENDING)│ ──┤ (max=5) │ • get_next_citation_ │ │ +│ │ Тема 5 (PENDING)│ ──┘ │ id_async() │ │ │ └─────────────────┘ │ • add_citation_async() │ │ │ │ └───────────┬─────────────┘ │ │ ▼ │ │ │ ┌─────────────────────────────────────────────────────────────┐ │ -│ │ Concurrent ResearchAgent Tasks │ │ +│ │ Задачи параллельных ResearchAgent │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ -│ │ │ Task 1 │ │ Task 2 │ │ Task 3 │ │ Task 4 │ ... │ │ -│ │ │(Topic 1)│ │(Topic 2)│ │(Topic 3)│ │(Topic 4)│ │ │ +│ │ │ Задача 1│ │ Задача 2│ │ Задача 3│ │ Задача 4│ ... │ │ +│ │ │(Тема 1) │ │(Тема 2) │ │(Тема 3) │ │(Тема 4) │ │ │ │ │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │ │ │ │ │ │ │ │ │ │ │ │ └────────────┴────────────┴────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ AsyncManagerAgentWrapper │ │ -│ │ (Thread-safe queue updates) │ │ +│ │ (Обновления очереди, безопасные для потоков) │ │ │ └─────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ ``` -| Component | Description | +| Компонент | Описание | |:---:|:---| -| `asyncio.Semaphore` | Limits concurrent tasks to `max_parallel_topics` (default: 5) | -| `AsyncCitationManagerWrapper` | Wraps CitationManager with `asyncio.Lock()` for thread-safe ID generation | -| `AsyncManagerAgentWrapper` | Ensures queue state updates are atomic across parallel tasks | -| Real-time Progress | Live display of all active research tasks with status indicators | +| `asyncio.Semaphore` | Ограничивает количество одновременных задач до `max_parallel_topics` (по умолчанию: 5) | +| `AsyncCitationManagerWrapper` | Оборачивает CitationManager с `asyncio.Lock()` для потокобезопасной генерации ID | +| `AsyncManagerAgentWrapper` | Обеспечивает атомарность обновлений состояния очереди в параллельных задачах | +| Отслеживание прогресса в реальном времени | Отображение всех активных задач исследования с индикаторами состояния | -**Agent Responsibilities** +**Обязанности агентов** -| Agent | Phase | Responsibility | +| Агент | Фаза | Обязанности | |:---:|:---:|:---| -| RephraseAgent | Planning | Optimizes user input topic, supports multi-turn user interaction for refinement | -| DecomposeAgent | Planning | Decomposes topic into subtopics with RAG context, obtains citation IDs from CitationManager | -| ManagerAgent | Researching | Queue state management, task scheduling, dynamic topic addition | -| ResearchAgent | Researching | Knowledge sufficiency check, query planning, tool selection, requests citation IDs before each tool call | -| NoteAgent | Researching | Compresses raw tool outputs into summaries, creates ToolTraces with pre-assigned citation IDs | -| ReportingAgent | Reporting | Builds citation map, generates three-level outline, writes report sections with citation tables, post-processes citations | +| RephraseAgent | Планирование | Оптимизация входной темы пользователя, поддержка многораундового взаимодействия пользователя для уточнения | +| DecomposeAgent | Планирование | Декомпозиция темы на подтемы с контекстом RAG, получение ID цитирования из CitationManager | +| ManagerAgent | Исследование | Управление состоянием очереди, планирование задач, динамическое добавление тем | +| ResearchAgent | Исследование | Проверка достаточности знаний, планирование запросов, выбор инструментов, запрос ID цитирования перед каждым вызовом инструмента | +| NoteAgent | Исследование | Сжатие необработанных выходных данных инструментов в сводки, создание ToolTraces с заранее назначенными ID цитирования | +| ReportingAgent | Отчетность | Построение карты цитирования, генерация структуры из трех уровней, написание разделов отчета с таблицами цитирования, пост-обработка цитирований | -**Report Generation Pipeline** +**Конвейер генерации отчетов** ``` -1. Build Citation Map → CitationManager.build_ref_number_map() -2. Generate Outline → Three-level headings (H1 → H2 → H3) -3. Write Sections → LLM uses [N] citations with provided citation table -4. Post-process → Convert [N] → [[N]](#ref-N), validate references -5. Generate References → Academic-style entries with collapsible source details +1. Построить карту цитирования → CitationManager.build_ref_number_map() +2. Генерация структуры → Трехуровневые заголовки (H1 → H2 → H3) +3. Написание разделов → LLM использует [N] цитирования с предоставленной таблицей цитирования +4. Пост-обработка → Преобразование [N] → [[N]](#ref-N), проверка ссылок +5. Генерация списка литературы → Стилизованные академические записи с раскрывающимися деталями источника ``` **Использование** @@ -1125,7 +1184,7 @@ asyncio.run(main()) -## ❓ ЧаВо +## ❓ Часто задаваемые вопросы
Не удается запустить backend? @@ -1202,6 +1261,29 @@ npm --version # Должно показывать номер версии
+
+Проблемы с длинными именами файлов при установке в Windows? + +**Проблема** + +В Windows вы можете столкнуться с ошибками, связанными с длинными путями файлов во время установки, такими как "Имя файла или расширение слишком длинное" или аналогичные проблемы с длиной пути. + +**Причина** + +Windows имеет ограничение по умолчанию на длину пути (260 символов), которое может быть превышено из-за вложенной структуры каталогов и зависимостей DeepTutor. + +**Решение** + +Включите поддержку длинных путей в системе, выполнив следующую команду в командной строке от имени администратора: + +```cmd +reg add "HKLM\SYSTEM\CurrentControlSet\Control\FileSystem" /v LongPathsEnabled /t REG_DWORD /d 1 /f +``` + +После выполнения этой команды перезапустите терминал, чтобы изменения вступили в силу. + +
+
Frontend не может подключиться к backend? @@ -1219,6 +1301,62 @@ NEXT_PUBLIC_API_BASE=http://localhost:8001
+
+Docker: Frontend не может подключиться при облачном развертывании? + +**Проблема** + +При развертывании на облачном сервере интерфейс показывает ошибки подключения, такие как "Не удалось получить данные" или "NEXT_PUBLIC_API_BASE не настроен". + +**Причина** + +Стандартный URL API - `localhost:8001`, который указывает на локальную машину пользователя в браузере, а не на ваш сервер. + +**Решение** + +Установите переменную окружения `NEXT_PUBLIC_API_BASE_EXTERNAL` на публичный URL вашего сервера: + +```bash +# Использование docker run +docker run -d --name deeptutor \ + -e NEXT_PUBLIC_API_BASE_EXTERNAL=https://your-server.com:8001 \ + ... другие параметры ... + ghcr.io/hkuds/deeptutor:latest + +# Или в файле .env +NEXT_PUBLIC_API_BASE_EXTERNAL=https://your-server.com:8001 +``` + +**Пример пользовательского порта:** +```bash +# Если используется порт бэкенда 9001 +-e BACKEND_PORT=9001 \ +-e NEXT_PUBLIC_API_BASE_EXTERNAL=https://your-server.com:9001 +``` + +
+ +
+Docker: Как использовать пользовательские порты? + +**Решение** + +Установите как переменные окружения портов, так и сопоставления портов: + +```bash +docker run -d --name deeptutor \ + -p 9001:9001 -p 4000:4000 \ + -e BACKEND_PORT=9001 \ + -e FRONTEND_PORT=4000 \ + -e NEXT_PUBLIC_API_BASE_EXTERNAL=http://localhost:9001 \ + ... другие переменные окружения ... + ghcr.io/hkuds/deeptutor:latest +``` + +**Важно**: Сопоставление портов `-p` должно соответствовать значениям `BACKEND_PORT`/`FRONTEND_PORT`. + +
+
Соединение WebSocket не удается? @@ -1233,6 +1371,66 @@ NEXT_PUBLIC_API_BASE=http://localhost:8001
+
+На странице настроек отображается "Ошибка загрузки данных" при использовании HTTPS обратного прокси? + +**Проблема** + +При развертывании за HTTPS обратным прокси (например, nginx), на странице настроек отображается "Ошибка загрузки данных", и инструменты разработчика браузера показывают, что HTTPS-запросы перенаправляются на HTTP (307 редирект). + +**Причина** + +Эта проблема была исправлена в версии v0.5.0+. Если вы используете более старую версию, проблема была вызвана автоматическими перенаправлениями с завершающей косой чертой от FastAPI, которые генерировали HTTP URL-адреса вместо сохранения исходного протокола HTTPS. + +**Решение (для v0.5.0+)** + +Обновитесь до последней версии. Исправление отключает автоматические перенаправления с косой чертой, чтобы предотвратить понижение протокола. + +**Рекомендуемая конфигурация nginx** + +При использовании nginx в качестве HTTPS обратного прокси используйте следующую конфигурацию: + +```nginx +# Фронтенд +location / { + proxy_pass http://localhost:3782; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; +} + +# API бэкенда +location /api/ { + proxy_pass http://localhost:8001; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; # Важно: сохраняет исходный протокол +} + +# Поддержка WebSocket +location /api/v1/ { + proxy_pass http://localhost:8001; + proxy_http_version 1.1; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection "upgrade"; + proxy_set_header Host $host; + proxy_set_header X-Forwarded-Proto $scheme; +} +``` + +**Переменная окружения** + +Установите в `.env`: +```bash +NEXT_PUBLIC_API_BASE=https://your-domain.com:port +``` + +См.: [GitHub Issue #112](https://github.com/HKUDS/DeepTutor/issues/112) + +
+
Где хранятся выходные данные модуля? @@ -1308,22 +1506,38 @@ python src/knowledge/extract_numbered_items.py --kb --base-dir ./data/
-## 📄 Лицензия -Этот проект лицензирован по **[Лицензии AGPL-3.0](../../LICENSE)**. + - @@ -1339,18 +1553,13 @@ python src/knowledge/extract_numbered_items.py --kb --base-dir ./data/ [⭐ Поставьте звезду](https://github.com/HKUDS/DeepTutor/stargazers) · [🐛 Сообщить об ошибке](https://github.com/HKUDS/DeepTutor/issues) · [💬 Обсуждения](https://github.com/HKUDS/DeepTutor/discussions) -[![Рейтинг звезд репозитория для @HKUDS/DeepTutor](https://reporoster.com/stars/dark/HKUDS/DeepTutor)](https://github.com/HKUDS/DeepTutor/stargazers) - -[![Форкеры репозитория для @HKUDS/DeepTutor](https://reporoster.com/forks/dark/HKUDS/DeepTutor)](https://github.com/HKUDS/DeepTutor/network/members) - -## История звезд - -[![График истории звезд](https://api.star-history.com/svg?repos=HKUDS/DeepTutor&type=timeline&legend=top-left)](https://www.star-history.com/#HKUDS/DeepTutor&type=timeline&legend=top-left) - --- -*✨ Спасибо за посещение **DeepTutor**!* +Этот проект распространяется под лицензией ***[AGPL-3.0](../../LICENSE)***. -Просмотры +

+ Спасибо, что посетили ✨ DeepTutor!

+ Views +

diff --git a/assets/roster/forkers.svg b/assets/roster/forkers.svg index 0cad1461..2b19d8b0 100644 --- a/assets/roster/forkers.svg +++ b/assets/roster/forkers.svg @@ -10,21 +10,21 @@ Forkers - + - + - + - + - + - -and 1,268 others + +and 1,388 others \ No newline at end of file diff --git a/assets/roster/stargazers.svg b/assets/roster/stargazers.svg index d8184c2d..90fb29b4 100644 --- a/assets/roster/stargazers.svg +++ b/assets/roster/stargazers.svg @@ -1,4 +1,4 @@ - + \ No newline at end of file diff --git a/config/agents.yaml b/config/agents.yaml index f8a0b813..37f030fe 100644 --- a/config/agents.yaml +++ b/config/agents.yaml @@ -55,3 +55,16 @@ co_writer: narrator: temperature: 0.7 max_tokens: 4000 + +# ============================================================================= +# Vision Solver Module - Image analysis and GeoGebra visualization +# ============================================================================= +# Agents: vision_solver_agent (with bbox, analysis, ggbscript, reflection stages) +# This module handles: +# - BBox: Visual element detection with pixel coordinates +# - Analysis: Geometric semantic analysis +# - GGBScript: GeoGebra command generation +# - Reflection: Command validation and correction +vision_solver: + temperature: 0.3 + max_tokens: 12000 diff --git a/config/main.yaml b/config/main.yaml index a5842ef7..e19175f3 100644 --- a/config/main.yaml +++ b/config/main.yaml @@ -1,5 +1,5 @@ system: - language: en + language: zh paths: user_data_dir: ./data/user knowledge_bases_dir: ./data/knowledge_bases @@ -10,6 +10,7 @@ paths: research_output_dir: ./data/user/research/cache research_reports_dir: ./data/user/research/reports solve_output_dir: ./data/user/solve + vision_solver_output_dir: ./data/user/vision_solver tools: rag_tool: kb_base_dir: ./data/knowledge_bases @@ -66,6 +67,23 @@ solve: max_iterations: 3 precision_answer_agent: enabled: true +vision_solver: + enabled: true + max_image_size: 10MB + geogebra: + xmin: -10 + xmax: 10 + ymin: -8 + ymax: 8 + stages: + bbox: + temperature: 0.3 + analysis: + temperature: 0.3 + ggbscript: + temperature: 0.3 + reflection: + temperature: 0.3 research: planning: rephrase: diff --git "a/docs/ref/wiki/ GeoGebra\344\275\277\347\224\250\346\214\207\345\215\227.md" "b/docs/ref/wiki/ GeoGebra\344\275\277\347\224\250\346\214\207\345\215\227.md" new file mode 100644 index 00000000..e1129137 --- /dev/null +++ "b/docs/ref/wiki/ GeoGebra\344\275\277\347\224\250\346\214\207\345\215\227.md" @@ -0,0 +1,2408 @@ +# 动态数学软件 + +![](images/98a0b381a893010bf30759b96da4417e1cceae8c23176ace4854e2cce4cac36e.jpg) + +# GeoGebra使用指南 + +北京师范大学 GeoGebra 学院(中国总部) + +GeoGebra Institute of Beijing Normal University + +(BNU), China Network + +# 前言 + +本使用指南基于GeoGebra Help 3.2并根据中国用户的使用习惯对章节和内容做了调整和修改。文中所有的操作和截图均使用GeoGebra 3.2在windows环境下完成。 + +GeoGebra Help 3.2 作者: + +Markus Hohenwarter, markus@geogebra.org + +Judith Hohenwarter, judith@geogebra.org + +《动态数学软件 GeoGebra 使用指南》编译: + +郭衍,guokan.mail@gmail.com + +GeoGebra Institute of Beijing Normal University(BNU), China Network 于 2011 年 5 月 25 日申请成立,作为 GeoGebra 在中国的首席学院,GeoGebra Institute BNU, China Network 将领导协助其他中国其他地方学院的建设与发展,致力于 GeoGebra 相关的数学教学和学习的研究工作,颁发中国 GeoGebra 用户水平认证,提供师范生和一线教师的专业培训,分享数学学习与教学的成功案例和先进经验,促进中国 GeoGebra 各地方学院间的合作。 + +组织成员: + +【主席】 + +曹一鸣,北京师范大学数学科学学院教授,博士生导师,中国数学会教育工作委员会副主任,全国数学教育研究会秘书长。 + +【其他成员】 + +王光明,天津师范大学数学科学学院教授,《数学教育学报》编辑部主任。 + +宁连华,南京师范大学数学科学学院副教授,硕士生导师。 + +马波,北京师范大学数学科学学院副教授,硕士生导师。 + +董连春,北京师范大学研究生,数学教育方向。 + +郭 衍,北京师范大学研究生,数学教育方向。 + +如发现书写有误或内容不当之处请发送邮件至:guokan.mail@gmail.com + +# 目录 + +# 第一章 GEOGEBRA简介 + +1. 什么是 GEOGEBRA? +2. 如何安装 GEOGEBRA? 3 +3. 认识 GEOGEBRA 4 + +# 第二章 GEOGEBRA 的用途 6 + +1. 学习时使用 6 +2. 演示时使用 +3.编辑时使用 9 + +# 第三章 绘图工具. 11 + +1.基本操作 11 +2. 一般工具 ..... 11 +3.点 13 +4. 线 ..... 13 +5. 向量 ..... 14 +6. 圆锥曲线 ..... 15 +7. 圆与多边形 ..... 15 +8. 数值与角度 16 +9. 几何变换 ..... 17 +10. 文字 ..... 18 +11. 图片 ..... 19 + +# 第四章 代数输入 ..... 20 + +1. 基本操作 ..... 20 +2. 数字和角 ..... 21 +3. 点和向量 ..... 22 +4. 直线和坐标轴 ..... 22 +5. 圆锥曲线 ..... 23 +6. 函数和运算 ..... 23 +7. 对象列表和运算 ..... 24 +8. 矩阵与运算 ..... 25 +9. 复数与运算 ..... 26 + +# 第五章 命令输入 ..... 28 + +1. 一般命令 ..... 28 +2.数值 28 + +3.角 30 +4.点 31 +5. 线 32 +6. 多边形 33 +7. 向量 33 +8. 函数 33 +9. 圆锥曲线 34 +10. 参数曲线 35 +11. 圆弧和扇形 35 +12. 文字 36 +13. 轨迹 37 +14.列表 37 +15. 几何变换 39 +16. 统计 40 +17. 电子表格 42 +18. 逻辑命令 43 + +# 第六章 菜单 44 + +1.文件 44 +2.编辑 45 +3. 查看 45 +4. 选项 ..... 46 +5.工具 47 +6.窗口 47 +7. 帮助 ..... 47 + +# 第七章 GEOGEBRA 的特性 ..... 49 + +1. 动画 ..... 49 +2. 显示条件 49 +3.自定义工具 50 +4.动态颜色 52 +5.JAVAsCRPT 52 +6.对象名称与标签 52 +7. 图层 53 +8. 重新定义 ..... 53 +9. 痕迹与轨迹 ..... 54 + +# 第一章 GeoGebra简介 + +# 1. 什么是 GeoGebra? + +GeoGebra 这款软件的名称拆开来就是“Geo”+“Gebra”,意思是结合了几何(Geometry)与代数(Algebra)。GeoGebra 是一个结合几何、代数、微积分和统计功能的动态数学软件,可应用于多平台(Window、Mac、Linux 等),提供 56 种语言支持,已在欧洲和美国荣获多项教育类软件奖项。 + +这是一款免费的开源软件,旨在帮助老师设计有趣的教学方法,为学校提供充满活力的数学教学。 + +# 1.1. 与“几何画板”的比较 + +概括说来,GeoGebra 几乎具备“几何画板”的全部功能,在绘图界面甚至比“几何画板”更加友好、易操作,同时具备“几何画板”没有的符号计算、微积分、统计等功能。另外,GeoGebra 具备开源的精神,在使用、交流、分享方面都给使用者提供了最大便利。 + +目前大多数主流教学软件是商业化的,这意味软件的可得性是受学校或学生的财务能力影响的。所以,无法买商业软件的一些老师或学生上网查寻免费软件,并根据他们的目的下载并使用这些软件。开源软件是指公开的软件的源代码,使用者可以自由使用、下载、修改与发布软件的可执行程序及程序源代码。这鼓励了开发、更新、维护开源软件的用户及团体和资助开发软件的开发商并形成一个合作的团队,使得软件日臻完善,而这种进步是商业软件远不能及的。教育技术也逐渐加入了这个趋势,做到这种“自由软件”的GeoGebra就是其中的一个。软件作者Hohenwarter说:“GeoGebra是免费软件,因为我相信教育应该是免费的。” + +这款软件很优秀,但 GeoCebra 软件背后的这种“开放与合作”的精神比软件本身更加可贵。 + +# 【软件使用费用】 + +“几何画板”软件是由美国Key Curriculum Press公司制作并出版的优秀教育软件,1996年该公司授权人民教育出版社在中国发行该软件的中文版。在“几何画板”的官方网站给出的几何画板5.0版(The Geometer’s Sketchpad version 5)的售价为:多用户版为69.95美元,订购100套以上为每个用户15美元;教师版为69.95美元;学生版为29.95美元。但是在网上搜索了很久也没有找到“几何画板”在中国大陆地区的售价,在人民教育出版社的网站上也没用找到“几何画板”的相关购买信息。所以也无法得知该软件在大陆地 + +区的售价了。 + +而GeoGebra是开源软件,提供免费在线安装和下载离线安装包。 + +# 【软件运行情况】 + +“几何画板”安装后可直接运行,运行效果流畅。 + +GeoGebra需在java环境下运行,所以先要在电脑中安装java虚拟机,需用户自己下载安装。目前的版本已经会提示安装,在网络环境下可以自动下载java并安装。 + +因为是在java下写的,所以GeoGebra也有着java的优点:跨平台、网页支持好。此外,GeoGebra还支持LaTex语法,可在画面上显示根号、次方及分数,这都是“几何画板”望尘莫及的,因此在网络上要讨论数学问题时,用GeoGebra来绘图是一个很好的选择。 + +# 【国内普及情况】 + +几何画板在我国已流传使用十几年了,深受广大理科教师特别是数学教师的喜爱。相对之下 GeoGebra 似乎少有人问津,这是百度搜索结果的比较。 + +但倘若使用google的外文搜索,GeoGebra和Geometer's Sketchpad(几何画板)的条目分别是672,000和98,500。这说明,GeoGebra在国际上的影响力是要大于几何画板的。(欧美国家开源软件的使用率是要高于国内的,免费软件很容易普及)加上GeoGebra开源的好处,适应多语言的支持,使用该软件更方便国际间的交流。 + +# 【操作难易程度】 + +学习 GeoGebra 软件并不难,主要是用鼠标来模拟尺规作图,几何画板和 GeoGebra 的使用方法其实大同小异,若是有操作过几何画板的经验,学习 GeoGebra 应很快就可熟悉。 + +若对 GeoGebra 的使用有任何问题,可上 GeoGebra 的讨论区和世界各地的使用者请教解决方法,也可对 GeoGebra 的未来发展提出建议,是要多加些什么功能或指令,程序设计者 Markus Hohenwarter 还会亲自回答。 + +此外,GeoGebra 还比几何画板多了统计和微积分的相关功能。 + +# 1.2. 国际 GeoGebra 学院 + +国际 GeoGebra 学院(International GeoGebra Institute,IGI)提供免费的动态数学软件和专业知识的培训,支持发展面向全体学生和教师的教学资源分享,以改善世界各地的数学、科学和技术教育。它培育和促进一线教师和研究人员之间的协作,努力建立自我维持发展的用户群体。 + +国际 GeoGebra 学院是一个非营利组织,设立了以下三个目标: + +一.培训和支持:提供专业发展机会,服务帮助职前教师和在职教师。 +二.发展和共享:分享研究素材、教学资源,不断改善和发展动态数学软件GeoGebra。 + +三.研究与合作:支持与GeoGebra相关的数学教学和学习的研究工作,促进国际GeoGebra和地方GeoGebra学院间的国际合作。 + +# 2. 如何安装 GeoGebra? + +下面介绍在windows下GeoGebra软件的安装方法。 + +# 2.1. 在线安装 + +首先打开 GeoGebra 的官方网站:http://www.geogebra.org/cms/ + +点击导航栏中的“Download”,按下“Webstart”即可在线安装。 + +由于 GeoGebra 是在 Java 环境下运行的软件,所以若您的电脑没有安装 Java,安装过程中会进行 Java 环境的安装。结束 Java 的安装后,就可以安装 GeoGebra 软件了。 + +# 2.2. 离线安装 + +如果您的电脑没有网络环境,可以使用离线安装包安装 GeoGebra 软件。 + +首先找到一台具备网络条件的电脑,在点击“Download”后点击“Offline Installers”链接,在新打开的页面中选择“Windows”下载离线安装包。 + +双击运行离线安装包,在安装界面中选择“Chinese(Simplified)”即可安装简体中文版的GeoGebra软件。 + +![](images/dc91ffd7f047021fdd4b43ce2c3c348d90d0330155428c6d3bbbe32eeaab4c42.jpg) + +GeoGebra 的安装文件很小,可以方便的放在移动存储设备中复制到没有没有网络环境的电脑上,进行离线安装。 + +# 3. 认识 GeoGebra + +为了配合多种数学功能的实现,GeoGebra 提供三种操作区域:代数区、绘图区和电子表格。这些操作区域分别对应实现不同的数学需求,如代数功能(方程、函数)、几何功能(画图、描点、函数图像)和统计功能。 + +![](images/c8431229ad86f2a4fcc080a4cb200a6f14118fb11241f3ae49d0e375888954a8.jpg) + +# 3.1. 代数区 + +GeoGebra 支持在命令框中直接输入代数表达式,输入完成后回车,所输入的代数表达式即可在代数区中显示,同时相应的几何图形也会在绘图区出现。GeoGebra 提供了很多命令,可以在命令框右侧的“Command”中选择需要的命令,选择命令后可以按 F1 键获得相应的语法帮助。 + +在代数区中,数学对象被分成“自由对象”和“派生对象”,如果生成的一个对象没有使用任何已有的对象,那么这个对象即称为“自由对象”,相反,如果新生成的对象是依赖已有的对象,则被称为“派生对象”。 + +还有一类对象称为“辅助对象”。在代数区中,选中某个对象,点击右键选择“属性”,在属性对话框的“基本”选项卡中,勾选“辅助对象”,该对象便会成为“辅助对象”,在默认设置下“辅助对象”在代数区中不显示。 + +# 3.2. 绘图区 + +可以使用工具栏中的绘图工具在绘图区中利用鼠标进行几何作图。从工具栏中选择一种绘图工具,可由在工具栏右侧显示的工具说明来了解绘图工具的使用方法。在绘图区内画出的几何对象都会在代数区中产生一个代数表达式。 + +绘图工具栏中的每个图标都是一个工具箱,里面包括了一系列相似的绘图工具,点击图 + +标右下方的箭头就可以打开工具箱,显示该工具箱中的所有绘图工具。 + +GeoGebra 中的数学对象是可以被修改的。在代数区中双击要修改的数学对象,即可修改表达式,修改完成后回车即可;或者选择移动工具,在绘图区中双击要修改的数学对象,弹出“重新定义”对话框,修改表达式后点击“确定”即可。 + +# 3.3. 电子表格 + +在电子表格中,每个单元格都有相应的名称来指定位置。如在第一行第一列的单元格称为A1,这有些类似OfficeExcel。在相关表达式中,可以用单元格的位置名称来代替单元格中的数据。 + +在电子表格中,不但可以输入数值,还可以输入 GeoGebra 可执行的数学对象(如坐标、函数、命令等)。在电子表格汇总输入的数学对象 GeoGebra 会在绘图区画出相应的图像,并用单元格的位置名称为该图像命名。 + +# 第二章 GeoGebra 的用途 + +# 1. 学习时使用 + +# 1.1. 自定义用户界面 + +GeoGebra 的用户可以根据自己的需要在“查看”菜单中自定义软件界面,选择是否显示某些操作区、对象以及更改窗口布局。 + +# u 自定义操作区 + +用户可以选择显示或隐藏绘图区中的对象。在代数区中找到图像所对应的代数表达式,单击表达式前方的小圆点,即可改变几何对象的显示或者隐藏。空心点表示隐藏对象,实心点表示显示对象。 + +为了调整绘图区的可视部分,可以使用移动绘图区工具,然后按住绘图区拖动即可改变绘图区的可视部分。 + +此外,还可以通过以下方法来改变绘图区的显示比例: + +1. 使用工具栏中的放大和缩小工具来放缩绘图区; + +2. 使用鼠标的滚轮来放缩绘图区; +3. 使用快捷键 Ctrl + 放大和 Ctrl - 来缩小; +4. 在绘图区中用鼠标右键拖选矩形区域,松开鼠标后被选中区域即可放大。 + +# u 自定义坐标轴和网格 + +在“显示”菜单中可以选择显示或隐藏绘图区的坐标轴和网格。在绘图区中点击右键,选择“绘图区”可以修改绘图区属性: + +1. 修改坐标轴。在“坐标轴”选项卡中,可以改变坐标线的样式和坐标轴的单位和范围大小。还可以选择“X轴”和“Y轴”选项卡分别修改坐标轴。 +2. 修改网格。在“网格”选项卡中,可以修改网格的颜色、样式及范围大小。 + +# u 自定义工具栏 + +在“工具”菜单中选择“自定义工具栏”来自定义工具栏,在打开的对话框中左侧是工具栏当前的状态,右侧是可供添加的工具,可以使用“插入”或“移除”来修改当前工具箱。 + +如果想恢复原来的设置,可以点击对话框左下角的“恢复工具栏默认设置”按钮。 + +# 1.2. 修改对象属性 + +在要修改的对象上点击右键,选择“属性”对话框,即可修改对象属性(颜色、颜色及是否显示等)。 + +在属性对话框中,对象是分类排列的(如点、直线、圆等),当对象的数量较多时,很容易在某个类别中修改一个或多个对象。如选择类别“直线”,可以对所有此类对象进行修 + +改。 + +修改时使用右侧的选项卡来修改对象的颜色、样式等属性。修改完成后点击对话框右下角的“关闭”按钮即可。 + +# 1.3. 对象的右键菜单 + +对象的右键菜单可以进一步修改数学对象的行为或属性。 + +![](images/92db8d4c0a30d314a102ddda65781beaa07fc69d3a439c238a37aafd29074b3a.jpg) + +在对象上点击右键,即可显示对象的右键菜单。可以修改对象的代数表达式的形式,可以选择是否显示对象,还可以执行改名、删除等操作。 + +# 2. 演示时使用 + +# 2.1. 作图过程导航条 + +![](images/ffece1f0d184a74d5602e5e20265a9defc4a2ab3f3716647250ddd1ac6b6e009.jpg) + +GeoGebra 提供并运行通过此功能展示 GeoGebra 文件的绘图步骤。只需在“查看”菜单中勾选“组图过程导航条”即可显示该功能,导航条位于绘图区底部。 + +按钮:回到第一步; +按钮:回到前一步; +按钮:前进到下一步; +按钮:前进到最后一步; +播放 按钮:自动播放(播放速度可设定); +暂停 按钮:停止自动播放; +按钮:显示作图过程。 + +# 2.2. 作图过程 + +在“查看”菜单中选择“作图过程”选项,可显示作图过程窗口。该窗口以表格的形式呈现。借助“作图过程”对话框底部的导航条,可以逐步重现已经完成的作图过程。 + +# 在“作图过程”对话框中,可以使用键盘进行操作: + +1. 使用键盘的 $\uparrow$ 键可以移动到前一个步骤; +2. 使用键盘的 $\downarrow$ 键可以移动到下一个步骤; +3. 使用键盘的 Home 键可以移动到第一个步骤; +4. 使用键盘的 End 键可以移动到最后一个步骤; +5. 使用键盘的 Delete 键可以删除所选择的步骤。 + +# 在“作图过程”对话框中,也可以使用鼠标进行操作: + +1. 双击鼠标左键选取想要选择的步骤; +2. 想要移动作图步骤可以使用鼠标进行拖拽,但如果该步骤有依赖的其他步骤,则这个操作不一定可以实现; +3. 在任何一行上点击鼠标右键可以打开该步骤的右键菜单,可以修改相关属性。 + +在“作图过程”对话框中可以在任意位置插入作图步骤:使用鼠标左键选取要插入的步骤的前一步,关闭作图过程对话框后新建对象,这样新的作图步骤就会被插入“作图过程”中了。 + +# “作图过程”导出为网页: + +GeoGebra 可以将“作图过程”导出为网页。首先在“查看”中打开“作图过程”对话框,然后在该对话框的“文件”菜单中选择“导出为网页”选项。 + +![](images/bc8ac77223be3a832cad34cb9243c9d93f3af3bbba5577401c74c2d65d950074.jpg) + +在“导出:作图过程”对话框中,可以输入“标题”“作者”和日期,以及是否输出绘图区的图像,宽度和高度。此外,还可以选择“彩色显示作图过程”,这样每一个作图步骤与数学对象就会用相同的颜色显示。 + +# 2.3. 更改GeoGebra的设定 + +可以在 GeoGebra 的选项菜单中修改并存储用户的偏好设定。 + +例如,可以改变“角的单位”,可以选择“角度”或者“弧度”。还可以更改“点的样式”“复选框大小”和“直角样式”等等。 + +根据使用需要更改好 GeoGebra 的设定后,可以点击“选项”菜单中的“保存设定”。GeoGebra 就会保存用户的个人偏好设定,以后就会以用户偏好作为预设来启动软件。 + +# 3. 编辑时使用 + +# 3.1. 打印 + +# 打印绘图区 + +GeoGebra 可以打印绘图区的图形。在“文件”菜单中点击“打印预览”选项。在弹出的窗口中,可以填写“标题”“作者”和“日期”。还可以设定打印的比例大小以及打印板式(横版或竖版)。 + +# 打印作图过程 + +要打印作图过程,首先要在“显示”菜单中选择显示“作图过程”。然后点击“文件”中的“打印预览”,和打印绘图区类似的,可以输入“标题”“作者”和“日期”以及设定相应选项。 + +# 3.2. 导出绘图区 + +# 生成图像 + +在 GeoGebra 的文件菜单中,可以将绘图区以图片的方式存储在电脑中。 + +在绘图区完成绘图工作后,单击“文件”选择下拉菜单中的“导出”下的“生成图像”。 + +![](images/1c7cacbf498b19306bf622448ec3d395d7856011d7819cb5781311e8646c76e4.jpg) + +在改窗口中,可以选择导出图片的格式,可供选择的格式有png、pdf、eps、svg和emf。 + +# 复制图像 + +有多种方法可以复制绘图区到电脑的剪贴簿: + +1. 单击“文件”选择下拉菜单中的“导出”下的“将绘图区放入剪贴板” +2. 单击“编辑”菜单,点击“将绘图区放入剪贴板” + +绘图区的截图将以png格式被复制到系统的剪贴板中,该图片可以被粘贴到其他程序之中,如:Office Word或PowerPoint + +# 3.3. 生成动态网页 + +GeoGebra 可以导出成为动态网页,成为“网页形式的动态工作表(html)”。在“文件”菜单中,选择“导出”后点击“网页形式的动态工作表(html)”,打开“导出:动态网页”对话框。 + +1. 在对话框中可以输入网页的“标题”“作者”和“日期” +2. 在“一般”选项卡中,可以在动态图形的前后加入一些文字,也可以将绘图窗口直接放在网页之中(在绘图区中打开程序窗口的按钮) +3. 在“高级”选项卡中,可以设定动态网页的一些功能,如:允许使用鼠标右键、在绘图区上按两下,就可以启动GeoGebra;以及修改用户界面,如:显示“菜单栏”、显示“工具栏”等。 + +说明:导出动态工作表时会生产三种文件: + +1. html 文件,此文件是工作表本身 +2. ggb 文件,此文件包括 GeoGebra 图形 +3. geogebra.jar,此文件包含GeoGebra以保证工作表的互动功能。 + +这些文件必须放在同一个文件夹下,动态网页才能正常运行。 + +导出的html文件可以在网页浏览器下查看(如:Internet Explorer,Chrome,Firefox等)。要求电脑必须安装Java模拟器才能正常查看和使用动态网页。 + +# 第三章 绘图工具 + +当鼠标移动到绘图区中的数学对象(如:点、向量、线段、多边形、曲线、直线及圆锥曲线)上时,该对象会出现选中效果,并出现该对象的说明。 + +下面介绍的绘图工具(工具组)可以由点击工具栏的按钮来启动,点击工具按钮右下角的小箭头可以开启工具组,有类似的工具可供选择。 + +# 1. 基本操作 + +# 1.1. 选取对象 + +要选取对象,先点击移动工具,然后使用鼠标选取对象。 + +加入想要同时选取多个对象,可以使用鼠标拖拽出一个区域:先点击移动工具,然后再要选取的区域左上角按下鼠标左键,不松开直到要选取区域的右下角,松开鼠标左键,所有在该区域内的对象就都被选取了。 + +也可以按住Ctrl键点击多个对象,也可以选取多个对象。 + +# 1.2. 对象改名 + +如果要更改选取对象或新增对象的名称,只需在该对象上点击鼠标右键,选择“改名”打开对话框,输入新的名称,点击“确定”即可。 + +# 2. 一般工具 + +![](images/141c6a4b0574a479cfb38846d4cb24f9e633d872d60544107801983b61001566.jpg) + +# 复制样式 + +此工具可以复制对象的样式(如:颜色、大小、线宽)。 + +先选取想要复制样式的对象,然后点击其他对象。 + +![](images/bf0a6e0cd5d490b5f1d66b7cc9490eb289915864c6f519c1a8e29cd9bc202db3.jpg) + +# 删除 + +点击要删除的对象。 + +![](images/6473d79f9ff2c2b69398caf4a64f8b488acb30f77f6bb6913696a58804327a65.jpg) + +# 移动 + +使用鼠标拖拽对象。在移动模式下,使用鼠标点击一个对象,可以按 Del 键删除对象,也可以使用方向键移动对象。 + +![](images/02ab2012501d5373139714e64f566043cd64dc261725ea1cccff3cf553e0ea98.jpg) + +# 移动绘图区 + +使用鼠标拖拽移动绘图区来改变可视范围。 + +在此模式下使用鼠标拖拽坐标轴,可以改变坐标轴比例。 + +![](images/2bd4facc2e2a243b1d432fd0b3f487bf1b17f0edde66492a4fad7f51d40fdbde.jpg) + +# 记录到电子表格 + +此工具能够在对象移动的同时,将该对象的变化数值记录在电子表格中。此工具仅对数值、点、向量有效。 + +GeoGebra会使用电子表格的前两个单元格来记录被选取对象的数值。、 + +# a=判断对象关系 + +选取两个对象,在弹出的 Relation 窗口中即可显示二者的关系。(如下图,两点不相等) + +![](images/d754b1e75b29cbb44f44d3e95b93384ceab01c480efa296e8fca58b2c1b1e362.jpg) + +![](images/1bbe99ee41071bfc3a9b61a549814c5b90319c46f1bddcc1fcbdce4ba37fb56d.jpg) + +# 绕点转动 + +先选取旋转的中心点,然后用鼠标拖拽对象以此中心旋转。 + +![](images/a3d0f3d64781097bb112942c70aea04736d69ea3dd373d49a6f68d1589294500.jpg) + +# 显示或隐藏标签 + +点击对象显示或隐藏其标签。 + +![](images/913b31094bb0051888cd6a0fbeecf57a9732a10305357334fe8d579a797350c9.jpg) + +# 显示或隐藏对象 + +点选对象来切换其显示或隐藏状态。点选完后,需要按 Esc 键。 + +![](images/6fcabddf1d60d96a2b4b3fd24a76578baf6c17afd1d74b36d4ae4b7afb54b512.jpg) + +# 放大 + +按一下绘图区即可放大(或使用鼠标滚轮)。 + +![](images/8a724b0720bf024b61469ef843859352c1d872348462bdbe0cf268d6f58f8510.jpg) + +# 缩小 + +按一下绘图区即可缩小(或使用鼠标滚轮)。 + +![](images/8f90e682f10d84a12e4c4bc888d0c9da3583d26e72a3e43eb48146b45ccb5daf.jpg) + +# 对象群组显示隐藏按钮 + +点击绘图区,会生成一个可以显示或隐藏对象群组的选择按钮,可以在窗口中设置哪些对象受该按钮控制。 + +![](images/33e91eef39c56c64d8956e581c77f3384872a3a128c9a351aa30d9485280b5d0.jpg) + +# 轨迹 + +点选一个会随着点A变化的点B,然后点击点A,就会产生B的轨迹。 + +# 3. 点 + +![](images/f0f0de9195673c6b290fe3db1cd6aa7312be3ae06ab85ee75bb530dc4b7b2639.jpg) + +# 交点 + +分别选择两个相交的对象,或者直接用鼠标点出交点。 + +对呀线段、射线或者弧,可以选择是否要落在外部交点。这可以用于做出落在对象延长线上的点。如:射线或线段延伸就是一条直线。 + +![](images/5669f1193b3cfefb9e2ac2292530454e33a89c7067b789766acb2c414cf1b7f3.jpg) + +# 中点或中心点 + +可以点选两点或一条线段以得到中点,也可以点选一条圆锥曲线以得到该圆锥曲线的中心点。 + +![](images/4d09d48ecf809dcc2ea065a2603994e029d2bf1090ed2cb713d6a664fd02d844.jpg) + +# 新点 + +点击绘图区空白处,或者点击某对象。 + +当点击线段、直线、多边形、圆锥曲线、函数或曲线可以在对象上新增一点。 + +当点击两对象相交处时可建立交点。 + +# 4. 线 + +# 4.1. 直线 + +![](images/b1d54eb7cb883f54e08cc8513ce48b447def2ad4caaca995170506d9e22c5141.jpg) + +# 直线(过两点) + +点选两点A和B,可以建立过点A和B的直线,代数区会显示相关的直线方程。 + +![](images/d5ec7ce61a98016c90d9be5143097598b42684ec5079f4e8de9e5f6958b2d115.jpg) + +# 平行线 + +点选一条直线a和一点A,可以画出一条过点A并平行于a的直线。 + +![](images/cc6cdae42435ad6c8f4ae32c5f179717ad03565d18acd794d166fb978327c303.jpg) + +# 垂线 + +点选一条直线a和一点A,建立一条过点A且垂直于a的直线。 + +![](images/27a321d6626b2aee61b1d2a03439e615cd35090ff55c998baca67ed0a18a2f21.jpg) + +# 中垂线 + +点选一条线段a或两点A和B建立中垂线(垂直平分线)。 + +![](images/424c8eafecc4b9b934ab0d01aef1fdd1bd5227843a31b4016f5ed46f13357fde.jpg) + +# 角平分线 + +有两种方式可以作出角平分线: + +1. 点选三点 A, B, C, 将生成以 B 为顶点, 三点围成的角的角平分线 +2. 点选两条线段,生成其角平分线 + +![](images/1af2a470aeda35ad3d7cb6e42a1648fc14bbd204cd8f841bfd0580fb5cb4fea6.jpg) + +# 切线 + +有三种方式可以作出切线: + +1. 选取一点A和一圆锥曲线a,可以生成过点A切与a相切的切线 +2. 选取一条线 c 和一圆锥曲线 a,可以生成平行于 c 与 a 相切的切线 +3. 选取一点A和一函数f,可以生成f在 $\mathbf{x} = \mathbf{x}$ (A)处的切线 + +![](images/8098771fdcd66ebf060890599220f5c92a540238234a988a8f81b9f8b4870781.jpg) + +# 极线或径线 + +1. 点选一点和圆锥曲线,可以画出极线 +2. 选取一线和圆锥曲线,可以画出径线 + +![](images/1aff0f22a290c5574046e29ae1809623e8497ea1071bb7b3ec3402540db3cbd3.jpg) + +# 回归直线 + +用鼠标拖拽选取一些点可以得到它们的回归直线,代数区会显示回归方程。 + +# 4.2. 线段 + +![](images/29ec7dca224dbbf89556ad9dce0585ba3337da9990c8faf5498a43e1bc271aea.jpg) + +# 线段(过两点) + +选取点A和点B,建立线段AB。在代数区会显示代数的长度。 + +![](images/d1fa6980d2e4119aa5fd173f9b9543b52e306fe35ce2b485abbcdefba19fb145.jpg) + +# 线段(指定起点、长度) + +点选点 A 作为线段的起点,在出现的窗口中输入长度,点击“确定”。 + +# 4.3. 射线 + +![](images/b221215b6ac3d40bc521e10170c090b06848925c0020dd512653da5a6caceba6.jpg) + +# 射线(过两点) + +点选两点 A 和 B,建立以 A 为起点通过 B 的射线,在代数区会显示相关的方程。 + +# 5. 向量 + +![](images/3a87250b21be7cf1a72fefc39e16c7f1684d832791aed8b5fe70fa0c2dec9ce1.jpg) + +# 向量(过两点) + +点选向量的起点或终点。 + +![](images/59c60f6dfd482385ff4b129cd514326eb87450f539c59fc2de941f112007245b.jpg) + +# 向量(指定起点、向量) + +点选起点 A 和另一个向量 v 以建立点 $\mathrm{B} = \mathrm{A} + \mathrm{v}$ , 可以做出从 A 到 B 的向量。 + +# 6. 圆锥曲线 + +![](images/24c722794edf2fe837de8c826de1ee6ca6605e13f8c01f19464b2131b595fd33.jpg) + +# 圆(指定圆心与一点) + +点选一点 M 和一点 P,可以画出以 M 为圆心过点 P 的圆。 + +![](images/45654885b173f81f48e21f6187c798c7dafd5960e6641deba2c15c97a083ba86.jpg) + +# 圆(指定圆心与半径) + +点选圆心M,在出现的窗口中输入半径,点击“确定”,即可画出指定圆心与半径的圆。 + +![](images/dd87f2320d30a2b101b78a614aa5378ed1bcc4f3724756afe0aea98f127032b8.jpg) + +# 圆(过三点) + +点选三点A,B,C可以画出过此三点的圆。 + +![](images/80514e2c87aa7837decdabb3320e842bdfcf2e2bd25925ae3b2c7f7a7320d0f8.jpg) + +# 圆(半径长、圆心) + +先选择一条作为半径长的半径,再指定圆心画圆。 + +![](images/f26808c2709518db4084c324dafac45e1681701be18ab66c45139ee8aaac4c7d.jpg) + +# 圆锥曲线(过五点) + +点选五个点,生成一个过此五点的圆锥曲线。 + +![](images/f007498656fdba651f1567cdf9df5fe027b22a22e7514eb416153588e496cbfb.jpg) + +# 椭圆 + +点选两点作为椭圆的焦点,然后选定椭圆上一点,画出椭圆。 + +![](images/85ae0accf2115a8f4f40d4425cd2750415e11818ab81de76e07d343136f9d654.jpg) + +# 双曲线 + +点选两点作为双曲线的焦点,然后选定落在双曲线上的一点,画出双曲线。 + +![](images/8552324548e5f3e05b2bc151f6b208214e43db87348b68a5735c614a1a8c0067.jpg) + +# 抛物线 + +点选一点和抛物线的准线,画出抛物线。 + +# 7. 圆与多边形 + +# 7.1. 圆弧与扇形 + +圆弧的代数值即为其长度,扇形的代数值为其面积。 + +![](images/e38e1516763e5abd7e0a9dce2a11fe900786ba8f8655426ccbb1a0ab7d4ae14d.jpg) + +# 圆弧(指定圆心与两点) + +先选取圆弧的圆心M,然后点选起点A和终点B,画出圆弧AB。 + +![](images/5be419bd7d41b0819c24e67838ad020259d6967257bf0ddb7efb82fb8e7b26bf.jpg) + +# 扇形(指定圆心与两点) + +先选取扇形的圆心M,然后点选起点A和终点B,画出扇形。 + +![](images/c6ad3b43ce32c6c6698ebd819cb2bd543be0585422627afac753babf3998f5f2.jpg) + +# 圆弧(过三点) + +点选三点 A, B 和 C 建立通过三点的圆弧。点 A 为圆弧起点, 点 B 在圆弧上, 点 C 为圆弧的终点。 + +![](images/4a433465988fdcf21c2513e7403443cef8b486088b11bdcf6833f181ced2d998.jpg) + +# 扇形(过三点) + +点选三点A,B和C建立扇形。点A为扇形圆弧起点,点B在扇形圆弧上,点C为扇形圆弧的终点。 + +![](images/1daf11f5ce8ca9e5308e7b6733e6a7d12cec96640ca3971987a57c721442c752.jpg) + +# 半圆(过两点) + +点选两点A和B,建立以线段AB为直径的半圆。 + +# 7.2. 多边形 + +![](images/d643eb4023d4ca2ae44046867e5dc31b4ce3166fd1662ff3ce8fa23faacd640f.jpg) + +# 多边形 + +至少点选三个点作为多边形的顶点,最后再点一下第一个顶点以建立一个封闭的多边形。在代数区会显示多边形的面积。 + +![](images/03ed982650da058a892d1d6e9bce36719d9cc42193875b5389245ac457527ff3.jpg) + +# 正多边形 + +点选两点 A 和 B 并在出现的窗口中输入整数 n,即得到一个有 n 个顶点,线段 AB 为边长的正多边形。 + +# 8. 数值与角度 + +![](images/618496f7324926a52377e377de73df11a4d540a856f733677026a7dc086c4acb.jpg) + +# 测量角度 + +该工具可以测量三点间的角度,两条线段、两条直线、两向量的角度和多边形的内角。 + +![](images/54f43745f0fcc6de23e3a9e1e744e7f7e4e2e0c1e69c4f906a28a0a06b6f0197.jpg) + +# 画指定角 + +点选两点 A 和 B, 在窗口中输入角度大小, 将建立点 C 和角度 $\alpha, \alpha$ 为角 ABC。 + +![](images/1c97a5b2e6781f75d1f3f33e046a91404fc86c431a6ced41f32901b410635f8b.jpg) + +# 测量面积 + +可以测量多边形、圆或者椭圆的面积,将面积值显示在绘图区中。 + +![](images/cdb9afb5b2fcda3161721dc2aef45937d32a7e414e5ef5a475e708e5e1938592.jpg) + +# 测量距离 + +可以测量两点、两直线或者一点与一直线间的距离,将距离值显示在绘图区中。也可以测量线段长度、圆的周长或者多边形周长。 + +![](images/d54d4fe498da9afbf49ebe4e0f684971df9118b7dfecbf67c1447bbd317c03bc.jpg) + +# 斜率 + +可以求出直线的斜率,将斜率值显示在绘图区中。 + +![](images/66a96fc496c949b4b7419cacc62e5f305521640a55f51844c4de5f78d1ed0a9a.jpg) + +# 滑杆 + +点击绘图区的任意位置可以建立数值或角度滑杆,在出现的窗口中,可以指定名称、数值或角度的变化区间,角度或者长度的增量,还可以设定滑杆水平或垂直以及宽度。 + +数值滑杆的位置在绘图区可以是绝对定位的(滑杆固定在绘图区上,不会因为放大缩小而改变或是消散)也可以是相对于坐标系的。 + +# 9. 几何变换 + +下列几何变换可用于点、直线、圆锥曲线、多边形和图片。 + +![](images/8ac19a3525e0a13193c2c8f295654366484c9b9a06ef3cb04c19e52fb64fc858.jpg) + +# 以某点为中心伸缩对象 + +先选取要进行伸缩变换的对象,在点选缩放中心,在窗口中输入缩放比例。 + +![](images/8e73671cbfb7984fcf22777e1d814db3117499f18542ac1649bf05e76e7fadb2.jpg) + +# 做轴对称 + +先选要做轴对称的对象,再选对称轴。 + +![](images/a3eb2536e236b0b032ca4689219920b4b7a37ffd1fd3c3ed287e48503fb31ae2.jpg) + +# 做点对称 + +先选要做对称的对象,再选对称中心。 + +![](images/c18ac65ffe53e5daec52008752403df6f083bf28dc50e7c551e001d0995cd92a.jpg) + +# 反演 + +该工具可以让一个点对一个圆进行反演。先选取要进行反演的点,再指定圆指定为反演圆。 + +![](images/cae8118cec17fd8cd32f43bb37ee02a91dd5b355cff3b5304d75df0e88809e4a.jpg) + +# 旋转 + +先选取要进行旋转的对象,再点选某点作为旋转中心,在窗口中输入旋转角度。 + +![](images/de324edae4ff7b96f37c2f00431926b514e1fe8ef3da9a8a3ccb20d5b94d427e.jpg) + +# 平移 + +先选取要平移的对象,再点击平移的向量。 + +# 10. 文字 + +# ABC 插入文字 + +此工具可以在绘图区中输入静态文字(纯字符)动态文字(包含变量)或LaTex公式。首先,需要指定文字的位置,然后在窗口中输入文字。 + +动态文字包含对象的数值。可以用键盘输入静态部分(如:Point A=)然后点击要显示其数值的对象。GeoGebra 会在变量文字中自动增加必要的语法结构,文字的静态部分加上双引号再加上+这个字符来连接文字的不同部分。如下例: + +![](images/3e6f8614af54727ebb86fc0e673daa6226ec1b70b5fba7aaf1ad2b058eb74918.jpg) + +
输入说明
文字显示“文字”(静态文字)
“Point A=”+A动态显示“Point A=(1.19,1.47)”(坐标会随着A点的位置而变化)
“a=”+a+“cm”动态显示“a=1cm”
+ +如果对象的名称已经存在,可以建立静态文字作为对象的名称,但需要使用双引号,否则GeoGebra会视为动态文字,显示对象的数值。但是如果输入的文字不是已经存在的对象名称,则不需要加双引号。 + +在GeoGebra中可以输入LaTex公式,在窗口中勾选“LaTex公式”即可输入符合LaTex语法的公式。下面适当说明一些重要的LaTex语句: + +
LaTex 输入输出结果
a \cdot ba · b
\frac{a}{b}
\sqrt{x}
\sqrt[n]{x}
\vec{v}
\overline{overline{1}}
x^2
a_{1}x1
\sin\alpha\beta + \cos\alpha\betasin a + cos b
\int_{a}^{b} xdx\(\int_{a}^{b}x dx\)
\sum_{i=1}^{n} i^2\(\sum_{i=1}^{n} i^2\)
+ +# 11. 图片 + +![](images/7c305967deada55a68f02daf896ccbb9b2c7fa5b0797fc32c723102d1af4d0b6.jpg) + +# 插入图片 + +该工具可以在绘图区插入图片。先在绘图区选定位置,然后出现“打开”窗口,选择要插入的图片,点击“打开”。 + +位置:图片的位置可以是绝对定位也可以是相对于坐标系的,可以在图片的“属性”中设置。也可以在“属性”中指定三个顶点来设置图片位置,这样就可以缩放、选择甚至扭曲图片了。 + +背景图:可以在“属性”中将图片设置为背景图,背景图会放在坐标轴后,且无法被鼠标选取。 + +透明度:为了使图片后面的对象或坐标轴可见,图片可以设定为透明,透明度可以在“属性”中设定,从 $0\%$ 到 $100\%$ 。 + +# 第四章 代数输入 + +数学对象的代数特性(如:数值、坐标、方程式)都会在左侧的“代数区”显示。如果要建立或者修改对象,可以使用GeoGebra底部的“命令框”直接输入代数式。 + +在命令框中输入代数式后,按Enter即可。按Enter键可以在命令框和绘图区之间快速切换,而不需要使用鼠标点击。 + +# 1. 基本操作 + +# 1.1. 对象命名 + +使用命令框建立一个对象时,可以给予对象特定的名称: + +【点】在 GeoGebra 中,点的命名是用大写字母表示的,只要在坐标前面加上名称与符号即可。 + +如: $\mathbf{C} = (2,4)$ , $\mathrm{P} = (1;180^{\circ})$ ,Complex $= 2 + \mathrm{i}$ + +【向量】为了区别点和向量,在GeoGebra中使用小写字母命名向量。同样,在向量坐标前加上名称和等号即可。 + +如: $\mathrm{v} = (1,3)$ , $\mathbf{u} = (3;90^{\circ})$ ,Complex $= 1 - 2\mathrm{i}$ + +【直线、圆、圆锥曲线】这些对象的命名是在方程式前面加上名称与冒号。 + +如:g: $\mathrm{y} = \mathrm{x} + 3$ ,c:(x-1) $\hat{\mathbf{\alpha}} 2+$ (y-2) $\hat{\mathbf{\alpha}} 2 = 4$ ,hyp: $\mathrm{x^2 - y^2} = 2$ + +【函数】在函数式前面加上名称即可。 + +如:f(x)=2x+4,g(x)=x^2,trig(x)=sin(x) + +如果不手动输入对象名称,GeoGebra会按字母顺序自动命名; + +如果对象名称包含下标可以使用“_”来建立下标,如:输入A_1可以得到 $\mathrm{A}_{1}$ + +# 1.2. 修改数值 + +有两种方法可以改变自由对象的数值: + +1. 修改对象的数值,可以在命令框输入名称与新的数值 + +如:如果要修改 $a = 3$ ,可以在命令框输入 $a = 5$ ,然后按 Enter 键即可。 + +2. 编辑代数式,使用移动工具在代数区双击对象,即可编辑对象的数值,编辑完成后按 Enter 即可。 + +如果直接修改自由对象的数值,则相应的派生对象的数值也会跟着改变。 + +# 1.3. 显示命令框的输入记录 + +将鼠标移至命令框,可以使用键盘的上下键,一步一步浏览先前输入的命令。 + +点击命令框左侧的帮助按钮,可以显示命令框的说明。 + +![](images/a5ce12eef5e5317ed19d895f8d53878e1c0eea4f989e9debeca88e7c898ef9a5.jpg) + +# 1.4. 命令框插入名称、数值或对象的定义 + +【插入对象的名称】使用移动工具并点选要插入名称的对象,然后按下键盘的 F5 键。在按下 F5 键之前,对象的名称会被附加到已在命令框输入的方程式。 + +【插入对象的数值】有两种方式可以插入对象的数值: + +1. 在对象上点击鼠标右键,在菜单中选取“复制到命令框”。 +2. 使用移动工具并点选要插入数值的对象,然后按下键盘上的 F4 键。 + +【插入对象的定义】有两种烦死可以插入对象的定义: + +1. 按住 Alt 点击对象来插入对象的定义。 +2. 使用移动工具点选要插入定义的对象,然后再按下键盘上的 F3 键。 + +# 2. 数字和角 + +GeoGebra 可以输入数值、角、点、向量、线段、直线、圆锥曲线、函数和参数曲线。可以在命令框输入这些对象,输入坐标或方程式后按 Enter 即可。 + +# 2.1. 数字 + +可以利用命令框建立一个数字,如果只输入一个数字,如:2,GeoGebra会将指定一个小写字母作为它的名称。如果要给数字明确的命名,可以先输入名称,再输入等号和数字。可以使用命令框右侧的下拉菜单选用某些常数(如圆周率 $\pi$ )。 + +![](images/d9cc57f5c2b7dcf44a297182a31a4ee8bd6c10cadf7ed153d01b695ac690aca0.jpg) + +![](images/33c4ce96a578f11cfcd71973dba4d825e0d32c24579c1368d89be1bd486a65e5.jpg) + +![](images/c14aed7c8b97a99734376a177079549d41f3eea46baf4033a8cde2de34bd73cf.jpg) + +![](images/540efde5ff71518428a68a681c62d10e3aa7c40fb5f234b7693c328a4560b128.jpg) + +![](images/0415ba00455f4870215f617534c7d7305d202d5692b3523c5306608145a45e4b.jpg) + +![](images/3380729fa10652f7c4c52e6834c34a47fb9c3633f7aa871bce36b48f900a0aad.jpg) + +![](images/b4c6711fcf50c8497664df0872acb66d37951db497cee6a767f19940c89ca8db.jpg) + +![](images/d0884eeed5711f8744ad18c545054dedd2abe75cfad3e85b6690d5ca1e5f5c86.jpg) + +![](images/6a5d5f9f0c1666cd8e8b848894b09e944e1057e921565a0701fe8e4b934cd6ef.jpg) + +# 2.2.角 + +可以输入角度“°”或者弧度“rad”,可以使用pi输入π + +可以使用快捷键输入°和π:Alt+O:可以生成符号“°”;Alt+P:可以生成符号“π”。 + +如:角 $\alpha$ 可以输入角度( $\alpha = 60^{\circ}$ )或弧度( $\alpha = \mathrm{pi} / 3$ ) + +# 【滑杆和方向键】 + +在绘图区可以使用“滑杆”来实现数字和角度的变化,而在代数区则使用方向键来改变数字和角度。 + +# 【限制数值范围】 + +数字与角度的变化可以指定一个区间,使用滑杆的“属性”来设定最大值和最小值。 + +# 3. 点和向量 + +点和向量可以使用直角坐标或极坐标来输入。其中,大写字母表示点,小写字母表示向量。如: + +在直角坐标下,输入点P或向量v,可以输入 $\mathbf{P} = (1,0)$ , $\mathrm{v} = (0,5)$ 。 + +在极坐标下,输入点P或向量,可以输入 $\mathrm{P} = (1;0^{\circ})$ , $\mathbf{v} = (5;90^{\circ})$ 。 + +必须使用分号来分隔极坐标的长度和角度,如果没有输入角度符号,GeoGebra将此角视为弧度。 + +在GeoGebra中,可以使用点和向量进行运算: + +求点A和B的中点M,可以在命令框输入 $\mathbf{M} = (\mathbf{A} + \mathbf{B}) / 2$ + +计算向量 $\mathbf{v}$ 的模长,可以在命令框输入length $=$ sqrt( $\mathbf{v}^{*}\mathbf{v}$ )。 + +# 4. 直线和坐标轴 + +# 4.1. 直线 + +可以在命令框中输入“方程式”或“参数式”来表示直线,在这两种情况下,被使用的变量必须是已经定义的数字、点或向量。如: + +输入 $\mathbf{g}$ : $3\mathrm{x} + 4\mathrm{y} = 2$ ,可以得到直线 $\mathbf{g}$ 。 + +先定义参数 $\mathrm{t}$ (如 $\mathrm{t} = 3$ )再输入参数式 $\mathbf{g} : \mathbf{X} = (-5, 5) + \mathrm{t}$ (4,-3)。 + +先定义参数 $\mathrm{m} = 2$ 和 $\mathrm{b} = -1$ ,然后输入方程式 $\mathrm{g} : \mathrm{y} = \mathrm{mx} + \mathrm{b}$ 。 + +# 4.2. 坐标轴 + +在命令框中可以使用xAxis或yAxis来当做坐标轴的名称。 + +如:输入命令 Perpendicular[A,xAxis]可以画出过点 A 且垂直于 x 轴的垂直线。 + +# 5. 圆锥曲线 + +可以用 $\mathbf{x}$ 和 $\mathbf{y}$ 的二元二次方程式来输入圆锥曲线,已经定义的变量可以被使用在方程式中。圆锥曲线的名称加冒号后输入方程式即可。如: + +
椭圆ell: 9x^2 + 16y^2 = 144
双曲线hyp: 9x^2 - 16y^2 = 144
抛物线par: y^2 = 4x
圆 c1c1: x^2 + y^2 = 25
圆 c2c2: (x-5)^2 + (y+2)^2 = 25
+ +# 6. 函数和运算 + +可以使用已经定义的变量或函数输入一个新的函数。如: + +函数f:f(x)=3x^3-x^2 + +函数 $\mathrm{g}$ : $\mathrm{g}(\mathbf{x}) = \tan (\mathrm{f}(\mathbf{x}))$ + +未命名的函数: $\sin (3x) + \tan (x)$ + +在GeoGebra中可以使用命令求出函数的积分和微分。如: + +先定义函数 $f(x) = 3x^3 - x^2$ ,然后输入 $g(x) = \cos(f'(x + 2))$ 得到函数 $g$ 。 + +# 6.1. 常用函数和运算 + +建立数字、坐标或方程式时,可以利用下面的常用函数和运算。 + +
运算输入
加法+
减法-
乘法*或空格键
内积*或空格键
复数乘法
除法/
次方^
阶层!
Gamma函数gamma()
括号()
计算某点的x坐标x()
计算某点的y坐标y()
绝对值abs()
正负号sgn()
平方根sqrt()
立方根cbrt()
0到1的随机数random()
指数函数exp()
自然对数ln()
以2为底的对数ld()
常用对数lg()
余弦cos()
正弦sin()
正切tan()
反余弦acos()
反正弦asin()
反正切atan()
小于等于的最大整数floor()
大于等于的最小整数ceil()
近似(四舍五入)round()
+ +# 6.2. 布朗运算 + +在GeoGebra中,可以使用布朗函数“true”和“false”。如:在命令框中输入 $a = true$ 或者 $b = false$ 并按下Enter键。 + +在 GeoGebra 中可以使用下列布朗函数的运算与条件,而这些命令可以从命令框右侧的下拉菜单中选取,或者使用键盘输入: + +
菜单选择键盘输入例子类型
等于\(\underline{\underline{2}}\)==\(a^{\underline{2}}b\) 或 \(a= =b\)数值、点、直线、圆锥曲线
不等于!=\(a \neq b\) 或 \(a != b\)数值、点、直线、圆锥曲线
小于<<\(a < b\)数值
大于>>\(a > b\)数值
小于等于<=\(a \leqslant b\) 或 \(a <= b\)数值
大于等于>=\(a \geqslant b\) 或 \(a >= b\)数值
&&\(a \land b\)布朗函数
||\(a \lor b\)布朗函数
¬!¬a 或 !a布朗函数
平行于//a // b直线
垂直于a⊥b直线
+ +# 7. 对象列表和运算 + +如果要建立一些对象(如:点、线段、圆)的列表,可以使用大括号。如: + +$\mathrm{L} = \{\mathrm{A},\mathrm{B},\mathrm{C}\}$ 为包含三点A,B,C的列表 + +$\mathrm{L} = \{(0,0), (1,1), (2,2)\}$ 为包含三个未命名点的列表 + +# 【比较对象列表】 + +比较两个对象列表,可以使用下列语法结构: + +list1 = list2: 检查两个列表是否相等并返回 true 或 false + +list1 != list2: 检查两个列表是否不相等并返回 true 或 false + +# 【列表的运算与函数】 + +如果在列表上执行运算与函数运算,可以得到一个新的列表。 + +# 加法与减法: + +- list1 + list2:将两列表中对应的元素相加,但要求两列表的长度相同。 +- list + number: 用列表中的每个元素加上某个数。 +- list1 - list2:第一个列表内的元素减去第二个列表内的相应元素,也要求两列表的长度相同。 +- list-number: 用列表中的每个元素减去某个数。 + +# 乘法与除法: + +- list1 * list2: 将两列表中对应的元素相乘。要求两列表的长度相同,如果列表内元素为矩阵,则要进行矩阵乘法运算。 +- list * number: 用列表内每个元素乘以某个数。 +- list1 / list2:第一个列表内的元素除以第二次列表内对应的元素,也要求两列表的长度相同。 +- list / number: 用列表中每个元素除以某个数。 +- number/list:用此数除以列表内的每个元素。 + +# 其他: + +- list^2:将列表内每个元素平方。 +- sin(list):列表内每个元素去sin函数。 + +# 8. 矩阵与运算 + +GeoGebra 也能使用矩阵,用矩阵的每一行表示。如: + +在 GeoGebra 中, $\{\{1,2,3\}, \{4,5,6\}, \{7,8,9\}\}$ 表示矩阵 $\left\{ \begin{array}{lll} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array} \right\}$ + +# 【行列式】 + +Determinant[Matrix]: 计算矩阵行列式的值。 + +# 【逆矩阵】 + +Invert[Matrix]: 得到矩阵的逆矩阵。 + +# 【转置矩阵】 + +Transpose[Matrix]: 得到矩阵的转置矩阵。 + +# 【矩阵运算】 + +# 加法和减法: + +- Matrix1 + Matrix2:两个相同大小的矩阵对应位置相加。 +- Matrix1 - Matrix2:两个相同大小的矩阵对应位置相减。 + +乘法: + +- Matrix * Number: 在矩阵的每个元素上乘以某个数。 +- Matrix1 * Matrix2: 使用矩阵乘法求出新的矩阵。第一个矩阵的列数必须与第二个矩阵的行数相等。如: + +输入{1,2,3}, {4,5,6}, {7,8,9}]*{1,2,3}, {4,5,6}, {7,8,9}得到{30,36,42}, {66,81,96}, {102,126,150}。 + +- $2 \times 2$ Matrix * Point(或向量):矩阵乘以某一点或向量,将得到一个新的点。如:输入\{\{1,2\}, \{3,4\}\}*(3,4)得到A=(11,25)。 +- $3 \times 3$ Matrix * Point(或向量):矩阵乘以某一点或向量,将得到一个新的点。如:输入\{\{1,2,3\}, \{4,5,6\}, \{0,0,1\}\}^* (1,2)得到A=(8,20)。 + +其他: + +- Determinant[Matrix]: 计算矩阵的行列式的值。 +- Invert[Matrix]: 给出矩阵的逆矩阵。 +- Transpose[Matrix]: 给出矩阵的转置矩阵。 + +# 9. 复数与运算 + +GeoGebra不能直接支持复数,但可以使用点来模拟复数运算。如: + +在命令框输入复数 $3 + 4\mathrm{i}$ ,在绘图区可以得到一个点(3,4),该点的坐标在代数区显示为 $\mathrm{z} = 3 + 4\mathrm{i}$ 。在代数区中任何点可以以复数形式显示,在“属性”中的“代数”选项卡中可以选择复数形式。 + +![](images/beb2dd79d59d9e7f78e6a5717d151498e8f66ee462cba4d01caa701068658943.jpg) + +如果复数i未被定义,GeoGebra将被认定为有序数对 $\mathrm{i} = (0,1)$ 或者复数 $0 + 1\mathrm{i}$ 。也就是说,可以在命令框输入i作为复数单位。 + +# 加法和减法: + +$\cdot (2,1) + (1, - 2)$ 等价于 $(2 + 1\mathrm{i}) + (1 - 2\mathrm{i})$ 并得到复数(3,-1)也可以显示为3-1i +- (2,1) - (1, -2) 等价于 $(2 + 1\mathrm{i}) - (1 - 2\mathrm{i})$ 并得到复数(1,3)也可以显示为 1-3i + +# 乘法和除法: + +- (2,1) * (1, -2) 等价于 $(2 + 1\mathrm{i}) * (1 - 2\mathrm{i})$ 并得到复数(4,-3)也可以显示为 4-3i +- (2,1)/(1,-2)等价于(2+1i)/(1-2i)并得到复数(0,1)也可以显示为 $0 + 1\mathrm{i}$ + +# 其他: + +- $3+$ (4,5) 等价于 $3+$ (4+5i) 并得到复数 (7,5) 或 $7+5\mathrm{i}$ +- 3-(4,5) 等价于 3-(4+5i) 并得到复数 (-1,-5) 或 -1-5i +- 3/(0,1)等价于 $3 / (0 + 1\mathrm{i})$ 并得到复数(0,-3)或0-3i + +- $3^{*}$ (1,2)等价于 $3^{*}(1 + 2\mathrm{i})$ 并得到复数(3,6)或3-6i + +# 第五章 命令输入 + +使用命令可以生成新对象或者修改已有对象。 + +当在 GeoGebra 的命令框中输入命令,软件会尝试着自动补齐命令,也就是说,在命令框中只要输入命令的前两个字符,GeoGebra 会显示最相近的命令(类似于输入法的联想功能): + +如果提示的命令刚好是想输入的,只需按下Enter键,即可接受建议将提示的命令输入命令框; + +如果提示的命令并不是想输入的,可以继续输入,GeoGebra会再提示其他相近的命令。 + +# 1. 一般命令 + +# 【绘图步骤】 + +ConstructionStep[]:返回目前的步骤数 + +ConstructionStep[Object]: 返回指定对象目前的步骤数 + +# 【删除】 + +Delete[Object]: 删除某一对象以及所有与之相关的对象 + +# 【关系】 + +Relation[Object a, Object b]: 显示一个窗口让我们得知 Object a 和 Object b 之间的关系。 + +如:该命令可以知道是否一点在一条直线上(或一条圆锥曲线上)、一直线是与一圆锥曲线相切、一直线是否与一圆锥曲线相交或两对象是否相等。 + +# 2. 数值 + +# 【仿射比】 + +AffineRatio[Point A, Point B, Point C]: 得出共线的三点 A, B, C 的比值 $\lambda$ , 其中 $C = A + \lambda * AB$ 。 + +# 【面积】 + +Area[Point A, Point B, Point C, ...]: 计算点A,B,C,……所围成的多边形面积。 + +Area[Conic c]: 计算圆锥曲线 $c$ 的面积。 + +# 【坐标轴】 + +AxisStepX[]:得到当前X轴单位和数值。 + +AxisStepY[]:得到当前 Y 轴单位和数值。 + +# 【二项式系数】 + +BinomialCoefficient[Number n, Number r]: 计算“n选r”的二项式系数。 + +# 【周长】 + +Circumference[Conic]: 计算圆锥曲线的周长。 + +# 【交比】 + +CrossRatio[Point A, Point B, Point C, Point D]: 计算共线的四点 A, B, C, D 的交比为 $\lambda$ 。 $\lambda = \text{AffineRatio}[\text{B}, \text{C}, \text{D}] / \text{AffineRatio}[\text{A}, \text{C}, \text{D}]$ 。 + +# 【曲率】 + +Curvature[Point, Function]: 计算函数在制定点的曲率。 + +Curvature[Point, Curve]: 计算曲线在指定点的曲率。 + +# 【距离】 + +Distance[Point A, Point B]: 计算两点A和B之间的距离。 + +Distance[Point, Line]: 计算点和线之间的距离。 + +Distance[Line g, Line h]: 计算两直线 $\mathbf{g}$ 和 $\mathbf{h}$ 之间的距离。 + +# 【轴长】 + +FirstAxisLength[Conic]:计算圆锥曲线主轴的长度。如:椭圆中是长轴的一半。 + +SecondAxisLength[Conic]:计算圆锥曲线副轴的长度。如:椭圆中是短轴的一半。 + +# 【最大公因数】 + +GCD[Number a, Number b]: 计算两个数字 a 和 b 的最大公因数。 + +GCD[List of numbers]: 计算数列的最大公因数。 + +# 【最小公倍数】 + +LCM[Number a, Number b]: 计算两个数字 a 和 b 的最小公倍数。 + +LCM[List of numbers]: 计算数列的最小公倍数。 + +# 【整除】 + +Div[Number a, Number b]: 计算数字 a 除 b 的商。 + +# 【积分】 + +Integral[Function, Number a, Number b]: 计算某函数在[a,b]的积分。 + +Integral[Function f, Function g, Number a, Number b]: 计算在区间[a,b]上,f(x)-g(x)的积分。 + +# 【迭代】 + +Iteration[Function, Number x0, Number n]: 给定起始值 x0,重复带入 f 函数 n 次。如:定义 $f(x) = x^{\wedge}2$ ,输入命令 Iteration[f, 3, 2] 就可以得到结果 81。 + +# 【长度】 + +Length[Vector]: 计算向量的长度。 + +Length[Point A]: 计算原点到点 A 的长度。 + +Length[Function, Number x1, Number x2]: 计算函数 f 从 x1 到 x2 之间的长度。 + +Length[Function, Point A, Point B]: 计算函数 f 从点 A 到点 B 之间的长度。如果给定的 + +点不在函数图像上,那么将以点的 $x$ 坐标作为区间。 + +Length[Curve, Number t1, Number t2]: 计算曲线在参数 t1 到 t2 之间的长度。 + +Length[Curve c, Point A, Point B]: 计算曲线c在点A和B之间的曲线长度。 + +Length[List]:计算列表的长度(列表的元素个数)。 + +# 【离心率】 + +LinearEccentricity[Conic]: 计算圆锥曲线的离心率。 + +# 【最大值、最小值】 + +Min[Number a, Number b]: 求 a 和 b 中的小者。 + +Max[Number a, Number b]: 求 a 和 b 中的大者。 + +# 【余数】 + +Mod[Integers a, Integers b]: 计算 a 除以 b 的余数。 + +# 【参数】 + +Parameter[Parabola]: 计算抛物线 $p$ 的参数(准线和焦点间的距离)。 + +# 【周长】 + +Perimeter[Polygon]: 计算多边形周长。 + +# 【半径】 + +Radius[Circle]: 计算圆 $c$ 的半径。 + +# 【随机数】 + +RandomBetween[Min integer, Max integer]: 在最大值和最小值直接产生随机数。 + +RandomBinomial[Number n of trials, Probability p]: 使用二项分布产生随机数。 + +RandomNormal[Mean, Standard deviation]: 使用正态分布产生随机数。 + +RandomPoisson[Mean]:使用泊松分布产生随机数。 + +# 【斜率】 + +Slope[Line]: 计算给定直线的斜率。 + +# 3.角 + +Angle[Vector v1, Vector v2]: 计算两向量 v1 和 v2 之间的夹角。 + +Angle[Line g, Line h]: 计算两直线 $\mathbf{g}$ 和 $\mathrm{h}$ 方向向量之间的夹角。 + +Angle[Point A, Point B, Point C]: 以B为顶点,计算BA和BC之间的夹角。 + +Angle[Point A, Point B, Angle $\alpha$ ]:以线段 AB 为始边,画出大小为 $\alpha$ 的角。 + +Angle[Conic]: 计算圆锥曲线 c 主轴的转角。 + +Angle[Vector]: 计算 $\mathbf{X}$ 轴到向量 $\mathbf{v}$ 之间的夹角。 + +Angle[Point]: 计算 $\mathbf{x}$ 轴到点A之间的夹角。 + +Angle[Number]: 将一个数字 n 转换成弧度(介于 0 到 2pi 之间)。 + +Angle[Polygon]: 建立多边形 $p$ 的所有内角。如果多边形是以逆时针方向选取,可以得到内角和。如果多边形是顺时针方向选取,可以得到外角和。 + +# 4. 点 + +# 【中心点】 + +Center[Conic]: 计算圆锥曲线的中心点。 + +# 【重心】 + +Centroid[Polygon]: 计算多边形 $\mathfrak{p}$ 的重心。 + +# 【极值】 + +Extremum[Polynomial]: 计算多项式函数图像上的局部极值。 + +# 【焦点】 + +Focus[Conic]: 得到圆锥曲线 $c$ 的焦点。 + +# 【拐点】 + +InflectionPoint[Polynomial]: 计算多项式 $f$ 的所有拐点。 + +# 【交点】 + +Intersect[Line g, Line h]: 计算直线 $\mathbf{g}$ 和 $\mathbf{h}$ 的交点。 + +Intersect[Line, Conic]: 计算直线 $\mathbf{g}$ 和圆锥曲线 $\mathbf{c}$ 的所有交点。 + +Intersect[Line, Conic, Number n]: 计算直线 $\mathbf{g}$ 和圆锥曲线 $\mathbf{c}$ 的第 $\mathbf{n}$ 个交点。 + +Intersect[Conic c1, Conic c2]: 计算两圆锥曲线 c 和 d 的所有交点。 + +Intersect[Conic c1, Conic c2, Number n]: 计算两圆锥曲线 c 和 d 的第 n 个交点。 + +Intersect[Polynomial f1, Polynomial f2]: 计算多项式f1和多项式f2的所有交点。 + +Intersect[Polynomial f1, Polynomial f2, Number n]: 计算多项式 f1 和多项式 f2 的第 n 个交点。 + +Intersect[Polynomial, Line]: 计算多项式 $f$ 和直线 $g$ 的所有交点。 + +Intersect[Polynomial, Line, Number n]: 计算多项式 $f$ 和直线 $g$ 的第 $n$ 个交点。 + +Intersect[Function f, Function g, Point A]: 计算函数 $\mathrm{f}$ 和 $\mathrm{g}$ 在起始点 $\mathrm{A}$ 的所有交点。 + +Intersect[Function, Line, Point A]: 计算函数 f 和直线 g 在起始点 A 的所有交点。 + +# 【中点】 + +Midpoint[Point A, Point B]: 计算点A和B的中点。 + +Midpoint[Segment]: 计算线段 s 的中点。 + +# 【点】 + +Point[Line]: 得到直线上一点。 + +Point[Conic]: 得到圆锥曲线上一点。 + +Point[Function]: 得到函数 $f$ 上一点。 + +Point[Polygon]: 得到多边形 $p$ 上一点。 + +Point[Vector]: 得到向量 $\mathbf{v}$ 上一点。 + +Point[Point, Vector]: 得到从 P 点平移向量 v 之后的点。 + +# 【根】 + +Root[Polynomial]: 得到多项式的所有根。 + +Root[Function, Number a]: 得到函数 f 在起始值 a 的一个根。 + +Root[Function, Number a, Number b]: 得到函数 f 在区间[a,b]上的根。 + +# 【顶点】 + +Vertex[Conic]: 生成圆锥曲线的全部顶点。 + +# 5. 线 + +# 【直线】 + +Line[Point A, Point B]: 建立过点A和B的直线。 + +Line[Point, Line]: 建立过点 A 且平行于直线 g 的直线。 + +Line[Point, Vector v]: 建立过点 A 且方向为 v 的直线。 + +# 【线段】 + +Segment[Point A, Point B]: 建立点A和点B之间的线段。 + +Segment[Point A, Number a]: 建立以点 A 为起点,线段长为 a 的线段。 + +# 【射线】 + +Ray[PointA, PointB]:建立起点为A过B点的射线。 + +Ray[Point, Vector v]: 建立起点为 A 并方向向量为 v 的射线。 + +# 【垂线】 + +Perpendicular[Point, Line]: 建立过点 A 且垂直于直线 g 的直线。 + +Perpendicular[Point, Vector]: 建立过点 A 且垂直于向量 v 的直线。 + +# 【中垂线】 + +PerpendicularBisector[Point A, Point B]: 建立线段 AB 的垂直平分线。 + +PerpendicularBisector[Segment]: 建立线段的垂直平分线。 + +# 【角平分线】 + +AngleBisector[Point A, Point B, Point C]: 生成角 ABC 的角平分线。 + +AngleBisector[Line g, Line h]: 生成直线 $\mathbf{g}$ 和 $\mathrm{h}$ 的角平分线。 + +# 【渐近线】 + +Asymptote[Hyperbola]: 建立双曲线 $h$ 的两条渐近线。 + +# 【对称轴】 + +Axes[Conic]: 建立圆锥曲线 $c$ 的对称轴。 + +# 【径】 + +Diameter[Line, Conic]: 圆锥曲线 c 平行于直线 g 的径。 + +Diameter[Vector, Conic]: 圆锥曲线 $c$ 平行于向量 $v$ 的径。 + +# 【准线】 + +Directrix[Parabola]: 建立抛物线 $p$ 的准线。 + +# 【极线】 + +Polar[Point, Conic]: 建立相对于圆锥曲线 $c$ 的极线。 + +# 【切线】 + +Tangent[Point, Conic]: 建立圆锥曲线 $c$ 过点 A 的所有切线。 + +Tangent[Line, Conic]: 建立圆锥曲线 $c$ 的所有平行于直线 $g$ 的切线。 + +Tangent[Number a, Function]: 建立 $f(x)$ 在 $x = a$ 的切线。 + +Tangent[Point A, Function]: 建立 $f(x)$ 在 $x = x(A)$ 的切线。 + +# 6. 多边形 + +Polygon[Point A, Point B, Point C,...]: 生成由给定点 A, B, C……所围成的多边形。 + +Polygon[Point A, Point B, Number n]: 以线段 AB 为边长,生成由 n 个顶点的正多边形。 + +# 7. 向量 + +# 【向量】 + +Vector[Point A, Point B]: 从点A到点B的向量。 + +Vector[Point]: 点 A 的位置向量。 + +# 【单位向量】 + +UnitVector[Line]: 直线 $\mathbf{g}$ 的单位方向向量。 + +UnitVector[Vector]: 与向量 v 同方向的单位向量。 + +# 【法向量】 + +PerpendicularVector[Line]:求直线 $\mathbf{g}$ 的法向量。如:直线 $\mathrm{ax + by = c}$ 的法向量为(a,b)。 + +PerpendicularVector[Vector v]: 求向量 $\mathbf{v}$ 的法向量。 + +# 【单位法向量】 + +UnitPerpendicularVector[Line]:求直线 $\mathbf{g}$ 的单位法向量。 + +UnitPerpendicularVector[Vector]: 求向量 $\mathbf{v}$ 的单位法向量 + +# 【曲率向量】 + +CurvatureVector[Point, Function]: 求函数 $f$ 在点 $A$ 的曲率向量。 + +CurvatureVector[Point, Curve]: 求曲线 $c$ 在点 A 的曲率向量。 + +# 8. 函数 + +# 【函数】 + +Function[Function, Number a, Number b]: 生成函数 f 在 [a,b] 区间上函数图像。如: + +$\mathrm{f(x) =}$ Function[x^2, -1, 1]得到函数 $\mathbf{x}^2$ 在区间[-1,1]上的函数图像。如果输入 $\mathrm{g(x)} = 2\mathrm{f(x)}$ 将会得到 $\mathrm{g(x)} = 2\mathrm{x}^2$ ,但这个函数的定义域并不限制在[-1,1]区间上。 + +# 【条件函数】 + +如果要建立条件函数可以使用布朗命令 If。如: + +$$ +\mathrm {f (x) = I f [ x < 3 , \sin (x) , x ^ {\wedge} 2 ] \text {相 当 于 :} f (x) = \left\{ \begin{array}{l l} {\sin (x), \text {当} x < 3} \\ {x ^ {2}, \text {当} x \geq 3} \end{array} \right.} +$$ + +# 【导数(微分)】 + +Derivative[Function]: 求函数 $f(x)$ 的导数(微分)。 + +Derivative[Function, Number n]: 求求函数 $f(x)$ 的 $n$ 阶导数( $n$ 次微分)。 + +# 【积分】 + +Integral[Function]: 求函数 $f(x)$ 的不定积分。 + +# 【多项式】 + +Polynomial[Function]: 求函数 $f$ 的展开式。 + +Polynomial[List of n points]: 建立经过 $\mathbf{n}$ 个点的 $\mathrm{n - 1}$ 次多项式。 + +# 【展开式】 + +Expand[Function]: 将式子按乘法展开。 + +# 【因式】 + +Factor[Polynomial]: 将多项式转换成因式乘法形式。 + +# 【化简】 + +Simplify[Function]: 化简给定的函数。如: + +- Simplify $[\mathrm{x} + \mathrm{x} + \mathrm{x}]$ 得到 $\mathrm{f(x)} = 3\mathrm{x}$ +- Simplify $\left[\sin (\mathrm{x}) / \cos (\mathrm{x})\right]$ 得到 $\mathrm{f(x)} = \tan (\mathrm{x})$ +- Simplify $[-2\sin(x)\cos(x)]$ 得到 $f(x) = \sin(-2x)$ + +# 【泰勒展开式】 + +TaylorPolynomial[Function, Number a, Number n]: 建立函数 $f(x)$ 在点 $x = a$ 的 $n$ 次泰勒展开式。 + +# 9. 圆锥曲线 + +# 【圆】 + +Circle[Point M, Number r]: 建立圆心为 M 半径为 r 的圆。 + +Circle[Point M, Segment]: 建立圆心为 M 半径为 s 的圆(参考先确定半径再画圆的绘图工具)。 + +Circle[Point M, Point A]: 建立圆心为 M 且过点 A 的圆。 + +Circle[Point A, Point B, Point C]: 建立过三点A,B,C的圆。 + +# 【圆锥曲线】 + +Conic[Point A, Point B, Point C, Point D, Point E]: 得到过五点 A, B, C, D, E 的圆锥曲线。 + +# 【椭圆】 + +Ellipse[Point F, Point G, Number a]: 建立焦点为 F 和 G 且半长轴长为 a 的椭圆。 + +Ellipse[PointF, PointG, Segment]: 建立焦点为 F 和 G 且半长轴长等于线段 s 长度的椭圆。 + +Ellipse[PointA,PointB,PointC]:建立焦点为A和B且过点C的圆。 + +# 【双曲线】 + +Hyperbola[Point F, Point G, Number a]: 建立焦点为 F 和 G 且半长轴长为 a 的双曲线。 + +Hyperbola[Point F, Point G, Segment]: 建立焦点为 F 和 G 且半长轴长等于线段 s 长度的双曲线。 + +Hyperbola[Point A, Point B, Point C]: 建立焦点为 A 和 B 且过点 C 的双曲线。 + +# 【抛物线】 + +Parabola[Point F, Line g]: 建立焦点为 F 准线为 g 的抛物线。 + +# 【密切圆】 + +OsculatingCircle[Point, Function]: 建立函数 $f$ 在点 $A$ 的密切圆。 + +OsculatingCircle[Point, Curve]: 建立曲线 c 在点 A 的密切圆。 + +# 10. 参数曲线 + +# 【曲线】 + +Curve[Expression e1, Expression e2, Parameter t, Number a, Number b]: 生成坐标为(e1,e2)的曲线,其中e1,e2为参数式。如: + +$$ +c = \operatorname {C u r v e} [ 2 \cos (t), 2 \sin (t), t, 0, 2 p i ] +$$ + +# 【参数曲线相关】 + +Curvature[Point, Curve]: 计算曲线上一点的曲率。 + +CurvatureVector[Point, Curve]: 生成曲线上一点的曲率向量。 + +Derivative[Function]: 计算函数 $f(x)$ 的微分。 + +Derivative[Function, Number n]: 计算函数 $f(x)$ 的 $n$ 次微分。 + +Length[Curve, Number t1, Number t2]: 计算曲线在参数 t1 和 t2 之间的长度。 + +Length[Curve c, Point A, Point B]: 计算曲线c在点A和B之间的曲线长度。 + +# 11. 圆弧和扇形 + +# 【弧】 + +Arc[Conic, Point A, Point B]: 得到介于圆锥曲线 c 上两点 A 和 B 之间的弧。 + +Arc[Conic, Number t1, Number t2]: 得到介于圆锥曲线 c 上梁参数 t1 和 t2 之间的弧。 + +# 【圆弧】 + +CircularArc[Point M, Point A, Point B]: 建立以 M 点为圆心,起点为 A 终点为 B 的圆弧。 + +CircumcircularArc[Point A, Point B, Point C]: 建立通过A,B,C三点的圆弧。 + +# 【扇形】 + +CircularSector[Point M, Point A, Point B]: 建立 M 点为圆心,起点为 A 终点为 B 的扇形。 + +CircumcircularSector[Point A, Point B, Point C]: 建立通过A,B,C三点的扇形。 + +Sector[Conic, Point A, Point B]: 建立介于圆锥曲线 c 上的两点 A 和 B 之间的圆锥曲线扇形区域。 + +Sector[Conic, Number t1, Number t2]: 建立介于圆锥曲线 c 上的两参数 t1 和 t2 之间的圆锥曲线扇形区域。 + +# 【半圆】 + +Semicircle[Point A, Point B]: 建立线段 AB 上的半圆。 + +# 12. 文字 + +# 【分数】 + +FractionText[Number]:将数值转换成分数形式。 + +# 【LaTex】 + +LaTeX[Object]: 得到用LaTex表达的对象的文字。如: + +若 $a = 2$ 切 $\mathrm{f(x) = ax^2}$ ,那么LaTeX[f]得到 $2\mathrm{x}^{2}$ + +LaTeX[Object, Boolean]: 通过逻辑判断得到用LaTex表达的对象的文字。如果为真,那么用数值代替代数,否则显示代数式。 + +解释:LaTeX 是一种基于 TeX 的排版系统,利用这种格式即使使用者没有排版和程序设计的知识也可以充分发挥由 TeX 所提供的强大功能,能在几天,甚至几小时内生成很多具有书籍质量的印刷品。对于生成复杂表格和数学公式,这一点表现得尤为突出。因此它非常适用于生成高印刷质量的科技和数学类文档。 + +# 【Unicode】 + +LetterToUnicode["Letter']: 将字符转化成 Unicode(统一码)。如:LetterToUnicode["a"]得到 97。 + +TextToUnicode["Text']: 将文字转化成 Unicode(统一码)。如: TextToUnicode["Some text"]得到{83, 111, 109, 101, 32, 116, 101, 120, 116}。 + +UnicodeToText[List of Integers]: 将 Unicode 转化成文字。如: UnicodeToText[\{104, 101, 108, 108, 111\}]会得到文字“hello”。 + +解释:Unicode(统一码、万国码、单一码)是一种在计算机上使用的字符编码。它为每种语言中的每个字符设定了统一并且唯一的二进制编码,以满足跨语言、跨平台进行文本转换、处理的要求。 + +# 【名称】 + +Name[Object]: 得到绘图区的对象名称。 + +# 【对象】 + +Object[Name of object as text]: 得到对象的名称。改命令与“名称”相反。 + +# 【文字】 + +Text[Object]:将对象的方程式转化成文字。最终出现的数值或字符结果。 + +Text[Object, Boolean]: 通过逻辑判断得到对象的文字。参考LaTeX[Object, Boolean], + +LaTex 命令得到的文字形式使用 LaTeX 语法表达的。 + +# 13. 轨迹 + +Locus[PointQ,PointP]:求点Q的轨迹线(P为控制点)。 + +# 14.列表 + +# 【Append】 + +Append[List, Object]: 将对象加入列表中成为新的列表的最后一个元素。 + +Append[Object, List]: 将对象加入列表中成为新的列表的第一个元素。 + +# 【CountIf】 + +CountIf[Condition, List]: 计算符合条件的元素个数。如:CountIf[x < 3, {1, 2, 3, 4, 5}]得到 2。 + +# 【元素】 + +Element[List, Number n]: 调用列表中的第 $\mathbf{n}$ 个元素。 + +# 【First】 + +First[List]:得到列表中的第一个元素。 + +First[List, Number of elements]: 得到一个包含给定列表中前 $n$ 个元素的新列表。 + +# 【Last】 + +Last[List]: 得到列表中的最后一个元素。 + +Last[List, Number of elements]: 得到一个包含给定列表中后 $n$ 个元素的新列表。 + +# 【插入】 + +Insert[Object, List, Position]: 将对象插入到列表的指定位置。 + +# 【交集】 + +Intersection[List 1, List 2]: 将两个列表的公共部分生成一个新的列表。 + +# 【并集】 + +Union[List 1, List 2]: 合并两列表并去除重复的元素。 + +# 【迭代数列】 + +IterationList[Function, Number x0, Number n]: 可以得到 $n + 1$ 个元素的列表,其中元素是由函数带入 x0 经过多次迭代生成的。如:定义 $f(x) = x^{\wedge}2$ ,输入 L = IterationList[f, 3, 2] 可以得到 L = {3, 9, 81} + +# 【合并】 + +Join[List 1, List 2, ...]: 将多个列表合并成一个列表。这种合并保留相同元素,不会重新排序。 + +Join[List of lists]: 将子列表合并得到一个更大的列表。新列表包含所有元素,并保留相 + +同元素,不会重新排序。 + +# 【筛选】 + +KeepIf[Condition, List]: 筛选列表中符合条件的元素。如: KeepIf[x<3, {1, 2, 3, 4, 1, 5, 6}] 可得到新列表{1, 2, 1}。 + +# 【长度】 + +Length[List]: 计算列表的长度,也就是元素个数。 + +# 【最小值】 + +Min[List]: 得到列表中最小的元素。 + +# 【最大值】 + +Max[List]:得到列表中最大的元素。 + +# 【内积】 + +Product[List of numbers]: 计算列表中所有数字的乘积。 + +# 【移除未定义对象】 + +RemoveUndefined[List]: 移除列表中没有定义的对象。 + +# 【排序】 + +Sort[List]:对列表中的数值、文字或点做排序。如:Sort[\{3,2,1\}]可得到列表 $\{1,2,3\}$ 。 + +# 【反序】 + +Reverse[List]: 倒序排列列表。 + +# 【求和】 + +Sum[List]:计算列表中所有元素之和。该命令可以用于数值、点、向量、文字和函数。 + +# 【提取】 + +Take[List, Start position m, End position n]: 提取列表中第 $\mathrm{m}$ 个元素到第 $\mathrm{n}$ 个元素组成新的列表。 + +# 【序列】 + +Sequence[Expression, Variable i, Number a, Number b]: 生成一个序列,使用给定的表达式及变量i从a到b变化。如:L = Sequence[(i, 1), i, 1, 5]得到的点序列,x坐标变化从1到5,如下图所示。 + +![](images/ed649149b5f40a7a0b3a113adec3c616709d5db2cb920691967b00e425649cfe.jpg) + +Sequence[Expression, Variable i, Number a, Number b, Numbers]:生成一个序列,使用给定的表达式及变量i从a到b变化,其中步长为s。如:L = Sequence[(i, 1), i, 1, 5, 0.5]得到的点序列,x坐标变化从1到5,步长为0.5。如下图所示。 + +![](images/613023049378af76a706e2f3073a1570187014fc1e03a9015155def461f3bd6a.jpg) + +# 15. 几何变换 + +# 【伸缩】 + +Dilate[Point A, Number, Point S]: 点A关于点S做指定比例的伸缩变换。 + +Dilate[Line, Number, Point S]: 直线关于点S做指定比例的伸缩变换。 + +Dilate[Conic, Number, Point S]: 圆锥曲线关于点S做指定比例的伸缩变换。 + +Dilate[Polygon, Number, Point S]: 多边形关于点S做指定比例的伸缩变换。 + +Dilate[Image, Number, Point S]: 图片关于点S做指定比例的伸缩变换。 + +# 【镜像】 + +Reflect[Point A, Point B]: 点 A 对点 B 做镜像。 + +Reflect[Line, Point]: 直线对指定点做镜像。 + +Reflect[Conic, Point]: 圆锥曲线对指定点做镜像。 + +Reflect[Polygon, Point]: 多边形对指定点做镜像。 + +Reflect[Image, Point]: 图片对指定点做镜像。 + +Reflect[Point, Line]: 点对指定直线做镜像。 + +Reflect[Line g, Line h]: 直线 $\mathbf{g}$ 对指定直线 $\mathrm{h}$ 做镜像。 + +Reflect[Conic, Line]: 圆锥曲线对指定直线做镜像。 + +Reflect[Polygon, Line]: 多边形对指定直线做镜像。 + +Reflect[Image, Line]: 图片对指定直线做镜像。 + +Reflect[Point, Circle]: 点对圆进行反演。 + +# 【旋转】 + +Rotate[Point, Angle]: 点关于原点旋转指定的角度。 + +Rotate[Vector, Angle]: 向量关于原点旋转指定的角度。 + +Rotate[Line, Angle]: 直线关于原点旋转指定的角度。 + +Rotate[Conic, Angle]: 圆锥曲线关于原点旋转指定的角度。 + +Rotate[Polygon, Angle]: 多边形关于原点旋转指定的角度。 + +Rotate[Image, Angle]: 图片关于原点旋转指定的角度。 + +Rotate[Point A, Angle, Point B]: 点A关于点B旋转指定的角度。 + +Rotate[Line, Angle, Point]: 直线关于指定点旋转指定的角度。 + +Rotate[Vector, Angle, Point]: 向量关于指定点旋转指定的角度。 + +Rotate[Conic, Angle, Point]: 圆锥曲线关于指定点旋转指定的角度。 + +Rotate[Polygon, Angle, Point]: 多边形关于指定点旋转指定的角度。 + +Rotate[Image, Angle, Point]: 图片关于指定点旋转指定的角度。 + +# 【平移】 + +Translate[Point, Vector]: 点沿指定向量平移。 + +Translate[Line, Vector]: 直线沿指定向量平移。 + +Translate[Conic, Vector]: 圆锥曲线沿指定向量平移。 + +Translate[Function, Vector]: 函数沿指定向量平移。 + +Translate[Polygon, Vector]: 多边形沿指定向量平移。 + +Translate[Image, Vector]: 图片沿指定向量平移。 + +Translate[Vector, Point]: 向量 v 平移到指定的点。 + +# 16. 统计 + +# 【求和】 + +SigmaXX[List of numbers]: 计算给定数值的平方和。 + +SigmaXX[List of points]: 计算给定点列的 $\mathbf{X}$ 坐标的平方和。 + +SigmaXY[List of x-coordinates, List of y-coordinates]: 计算两点列的 $\mathbf{x}$ 与 $\mathbf{y}$ 坐标乘积的和。 + +SigmaXY[List of points]: 计算 $\mathbf{x}$ 坐标和 $\mathbf{y}$ 坐标的乘积和。 + +SigmaYY[List of Points]: 计算给定点列 $\mathbf{y}$ 坐标的平方和。 + +# 【中位数】 + +Median[List of numbers]: 得到列表中元素的中位数。 + +# 【众数】 + +Mode[List of numbers]: 得到列表中元素的众数。 + +# 【平均数】 + +Mean[List of numbers]: 计算列表中元素的平均数。 + +MeanX[List of points]: 计算列表中元素 $\mathbf{x}$ 坐标的平均数。 + +MeanY[List of points]: 计算列表中元素 y 坐标的平均数。 + +# 【四分位数】 + +Q1[List of numbers]: 得到列表元素中的第一四分位数。 + +Q3[List of numbers]: 得到列表元素中的第三四分位数。 + +# 【方差】 + +Variance[List of numbers]: 计算列表元素的方差。 + +# 【标准差】 + +SD[List of Numbers]: 计算列表中数值的标准差。 + +# 【相关系数】 + +CorrelationCoefficient[List of x-coordinates, List of y-coordinates]: 使用给定的 x 坐标和 y 坐标计算相关系数的乘积。 + +CorrelationCoefficient[List of points]: 使用给定点的坐标计算相关系数的乘积。 + +# 【协方差】 + +Covariance[List 1 of numbers, List 2 of numbers]: 计算两列表的协方差。 + +Covariance[List of points]: 计算 $\mathbf{X}$ 坐标和y坐标的协方差。 + +# 【正态分布函数】 + +Normal[Mean, Standard deviation, Variable value]: 计算函数 $(\Phi(\mathbf{x})$ -mean)/(standard deviation) 其中 $\Phi(\mathbf{x})$ 是符合 $\mathrm{N}(0,1)$ 分布的概率密度函数。 + +# 【反正态函数】 + +InverseNormal[Mean, Standard deviation, Probability]: 计算具有给定概率正态分布的区间点。 + +# 【条形图】 + +BarChart[Start value, End value, List of heights]: 建立指定区间内的条形图,条形个数取决于表格长度,高度取决于元素大小。如:BarChart[10, 20, {1,2,3,4,5}]得到如下图所示。 + +![](images/224cdd1698ca8fcc2a659466e16993666b441a67bcfba1c3585a72b3013ccb6c.jpg) + +BarChart[Start value a, End value b, Expression, Variable k, From number c, To number d]: 建立[a,b]之间的条形图,用含变量k的表达式表示条形高度,其中k从c到d变化。 + +BarChart[List of raw data, Width of bars]: 由给定数据生成条形图。可设定条宽。 + +BarChart[List of data, List of frequencies]: 建立位置长度的条形图。 + +BarChart[List of data, List of frequencies, Width of bars w]: 建立位置长度的条形图,并指定条宽。如:BarChart[{10,11,12,13,14}, {1,2,3,0,3}, 0.5] 得到如下图所示。 + +![](images/36fc680018e18c15146b3a80200e45ca049be64ec0e1a1f6851e9011ea15fbd8.jpg) + +# 【直方图】 + +Histogram[List of class boundaries, List of heights]: 由给定的高度建立直方图。 + +Histogram[List of class boundaries, List of raw data]: 使用给定的数据建立直方图。如: Histogram[\{1, 2, 3, 4\}, \{1.0, 1.1, 1.1, 1.2, 1.7, 2.2, 2.5, 4.0\}]得到如下图所示。 + +![](images/c6e2cd20642e4f3fa25c179a451b968e4763a637caa001fa0af0475601f080b0.jpg) + +# 【盒形图】 + +BoxPlot[yOffset, yScale, List of raw data]: 建立给定数据的盒状图,并在坐标系的垂直方向是由y的偏移量控制的,高度受y的度量影响。 + +BoxPlot[yOffset, yScale, Start value, Q1, Median, Q3, End value]: 在区间上建立盒形图。 + +# 【线性回归】 + +FitLine[List of points]: 计算 xy 坐标上的线性回归。 + +# 【其他回归】 + +FitExp[List of points]: 计算指数回归曲线。 + +FitLineX[List of points]: 计算 xy 坐标平面上的线性回归。 + +FitLog[List of points]: 计算对数回归曲线。 + +FitLogistic[List of points]: 计算 Logistic 回归曲线。 + +FitPoly[List of points, Degree n of polynomial]: 计算 $\mathbf{n}$ 次多项式的回归方程。 + +FitPow[List of points]: 计算 $\mathbf{a}\mathbf{x}^{\mathrm{b}}$ 型的回归曲线。 + +FitSin[List of points]: 计算 $a + b \sin (cx + d)$ 型的回归曲线。 + +# 17. 电子表格 + +# 【行号】 + +Row[Spreadsheet cell]:得到非空单元格的行号。如:若B3非空,那么Column[B3]可以得到数值 $a = 3$ ,因为B3在第三行。 + +# 【列号】 + +Column[Spreadsheet cell]:得到非空单元格的列号。如:若B3非空,那么Column[B3]可以得到数值 $a = 2$ ,因为B是第二列。 + +# 【生成列表】 + +CellRange[Start cell, End cell]: 由指定范围的单元格生成列表。 + +# 18. 逻辑命令 + +# 【If】 + +If[Condition, Object]: 如果条件式为真,可以得到 Object。如果条件式为假时,为 undefined Object。 + +If[Condition, Object a, Object b]: 如果条件式为真,可以得到 Object。如果条件式为假时,为 Object b。 + +# 【IsDefined】 + +IsDefined[Object]: 根据对象是否有定义,返回真假值。 + +# 【IsInteger】 + +IsInteger[Number]: 根据对象是否为整数,返回真假值。 + +# 第六章 菜单 + +# 1. 文件 + +# 【新窗口】 + +点击该项可以新建 GeoGebra 窗口,窗口布局符合之前的预设。 + +# 【新建】 + +点击该选项可以在同一个GeoGebra窗口内新建空白界面,在开启新的文件前,电脑会询问是否保存当前文件。 + +# 【打开】 + +点击该选项可以用来打开电脑中的 GeoGebra 文件,文件后缀为 .ggb。 + +# 【保存】 + +点击该选项可以将文件存储为 GeoGebra 文件,文件后缀为.ggb。 + +# 【另存为】 + +点击该选项可以将文件另存为新的 GeoGebra 文件,电脑会要求输入新的文件名。 + +# 【导出】 + +点击该选项并选择下一级选项可以将文件保存成其他格式,如:网页、图片。 + +有下列图片格式可供选择: + +PNG——流式网络图形格式(Portable Network Graphic Format)是一种位图文件(bitmap file)存储格式。PNG格式图片因其高保真性、透明性及文件体积较小等特性,被广泛应用于网页设计、平面设计中。网络通讯中因受带宽制约,在保证图片清晰、逼真的前提下,不能大范围的使用文件较大的bmp、.jpg格式文件,所以PNG格式文件自诞生之日起就大行其道。 + +EPS——EPS(Encapsulated PostScript)是处理图像工作中的最重要的格式,它在Mac和PC环境下的图形和版面设计中广泛使用。EPS文件是目前桌面印前系统普遍使用的通用交换格式当中的一种综合格式,大部分排版和图形处理软件都可以使用它。EPS文件可以应用于排版、设计。 + +PStricks——方便Latex使用的图片格式。 + +PGF/TikZ——方便LaTex使用的图片格式。 + +# 【打印预览】 + +点击该选项可以打开“打印预览”窗口,可以输入“标题”“作者”“日期”并设置显示比例。 + +# 【关闭】 + +点击该选项可以关闭GeoGebra窗口。 + +# 2. 编辑 + +# 【撤销】 + +点击该选项可以回到上一个动作。工具栏右侧也有同样功能的按钮。 + +# 【重做】 + +点击该选项可以重做下一个的动作。工具栏右侧也有同样功能的按钮。 + +# 【删除】 + +点击该选项可以删除选中的对象以及与之相关的对象。 + +# 【选择全部】 + +点击该选项可以选取所有对象。 + +# 【选择当前层】 + +点击该选项可以选取在同一层的对象。使用该选项前需要先点选一个对象作为图层的依据。 + +# 【选择后继】 + +点击该选项可以选取所有该对象有关的派生对象。 + +# 【选择祖先】 + +点击该选项可以选取所有对象的祖先。 + +# 【属性】 + +点击该选项可以打开属性对话框,供修改对象属性。 + +# 3. 查看 + +# 【坐标轴】 + +点击该选项可以显示或隐藏绘图区中的坐标轴。 + +# 【网格】 + +点击该选项可以显示或隐藏绘图区中坐标的网格。 + +# 【代数区】 + +勾选该选项可以显示代数区。 + +# 【电子表格】 + +勾选该选项可以显示电子表格。 + +# 【辅助对象】 + +勾选该选项可以显示代数去中的辅助对象。 + +# 【窗口左右并排】 + +勾选该选项则窗口布局为左右排列,否则为上下排列。 + +# 【命令框】 + +勾选该选项可以显示命令框,命令框位于GeoGebra窗口底部。 + +# 【命令列表】 + +勾选该选项可以显示命令框中的命令列表。 + +# 【作图过程】 + +点击该选项可以打开“作图过程”窗口。 + +# 【作图过程导航条】 + +勾选该选项可以显示作图过程导航条。 + +# 【刷新视图】 + +点击该选项可以重新整理所以视图,清楚所有点或线的轨迹。 + +# 【重新计算所有对象】 + +点击该选项可以重新计算 GeoGebra 文件中的所有对象。 + +# 4. 选项 + +# 【吸附格点】 + +该选项可以选择开启或关闭点吸附格点的功能。如果选择“自动”,当显示网格时吸附格点功能将开启,不显示网格时则该功能关闭。 + +# 【角的单位】 + +该选项可以选择角的单位是角度或是弧度。 + +# 【数值近似】 + +该选项可以选择保留的多少位小数。 + +# 【移动连续性】 + +可以选择开启或关闭移动连续性。 + +# 【点的样式】 + +点击该选项可以选择点的样式,GeoGebra中有7种样式可供选择。 + +# 【坐标】 + +该选项可以选择坐标的显示方式,共有三种形式: $\mathrm{A = (x,y)}$ ; $\mathrm{A(x|y)}$ ; $\mathrm{A:(x,y)}$ 。 + +# 【复选框大小】 + +点击该选项可以选择复选框的大小。 + +# 【直角样式】 + +该选项可以选择如何来标记直角:矩形、点、或一般。 + +# 【对象标签】 + +该选项可以选择是否显示新增对象的标签。可供选择的有:自动、显示新对象标签、隐藏新对象标签或只显示新对象标签。 + +# 【字体大小】 + +该选项可以调整标签及文字的字体大小。 + +# 【语言】 + +GeoGebra 有多种语言可供选择,可以通过该选项设定语言。 + +# 【绘图区】 + +点击该选项可以打开绘图区属性对话框,可以修改坐标轴、网格以及背景颜色。 + +# 【保存设定】 + +点击该选项可以保存对于GeoGebra的偏好设定,下次打开软件时菜单栏、工具栏和绘图区都将根据之前的设定显示。 + +# 【恢复默认设置】 + +点击该选项可以恢复 GeoGebra 的默认设置。 + +# 5. 工具 + +# 【新工具】 + +在 GeoGebra 中可以根据现有的构图,建立自定义工具。在出现的对话框中,可以设定工具输入输出的对象,并且选择工具栏显示的名称和图标。 + +# 【工具管理】 + +点击该选项将会打开“工具管理”对话框,可以删除工具或者修改工具的名称和图标。也可以将选取的工具存储为.ggt格式。 + +# 【自定义工具栏】 + +点击该选项,可以自定义工具栏中的图标。可以限制一些工具出现在工具栏中。 + +# 6. 窗口 + +# 【新窗口】 + +点击该选项可以建立新的GeoGebra窗口。 + +# 7. 帮助 + +# 【帮助】 + +点击该选项可以打开 GeoGebra 的在线帮助文件。如果是离线安装的 GeoGebra 则可以 + +# 动态数学软件 GeoGebra 使用指南 + +离线查看该文件,如果是在线安装的GeoGebra则需要在线查看。 + +# 【网站】 + +点击该选项可以打开 GeoGebra 的官方网站。(www.geogebra.org) + +# 【论坛】 + +点击该选项可以打开 GeoGebra 的用户论坛。(www.geogebra.org/forum) + +# 【GeoGebra 维基】 + +点击该选项可以打开 GeoGebra Wiki. (www.geogebra.org/wiki) + +# 【关于/版权宣告】 + +点击该选项可以打开关于 GeoGebra 版权信息的窗口以及鸣谢以不同方式支持 GeoGebra 计划的人员。 + +# 第七章 GeoGebra 的特性 + +# 1. 动画 + +# 1.1.自动动画 + +GeoGebra 可以实现一个或多个自变量或者角度的动画效果,这种动画效果可以在绘图区中通过滑杆来实现。 + +在 GeoGebra 中,如果要实现自动动画效果,需要在滑杆上点击鼠标右键,在右键菜单中勾选“开启动画”。在“属性”中可以设定动画的一些行为: + +![](images/19ffe397175b0dba24727d89e45f483c1638a670f696834f08d9ab186bd2ec05.jpg) + +1. 控制动画的速度。当速度设定为 1 时,表示完成滑杆上的一个区间需要 10 秒。 +2. 改变动画的重复方式。有: + +- <=>往复振荡的 +$\cdot = >$ 增长的 +$< =$ 减少的 + +# 1.2. 手动动画 + +所谓“手动”就是使用移动工具来实现动画效果,点选要改变的变量或角度,然后使用键盘上的+或-键,或者上下方向键,持续按住来实现手动变化效果。 + +# 2. 显示条件 + +在GeoGebra中除了可以直接显示或隐藏某个对象,也可以设定对象的显示隐藏条件。例如,在绘图区中新增一个复选框,当勾选时某对象才会出现;有或者设定当滑杆变化到某一个值时对象才会出现。 + +可以选用绘图工具中的“对象群组隐藏显示按钮”来建立一个复选框,利用勾选复选框来显示一个或多个对象。此外,也可以在命令框里输入一个布朗函数,通过这个函数的逻辑值来改变对象在绘图中的显示或隐藏。 + +在对象上点击鼠标右键,打开“属性”对话框,在“高级”选项卡中可以设定对象的显示条件。在设定对象的显示条件时,可以使用右侧的下拉列表,选用各种逻辑符号。如: $\neq$ 、 $\geq$ 、 $\land$ 等。例子如下: + +- 如果 a 是数值滑杆,在某对象的显示条件中输入 $a < 2$ ,那么当 a 滑到小于 2 的时候该对象才会显示在绘图区。如下图所示,当 a 大于 2 时,圆隐藏;当 a 小于 2 时,圆显示。 + +![](images/0723cef93dde6cfe62119524aad193e729e6d06ff99cf18a422118e82908fb11.jpg) + +- 如果 b 是布朗代数,可以进行逻辑运算,当 b 的值为 true 时,对象显示;当 b 的值为 false 时,对象隐藏。 +- 如果 $\mathrm{g}$ 和 $\mathrm{h}$ 是两条直线,可以使用 $\mathrm{g} \parallel \mathrm{h}$ 作为条件,当两直线平行时对象才会显示。 + +# 3. 自定义工具 + +在 GeoGebra 可以根据用户的需要自己建立绘图工具。 + +# 3.1. 建立自定义工具 + +首先建立工具绘制后的图像,然后点击“工具”菜单中的“新工具”,在对话框的填写输入对象、输出对象、名称与图标灯信息,建立自定义工具。 + +![](images/147afd8c1e943deffd4fde44e29da4b1b6badf1f445a22f85bc44a5b173afea6.jpg) + +例如:建立一个可以生成单位圆的工具。希望通过在绘图区点选一个点或点击绘图区空白处就可以画出一个单位圆。 + +- 首先画出一个单位圆。建立一个自由点A,然后选择绘图工具“圆(指定圆心与半径)”画出一个半径为1的圆(c)。 + +- 点击“工具”菜单中的“新工具”。 +- 选择输出对象:点击下来菜单,选择“圆c”。 +- 选择输入对象:GeoGebra会自动填充输入对象(此处为点A)。如果要选择其他,也可以在下拉菜单中选择。 +- 输入名称:输入工具或命令的名称“画单位圆”。 +- 工具说明:如有必要,可以填写工具说明,解释如何使用这个自定义工具。 +- 在工具栏中显示:勾选这个选项,可以让自定义工具在工具栏中出现。 +- 图标:可以选择电脑中的图片作为该工具的图标,图片会自适应大小,变成一个按钮大小和其他绘图工具图标类似。 + +![](images/b96a9a5976e42ce3ce4bacf4f1363a2ece55793dc99ba97e7cff113c832931a9.jpg) + +在工具栏中选中“画单位圆”工具,点击绘图区空白处或者点选已有点,就可以画出一个单位圆了。 + +# 3.2. 保存自定义工具 + +自定义工具可以存成单独的文件,分享供其他的 GeoGebra 用户使用。 + +点击“工具”菜单中的“工具管理”,在菜单中选择要保存的工具,点击“另存为”来存储工具。工具将保存为.ggt格式,而一般的GeoGebra文件为.ggb。 + +![](images/ef33a4b4f9ed4a95da03c3d31d4755109be81ba39ccfeeb7d194e1cba62ea707.jpg) + +# 3.3. 使用自定义工具 + +点击“文件”菜单中的“新建”,将打开一个新的GeoGebra文件,而自定义工具还会出现在工具栏中。但是,如果关闭了GeoGebra软件,自定义工具可能将不再出现在工具栏中。有几种方法可以重新调用自定义工具: + +- 在建立自定义工具后,点击“选项”菜单中的“保存设定”,那么自定义工具将会一 + +直出现在工具栏中。 + +- 如果已经将自定义工具另存为了 ggt 格式,可以点击“文件”菜单中的“打开”选择要用的自定义工具,打开后自定义工具将会出现在工具栏中。打开工具文件不会影响已有的绘图区,只会在工具栏中新增一个工具。 + +# 4. 动态颜色 + +使用鼠标右键点击对象,打开“属性”对话框,可以设置对象的颜色。在“高级”选项卡中可以对对象设定动态颜色。 + +![](images/56b41731de0942edcab8fcf94a007b1a261badf3189cb08935185a46e4d20f4f.jpg) + +可以在输入框中输入0到1之间的数值或函数,对象颜色将会由RGB(红绿蓝)三者的配色决定。例如: + +- 建立三个滑杆 a, b, c, 并设置它们的变化范围为 0 到 1。 +- 使用“多边形”工具画出一个多边形。 +- 在“属性”对话框中选择“高级”选项卡,在三个颜色后分别输入a,b,c。 +- 关闭对话框,滑动滑杆,可以发现多边形的颜色也随着变化。 + +![](images/75eae72b4716b8b6e440600c1475716bcb4078e383afc99a297fb8578163bce0.jpg) + +说明:这里使用的实际是 RGB 配色原理。RGB 色彩模式是一种颜色标准,是通过对红(R)、绿(G)、蓝(B)三个颜色通道的变化以及它们相互之间的叠加来得到各式各样的颜色的,这个标准几乎包括了人类视力所能感知的所有颜色,是目前运用最广的颜色系统之一。 + +# 5. JavaScript + +JavaScript 是一种广泛用于客户端 Web 开发的脚本语言,常用来给 HTML 网页添加动态功能。GeoGebra 提供的 JavaScript 界面对于会使用 HTML 源代码编辑网页的使用者来说是一个非常便利有趣的功能。 + +在 GeoGebra 导出的动态网页中,GeoGebra 提供了一个 JavaScript 界面,可以方便的在网页中添加按钮等等,来实现动态效果。 + +# 6. 对象名称与标签 + +# 6.1. 显示隐藏对象名称 + +在绘图区中有几个方法可以显示或隐藏对象的名称: + +- 点击工具栏中的“显示或隐藏对象”工具,然后点击要显示或隐藏名称的对象。 +- 使用鼠标右键点击对象,勾选或取消“显示对象”。 + +- 使用鼠标右键点击对象,打开“属性”对话框,在“基本”选项卡中勾选或取消“显示对象”。 + +# 6.2. 名称和数值 + +在GeoGebra中每个对象都有唯一的名称来标示,可以选择显示对象的名称或数值。使用鼠标右键点击对象,打开“属性”对话框,在“基本”选项卡中,可以勾选显示“名称”“名称和值”“数值”和“标签文字”四种模式。 + +![](images/06deee00f7fa17682afe22ce3ea5b9de260f6a358e7118001cf337f9fc421a3c.jpg) + +在GeoGebra中,一个点的数值即为其坐标,一个函数的数值为其方程式。 + +# 6.3. 标签文字 + +GeoGebra 对所以的对象提供标签标示,可以附加在对象上。使用鼠标右键点击对象,打开“属性”对话框,在“基本”选项卡中,输入标签文字并选择显示“标签文字”即可。 + +# 7. 图层 + +在绘图区中可能会出现多个对象相覆盖的情况,图层的选定可以决定哪些对象可以被操作。 + +默认情况下,所以绘图区中绘制的对象在图层0,也就是绘图区的背景层。GeoGebra一共提供了10个图层,从0到9,数字越大越在上层。使用鼠标右键点击对象,打开“属性”对话框,在“高级”选项卡中,可以改选对象所处的图层。 + +在 GeoGebra 可以方便的选取同一层的所有对象,点选某一图层的一个对象,然后在“编辑”菜单中选择“选择当前层”,就可以选择该图层的所以对象了。 + +# 8. 重新定义 + +在绘图区中修改对象时,使用鼠标左键双击对象,可以弹出“重新定义”对话框,由此可以方便的修改对象。 + +![](images/8f4b5cdbbe1197563e9ccac19623ef34b9e47d20df4bc2a1d45efa5643cebca5.jpg) + +例子如下: + +- 如果要让一条通过点 A 和 B 的直线变成线段 AB,可以打开对象的“重新定义”对话框,然后修改表达式为 Segment[A,B]即可。 + +固定对象是无法重新定义的,如果要重新定义固定对象,可以在“属性”对话框中的“基本”选项卡下取消“固定对象”的勾选。 + +# 9. 痕迹与轨迹 + +在GeoGebra中,可以让对象移动时在绘图区中留下轨迹。在对象上点击右键,在右键菜单中点击“显示移动痕迹”。此后修改对象时,GeoGebra会追踪对象的位置变化并留下痕迹。 + +此外,还可以让 GeoGebra 自行建立点的运到轨迹。使用工具栏中的“轨迹”工具,或在命令框中输入“Locus”命令。运动轨迹必须依赖另外一点的移动。 + +例子如下: + +- 建立点 $\mathrm{A} = (-1, - 1)$ 和点 $\mathbf{B} = (1, - 1)$ ,并建立线段AB。 +- 在线段上取一点 C,点 C 只能沿线段移动。 +- 建立一点 $\mathbf{P}$ , 与 $\mathbf{C}$ 相关, $\mathrm{P} = (\mathrm{x}(\mathrm{C}), \mathrm{x}(\mathrm{C})^{\wedge}2)$ . +- 使用工具或命令来建立点 P 的轨迹。 + +![](images/0063264b982a9f8ac8123498ca0901c9eba39ec31dfb5d20edd1baff444c2f60.jpg) diff --git a/docs/ref/wiki/geogrebra-llms.md b/docs/ref/wiki/geogrebra-llms.md new file mode 100644 index 00000000..a9e0f059 --- /dev/null +++ b/docs/ref/wiki/geogrebra-llms.md @@ -0,0 +1,7 @@ +## This page describes the GeoGebra Apps API to interact with GeoGebra apps. Please see GeoGebra Apps Embedding on how to embed our apps into your web pages. + +- https://geogebra.github.io/docs/reference/en/GeoGebra_Apps_API/ + +## This page lists all parameters that can be used to configure GeoGebra Apps. Please see GeoGebra Apps Embedding to learn how to embed GeoGebra apps into your pages and where these parameters can be used. + +- https://geogebra.github.io/docs/reference/en/GeoGebra_App_Parameters/ diff --git a/docs/ref/wiki/langchain-llms.txt b/docs/ref/wiki/langchain-llms.txt new file mode 100644 index 00000000..9100a70d --- /dev/null +++ b/docs/ref/wiki/langchain-llms.txt @@ -0,0 +1,149 @@ + +# Guides + +- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/index/): This page provides an overview of the LangGraph project, including its logo and essential scripts for functionality within MkDocs. It also includes a reference to the README.md file for detailed information about the project. The content is designed to be user-friendly and visually appealing. +- [LangGraph Quickstart Guide](https://langchain-ai.github.io/langgraph/agents/agents/): This quickstart guide provides step-by-step instructions for setting up and using LangGraph's prebuilt components to create agentic systems. It covers prerequisites, installation, agent creation, configuration of language models, and advanced features like memory and structured output. Ideal for developers looking to leverage LangGraph for building intelligent agents. +- [Getting Started with LangGraph: Building AI Agents](https://langchain-ai.github.io/langgraph/concepts/why-langgraph/): This page provides an overview of LangGraph, a platform designed for developers to create adaptable AI agents. It highlights key features such as reliability, extensibility, and streaming support, and offers a series of tutorials to help users build a support chatbot with various capabilities. By following the tutorials, developers will learn to implement essential functionalities like conversation state management and human-in-the-loop controls. +- [Building a Basic Chatbot with LangGraph](https://langchain-ai.github.io/langgraph/tutorials/get-started/1-build-basic-chatbot/): This tutorial guides you through the process of creating a basic chatbot using LangGraph. It covers prerequisites, installation of necessary packages, and step-by-step instructions to set up a state machine for the chatbot. By the end of the tutorial, you will have a functional chatbot that can engage in simple conversations. +- [Integrating Web Search Tools into Your Chatbot](https://langchain-ai.github.io/langgraph/tutorials/get-started/2-add-tools/): This tutorial guides you through the process of enhancing your chatbot's capabilities by integrating a web search tool, specifically the Tavily Search Engine. It covers prerequisites, installation, configuration, and the implementation of the search tool within a LangGraph-based chatbot. By the end, you'll have a functional chatbot that can retrieve real-time information to answer user queries beyond its training data. +- [Implementing Memory in Chatbots with LangGraph](https://langchain-ai.github.io/langgraph/tutorials/get-started/3-add-memory/): This page provides a comprehensive guide on how to add memory functionality to chatbots using LangGraph's persistent checkpointing feature. It details the steps to create a `MemorySaver` checkpointer, compile the graph, and interact with the chatbot to maintain context across multiple interactions. Additionally, it explains how to inspect the state of the chatbot and highlights the advantages of checkpointing over simple memory solutions. +- [Implementing Human-in-the-Loop Controls in LangGraph](https://langchain-ai.github.io/langgraph/tutorials/get-started/4-human-in-the-loop/): This page provides a comprehensive guide on adding human-in-the-loop controls to LangGraph workflows, enabling agents to pause execution for human input. It details the use of the `interrupt` function to facilitate user feedback and outlines the steps to integrate a `human_assistance` tool into a chatbot. Additionally, the tutorial covers graph compilation, visualization, and resuming execution with human input. +- [Customizing State in LangGraph for Enhanced Chatbot Functionality](https://langchain-ai.github.io/langgraph/tutorials/get-started/5-customize-state/): This tutorial guides you through the process of adding custom fields to the state in LangGraph, enabling complex behaviors in your chatbot without relying solely on message lists. You will learn how to implement human-in-the-loop controls to verify information before it is stored in the state. By the end of this tutorial, you will have a deeper understanding of state management and how to enhance your chatbot's capabilities. +- [Implementing Time Travel in LangGraph Chatbots](https://langchain-ai.github.io/langgraph/tutorials/get-started/6-time-travel/): This page provides a comprehensive guide on utilizing the time travel functionality in LangGraph to enhance chatbot interactions. It covers how to rewind, add steps, and replay the state history of a chatbot, allowing users to explore different outcomes and fix mistakes. Additionally, it includes code snippets and practical examples to help developers implement these features effectively. +- [LangGraph Deployment Options](https://langchain-ai.github.io/langgraph/tutorials/deployment/): This page outlines the various options available for deploying LangGraph applications, including local testing and different cloud-based solutions. It details free deployment methods such as Local, as well as production options like Cloud SaaS and self-hosted solutions. Each deployment method is linked to further documentation for in-depth guidance. +- [Agent Development with LangGraph](https://langchain-ai.github.io/langgraph/agents/overview/): This page provides an overview of agent development using LangGraph, highlighting its prebuilt components and capabilities for building agent-based applications. It explains the structure of an agent, key features such as memory integration and human-in-the-loop control, and outlines the package ecosystem available for developers. With LangGraph, users can focus on application logic while leveraging robust infrastructure for state management and feedback. +- [Guide to Running Agents in LangGraph](https://langchain-ai.github.io/langgraph/agents/run_agents/): This page provides a comprehensive overview of how to execute agents in LangGraph, detailing both synchronous and asynchronous methods. It covers input and output formats, streaming capabilities, and how to manage execution limits to prevent infinite loops. Additionally, it includes code examples and links to further resources for deeper understanding. +- [Streaming Data in LangGraph](https://langchain-ai.github.io/langgraph/agents/streaming/): This page provides an overview of streaming data types in LangGraph, including agent progress, LLM tokens, and custom updates. It includes code examples for both synchronous and asynchronous streaming methods. Additionally, it covers how to stream multiple modes and disable streaming when necessary. +- [Configuring Chat Models for Agents](https://langchain-ai.github.io/langgraph/agents/models/): This page provides detailed instructions on how to configure various chat models for use with agents in LangChain. It covers model initialization, tool calling support, and how to specify models from different providers such as OpenAI, Anthropic, Azure, Google Gemini, and AWS Bedrock. Additionally, it includes information on disabling streaming, adding model fallbacks, and links to further resources. +- [Using Tools in LangChain](https://langchain-ai.github.io/langgraph/agents/tools/): This page provides an overview of how to define, customize, and manage tools within the LangChain framework. It covers creating simple tools, handling tool errors, and utilizing prebuilt integrations for enhanced functionality. Additionally, it discusses advanced features such as memory management and controlling tool behavior during agent execution. +- [Integrating MCP with LangGraph Agents](https://langchain-ai.github.io/langgraph/agents/mcp/): This page provides a comprehensive guide on how to integrate the Model Context Protocol (MCP) with LangGraph agents using the `langchain-mcp-adapters` library. It includes installation instructions, example code for using MCP tools, and guidance on creating custom MCP servers. Additional resources for further reading on MCP are also provided. +- [Understanding Context in LangGraph Agents](https://langchain-ai.github.io/langgraph/agents/context/): This page provides an overview of how to supply context to agents in LangGraph, detailing the three primary types: Config, State, and Long-Term Memory. It explains how to use these context types to enhance agent behavior, customize prompts, and access context in tools. Additionally, it includes code examples for implementing context in various scenarios. +- [Understanding Memory in LangGraph for Conversational Agents](https://langchain-ai.github.io/langgraph/agents/memory/): This documentation page provides an overview of the two types of memory supported by LangGraph: short-term and long-term memory. It explains how to implement these memory types in conversational agents, including code examples and best practices for managing message history. Additionally, it covers the use of persistent storage and tools for enhancing memory functionality. +- [Implementing Human-in-the-Loop in LangGraph](https://langchain-ai.github.io/langgraph/agents/human-in-the-loop/): This documentation page provides a comprehensive guide on how to implement Human-in-the-Loop (HIL) features in LangGraph, allowing for human review and approval of tool calls in agents. It covers the use of the `interrupt()` function to pause execution for human input, along with practical examples and code snippets. Additionally, it explains how to create a wrapper to add HIL capabilities to any tool seamlessly. +- [Building Multi-Agent Systems](https://langchain-ai.github.io/langgraph/agents/multi-agent/): This page provides an overview of multi-agent systems, detailing how to create and manage them using supervisor and swarm architectures. It includes practical examples of implementing a flight and hotel booking assistant using the LangGraph libraries. Additionally, the page explains the concept of handoffs between agents, allowing for seamless communication and task delegation. +- [Evaluating Agent Performance with LangSmith](https://langchain-ai.github.io/langgraph/agents/evals/): This page provides a comprehensive guide on how to evaluate the performance of agents using the LangSmith evaluations framework. It includes instructions on defining evaluator functions, utilizing prebuilt evaluators from the AgentEvals package, and running evaluations with specific datasets. Additionally, it covers different evaluation techniques, including trajectory matching and using LLMs as judges. +- [Deploying Your LangGraph Agent](https://langchain-ai.github.io/langgraph/agents/deployment/): This page provides a comprehensive guide on how to deploy a LangGraph agent, including setting up a LangGraph app for both local development and production. It covers essential features, installation steps, and configuration requirements, along with instructions for launching the local server and utilizing the LangGraph Studio Web UI for debugging. Additionally, it offers links to further resources for deployment options. +- [Agent Chat UI Documentation](https://langchain-ai.github.io/langgraph/agents/ui/): This page provides comprehensive guidance on using the Agent Chat UI for interacting with LangGraph agents. It covers setup instructions, features like human-in-the-loop workflows, and the integration of generative UI components. Users can find links to relevant resources and tips for customizing their chat experience. +- [Overview of Agent Architectures in LLM Applications](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/): This page provides a comprehensive overview of various agent architectures used in large language model (LLM) applications, highlighting their control flows and functionalities. It discusses key concepts such as routers, tool-calling agents, memory management, and planning, along with customization options for specific tasks. Additionally, it covers advanced features like human-in-the-loop, parallelization, subgraphs, and reflection mechanisms to enhance agent performance. +- [Understanding Workflows and Agents in LangGraph](https://langchain-ai.github.io/langgraph/tutorials/workflows/): This documentation page provides an in-depth overview of workflows and agents within LangGraph, highlighting their differences and use cases. It covers various patterns for building agentic systems, including setup instructions, building blocks, and advanced concepts like prompt chaining, parallelization, and routing. Additionally, it offers practical examples and code snippets to help users implement these workflows effectively. +- [Understanding LangGraph: Core Concepts and Components](https://langchain-ai.github.io/langgraph/concepts/low_level/): This documentation page provides an in-depth overview of the core concepts of LangGraph, focusing on how agent workflows are modeled as graphs. It covers essential components such as States, Nodes, and Edges, and explains how they interact to create complex workflows. Additionally, it discusses graph compilation, message handling, and configuration options to enhance the functionality of your graphs. +- [LangGraph Runtime Overview](https://langchain-ai.github.io/langgraph/concepts/pregel/): This page provides a comprehensive overview of the LangGraph runtime, specifically focusing on the Pregel execution model. It details the structure and functionality of actors and channels within the Pregel framework, along with examples of how to implement applications. Additionally, it introduces high-level APIs for creating Pregel applications using StateGraph and Functional API. +- [Using the LangGraph API: A Comprehensive Guide](https://langchain-ai.github.io/langgraph/how-tos/graph-api/): This documentation provides a detailed overview of how to utilize the LangGraph Graph API, covering essential concepts such as state management, node creation, and control flow. It includes practical examples for building sequences, branches, and loops, as well as advanced features like retry policies and async execution. Additionally, the guide offers insights into visualizing graphs and integrating with external tools. +- [LangGraph Streaming System](https://langchain-ai.github.io/langgraph/concepts/streaming/): This page provides an overview of the streaming capabilities of LangGraph, enabling real-time updates for enhanced user experiences. It details the types of data that can be streamed, including workflow progress, LLM tokens, and custom updates. Additionally, it outlines various functionalities and modes available for streaming within the LangGraph framework. +- [Streaming Outputs in LangGraph](https://langchain-ai.github.io/langgraph/how-tos/streaming/): This documentation page provides an overview of how to utilize the streaming capabilities of LangGraph, including synchronous and asynchronous streaming methods. It covers various stream modes, such as updates, values, and custom data, along with examples of how to implement them in your graphs. Additionally, it discusses the integration of Large Language Models (LLMs) and how to handle streaming outputs effectively. +- [LangGraph Persistence and Checkpointing](https://langchain-ai.github.io/langgraph/concepts/persistence/): This page provides an in-depth overview of the persistence layer in LangGraph, focusing on the use of checkpointers to save graph states at each super-step. It covers key concepts such as threads, checkpoints, state retrieval, and memory management, along with practical examples and code snippets. Additionally, it discusses advanced features like time travel, fault tolerance, and the integration of memory stores for cross-thread information retention. +- [Understanding Durable Execution in LangGraph](https://langchain-ai.github.io/langgraph/concepts/durable_execution/): This page provides an overview of durable execution, a technique that allows workflows to save their progress and resume from key points. It details the requirements for implementing durable execution in LangGraph, including the use of persistence and tasks to ensure deterministic and consistent replay. Additionally, it covers how to handle pausing, resuming, and recovering workflows effectively. +- [Implementing Memory in LangGraph for AI Applications](https://langchain-ai.github.io/langgraph/how-tos/persistence/): This documentation page provides a comprehensive guide on adding persistence to AI applications using LangGraph. It covers both short-term and long-term memory implementations, including code examples for managing conversation context and user-specific data. Additionally, it discusses the use of various storage backends and semantic search capabilities for enhanced memory management. +- [Understanding Memory in AI Agents](https://langchain-ai.github.io/langgraph/concepts/memory/): This documentation page provides an in-depth overview of memory types in AI agents, focusing on short-term and long-term memory. It explains how these memory types can be implemented and managed within applications using LangGraph, including techniques for handling conversation history and storing memories. Additionally, it discusses the importance of memory in enhancing user interactions and the various strategies for writing and updating memories. +- [Memory Management in LangGraph for AI Applications](https://langchain-ai.github.io/langgraph/how-tos/memory/): This page provides an overview of memory management in LangGraph, focusing on short-term and long-term memory functionalities essential for conversational agents. It includes detailed instructions on how to implement memory strategies such as trimming, summarizing, and deleting messages to optimize conversation tracking without exceeding context limits. Code examples are provided to illustrate the implementation of these memory management techniques. +- [Human-in-the-Loop Workflows in LangGraph](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/): This page provides an overview of the human-in-the-loop (HIL) capabilities within LangGraph, highlighting how human intervention can enhance automated processes. It details key features such as persistent execution state and flexible integration points, along with typical use cases for validating outputs and providing context. Additionally, it outlines the implementation of HIL through specific functions and primitives. +- [Implementing Human-in-the-Loop Workflows with Interrupts](https://langchain-ai.github.io/langgraph/how-tos/human_in_the_loop/add-human-in-the-loop/): This documentation page provides a comprehensive guide on using the `interrupt` function in LangGraph to facilitate human-in-the-loop workflows. It covers the implementation details, design patterns, and best practices for pausing graph execution to gather human input, as well as how to resume execution with that input. Additionally, it highlights common pitfalls and offers extended examples to illustrate various use cases. +- [Understanding Breakpoints in LangGraph](https://langchain-ai.github.io/langgraph/concepts/breakpoints/): This page provides an overview of breakpoints in LangGraph, which allow users to pause graph execution at specific points for inspection. It explains how breakpoints utilize the persistence layer to save the graph state and how execution can be resumed after inspection. An illustrative example is included to demonstrate the concept visually. +- [Using Breakpoints in Graph Execution](https://langchain-ai.github.io/langgraph/how-tos/human_in_the_loop/breakpoints/): This page provides a comprehensive guide on how to implement breakpoints in graph execution for debugging purposes. It covers the requirements for setting breakpoints, the difference between static and dynamic breakpoints, and includes code examples for both compile-time and run-time configurations. Additionally, it explains how to manage breakpoints in subgraphs. +- [Time Travel Functionality in LangGraph](https://langchain-ai.github.io/langgraph/concepts/time-travel/): This page explains the time travel feature in LangGraph, which allows users to analyze and debug decision-making processes in non-deterministic systems. It outlines how to understand reasoning, debug mistakes, and explore alternative solutions by resuming execution from prior checkpoints. The functionality enables users to create new forks in the execution history for deeper insights. +- [Using Time-Travel in LangGraph](https://langchain-ai.github.io/langgraph/how-tos/human_in_the_loop/time-travel/): This page provides a comprehensive guide on how to implement time-travel functionality in LangGraph. It outlines the steps to run a graph, identify checkpoints, modify graph states, and resume execution from specific checkpoints. Additionally, an example workflow is included to illustrate the process of generating and modifying jokes using LangGraph. +- [Integrating Tools with AI Models](https://langchain-ai.github.io/langgraph/concepts/tools/): This page provides an overview of how AI models can interact with external systems using tool calling. It explains the concept of tools, their integration with chat models, and how to create or use prebuilt tools for various applications. Additionally, it highlights the importance of relevance in tool invocation and offers links to further resources and guides. +- [Using Tools in LangChain](https://langchain-ai.github.io/langgraph/how-tos/tool-calling/): This documentation page provides a comprehensive guide on how to create and utilize tools within the LangChain framework. It covers defining simple and customized tools, managing tool arguments, accessing configuration and state, and integrating tools with chat models and agents. Additionally, it discusses error handling and strategies for managing a large number of tools. +- [Understanding Subgraphs in LangGraph](https://langchain-ai.github.io/langgraph/concepts/subgraphs/): This page provides an overview of subgraphs in LangGraph, explaining their role as encapsulated nodes within larger graphs. It discusses the benefits of using subgraphs, such as facilitating multi-agent systems and enabling independent team work. Additionally, it outlines the communication methods between parent graphs and subgraphs, detailing scenarios involving shared and different state schemas. +- [Using Subgraphs in LangGraph](https://langchain-ai.github.io/langgraph/how-tos/subgraph/): This guide provides an overview of how to effectively use subgraphs within LangGraph, including communication methods between parent graphs and subgraphs. It covers shared and different state schemas, setup instructions, and examples for implementing subgraphs in multi-agent systems. Additionally, it discusses persistence, state management, and streaming outputs from subgraphs. +- [Understanding Multi-Agent Systems](https://langchain-ai.github.io/langgraph/concepts/multi_agent/): This page provides an in-depth overview of multi-agent systems, focusing on the architecture and benefits of using multiple independent agents to manage complex applications. It discusses various multi-agent architectures, including network, supervisor, and hierarchical models, as well as communication strategies and state management techniques for effective agent interaction. +- [Building Multi-Agent Systems with LangGraph](https://langchain-ai.github.io/langgraph/how-tos/multi_agent/): This guide provides an overview of how to build multi-agent systems using LangGraph, focusing on the implementation of handoffs for agent communication. It covers the creation of independent agents, the use of handoffs to transfer control and data between agents, and examples of prebuilt multi-agent architectures. Additionally, it includes code snippets and best practices for managing agent interactions and state. +- [Understanding the Functional API in LangGraph](https://langchain-ai.github.io/langgraph/concepts/functional_api/): This documentation page provides an overview of the Functional API in LangGraph, detailing its key features such as persistence, memory, and human-in-the-loop capabilities. It explains how to define workflows using the `@entrypoint` and `@task` decorators, along with examples and best practices for implementing workflows with state management and streaming. Additionally, it compares the Functional API with the Graph API, highlighting their differences and use cases. +- [Functional API Documentation](https://langchain-ai.github.io/langgraph/how-tos/use-functional-api/): This page provides comprehensive guidance on using the Functional API, including creating workflows, handling parallel execution, and integrating with other APIs. It covers various features such as retry policies, caching, and human-in-the-loop workflows, along with practical examples. Additionally, it discusses memory management strategies for both short-term and long-term use cases. +- [Overview of LangGraph Platform](https://langchain-ai.github.io/langgraph/concepts/langgraph_platform/): The LangGraph Platform is designed for developing, deploying, and managing long-running agent workflows with ease. This page outlines the platform's features, including streaming support, background runs, and memory management, which enhance the performance and reliability of agent applications. Additionally, it provides links to resources for getting started and deploying agents effectively. +- [LangGraph Platform Quickstart Guide](https://langchain-ai.github.io/langgraph/tutorials/langgraph-platform/local-server/): This quickstart guide provides step-by-step instructions for running a LangGraph application locally. It covers prerequisites, installation of the LangGraph CLI, app creation, dependency installation, and launching the server. Additionally, it includes testing your application using the LangGraph Studio and API. +- [LangGraph Platform Deployment Quickstart](https://langchain-ai.github.io/langgraph/cloud/quick_start/): This quickstart guide provides step-by-step instructions for deploying an application on the LangGraph Platform using GitHub. It covers prerequisites, repository creation, deployment procedures, and testing your application and API. Follow these steps to successfully set up and run your application in the LangGraph environment. +- [Overview of LangGraph Platform Components](https://langchain-ai.github.io/langgraph/concepts/langgraph_components/): This page provides a comprehensive overview of the various components that make up the LangGraph Platform. It details the functionalities of each component, including the LangGraph Server, CLI, Studio, SDKs, and the control and data planes. Users can learn how these components work together to facilitate the development, deployment, and management of LangGraph applications. +- [LangGraph Server Documentation](https://langchain-ai.github.io/langgraph/concepts/langgraph_server/): This page provides an overview of the LangGraph Server, an API designed for creating and managing agent-based applications. It details the server versions, application structure, deployment components, and the use of assistants, persistence, and task queues. Additionally, it includes links to further resources and guides for effective deployment and usage. +- [LangGraph Application Structure Guide](https://langchain-ai.github.io/langgraph/concepts/application_structure/): This page provides an overview of the structure of a LangGraph application, detailing the essential components such as the configuration file, dependencies, graphs, and environment variables. It includes examples of directory structures for both Python and JavaScript applications, as well as guidance on how to specify the necessary information for deployment. Additionally, it covers key concepts related to the configuration file and the role of dependencies and environment variables in the application. +- [Setting Up a LangGraph Application with requirements.txt](https://langchain-ai.github.io/langgraph/cloud/deployment/setup/): This guide provides step-by-step instructions for configuring a LangGraph application for deployment using a requirements.txt file to manage dependencies. It covers essential topics such as specifying dependencies, defining environment variables, and creating the LangGraph configuration file. Additionally, it includes examples and tips for alternative setup methods. +- [Setting Up a LangGraph Application with pyproject.toml](https://langchain-ai.github.io/langgraph/cloud/deployment/setup_pyproject/): This guide provides step-by-step instructions for configuring a LangGraph application using the `pyproject.toml` file for dependency management. It covers the necessary components, including specifying dependencies, environment variables, and defining graphs, along with examples and best practices. Additionally, it offers tips for alternative setups and links to further resources for deployment. +- [Setting Up a LangGraph.js Application](https://langchain-ai.github.io/langgraph/cloud/deployment/setup_javascript/): This guide provides step-by-step instructions for configuring a LangGraph.js application for deployment on the LangGraph Platform or for self-hosting. It covers essential topics such as specifying dependencies, environment variables, defining graphs, and creating the necessary configuration file. By following this walkthrough, users will learn how to structure their application and prepare it for deployment. +- [Customizing Your Dockerfile in LangGraph](https://langchain-ai.github.io/langgraph/cloud/deployment/custom_docker/): This page provides a guide on how to customize your Dockerfile by adding additional commands through the `langgraph.json` configuration file. It explains how to specify the `dockerfile_lines` key to include necessary dependencies, such as installing system packages and Python libraries. An example is provided to illustrate the process of integrating the Pillow library for image processing. +- [LangGraph CLI Documentation](https://langchain-ai.github.io/langgraph/concepts/langgraph_cli/): This page provides an overview of the LangGraph CLI, a command-line tool for building and running the LangGraph API server locally. It includes installation instructions, a list of core commands, and their descriptions to help users effectively utilize the CLI for development and deployment. For further details, users can refer to the LangGraph CLI Reference. +- [LangGraph Studio Documentation](https://langchain-ai.github.io/langgraph/concepts/langgraph_studio/): This page provides an overview of LangGraph Studio, an IDE for visualizing, interacting with, and debugging agentic systems that utilize the LangGraph Server API. It outlines the prerequisites for using the studio, key features, and the two operational modes: Graph mode and Chat mode. Additionally, it includes links to further resources for getting started with LangGraph Studio. +- [Getting Started with LangGraph Studio](https://langchain-ai.github.io/langgraph/cloud/how-tos/studio/quick_start/): This page provides a comprehensive guide on how to connect and use LangGraph Studio with both deployed applications on the LangGraph Platform and local development servers. It includes instructions for installation, running the server, accessing the Studio UI, and debugging options. Additionally, troubleshooting tips and next steps for further exploration of LangGraph Studio features are also provided. +- [Running Applications: A Comprehensive Guide](https://langchain-ai.github.io/langgraph/cloud/how-tos/invoke_studio/): This page provides a detailed guide on how to submit a run to your application, covering both Graph and Chat modes. It includes instructions on specifying input, managing assistants, enabling streaming, and using breakpoints. Additionally, it offers tips for running applications from specific checkpoints in existing threads. +- [Managing Assistants in LangGraph Studio](https://langchain-ai.github.io/langgraph/cloud/how-tos/studio/manage_assistants/): This page provides guidance on how to manage assistants within LangGraph Studio, including viewing, editing, and updating assistant configurations. It covers both Graph mode and Chat mode, detailing how to activate assistants and make changes to their settings. Users will learn how to navigate the interface to effectively manage their assistant configurations for graph runs. +- [Managing Threads in Studio](https://langchain-ai.github.io/langgraph/cloud/how-tos/threads_studio/): This page provides a comprehensive guide on how to view and edit threads within the Studio application. It covers both Graph and Chat modes, detailing the steps to create new threads, view thread history, and edit thread states. Additionally, it includes links to related concepts for further learning. +- [Modifying Prompts in LangGraph Studio](https://langchain-ai.github.io/langgraph/cloud/how-tos/iterate_graph_studio/): This page provides guidance on how to modify prompts within LangGraph Studio using two methods: direct node editing and the LangSmith Playground interface. It details the configuration options available for nodes, including `langgraph_nodes` and `langgraph_type`, along with examples for both Pydantic models and dataclasses. Additionally, it outlines the steps for editing prompts in the UI and utilizing the LangSmith Playground for testing LLM calls. +- [Debugging LangSmith Traces in LangGraph Studio](https://langchain-ai.github.io/langgraph/cloud/how-tos/clone_traces_studio/): This guide provides step-by-step instructions for opening and debugging LangSmith traces in LangGraph Studio. It covers how to deploy threads and test local agents with remote traces, ensuring a seamless debugging experience. Additionally, it outlines the requirements for local agents and the process for cloning threads for local testing. +- [How to Add Nodes to LangSmith Datasets](https://langchain-ai.github.io/langgraph/cloud/how-tos/datasets_studio/): This guide provides step-by-step instructions on how to add examples from nodes in the thread log to LangSmith datasets. It covers selecting threads, choosing nodes, and editing inputs/outputs before adding them to the dataset. Additionally, it includes links to further resources on evaluating intermediate steps. +- [LangGraph SDK Documentation](https://langchain-ai.github.io/langgraph/concepts/sdk/): This page provides an overview of the LangGraph SDK, including installation instructions for both Python and JavaScript. It details the synchronous and asynchronous client options available for interacting with the LangGraph Server. Additionally, it offers links to further resources and references for the SDK. +- [Integrating Semantic Search in LangGraph](https://langchain-ai.github.io/langgraph/cloud/deployment/semantic_search/): This guide provides step-by-step instructions on how to implement semantic search in your LangGraph deployment. It covers prerequisites, configuration of the store, and usage examples for searching memories and documents by semantic similarity. Additionally, it includes information on using custom embeddings and querying via the LangGraph SDK. +- [Configuring Time-to-Live (TTL) in LangGraph Applications](https://langchain-ai.github.io/langgraph/how-tos/ttl/configure_ttl/): This guide provides detailed instructions on how to configure Time-to-Live (TTL) settings for checkpoints and store items in LangGraph applications. It covers the necessary configurations in the `langgraph.json` file, including strategies for managing data lifecycle and memory. Additionally, it explains how to combine TTL configurations and override them at runtime. +- [LangGraph Authentication & Access Control Overview](https://langchain-ai.github.io/langgraph/concepts/auth/): This page provides a comprehensive guide to the authentication and authorization mechanisms within the LangGraph Platform. It explains the core concepts of authentication versus authorization, outlines default security models, and details the system architecture involved in user identity management. Additionally, it covers implementation examples for authentication and authorization handlers, along with common access patterns and supported resources. +- [Custom Authentication Setup for LangGraph Platform](https://langchain-ai.github.io/langgraph/how-tos/auth/custom_auth/): This guide provides step-by-step instructions on how to implement custom authentication in your LangGraph Platform application. It covers the necessary prerequisites, implementation details, configuration updates, and client connection methods. The guide is applicable to both managed and Enterprise self-hosted deployments. +- [Documenting API Authentication in OpenAPI for LangGraph](https://langchain-ai.github.io/langgraph/how-tos/auth/openapi_security/): This guide provides instructions on how to customize the security schema for your LangGraph Platform API documentation using OpenAPI. It covers default security schemes for both LangGraph Platform and self-hosted deployments, as well as how to implement custom authentication. Additionally, it includes examples for OAuth2 and API key authentication, along with testing procedures. +- [Managing Assistants in LangGraph](https://langchain-ai.github.io/langgraph/concepts/assistants/): This page provides an overview of how to create and manage assistants within the LangGraph Platform, which allows for separate configuration of agents without altering the core graph logic. It covers the prerequisites, configuration options, and versioning of assistants, highlighting their role in optimizing agent performance for different tasks. Additionally, it includes links to relevant API references and how-to guides for further assistance. +- [Managing Assistants in LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/configuration_cloud/): This documentation page provides a comprehensive guide on how to create, configure, and manage assistants using the LangGraph SDK and Platform UI. It includes code examples in Python and JavaScript, as well as instructions for creating new versions and using previous versions of assistants. Additionally, it covers the process of utilizing assistants in various environments. +- [Understanding Threads in LangGraph](https://langchain-ai.github.io/langgraph/cloud/concepts/threads/): This page provides an overview of threads in the LangGraph framework, detailing how they accumulate the state of runs and the importance of checkpoints. It explains the process of creating threads and retrieving their current and historical states. Additionally, it offers links to further resources on threads, checkpoints, and the LangGraph API for managing thread states. +- [Managing Threads in LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/use_threads/): This documentation page provides a comprehensive guide on how to create, view, and inspect threads using the LangGraph SDK. It includes detailed instructions for creating empty threads, copying existing threads, and initializing threads with prepopulated states. Additionally, it covers how to list and inspect threads, including filtering and sorting options. +- [Understanding Runs in LangGraph Platform](https://langchain-ai.github.io/langgraph/cloud/concepts/runs/): This page provides an overview of what constitutes a run in the LangGraph Platform, including its input, configuration, and metadata. It also highlights the ability to execute runs on threads and offers links to the API reference for managing runs. +- [Starting Background Runs for Your Agent](https://langchain-ai.github.io/langgraph/cloud/how-tos/background_run/): This guide provides step-by-step instructions on how to initiate background runs for your agent using Python, JavaScript, and CURL. It covers the setup process, checking current runs, starting new runs, and retrieving the final results. By following this documentation, users can efficiently manage long-running jobs within their applications. +- [Running Multiple Agents on the Same Thread in LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/same-thread/): This documentation page explains how to run multiple agents on the same thread using the LangGraph Platform. It provides step-by-step examples in Python, JavaScript, and CURL to create agents, run them on a thread, and demonstrate how the second agent can utilize the context from the first agent's responses. By following the examples, users can learn to effectively manage multiple agents and their interactions. +- [Scheduling Cron Jobs with LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/cron_jobs/): This page provides a comprehensive guide on how to schedule cron jobs using the LangGraph Platform. It includes setup instructions for different programming languages, examples of creating and deleting cron jobs, and details on managing stateless cron jobs. Users will learn how to automate tasks such as sending weekly emails without writing custom scripts. +- [Guide to Stateless Runs in LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/stateless_runs/): This page provides a comprehensive guide on how to implement stateless runs using the LangGraph Platform. It includes setup instructions for various programming languages, examples of streaming results, and methods for waiting for stateless results. Users will learn how to execute runs without maintaining persistent state, making their applications more efficient. +- [Configurable Headers in LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/configurable_headers/): This page provides guidance on how to configure headers dynamically in the LangGraph platform to modify agent behavior and permissions. It details how to include or exclude specific headers in the runtime configuration using the `langgraph.json` file. Additionally, it explains how to access these headers within your graph and offers an option to opt-out of configurable headers. +- [Streaming in LangGraph Platform](https://langchain-ai.github.io/langgraph/cloud/concepts/streaming/): This page provides an overview of streaming capabilities within the LangGraph Platform, detailing the various streaming modes available for LLM applications. It includes instructions for creating streaming runs, handling stateless runs, and joining active background runs. Additionally, code examples in Python, JavaScript, and cURL are provided to illustrate the implementation of these features. +- [Streaming Outputs with LangGraph SDK](https://langchain-ai.github.io/langgraph/cloud/how-tos/streaming/): This documentation page provides detailed instructions on how to stream outputs from the LangGraph API server using the LangGraph SDK in Python, JavaScript, and cURL. It covers various streaming modes, including updates, values, and custom data, along with examples for each mode. Additionally, it explains how to handle subgraphs, debug information, and LLM tokens during streaming. +- [Human-in-the-Loop Workflows in LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/add-human-in-the-loop/): This page provides an overview of the human-in-the-loop (HIL) capabilities in LangGraph, allowing for human intervention in automated processes. It details the `interrupt` function, which pauses execution for human input, and includes examples in Python, JavaScript, and cURL for implementing HIL workflows. Additionally, it links to further resources for understanding and utilizing HIL features effectively. +- [Using Breakpoints in LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/human_in_the_loop_breakpoint/): This page provides an overview of how to set and use breakpoints in LangGraph to pause graph execution for inspection. It includes examples for setting breakpoints at compile time and run time in Python, JavaScript, and cURL. Additionally, it offers guidance on resuming execution after hitting a breakpoint. +- [Using Time Travel in LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/human_in_the_loop_time_travel/): This page provides a comprehensive guide on how to utilize the time travel functionality in LangGraph, allowing users to resume execution from previous checkpoints. It outlines the steps to run a graph, identify checkpoints, modify graph states, and resume execution. Additionally, the page includes code examples in Python, JavaScript, and cURL for practical implementation. +- [Model Context Protocol (MCP) Endpoint Documentation](https://langchain-ai.github.io/langgraph/concepts/server-mcp/): This page provides comprehensive documentation on the Model Context Protocol (MCP) endpoint available in LangGraph Server. It covers the requirements for using MCP, how to expose agents as MCP tools, and includes examples for connecting with MCP-compliant clients in various programming languages. Additionally, it outlines session behavior, authentication, and instructions for disabling the MCP endpoint. +- [Managing Double Texting in LangGraph](https://langchain-ai.github.io/langgraph/concepts/double_texting/): This page provides an overview of how to handle double texting scenarios in LangGraph, where users may send multiple messages before the first has completed. It outlines four strategies: Reject, Enqueue, Interrupt, and Rollback, each with links to detailed configuration guides. Prerequisites for implementing these strategies include having the LangGraph Server set up. +- [Using the Interrupt Option in Double Texting](https://langchain-ai.github.io/langgraph/cloud/how-tos/interrupt_concurrent/): This guide provides detailed instructions on how to utilize the `interrupt` option for double texting, allowing users to interrupt a prior run of a graph and start a new one. It includes setup instructions, code examples in Python, JavaScript, and CURL, as well as guidance on viewing run results and verifying the status of interrupted runs. Familiarity with double texting is assumed, and a link to a conceptual guide is provided for further understanding. +- [Using the Rollback Option in Double Texting](https://langchain-ai.github.io/langgraph/cloud/how-tos/rollback_concurrent/): This guide provides detailed instructions on how to utilize the `rollback` option in double texting, which allows users to interrupt a previous run and start a new one while permanently deleting the prior run from the database. It includes setup instructions, code examples in Python, JavaScript, and CURL, and demonstrates how to view run results and verify the deletion of the original run. Familiarity with double texting is assumed, and a link to a conceptual guide is provided for further reading. +- [Using the Reject Option in Double Texting](https://langchain-ai.github.io/langgraph/cloud/how-tos/reject_concurrent/): This guide provides an overview of the `reject` option in double texting, which prevents new runs of a graph from starting while an original run is still in progress. It includes setup instructions, code examples in Python, JavaScript, and CURL, and demonstrates how to handle errors when attempting to create concurrent runs. Additionally, it shows how to view the results of the original run after the rejection. +- [Using the Enqueue Option for Double Texting](https://langchain-ai.github.io/langgraph/cloud/how-tos/enqueue_concurrent/): This guide provides an overview of the `enqueue` option for double texting, which allows interruptions to be queued and executed in the order they are received. It includes setup instructions, code examples in Python, JavaScript, and CURL for creating runs, and methods for viewing run results. Familiarity with double texting concepts is assumed, and a helper function for output formatting is also provided. +- [Understanding Webhooks in LangGraph Platform](https://langchain-ai.github.io/langgraph/cloud/concepts/webhooks/): This page provides an overview of webhooks and their role in enabling event-driven communication between LangGraph Platform applications and external services. It explains how to use the `webhook` parameter in various endpoints to trigger requests upon the completion of API calls. For further details, a link to a comprehensive how-to guide is also included. +- [Using Webhooks with LangGraph Platform](https://langchain-ai.github.io/langgraph/cloud/how-tos/webhooks/): This documentation page provides a comprehensive guide on how to implement webhooks in the LangGraph Platform to receive updates after API calls. It includes details on supported endpoints, setup instructions for different programming languages, and examples of how to specify webhook parameters in API requests. Additionally, it covers security measures and testing tools for verifying webhook functionality. +- [Scheduling Tasks with Cron Jobs on LangGraph Platform](https://langchain-ai.github.io/langgraph/cloud/concepts/cron_jobs/): This page provides an overview of how to use cron jobs on the LangGraph Platform to run assistants on a defined schedule. It explains the process of setting up a cron job, including specifying the schedule, assistant, and input. Additionally, it includes links to a how-to guide and API reference for further details. +- [Scheduling Cron Jobs with LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/cron_jobs/): This page provides a comprehensive guide on how to use cron jobs with the LangGraph Platform to automate graph executions on a schedule. It includes setup instructions for various programming languages, examples of creating and deleting cron jobs, and tips for managing stateless cron jobs. Users will learn how to efficiently schedule tasks without manual intervention, ensuring timely execution of automated processes. +- [Adding Custom Lifespan Events in LangGraph](https://langchain-ai.github.io/langgraph/how-tos/http/custom_lifespan/): This page provides a guide on how to implement custom lifespan events in your LangGraph Platform applications, specifically for Python deployments. It covers the initialization and cleanup of resources during server startup and shutdown using FastAPI. Additionally, it includes code examples and configuration steps to help you integrate these events into your application. +- [Adding Custom Middleware to LangGraph Platform](https://langchain-ai.github.io/langgraph/how-tos/http/custom_middleware/): This page provides a step-by-step guide on how to add custom middleware to your server when deploying agents to the LangGraph Platform. It covers the necessary code implementation using FastAPI, configuration settings in `langgraph.json`, and instructions for testing and deploying your application. Additionally, it offers links to related topics such as custom routes and lifespan events for further customization. +- [Adding Custom Routes in LangGraph](https://langchain-ai.github.io/langgraph/how-tos/http/custom_routes/): This page provides a step-by-step guide on how to add custom routes to your LangGraph platform application using a Starlette or FastAPI app. It includes instructions for creating a new app, configuring the `langgraph.json` file, and testing the server locally. Additionally, it explains how custom routes can override default endpoints and offers suggestions for further customization. +- [LangGraph Deployment Options](https://langchain-ai.github.io/langgraph/concepts/deployment_options/): This page outlines the various deployment options available for the LangGraph Platform, including Cloud SaaS, Self-Hosted Data Plane, Self-Hosted Control Plane, and Standalone Container. Each option is described in detail, highlighting key features, management responsibilities, and compatibility. A comparison table is also provided to help users choose the best deployment strategy for their needs. +- [LangGraph Data Plane Overview](https://langchain-ai.github.io/langgraph/concepts/langgraph_data_plane/): This page provides a comprehensive overview of the LangGraph Data Plane, detailing its components including the server infrastructure, listener application, and data management systems like Postgres and Redis. It also covers key features such as autoscaling, static IP addresses, and custom configurations for Postgres and Redis. Additionally, the page outlines telemetry, licensing, and tracing functionalities relevant to different deployment options. +- [LangGraph Control Plane Overview](https://langchain-ai.github.io/langgraph/concepts/langgraph_control_plane/): This page provides a comprehensive overview of the LangGraph Control Plane, detailing its UI and API functionalities for managing LangGraph Servers. It covers deployment types, environment variables, database provisioning, and asynchronous deployment processes. Additionally, it highlights the integration with LangSmith for tracing projects. +- [Cloud SaaS Deployment Guide](https://langchain-ai.github.io/langgraph/concepts/langgraph_cloud/): This page provides a comprehensive guide on deploying the LangGraph Server using the Cloud SaaS model. It outlines the roles of the control plane and data plane, detailing their functionalities and management. Additionally, it includes an architectural diagram to illustrate the deployment structure. +- [Deployment Guide for LangGraph Platform](https://langchain-ai.github.io/langgraph/cloud/deployment/cloud/): This page provides a comprehensive guide on how to deploy applications to the LangGraph Platform using GitHub repositories. It covers prerequisites, steps for creating new deployments and revisions, managing deployment settings, and viewing logs. Additionally, it includes instructions for whitelisting IP addresses and modifying GitHub repository access. +- [Self-Hosted Data Plane Deployment Guide](https://langchain-ai.github.io/langgraph/concepts/langgraph_self_hosted_data_plane/): This page provides an overview of the Self-Hosted Data Plane deployment option, which allows users to manage their data plane infrastructure while offloading control plane management to LangChain. It outlines the requirements, architecture, and supported compute platforms for deployment. Additionally, it includes important information regarding the beta status of this deployment option. +- [Deploying a Self-Hosted Data Plane](https://langchain-ai.github.io/langgraph/cloud/deployment/self_hosted_data_plane/): This page provides a comprehensive guide on deploying a Self-Hosted Data Plane using Kubernetes and Amazon ECS. It outlines the prerequisites, setup steps, and configuration details necessary for a successful deployment. Additionally, it highlights the current beta status of this deployment option. +- [Self-Hosted Control Plane Deployment Guide](https://langchain-ai.github.io/langgraph/concepts/langgraph_self_hosted_control_plane/): This page provides an overview of the Self-Hosted Control Plane deployment option, currently in beta. It outlines the requirements, architecture, and compute platforms supported for deploying the control and data planes in your cloud environment. Additionally, it includes important links and resources for managing your self-hosted infrastructure. +- [Deploying a Self-Hosted Control Plane](https://langchain-ai.github.io/langgraph/cloud/deployment/self_hosted_control_plane/): This page provides a comprehensive guide on deploying a Self-Hosted Control Plane using Kubernetes. It outlines the prerequisites, setup steps, and configuration details necessary for a successful deployment. Additionally, it highlights the beta status of this deployment option and includes links to relevant resources for further assistance. +- [Deploying LangGraph Server with Standalone Container](https://langchain-ai.github.io/langgraph/concepts/langgraph_standalone_container/): This page provides a comprehensive guide on deploying a LangGraph Server using the Standalone Container option. It outlines the architecture, supported compute platforms, and Enterprise server version features. Users will find essential information on managing the data plane infrastructure without a control plane. +- [Deploying a Standalone Container with LangGraph](https://langchain-ai.github.io/langgraph/cloud/deployment/standalone_container/): This documentation provides a comprehensive guide on deploying a standalone container for the LangGraph application. It covers prerequisites, environment variable configurations, and deployment methods using Docker and Docker Compose. Additionally, it includes instructions for deploying on Kubernetes using Helm. +- [Scalability and Resilience of LangGraph Platform](https://langchain-ai.github.io/langgraph/concepts/scalability_and_resilience/): This page provides an overview of the scalability and resilience features of the LangGraph Platform. It details how the platform handles server and queue scalability, as well as the mechanisms in place for ensuring resilience during both graceful and hard shutdowns. Additionally, it covers the resilience strategies employed for Postgres and Redis to maintain service availability. +- [LangGraph Platform Plans Overview](https://langchain-ai.github.io/langgraph/concepts/plans/): This page provides an overview of the different plans available for the LangGraph Platform, including Developer, Plus, and Enterprise options. Each plan offers varying deployment options, usage limits, and features tailored to different user needs. For detailed pricing and related resources, links to additional documentation are also included. + +# Examples + +- [Building an Agentic RAG System](https://langchain-ai.github.io/langgraph/tutorials/rag/langgraph_agentic_rag/): This tutorial guides you through the process of creating a retrieval agent (RAG) system using LangChain and LangGraph. You will learn how to fetch and preprocess documents, create a retriever tool, and build an agentic RAG that intelligently decides when to retrieve information or respond directly to user queries. By the end, you'll have a functional system capable of semantic search and context-aware responses. +- [Building a Multi-Agent Supervisor System](https://langchain-ai.github.io/langgraph/tutorials/multi_agent/agent_supervisor/): This tutorial guides you through the process of creating a multi-agent supervisor system using specialized agents for research and math tasks. You will learn how to set up the environment, create individual worker agents, and implement a supervisor that orchestrates their interactions. By the end, you'll have a fully functional multi-agent architecture capable of handling complex queries. +- [Building a SQL Agent with LangChain](https://langchain-ai.github.io/langgraph/tutorials/sql-agent/): This tutorial provides a step-by-step guide on how to create a SQL agent capable of answering questions about a SQL database. It covers the setup of necessary dependencies, configuration of a SQLite database, and the implementation of a prebuilt agent that interacts with the database to generate and execute queries. Additionally, it discusses customizing the agent for more control over its behavior. +- [Custom Run ID, Tags, and Metadata for LangSmith Graph Runs](https://langchain-ai.github.io/langgraph/how-tos/run-id-langsmith/): This guide provides instructions on how to pass a custom run ID and set tags and metadata for graph runs in LangSmith. It covers prerequisites, configuration options, and includes code examples for setting up and running a graph with LangGraph. Additionally, it explains how to view and filter traces in the LangSmith platform. +- [Custom Authentication Setup for Chatbots](https://langchain-ai.github.io/langgraph/tutorials/auth/getting_started/): This tutorial guides you through the process of setting up custom authentication for a chatbot using the LangGraph platform. You will learn how to implement token-based security to control user access, starting with a basic example and preparing for more advanced authentication methods in future tutorials. By the end, you'll have a functional chatbot that restricts access to authenticated users. +- [Implementing Private Conversations in Chatbots](https://langchain-ai.github.io/langgraph/tutorials/auth/resource_auth/): This tutorial guides you through extending a chatbot to enable private conversations for each user by implementing resource-level access control. You'll learn how to add authorization handlers to ensure users can only access their own threads and test the functionality to confirm proper access restrictions. Additionally, the tutorial covers scoped authorization handlers for more granular control over resource access. +- [Integrating OAuth2 Authentication with Supabase](https://langchain-ai.github.io/langgraph/tutorials/auth/add_auth_server/): This tutorial guides you through replacing hard-coded tokens with real user accounts using OAuth2 for secure authentication in your LangGraph application. You'll learn how to set up Supabase as your identity provider, implement token validation, and ensure proper user authorization. By the end, you'll have a production-ready authentication system that allows users to securely access their own data. +- [Rebuilding Graphs at Runtime in LangGraph](https://langchain-ai.github.io/langgraph/cloud/deployment/graph_rebuild/): This guide explains how to rebuild your graph at runtime with different configurations in LangGraph. It covers the necessary prerequisites, how to define graphs, and the steps to modify your graph-making function for dynamic behavior based on user input. Additionally, it provides examples of both static and dynamic graph configurations. +- [Interacting with RemoteGraph in LangGraph](https://langchain-ai.github.io/langgraph/how-tos/use-remote-graph/): This documentation page provides a comprehensive guide on how to interact with a LangGraph Platform deployment using the RemoteGraph interface. It covers the initialization of RemoteGraph, invoking the graph both asynchronously and synchronously, and utilizing it as a subgraph. Additionally, it includes code examples in Python and JavaScript to facilitate understanding and implementation. +- [Deploying Agents on LangGraph Platform](https://langchain-ai.github.io/langgraph/how-tos/autogen-langgraph-platform/): This page provides a comprehensive guide on how to deploy agents like AutoGen and CrewAI using the LangGraph Platform. It covers the necessary setup, agent definition, and wrapping the agent in a LangGraph node for deployment. Additionally, it highlights the benefits of using LangGraph for scalable infrastructure and memory support. +- [Integrating LangGraph with React: A Comprehensive Guide](https://langchain-ai.github.io/langgraph/cloud/how-tos/use_stream_react/): This documentation provides a detailed guide on how to integrate the LangGraph platform into your React applications using the `useStream()` hook. It covers installation, key features, example implementations, and customization options for building chat experiences. Additionally, it includes advanced topics such as event handling, TypeScript support, and managing conversation threads. +- [Implementing Generative User Interfaces with LangGraph](https://langchain-ai.github.io/langgraph/cloud/how-tos/generative_ui_react/): This documentation provides a comprehensive guide on how to implement Generative User Interfaces (Generative UI) using the LangGraph platform. It covers prerequisites, step-by-step tutorials for defining UI components, sending them in graphs, and handling them in React applications. Additionally, it includes how-to guides for customizing components and managing UI state effectively. + +# Resources + +- [LangGraph FAQ](https://langchain-ai.github.io/langgraph/concepts/faq/): This FAQ page provides answers to common questions about LangGraph, an orchestration framework for complex agentic systems. It covers topics such as the differences between LangGraph and LangChain, performance impacts, open-source status, and compatibility with various LLMs. Additionally, it outlines the distinctions between LangGraph and LangGraph Platform, including features and deployment options. +- [Getting Started with LangGraph Templates](https://langchain-ai.github.io/langgraph/concepts/template_applications/): This page provides an overview of open source reference applications known as templates, designed to help users quickly build applications with LangGraph. It includes installation instructions for the LangGraph CLI, a list of available templates with their descriptions, and guidance on creating and deploying a new LangGraph app. Users can find links to repositories for each template and next steps for customizing their applications. +- [Guide to Using llms.txt and llms-full.txt for LLMs](https://langchain-ai.github.io/langgraph/llms-txt-overview/): This page provides an overview of the `llms.txt` and `llms-full.txt` formats, which facilitate access to programming documentation for large language models (LLMs) and agents. It outlines the differences between the two formats, usage instructions via an MCP server, and best practices for integrating these files into integrated development environments (IDEs). Additionally, it highlights considerations for managing large documentation files effectively. +- [Community Agents for LangGraph](https://langchain-ai.github.io/langgraph/agents/prebuilt/): This page provides a list of community-built libraries that extend the functionality of LangGraph. Each entry includes the library name, GitHub URL, a brief description, and additional metrics like weekly downloads and stars. Additionally, it outlines how to contribute your own library to the LangGraph documentation. +- [LangGraph Error Reference Guide](https://langchain-ai.github.io/langgraph/troubleshooting/errors/index/): This page serves as a comprehensive reference for resolving common errors encountered while using the LangGraph platform. It includes a list of error codes and links to detailed guides for troubleshooting specific issues. Users can find solutions for errors related to graph recursion, concurrent updates, node return values, and more. +- [Handling Recursion Limits in LangGraph](https://langchain-ai.github.io/langgraph/troubleshooting/errors/GRAPH_RECURSION_LIMIT/): This page provides guidance on managing recursion limits in LangGraph's StateGraph. It explains how to identify potential infinite loops in your graph and offers solutions for increasing the recursion limit when working with complex graphs. Additionally, it includes code examples to illustrate the concepts discussed. +- [Handling INVALID_CONCURRENT_GRAPH_UPDATE in LangGraph](https://langchain-ai.github.io/langgraph/troubleshooting/errors/INVALID_CONCURRENT_GRAPH_UPDATE/): This page explains the INVALID_CONCURRENT_GRAPH_UPDATE error that occurs in LangGraph when multiple nodes attempt to update the same state property concurrently. It provides an example of how this error can arise and offers a solution by using a reducer to combine values from parallel node executions. Additionally, troubleshooting tips are included to help resolve this issue. +- [Handling Invalid Node Return Values in LangGraph](https://langchain-ai.github.io/langgraph/troubleshooting/errors/INVALID_GRAPH_NODE_RETURN_VALUE/): This page provides guidance on the error encountered when a LangGraph node returns a non-dict value. It includes an example of incorrect node implementation and the resulting error message. Additionally, troubleshooting tips are offered to ensure that all nodes return the expected dictionary format. +- [Handling Multiple Subgraphs in LangGraph](https://langchain-ai.github.io/langgraph/troubleshooting/errors/MULTIPLE_SUBGRAPHS/): This page discusses the limitations of calling multiple subgraphs within a single LangGraph node when checkpointing is enabled. It provides troubleshooting tips to resolve related errors, including suggestions for compiling subgraphs without checkpointing and using the Send API for graph calls. +- [Handling INVALID_CHAT_HISTORY Error in create_react_agent](https://langchain-ai.github.io/langgraph/troubleshooting/errors/INVALID_CHAT_HISTORY/): This page provides an overview of the INVALID_CHAT_HISTORY error encountered in the create_react_agent function when a malformed list of messages is passed. It outlines the potential causes of the error and offers troubleshooting steps to resolve it. Users can learn how to properly invoke the graph and manage tool calls to avoid this issue. +- [Handling INVALID_LICENSE Error in LangGraph Platform](https://langchain-ai.github.io/langgraph/troubleshooting/errors/INVALID_LICENSE/): This page provides guidance on troubleshooting the INVALID_LICENSE error encountered when starting a self-hosted LangGraph Platform server. It outlines the scenarios in which this error may occur and offers solutions based on different deployment types. Additionally, it includes steps to verify the necessary credentials for successful deployment. +- [LangGraph Studio Troubleshooting Guide](https://langchain-ai.github.io/langgraph/troubleshooting/studio/): This page provides troubleshooting solutions for common connection issues encountered in LangGraph Studio, particularly with Safari and Brave browsers. It also addresses potential graph edge issues and offers methods to define routing paths for conditional edges. Users can find step-by-step instructions for resolving these issues using Cloudflare Tunnel and browser settings. +- [LangGraph Case Studies](https://langchain-ai.github.io/langgraph/adopters/): This page provides a comprehensive list of companies that have successfully implemented LangGraph, showcasing their unique use cases and the benefits they have achieved. Each entry includes links to detailed case studies or blog posts for further reading. If your company uses LangGraph, you are encouraged to share your success story to contribute to this growing collection. diff --git a/scripts/install_all.sh b/scripts/install_all.sh index ae6afe22..232a3a14 100644 --- a/scripts/install_all.sh +++ b/scripts/install_all.sh @@ -78,7 +78,7 @@ if [ ! -f "$REQUIREMENTS_FILE" ]; then exit 1 fi -print_info "Using Python: $(which python)" +print_info "Using Python: $(which python3)" print_info "Requirements file: $REQUIREMENTS_FILE" # Function to install with uv (faster and better resolver) @@ -86,7 +86,7 @@ install_with_uv() { # Check if uv is available if ! command -v uv &> /dev/null; then print_info "Installing uv for better dependency resolution..." - if python -m pip install uv; then + if python3 -m pip install uv; then print_success "uv installed successfully" else return 1 @@ -94,7 +94,7 @@ install_with_uv() { fi print_info "Using uv for faster dependency resolution..." - if python -m uv pip install -r "$REQUIREMENTS_FILE"; then + if python3 -m uv pip install -r "$REQUIREMENTS_FILE"; then return 0 else return 1 @@ -107,7 +107,7 @@ install_with_pip_staged() { # Stage 1: Core dependencies print_info "Stage 1/3: Installing core dependencies..." - if python -m pip install \ + if python3 -m pip install \ "python-dotenv>=1.0.0" \ "PyYAML>=6.0" \ "tiktoken>=0.5.0" \ @@ -137,7 +137,7 @@ install_with_pip_staged() { print_info "Stage 3/4: Installing raganything (includes lightrag-hku, this may take a while)..." if ! python -m pip install "raganything>=0.1.0"; then print_warning "Standard install failed, trying with --no-deps..." - python -m pip install "raganything>=0.1.0" --no-deps || print_warning "raganything installation had issues" + python3 -m pip install "raganything>=0.1.0" --no-deps || print_warning "raganything installation had issues" fi # Stage 4: docling (alternative parser for Office/HTML documents) @@ -145,7 +145,7 @@ install_with_pip_staged() { python -m pip install "docling>=2.31.0" || print_warning "docling installation had issues (optional, can be skipped)" # Optional deps - python -m pip install "perplexityai>=0.1.0" "dashscope>=1.14.0" 2>/dev/null || true + python3 -m pip install "perplexityai>=0.1.0" "dashscope>=1.14.0" 2>/dev/null || true return 0 } @@ -165,7 +165,7 @@ else else # Strategy 3: Direct pip as last resort print_warning "Staged installation had issues, trying direct pip install..." - if python -m pip install -r "$REQUIREMENTS_FILE"; then + if python3 -m pip install -r "$REQUIREMENTS_FILE"; then print_success "Backend dependencies installed successfully" else print_error "Backend dependencies installation failed" @@ -318,7 +318,7 @@ ALL_OK=true print_info "Checking backend key packages..." check_python_package() { - if python -c "import $1" 2>/dev/null; then + if python3 -c "import $1" 2>/dev/null; then print_success " ✓ $1" return 0 else @@ -332,7 +332,7 @@ check_python_package uvicorn || ALL_OK=false check_python_package openai || ALL_OK=false # Check lightrag_hku (import name is lightrag) -if python -c "import lightrag" 2>/dev/null; then +if python3 -c "import lightrag" 2>/dev/null; then print_success " ✓ lightrag_hku" else print_error " ✗ lightrag_hku not installed" @@ -340,7 +340,7 @@ else fi # Check raganything -if python -c "import raganything" 2>/dev/null; then +if python3 -c "import raganything" 2>/dev/null; then print_success " ✓ raganything" else print_error " ✗ raganything not installed" diff --git a/src/agents/vision_solver/__init__.py b/src/agents/vision_solver/__init__.py new file mode 100644 index 00000000..0e032ee4 --- /dev/null +++ b/src/agents/vision_solver/__init__.py @@ -0,0 +1,27 @@ +"""Vision Solver Agent Module. + +This module implements image analysis and GeoGebra visualization for math problems. +It follows a four-stage pipeline: +1. BBox - Visual element detection with pixel coordinates +2. Analysis - Geometric semantic analysis +3. GGBScript - Generate GeoGebra drawing commands +4. Reflection - Validate and fix commands +""" + +from src.agents.vision_solver.models import ( + AnalysisOutput, + BBoxOutput, + GGBScriptOutput, + ImageAnalysisState, + ReflectionOutput, +) +from src.agents.vision_solver.vision_solver_agent import VisionSolverAgent + +__all__ = [ + "VisionSolverAgent", + "BBoxOutput", + "AnalysisOutput", + "GGBScriptOutput", + "ReflectionOutput", + "ImageAnalysisState", +] diff --git a/src/agents/vision_solver/models.py b/src/agents/vision_solver/models.py new file mode 100644 index 00000000..1a43bf54 --- /dev/null +++ b/src/agents/vision_solver/models.py @@ -0,0 +1,326 @@ +"""Data models for Vision Solver image analysis pipeline. + +Defines input/output structures for the four-stage analysis: +1. BBox - Coordinate detection +2. Analysis - Geometric semantic analysis +3. GGBScript - Drawing command generation +4. Reflection - Validation and correction +""" + +from dataclasses import dataclass, field +from enum import Enum +from typing import TypedDict + +# ==================== BBox Stage Models ==================== + + +class ImageDimensions(TypedDict): + """Image dimensions.""" + + width: int + height: int + + +class Coordinate(TypedDict): + """Pixel coordinate.""" + + x: int + y: int + + +class CoordElement(TypedDict, total=False): + """Coordinate element with different fields for different types. + + Types: + - point: element_id, type, label, position + - segment: element_id, type, label, start, end + - polygon: element_id, type, label, vertices + - circle: element_id, type, label, center, radius + """ + + element_id: str + type: str # point/segment/polygon/circle/arc/angle + label: str + position: Coordinate # For points + start: Coordinate # For segments + end: Coordinate # For segments + vertices: list[dict] # For polygons + center: Coordinate # For circles/arcs + radius: int # Pixel radius for circles + + +class BBoxOutput(TypedDict): + """BBox stage output: pure visual recognition results. + + Responsibility: Extract pixel coordinates of all geometric elements + Coordinate system: Origin at top-left, Y-axis downward + """ + + image_dimensions: ImageDimensions + elements: list[CoordElement] + + +# ==================== Analysis Stage Models ==================== + + +class PointDefinition(TypedDict, total=False): + """Point definition with three types. + + 1. has_coordinate=True: Point with coordinates from problem text + 2. use_bbox=True: Visible in image but no coordinates in text + 3. type="derived": Derived point (only when explicitly stated) + """ + + label: str + type: str # "free" | "derived" + has_coordinate: bool + coordinate: Coordinate | None + use_bbox: bool + bbox_position: Coordinate | None + estimated_ggb_coordinate: Coordinate | None + estimation_method: str + anchor_points: list[str] + derivation_method: str + derivation_params: list[str] + source: str + + +class SegmentDefinition(TypedDict): + """Segment definition.""" + + label: str + endpoints: list[str] + is_auxiliary: bool + + +class ShapeDefinition(TypedDict, total=False): + """Shape definition.""" + + label: str + type: str + vertices: list[str] + + +class CircleDefinition(TypedDict, total=False): + """Circle definition.""" + + label: str + center: str + radius: float | None + radius_segment: str | None + through_point: str | None + + +class KeyElements(TypedDict, total=False): + """Key geometric elements.""" + + points: list[PointDefinition] + segments: list[SegmentDefinition] + shapes: list[ShapeDefinition] + circles: list[CircleDefinition] + special_points: list[PointDefinition] + + +class GeometricRelationType(str, Enum): + """Types of geometric relations.""" + + PARALLEL = "parallel" + PERPENDICULAR = "perpendicular" + EQUAL_LENGTH = "equal_length" + MIDPOINT = "midpoint" + INTERSECTION = "intersection" + TANGENT = "tangent" + CONGRUENT = "congruent" + SIMILAR = "similar" + BISECTOR = "bisector" + ON_LINE = "on_line" + ON_CIRCLE = "on_circle" + + +class GeometricRelation(TypedDict): + """Geometric relation.""" + + type: str + objects: list[str] + description: str + + +class RelativePositionAnalysis(TypedDict, total=False): + """Relative position analysis.""" + + point: str + observations: list[str] + conclusions: dict + + +class AnalysisOutput(TypedDict, total=False): + """Analysis stage output: geometric semantic analysis results. + + Responsibility: Extract geometric relations, constraints, special point definitions + """ + + image_reference_detected: bool + image_reference_keywords: list[str] + key_elements: KeyElements + constraints: list + geometric_relations: list[GeometricRelation] + relative_position_analysis: list[RelativePositionAnalysis] + element_positions: dict + annotations: list[dict] + construction_steps: list[dict] + + +# ==================== GGBScript Stage Models ==================== + + +class GGBCommand(TypedDict): + """GeoGebra command.""" + + sequence: int + command: str + description: str + + +class GGBScriptOutput(TypedDict): + """GGBScript stage output: GeoGebra drawing commands. + + Responsibility: Generate accurate GeoGebra command sequence + Principle: Compass-and-ruler construction, prefer derived points + """ + + commands: list[GGBCommand] + + +# ==================== Reflection Stage Models ==================== + + +class VerificationResult(TypedDict): + """Verification result.""" + + check_type: str + target: str + expected: str + actual: str + passed: bool + + +class IssueFound(TypedDict, total=False): + """Issue found during verification.""" + + issue_id: str + severity: str # "critical", "error", "warning" + category: str + description: str + affected_commands: list[int] + correction_needed: str + + +class Correction(TypedDict): + """Correction for an issue.""" + + issue_id: str + action: str # "replace", "insert", "delete" + target_sequence: int | None + new_command: str | None + reason: str + + +class FinalVerification(TypedDict, total=False): + """Final verification results.""" + + no_wrong_assumptions: bool + all_derived_points_use_commands: bool + all_use_bbox_points_use_coordinates: bool + all_constraints_satisfied: bool + layout_matches_original: bool + ready_for_rendering: bool + + +class ReflectionOutput(TypedDict): + """Reflection stage output: validation and correction results. + + Responsibility: Verify command correctness, find issues and fix them + """ + + verification_results: list[VerificationResult] + issues_found: list[IssueFound] + corrections: list[Correction] + final_verification: FinalVerification + corrected_commands: list[GGBCommand] + + +# ==================== Pipeline State ==================== + + +@dataclass +class ImageAnalysisState: + """State for the image analysis pipeline.""" + + session_id: str + question_text: str + image_base64: str | None = None + + # Stage outputs + bbox_output: BBoxOutput | None = None + analysis_output: AnalysisOutput | None = None + ggbscript_output: GGBScriptOutput | None = None + reflection_output: ReflectionOutput | None = None + + # Final results + final_ggb_commands: list[GGBCommand] = field(default_factory=list) + + # Flags + image_is_reference: bool = False + has_image: bool = False + + +# ==================== Helper Functions ==================== + + +def create_empty_bbox_output() -> BBoxOutput: + """Create empty BBox output.""" + return {"image_dimensions": {"width": 0, "height": 0}, "elements": []} + + +def create_empty_analysis_output() -> AnalysisOutput: + """Create empty Analysis output.""" + return { + "image_reference_detected": False, + "image_reference_keywords": [], + "key_elements": { + "points": [], + "segments": [], + "shapes": [], + "circles": [], + "special_points": [], + }, + "constraints": [], + "geometric_relations": [], + "relative_position_analysis": [], + "element_positions": {"relative_positions": [], "layout_description": ""}, + "annotations": [], + "construction_steps": [], + } + + +def create_empty_ggbscript_output() -> GGBScriptOutput: + """Create empty GGBScript output.""" + return {"commands": []} + + +def create_empty_reflection_output() -> ReflectionOutput: + """Create empty Reflection output.""" + return { + "verification_results": [], + "issues_found": [], + "corrections": [], + "final_verification": { + "no_wrong_assumptions": False, + "all_derived_points_use_commands": False, + "all_use_bbox_points_use_coordinates": False, + "all_constraints_satisfied": False, + "layout_matches_original": False, + "ready_for_rendering": False, + }, + "corrected_commands": [], + } diff --git a/src/agents/vision_solver/prompts/analysis.md b/src/agents/vision_solver/prompts/analysis.md new file mode 100644 index 00000000..fb4d6213 --- /dev/null +++ b/src/agents/vision_solver/prompts/analysis.md @@ -0,0 +1,396 @@ +# Analysis 节点 Prompt - 几何语义分析 + +## 角色定义 + +你是一个专业的几何学专家。你的任务是分析数学题目图片的**几何语义**,提取几何关系、约束条件和特殊点定义。 + +## 核心原则:区分"题干约束"与"图像约束" + +### 第一步:检测图像引用词 + +**检查题干是否包含以下词汇**: +- "如图"、"如图所示"、"看图"、"从图中"、"图示"、"图中" +- "根据图"、"观察图"、"参照图" + +如果检测到这些词,设置 `image_is_reference: true`,表示**图像是题目的核心信息来源**。 + +### 第二步:根据图像引用确定信息优先级 + +#### 场景 A:无图像引用词(`image_is_reference: false`) +1. **题目题干**(最高权威):文字描述的条件具有绝对优先级 +2. **图片标注**(次要):图中标注的数值、符号 +3. **BBox 坐标比例**(仅参考):只用于判断大致相对位置 + +#### 场景 B:有图像引用词(`image_is_reference: true`)⚠️ 关键 +1. **题目题干中的明确坐标/数值**(最高权威):如"A的坐标为(-3,0)" +2. **图像中的几何位置关系**(关键约束):BBox 坐标反映的相对位置是**必须遵守的约束** +3. **题干中的定性描述**(参考) + +**核心区别**: +- 场景 A:题干说什么就是什么,图像仅供参考 +- 场景 B:题干给的坐标是确定的,但**没有被题干明确定义的点,其位置由图像决定** + +## ⚠️ 反假设原则(最重要) + +### 绝对禁止的假设 + +1. **禁止假设中点**: + - ❌ 如果题干没说"C是AB中点",即使C的x坐标接近AB中点,也**禁止**将C定义为中点 + - ✅ 应该根据BBox坐标确定C的实际位置 + +2. **禁止假设交点**: + - ❌ 如果题干没说"P是两线交点",不能假设P是交点 + - ✅ P应该根据BBox坐标定位 + +3. **禁止假设特殊位置**: + - ❌ 不能假设某点"恰好在"某直线上(除非题干明确说明) + - ❌ 不能假设某点到两点等距(除非题干明确说明) + +### 如何判断一个点是"派生点"还是"自由点" + +**派生点的唯一判定标准**:题干中有明确的文字描述 +- "M是AB的中点" → M是派生点,用 `Midpoint[A, B]` +- "P是直线l和m的交点" → P是派生点,用 `Intersect[l, m]` + +**自由点的判定标准**:题干没有给出几何关系定义 +- 题干问"C的坐标是?" → C是待求点,位置由图像决定,标记为 `use_bbox: true` +- 图中画了点D但题干没描述其定义 → D是自由点,标记为 `use_bbox: true` + +## 输入信息 + +### 题目题干 +``` +{{ question_text }} +``` + +### 图片 +[用户上传的题目图片] + +### BBox 分析结果 +```json +{{ bbox_output_json }} +``` + +## 任务说明 + +### 0. 图像引用检测(首要任务) + +**必须首先检查题干是否包含图像引用词**: +- 关键词列表:如图、看图、从图中、图示、图中、根据图、观察图、参照图 +- 输出字段:`image_is_reference: true/false` + +### 1. 识别关键元素 + +**点 (points)** 的三种类型: + +#### 类型 1:题干给出坐标的点 +题干明确给出坐标值(如"A的坐标为(-3,0)") +```json +{"label": "A", "type": "free", "has_coordinate": true, "coordinate": {"x": -3, "y": 0}} +``` + +#### 类型 2:图像可见但题干无坐标的自由点 ⚠️ 关键类型 +图片中可见,但题干只是提到这个点(如"图书馆C")而没有给出其坐标或几何定义 +```json +{"label": "C", "type": "free", "has_coordinate": false, "use_bbox": true, "bbox_position": {"x": 350, "y": 400}} +``` +**重要**:必须包含 `bbox_position` 字段,记录BBox中的像素坐标! + +#### 类型 3:派生点(仅当题干明确定义时) +**仅当题干有明确文字描述时**才能标记为派生点: +- 题干说"M是AB中点" → `{"label": "M", "type": "derived", "derivation_method": "create_midpoint", "derivation_params": ["A", "B"]}` +- 题干说"P是l和m的交点" → `{"label": "P", "type": "derived", "derivation_method": "create_intersection", "derivation_params": ["l", "m"]}` + +**⚠️ 禁止规则**: +- ❌ 题干没说C是中点,但你观察到C似乎在AB中间 → 不能标记为派生点 +- ❌ 题干问"C的坐标是?" → C不是派生点,是类型2(use_bbox: true) + +**线段 (segments)**: +- 标注端点 +- 区分:实边 vs 辅助线(虚线) + +**形状 (shapes)**: +- 类型:triangle, quadrilateral, rectangle, parallelogram, rhombus, square, trapezoid +- 顶点顺序 + +**圆 (circles)**: +- 圆心点 +- 半径(数值或通过某点) + +### 2. 提取几何约束 + +从题干和图片中提取所有几何约束: + +``` +约束格式示例: +- "AB = 6cm" # 长度约束 +- "∠ABC = 90°" # 角度约束 +- "AB ∥ CD" # 平行约束 +- "AC ⊥ BD" # 垂直约束 +- "AB = CD" # 等长约束 +- "M 是 AB 中点" # 中点约束 +- "P 在 ⊙O 上" # 点在圆上 +``` + +### 3. 识别几何关系 + +| 关系类型 | 说明 | +|---------|------| +| parallel | 平行 | +| perpendicular | 垂直 | +| equal_length | 等长 | +| midpoint | 中点 | +| intersection | 交点 | +| tangent | 相切 | +| congruent | 全等 | +| similar | 相似 | +| bisector | 平分 | +| on_line | 点在线上 | +| on_circle | 点在圆上 | + +### 4. 分析元素位置(关键步骤) + +#### 4.1 基于 BBox 坐标分析相对位置 + +使用 BBox 坐标判断相对位置关系: +- "A 在 B 的左边" +- "C 在 AB 的上方/下方" +- "三角形 ABC 位于图的中央" + +#### 4.2 基于已知坐标锚定 use_bbox 点(⚠️ 新增关键步骤) + +**当存在题干给出坐标的点时,利用它们来精确估算 use_bbox 点的坐标**: + +**方法:基于已知点建立坐标映射** + +1. 从题干获取已知点坐标(如 A=(-3,0), B=(2,0)) +2. 从 BBox 获取这些点的像素坐标(如 A_bbox=(100,200), B_bbox=(500,200)) +3. 计算比例因子: + - scale_x = (B.x - A.x) / (B_bbox.x - A_bbox.x) = (2-(-3)) / (500-100) = 5/400 = 0.0125 +4. 对于 use_bbox 点 C(C_bbox=(300, 500)): + - C.x = A.x + (C_bbox.x - A_bbox.x) * scale_x = -3 + (300-100) * 0.0125 = -3 + 2.5 = -0.5 + - C.y = 根据 y 方向类似计算(注意 BBox 的 y 轴向下) + +**输出格式**: +```json +{ + "label": "C", + "type": "free", + "has_coordinate": false, + "use_bbox": true, + "bbox_position": {"x": 300, "y": 500}, + "estimated_ggb_coordinate": {"x": -0.5, "y": -3}, + "estimation_method": "anchor_based", + "anchor_points": ["A", "B"] +} +``` + +#### 4.3 验证相对位置是否符合几何关系 + +**关键验证**:检查 use_bbox 点的位置是否与某些几何关系"接近但不完全符合" + +示例: +- 已知 A=(-3,0), B=(2,0),AB 的中点应该在 (-0.5, 0) +- 如果 C 的估算坐标是 (-0.5, -3) +- 验证:C 的 x 坐标接近中点,但 y 坐标不为 0 +- 结论:C **不是** AB 的中点,C 可能在 AB 的垂直平分线上 + +**输出相对位置分析**: +```json +{ + "relative_position_analysis": [ + { + "point": "C", + "observation": "C 的 x 坐标 (-0.5) 等于 AB 中点的 x 坐标", + "observation_type": "x_aligned_with_midpoint" + }, + { + "point": "C", + "observation": "C 的 y 坐标 (-3) 不等于 0,C 在 x 轴下方", + "observation_type": "below_x_axis" + }, + { + "point": "C", + "conclusion": "C 在 AB 的垂直平分线上,但不是 AB 的中点", + "is_midpoint": false, + "is_on_perpendicular_bisector": true + } + ] +} +``` + +### 5. 提取标注信息 + +识别图中的数值标注: +- 长度标注(如 "6cm") +- 角度标注(如 "60°") +- 其他标签 + +### 6. 建议构造步骤 + +按照尺规作图思维,建议绘图顺序: +1. 先画基准点 +2. 根据约束构造其他点 +3. 连接线段 +4. 添加辅助线 +5. 标注 + +## 输出格式 + +请以 JSON 格式输出,严格遵循以下结构: + +```json +{ + "image_reference_detected": true, + "image_reference_keywords": ["如图"], + "key_elements": { + "points": [ + { + "label": "A", + "type": "free", + "has_coordinate": true, + "coordinate": {"x": -3, "y": 0}, + "source": "题干明确给出" + }, + { + "label": "B", + "type": "free", + "has_coordinate": true, + "coordinate": {"x": 2, "y": 0}, + "source": "题干明确给出" + }, + { + "label": "C", + "type": "free", + "has_coordinate": false, + "use_bbox": true, + "bbox_position": {"x": 350, "y": 500}, + "estimated_ggb_coordinate": {"x": -0.5, "y": -3}, + "estimation_method": "anchor_based", + "anchor_points": ["A", "B"], + "source": "图像位置" + }, + { + "label": "M", + "type": "derived", + "derivation_method": "create_midpoint", + "derivation_params": ["A", "B"], + "source": "题干明确说明'M是AB中点'" + } + ], + "segments": [ + {"label": "AB", "endpoints": ["A", "B"], "is_auxiliary": false} + ], + "shapes": [ + {"label": "△ABC", "type": "triangle", "vertices": ["A", "B", "C"]} + ], + "circles": [], + "special_points": [] + }, + "constraints": [ + { + "description": "A的坐标为(-3,0)", + "source": "题干", + "type": "coordinate" + }, + { + "description": "B的坐标为(2,0)", + "source": "题干", + "type": "coordinate" + } + ], + "geometric_relations": [], + "relative_position_analysis": [ + { + "point": "C", + "observations": [ + "C 的 x 坐标约为 -0.5,与 AB 中点的 x 坐标相同", + "C 的 y 坐标约为 -3,在 x 轴下方" + ], + "conclusions": { + "is_midpoint_of_AB": false, + "is_on_perpendicular_bisector_of_AB": true, + "reason": "C 的 x 坐标与 AB 中点相同,但 y 坐标不为 0" + } + } + ], + "element_positions": { + "relative_positions": [ + "A 在左侧", + "B 在右侧", + "C 在 AB 下方" + ], + "layout_description": "A、B 在同一水平线上(x轴),C 在它们下方" + }, + "annotations": [], + "construction_steps": [ + {"order": 1, "action": "create_point", "target": "A", "description": "创建点 A,使用题干坐标 (-3, 0)"}, + {"order": 2, "action": "create_point", "target": "B", "description": "创建点 B,使用题干坐标 (2, 0)"}, + {"order": 3, "action": "create_point", "target": "C", "description": "创建点 C,使用锚定估算坐标 (-0.5, -3)"}, + {"order": 4, "action": "create_segment", "target": "AB", "description": "连接 A 和 B"} + ] +} +``` + +## 注意事项 + +1. **首先检测图像引用**:题干中是否有"如图"等词 +2. **题干坐标优先**:题干给出的坐标值必须使用 +3. **派生点必须有题干依据**:只有题干明确说"M是中点"才能标记为派生点 +4. **use_bbox 点必须包含 bbox_position**:记录原始像素坐标 +5. **利用已知点锚定坐标**:当有题干坐标点时,用它们来计算 use_bbox 点的数学坐标 +6. **分析相对位置**:明确判断 use_bbox 点是否与某些几何关系"接近但不完全符合" + +## ⚠️ 反假设原则(最重要的规则) + +### 绝对禁止的假设 + +| 禁止的假设 | 正确的做法 | +|-----------|-----------| +| 题干没说C是中点,但你认为C看起来在中间 → 标记为派生点 | C 标记为 `use_bbox: true`,在 relative_position_analysis 中说明"C 的 x 坐标接近中点" | +| 题干问"C的坐标是?" → 你推断C是某个交点 | C 标记为 `use_bbox: true`,C 的位置由图像决定 | +| 图中的点看起来在某条线上 → 标记为"点在线上" | 只有题干明确说"P在直线l上"才能标记这个关系 | + +### 判断流程图 + +``` +题干是否明确说"X是Y的中点/交点/..."? + ├── 是 → X 是派生点,使用 derivation_method + └── 否 → 题干是否给出 X 的坐标? + ├── 是 → X 是 has_coordinate: true + └── 否 → X 是 use_bbox: true(位置由图像决定) +``` + +## 处理"图片可见但题干无坐标"的点 + +**关键规则**:图片中可见的点必须被绘制,即使题干没有给出其坐标。 + +### 正确示例 + +**题干**:如图,A的坐标为(-3,0),B的坐标为(2,0),则图书馆C的坐标为____。 + +**分析**: +- A:题干给出坐标 → `has_coordinate: true, coordinate: {x: -3, y: 0}` +- B:题干给出坐标 → `has_coordinate: true, coordinate: {x: 2, y: 0}` +- C:题干**没有**给出坐标或几何定义,只是问"C的坐标" → `use_bbox: true` +- 题干有"如图" → `image_is_reference: true` + +**错误做法**: +- ❌ 假设 C 是 AB 的中点,标记为 `derivation_method: create_midpoint` +- ❌ 假设 C 在某条特定直线上 + +**正确做法**: +- ✅ 使用 BBox 坐标确定 C 的位置 +- ✅ 在 relative_position_analysis 中描述 C 的位置特征 +- ✅ 如果 A、B 有已知坐标,用它们锚定计算 C 的数学坐标 + +## 常见错误提醒 + +- ❌ **禁止假设几何关系**:题干没说的关系,不能假设 +- ❌ 派生点没有题干依据就标记 +- ❌ use_bbox 点缺少 bbox_position 字段 +- ❌ 不要遗漏图片中可见的点 +- ✅ 仔细检测"如图"等图像引用词 +- ✅ 派生点必须引用题干中的具体文字 +- ✅ use_bbox 点使用锚点估算数学坐标 +- ✅ 在 relative_position_analysis 中分析位置特征 diff --git a/src/agents/vision_solver/prompts/bbox.md b/src/agents/vision_solver/prompts/bbox.md new file mode 100644 index 00000000..0c8a5563 --- /dev/null +++ b/src/agents/vision_solver/prompts/bbox.md @@ -0,0 +1,159 @@ +# BBox 节点 Prompt - 坐标定位 + +## 角色定义 + +你是一个专业的几何图形视觉识别专家。你的任务是从数学题目图片中提取所有几何元素的**像素坐标**。 + +**重要**:你只做视觉识别,不分析几何关系,不做任何数学推理。 + +## 信息优先级 + +1. **图片视觉信息**:直接从图片中识别的元素位置 +2. **题目题干(参考)**:仅用于辅助识别标签名称 + +## 输入信息 + +### 题目题干 +``` +{{ question_text }} +``` + +### 图片 +[用户上传的题目图片] + +## 任务说明 + +### 你需要识别的元素类型 + +1. **点 (point)** + - 识别所有带标签的点(如 A、B、C、O、P、Q 等) + - 记录点的像素坐标 (x, y) + +2. **线段 (segment)** + - 识别所有线段 + - 记录起点和终点的像素坐标 + - 注意区分实线和虚线 + +3. **多边形 (polygon)** + - 识别三角形、四边形等 + - 按顺序记录所有顶点 + +4. **圆 (circle)** + - 识别圆心位置 + - 估算半径(像素) + +5. **圆弧 (arc)** + - 识别圆弧的圆心 + - 记录起点和终点位置 + +6. **角度标记 (angle)** + - 识别角度弧线标记 + - 记录顶点和两边方向 + +### 坐标系说明 + +- **原点**:图片左上角 +- **X 轴**:向右为正 +- **Y 轴**:向下为正 +- **单位**:像素 + +## 输出格式 + +请以 JSON 格式输出,严格遵循以下结构: + +```json +{ + "image_dimensions": { + "width": <图片宽度像素>, + "height": <图片高度像素> + }, + "elements": [ + { + "element_id": "point_A", + "type": "point", + "label": "A", + "position": {"x": 100, "y": 200} + }, + { + "element_id": "point_B", + "type": "point", + "label": "B", + "position": {"x": 500, "y": 200} + }, + { + "element_id": "point_C", + "type": "point", + "label": "C", + "position": {"x": 300, "y": 500} + }, + { + "element_id": "segment_AB", + "type": "segment", + "label": "AB", + "start": {"x": 100, "y": 200}, + "end": {"x": 500, "y": 200} + } + ], + "relative_positions": { + "description": "点之间的相对位置观察(纯视觉,不做几何推理)", + "observations": [ + { + "point": "C", + "relative_to": ["A", "B"], + "x_position": "C 的 x 坐标 (300) 大约在 A (100) 和 B (500) 的中间", + "y_position": "C 的 y 坐标 (500) 远大于 A、B 的 y 坐标 (200),即 C 在 A、B 下方" + } + ] + }, + "grid_info": { + "has_grid": true, + "grid_type": "square", + "estimated_cell_size_pixels": 50 + } +} +``` + +### 新增:relative_positions 字段 + +**目的**:在纯视觉层面记录点之间的相对位置,供后续阶段参考。 + +**规则**: +- 只做视觉观察,不做几何推理(如"是不是中点") +- 记录每个点相对于其他关键点的 x/y 位置 +- 这些观察将帮助后续阶段判断几何关系 + +## 注意事项 + +1. **只做视觉识别**:不要分析几何关系,不要计算角度大小 +2. **识别所有标签**:注意图中所有带字母标记的点 +3. **坐标尽量准确**:根据图片实际位置估算像素坐标 +4. **保持一致性**:同一个点在不同元素中的坐标应保持一致 +5. **虚线标记**:如果线段是虚线,在 label 中注明(如 "AB(虚线)") +6. **记录相对位置**:在 relative_positions 中描述点之间的视觉位置关系 + +## 特殊场景处理 + +### 网格背景图 +- 图片可能有网格背景(方格纸、坐标纸等) +- **不要把网格线误认为几何线段** +- 重点识别标注了字母标签的点(如 A、B、C) +- 网格只是背景,几何元素是带标签的点和连接它们的线 + +### 标签识别 +- 标签可能是斜体(如 *A*、*B*)或正体 +- 中文标签也要识别(如 "点A"、"圆心O") +- 有些点可能只有视觉标记(黑点)但没有旁边的字母标签 + +### 必须识别的内容 +- **所有带标签的点**:即使题干没有提到该点的坐标,只要图中画了并标注了字母,就必须识别 +- **所有可见的线段**:连接标注点的线 +- **图中的几何标记**:角度弧、平行符号、垂直符号等 + +## 常见错误提醒 + +- ❌ 不要分析"A 是 BC 的中点"这样的几何关系 +- ❌ 不要推导坐标,只报告视觉位置 +- ❌ 不要遗漏任何带标签的元素(**即使题干没提到该点的坐标**) +- ❌ 不要把网格线误认为几何线段 +- ✅ 只报告你在图片中看到的内容 +- ✅ 必须识别图中所有带字母标签的点 diff --git a/src/agents/vision_solver/prompts/ggbscript.md b/src/agents/vision_solver/prompts/ggbscript.md new file mode 100644 index 00000000..a998f8a0 --- /dev/null +++ b/src/agents/vision_solver/prompts/ggbscript.md @@ -0,0 +1,373 @@ +# GGBScript 节点 Prompt - GeoGebra 绘图指令生成 + +## 角色定义 + +你是一个专业的 GeoGebra 绘图专家。你的任务是根据几何分析结果,生成精准的 GeoGebra 绘图指令序列。 + +**核心原则**:采用尺规作图思维,优先使用派生点(Midpoint, Intersect)保证几何精度。 + +## 信息优先级 + +1. **Analysis 输出**:几何约束和派生点定义 +2. **BBox 坐标**:仅用于确定基准点位置和布局 +3. **图片**:参考整体布局 + +## 输入信息 + +### 题目题干 +``` +{{ question_text }} +``` + +### 图片 +[用户上传的题目图片] + +### BBox 分析结果 +```json +{{ bbox_output_json }} +``` + +### Analysis 分析结果 +```json +{{ analysis_output_json }} +``` + +## GeoGebra 命令参考(验证语法) + +### 点操作(使用大写字母命名) +- `A = (x, y)` - 创建点(直角坐标) +- `P = (5; 60°)` - 极坐标(长度;角度) +- `Point[line]` 或 `Point[conic]` - 在对象上的点 +- `Intersect[obj1, obj2]` - 两对象交点(返回所有交点) +- `Intersect[obj1, obj2, n]` - 第n个交点 +- `Midpoint[A, B]` - 两点中点 +- `Midpoint[segment]` - 线段中点 +- `Center[conic]` - 圆锥曲线中心 + +### 向量(使用小写字母命名) +- `v = (3, 4)` - 向量坐标 +- `Vector[A, B]` - 从A到B的向量 +- `Vector[A]` - 点A的位置向量 + +### 线段、直线、射线 +- `Segment[A, B]` - 线段AB +- `Segment[A, 5]` - 从A出发长度为5的线段 +- `Line[A, B]` - 过两点的直线 +- `g: y = 2x + 1` 或 `g: 3x + 2y = 6` - 方程形式的直线 +- `Ray[A, B]` - 从A过B的射线 +- `Perpendicular[A, line]` - 过A垂直于line的直线 +- `PerpendicularBisector[A, B]` - AB的垂直平分线 +- `AngleBisector[A, B, C]` - 角ABC的平分线(B为顶点) + +### 函数 +- `f(x) = x^2 + 2x + 1` - 基本函数 +- `g(x) = sin(x)`, `h(x) = cos(x)`, `t(x) = tan(x)` - 三角函数 +- `asin(x)`, `acos(x)`, `atan(x)` - 反三角函数 +- `f(x) = exp(x)` 或 `f(x) = e^x` - 指数函数 +- **对数函数**: + - `ln(x)` - 自然对数 + - `lg(x)` - 常用对数(以10为底)❌不要用 `log(10, x)` + - `ld(x)` - 二进制对数(以2为底) +- `sqrt(x)`, `cbrt(x)`, `abs(x)`, `floor(x)`, `ceil(x)`, `round(x)` - 常用函数 +- `If[x < 0, -x, x]` - 分段/条件函数 +- `Derivative[f]` 或 `f'(x)` - 导数 +- `Integral[f]` - 不定积分 +- `Integral[f, a, b]` - 定积分 + +### 圆 +- `Circle[M, r]` - 圆心M,半径r(例如 `Circle[(0,0), 3]`) +- `Circle[M, A]` - 圆心M,过点A +- `Circle[A, B, C]` - 过三点的圆 +- `c: x^2 + y^2 = 9` - 方程形式 + +### 椭圆 +- `Ellipse[F1, F2, a]` - 焦点F1、F2,半长轴a +- `Ellipse[F1, F2, P]` - 焦点F1、F2,过点P +- `ell: 9x^2 + 16y^2 = 144` - 方程形式(使用整数系数,避免分数) + +### 双曲线 +- `Hyperbola[F1, F2, a]` - 焦点F1、F2,半长轴a +- `Hyperbola[F1, F2, P]` - 焦点F1、F2,过点P +- `hyp: 9x^2 - 16y^2 = 144` - 方程形式(使用整数系数) + +### 抛物线 +- `Parabola[F, line]` - 焦点F,准线line +- `par: y = x^2` 或 `y^2 = 4x` - 方程形式 + +### 多边形 +- `Polygon[A, B, C]` 或 `Polygon[A, B, C, D]` - 多边形 +- `Polygon[A, B, n]` - 以AB为边的正n边形 + +### 角度 +- `Angle[A, B, C]` - 以B为顶点的角 + +### 几何变换 +- `Translate[obj, vector]` - 平移 +- `Rotate[obj, angle]` - 绕原点旋转 +- `Rotate[obj, angle, point]` - 绕指定点旋转 +- `Reflect[obj, line]` - 关于直线反射 +- `Reflect[obj, point]` - 关于点反射 +- `Dilate[obj, factor, center]` - 以center为中心缩放 + +### 构造命令 +- `Tangent[point, conic]` - 圆锥曲线在点处的切线 +- `Tangent[x_value, function]` - 函数在x处的切线 +- `Asymptote[hyperbola]` - 双曲线的渐近线 +- `Directrix[parabola]` - 抛物线的准线 + +### 样式设置命令(关键) +- `SetColor[obj, r, g, b]` - RGB颜色 (0-255) +- `SetColor[obj, "Red"]` - 颜色名(支持:Red, Blue, Green, Yellow, Orange, Purple, Cyan, Magenta, Black, Gray, White) +- `SetLineThickness[obj, n]` - 线宽(1-13) +- `SetLineStyle[obj, n]` - 线型(0=实线, 1=虚线, 2=点线) +- `SetPointSize[obj, n]` - 点大小(1-9) +- `SetFilling[obj, ratio]` - 填充(0-1) +- `SetVisible[obj, false]` - 隐藏对象 +- `SetLabelVisible[obj, true/false]` - 显示/隐藏标签 +- `SetCaption[obj, "text"]` - 设置标注 + +### 画布控制 +- `ShowGrid[true/false]` - 网格 +- `ShowAxes[true/false]` - 坐标轴 + +**注意**:不要使用 `SetCoordSystem` 命令,坐标系会根据绘制的元素自动适配。 + +### 文字和标签 +- `Text["Hello", A]` 或 `Text["Hello", (2, 3)]` - 在位置显示文字 +- `Text["$\\frac{1}{2}$", (0, 0)]` - LaTeX文字 + +## ⚠️ 常见错误(必须避免) + +1. **❌ 错误**: `Point({1, 2})` → **✅ 正确**: `A = (1, 2)` +2. **❌ 错误**: `log(10, x)` → **✅ 正确**: `lg(x)` (常用对数) +3. **❌ 错误**: `x^2/4 + y^2/9 = 1` → **✅ 正确**: `9x^2 + 4y^2 = 36` (使用整数系数) +4. **❌ 错误**: `# this is a comment` → **✅ 正确**: 不要使用#注释,GeoGebra不支持 +5. **❌ 错误**: `Line(A, B)` → **✅ 正确**: `Line[A, B]` (使用方括号) +6. **❌ 错误**: `Circle(A, 3)` → **✅ 正确**: `Circle[A, 3]` (使用方括号) + +## 尺规作图原则 + +### 1. 确定基准点 + +**优先使用题干给出坐标的点作为锚点**,这些点的位置是精确的。 + +### 2. 点的定义策略(三种情况) + +根据 Analysis 输出中点的类型,选择不同的定义方式: + +#### 情况 1:题干给出坐标的点 (`has_coordinate: true`) +``` +# 直接使用题干给出的坐标(最精确) +A = (-3, 0) # 题干明确说 A 的坐标是 (-3, 0) +B = (2, 0) # 题干明确说 B 的坐标是 (2, 0) +``` + +#### 情况 2:派生点 (`type: "derived"`) +``` +# 必须使用几何命令,不能用坐标 +# 只有题干明确说"M是AB中点"才能这样做 +M = Midpoint[A, B] # 中点(题干必须明确说明) +P = Intersect[line1, line2] # 交点(题干必须明确说明) +``` + +#### 情况 3:图片可见但题干无坐标的点 (`use_bbox: true`) ⚠️ 关键 + +**优先使用 Analysis 中的 `estimated_ggb_coordinate`**: +``` +# Analysis 已经基于已知锚点计算了估算坐标 +# 直接使用这个坐标 +C = (-0.5, -3) # 来自 Analysis.estimated_ggb_coordinate +``` + +**如果 Analysis 没有提供估算坐标,使用锚点法手动计算** + +### 3. 锚点法坐标转换(关键技术) + +当有题干给出坐标的点时,利用它们精确计算 `use_bbox: true` 点的坐标。 + +**步骤**: +1. 获取两个锚点的题干坐标和 BBox 坐标 + - A_ggb = (-3, 0), A_bbox = (100, 200) + - B_ggb = (2, 0), B_bbox = (500, 200) + +2. 计算比例因子 + - scale_x = (B_ggb.x - A_ggb.x) / (B_bbox.x - A_bbox.x) = 5 / 400 = 0.0125 + - scale_y = 需要根据 y 方向的锚点计算(注意 BBox 的 y 轴向下) + +3. 对于目标点 C(C_bbox = (300, 500)) + - C_ggb.x = A_ggb.x + (C_bbox.x - A_bbox.x) * scale_x = -3 + 200 * 0.0125 = -0.5 + - C_ggb.y = A_ggb.y - (C_bbox.y - A_bbox.y) * scale_y(y轴翻转) + +**重要**:`use_bbox: true` 的点必须被绘制!使用锚点法确保坐标准确。 + +### 4. ⚠️ 禁止错误的派生点定义 + +**绝对禁止**:将 `use_bbox: true` 的点定义为派生点 + +``` +# 错误示例(如果题干没说 C 是中点) +❌ C = Midpoint[A, B] # 错误!题干没说 C 是中点 + +# 正确做法 +✅ C = (-0.5, -3) # 使用锚点法计算的坐标 +``` + +**判断流程**: +``` +Analysis 中这个点是什么类型? +├── type: "derived" → 使用几何命令(Midpoint, Intersect 等) +├── has_coordinate: true → 使用题干坐标 +└── use_bbox: true → 使用 estimated_ggb_coordinate 或锚点法计算 +``` + +### 5. 派生点必须使用命令 + +``` +# 中点 +M = Midpoint[A, B] + +# 交点(线与线) +P = Intersect[line1, line2] + +# 交点(延长线) +aux_line = Line[A, B] # 创建辅助直线 +P = Intersect[aux_line, CD] # 求交点 +SetVisible[aux_line, false] # 隐藏辅助线 +``` + +### 6. 垂直关系 + +``` +# 过点作垂线 +perp = Perpendicular[C, Segment[A, B]] +D = Intersect[perp, Segment[A, B]] +SetVisible[perp, false] # 隐藏辅助垂线 +Segment[C, D] # 画垂线段 +``` + +### 7. 样式区分 + +``` +# 实边(题目给定的边) +SetColor[AB, "Black"] +SetLineThickness[AB, 3] + +# 辅助线(虚线) +SetColor[aux, "Gray"] +SetLineStyle[aux, 1] # 虚线 + +# 重要点 +SetPointSize[A, 5] +SetColor[A, "Blue"] +``` + +## 输出格式 + +请以 JSON 格式输出,严格遵循以下结构: + +```json +{ + "commands": [ + { + "sequence": 1, + "command": "ShowGrid[true]", + "description": "显示网格" + }, + { + "sequence": 2, + "command": "ShowAxes[true]", + "description": "显示坐标轴" + }, + { + "sequence": 4, + "command": "A = (-3, 0)", + "description": "创建基准点 A(题干坐标)" + }, + { + "sequence": 5, + "command": "B = (2, 0)", + "description": "创建点 B(题干坐标)" + }, + { + "sequence": 6, + "command": "s_AB = Segment[A, B]", + "description": "创建线段 AB" + }, + { + "sequence": 7, + "command": "SetColor[s_AB, \"Blue\"]", + "description": "设置 AB 颜色为蓝色" + }, + { + "sequence": 8, + "command": "SetLineThickness[s_AB, 3]", + "description": "设置 AB 线宽" + }, + { + "sequence": 9, + "command": "SetPointSize[A, 5]", + "description": "设置点 A 大小" + }, + { + "sequence": 10, + "command": "SetPointSize[B, 5]", + "description": "设置点 B 大小" + } + ] +} +``` + +## 命令生成顺序(重要) + +1. **画布设置**:`ShowGrid`, `ShowAxes`(不要使用 `SetCoordSystem`) +2. **基准点**:题干给出坐标的点 +3. **派生点**:使用 Midpoint, Intersect 等 +4. **估算点**:use_bbox 的点(使用估算坐标) +5. **线段和图形**:Segment, Line, Circle, Polygon 等 +6. **辅助构造**:辅助线、垂线等 +7. **样式设置**:SetColor, SetLineThickness, SetPointSize 等 +8. **隐藏辅助对象**:SetVisible[obj, false] + +## 注意事项 + +1. **派生点必须有题干依据**:只有题干明确说"M是中点"才能用 `Midpoint[A, B]` +2. **use_bbox 点使用锚点法**:优先使用 Analysis 的 estimated_ggb_coordinate +3. **禁止将 use_bbox 点定义为派生点**:即使看起来像中点,也不能假设 +4. **命令顺序**:先点后线,先主要元素后辅助元素 +5. **样式分离**:先创建对象,再设置样式 +6. **隐藏辅助对象**:用于计算的辅助线/点用 `SetVisible[obj, false]` 隐藏 +7. **完整性检查**:确保 Analysis 中列出的所有点都被创建 +8. **使用方括号**:所有命令参数使用方括号 `[]`,不要使用圆括号 +9. **不要使用注释**:GeoGebra 不支持 `#` 注释 + +## 常见错误提醒 + +### ⚠️ 最严重的错误:将 use_bbox 点错误地定义为派生点 + +- ❌ **题干没说C是中点**,但你写了 `C = Midpoint[A, B]` +- ✅ 正确做法:`C = (-0.5, -3)` (使用锚点法计算的坐标) + +### 派生点 vs use_bbox 点的区别 + +| 情况 | Analysis 中的标记 | GGBScript 做法 | +|------|------------------|---------------| +| 题干说"M是AB中点" | type: "derived" | M = Midpoint[A, B] | +| 题干问"C的坐标是?" | use_bbox: true | C = (estimated_x, estimated_y) | +| 题干给出"A(-3,0)" | has_coordinate: true | A = (-3, 0) | + +### 其他常见错误 + +- ❌ 派生点用坐标定义:`M = (3, 2)` (题干说M是中点) +- ✅ 派生点用命令:`M = Midpoint[A, B]` + +- ❌ use_bbox 点用几何命令(题干没说是中点/交点) +- ✅ use_bbox 点用坐标:`C = (-0.5, -3)` + +- ❌ 遗漏 use_bbox 点(因为题干没给坐标就忽略) +- ✅ use_bbox 点必须绘制 + +- ❌ 忘记使用锚点法(直接用全局映射导致位置不准) +- ✅ 优先使用锚点法计算坐标 + +- ❌ 使用圆括号作为命令参数:`Circle(A, 3)` +- ✅ 使用方括号:`Circle[A, 3]` diff --git a/src/agents/vision_solver/prompts/reflection.md b/src/agents/vision_solver/prompts/reflection.md new file mode 100644 index 00000000..f52ffffc --- /dev/null +++ b/src/agents/vision_solver/prompts/reflection.md @@ -0,0 +1,404 @@ +# Reflection 节点 Prompt - 验证与修正 + +## 角色定义 + +你是一个严谨的几何验证专家。你的任务是验证 GeoGebra 绘图指令的正确性,发现问题并修正。 + +**核心原则**: +1. 确保派生点使用几何命令 +2. 确保 use_bbox 点使用坐标(不是几何命令) +3. **检测并修正"错误假设"**:如果 GGBScript 将 use_bbox 点错误地定义为派生点 + +## 输入信息 + +### 题目题干 +``` +{{ question_text }} +``` + +### 图片 +[用户上传的题目图片] + +### BBox 分析结果 +```json +{{ bbox_output_json }} +``` + +### Analysis 分析结果 +```json +{{ analysis_output_json }} +``` + +### GGBScript 生成结果 +```json +{{ ggbscript_output_json }} +``` + +## 验证任务 + +### 1. 长度验证 + +检查所有长度约束是否正确实现: + +``` +约束:AB = 6 +验证:检查 A、B 两点坐标差是否为 6 + +公式:|AB| = √[(Bx - Ax)² + (By - Ay)²] +``` + +### 2. 角度验证 + +检查所有角度约束是否正确实现: + +``` +约束:∠ABC = 90° +验证:计算向量 BA 和 BC 的点积是否为 0 + +公式:BA · BC = |BA| × |BC| × cos(θ) +如果 BA · BC = 0,则 θ = 90° +``` + +### 3. 平行/垂直验证 + +``` +平行:两向量方向相同(叉积为 0) +垂直:两向量点积为 0 +``` + +### 4. 特殊点验证(最重要) + +**区分三种点的定义方式**: + +#### 派生点(Analysis 中 `type: "derived"`) +必须使用几何命令: +``` +# 中点必须使用 Midpoint 命令 +✅ M = Midpoint[A, B] +❌ M = ((Ax + Bx)/2, (Ay + By)/2) # 虽然数学正确,但不够精准 + +# 交点必须使用 Intersect 命令 +✅ P = Intersect[line1, line2] +❌ P = (计算出的交点坐标) +``` + +#### 题干给出坐标的点(Analysis 中 `has_coordinate: true`) +使用题干给出的精确坐标: +``` +✅ A = (-3, 0) # 题干说 A 的坐标是 (-3, 0) +``` + +#### 图片可见但题干无坐标的点(Analysis 中 `use_bbox: true`) +**必须使用坐标,不能使用几何命令**: +``` +# 这类点不是派生点,但图片中确实存在 +# 应该使用锚点法计算的坐标或 estimated_ggb_coordinate +✅ C = (-0.5, -3) # 从锚点法计算的位置 +❌ C = Midpoint[A, B] # 错误!题干没说C是中点 +❌ 遗漏不画 # 这是严重错误! +``` + +**关键区别**: +- `use_bbox: true` 的点 **必须用坐标定义**(即使看起来像某个几何关系的结果) +- `type: "derived"` 的点 **必须用命令定义**(如 Midpoint、Intersect) + +### 4.5 ⚠️ 反假设验证(新增关键检查) + +**目的**:检测并修正 GGBScript 中的"错误假设" + +**检查流程**: +1. 遍历 GGBScript 中使用几何命令定义的点(如 Midpoint、Intersect) +2. 对于每个这样的点,检查 Analysis 中它的类型: + - 如果是 `type: "derived"` → 正确 + - 如果是 `use_bbox: true` → **错误!**这是假设,需要修正 + +**错误示例**: +``` +Analysis 中: + C: { type: "free", use_bbox: true, estimated_ggb_coordinate: {x: -0.5, y: -3} } + +GGBScript 中: + C = Midpoint[A, B] # 错误!Analysis 没有说 C 是派生点 + +修正后: + C = (-0.5, -3) # 使用 estimated_ggb_coordinate +``` + +**验证项目**: +```json +{ + "check_type": "anti_assumption", + "target": "C", + "analysis_type": "use_bbox: true", + "ggbscript_command": "C = Midpoint[A, B]", + "issue": "use_bbox 点被错误地定义为派生点", + "correction": "C = (-0.5, -3)" +} +``` + +### 5. 布局验证 + +对比原图相对位置: + +``` +检查项: +- 点的相对位置是否与原图一致(左/右/上/下) +- 图形的整体布局是否合理 +- 坐标系范围是否能容纳所有元素 +``` + +### 6. 样式验证 + +``` +检查项: +- 辅助线是否设置为虚线 +- 重要元素是否有突出显示 +- 隐藏的辅助对象是否正确隐藏 +``` + +## 常见问题类型 + +### 严重错误 (error) + +1. **⚠️ use_bbox 点被错误地定义为派生点(最严重的错误)** + ``` + 问题:C 在 Analysis 中标记为 use_bbox: true,但 GGBScript 写了 C = Midpoint[A, B] + 原因:GGBScript 假设了一个题干没有说明的几何关系 + 修正:使用 Analysis 的 estimated_ggb_coordinate: C = (-0.5, -3) + + 检查方法: + 1. 在 GGBScript 中找所有 Midpoint、Intersect 等命令 + 2. 检查对应点在 Analysis 中是否标记为 type: "derived" + 3. 如果不是 derived 而是 use_bbox,则是错误 + ``` + +2. **派生点未使用命令** + ``` + 问题:M = (3, 2) 但 M 应该是 AB 中点(Analysis 标记为 type: "derived") + 修正:M = Midpoint[A, B] + ``` + +3. **长度/角度约束未满足** + ``` + 问题:AB = 6 但实际计算出 |AB| = 5.8 + 修正:调整 B 点坐标 + ``` + +4. **遗漏几何元素** + ``` + 问题:图片中有点 C(Analysis 标记为 use_bbox: true),但 GGBScript 没有创建它 + 修正:根据 estimated_ggb_coordinate 添加 C = (x, y) + ``` + +5. **use_bbox 点坐标不正确** + ``` + 问题:C 的坐标与 Analysis 的 estimated_ggb_coordinate 不一致 + 修正:使用 Analysis 提供的坐标 + ``` + +### 警告 (warning) + +1. **辅助线未设虚线** + ``` + 问题:辅助线使用实线 + 修正:添加 SetLineStyle[aux, 1] + ``` + +2. **布局不合理** + ``` + 问题:图形太小或位置偏离 + 修正:调整坐标系范围或基准点位置 + ``` + +## 修正策略 + +### replace - 替换命令 +```json +{ + "action": "replace", + "target_sequence": 5, + "new_command": "M = Midpoint[A, B]", + "reason": "中点必须使用 Midpoint 命令" +} +``` + +### insert - 插入命令 +```json +{ + "action": "insert", + "target_sequence": 6, + "new_command": "SetLineStyle[aux, 1]", + "reason": "辅助线应该设置为虚线" +} +``` + +### delete - 删除命令 +```json +{ + "action": "delete", + "target_sequence": 3, + "reason": "重复的命令" +} +``` + +## 输出格式 + +请以 JSON 格式输出,严格遵循以下结构: + +```json +{ + "verification_results": [ + { + "check_type": "anti_assumption", + "target": "C", + "analysis_type": "use_bbox: true", + "ggbscript_command": "C = (-0.5, -3)", + "passed": true, + "comment": "use_bbox 点正确使用了坐标定义" + }, + { + "check_type": "derived_point", + "target": "M", + "analysis_type": "type: derived", + "ggbscript_command": "M = Midpoint[A, B]", + "passed": true, + "comment": "派生点正确使用了几何命令" + }, + { + "check_type": "length", + "target": "AB", + "expected": "5", + "actual": "5", + "passed": true + } + ], + "issues_found": [ + { + "issue_id": "issue_1", + "severity": "critical", + "category": "wrong_assumption", + "description": "点 C 在 Analysis 中是 use_bbox: true,但 GGBScript 错误地使用了 Midpoint 命令", + "affected_commands": [5], + "correction_needed": "将 C = Midpoint[A, B] 改为 C = (-0.5, -3)" + }, + { + "issue_id": "issue_2", + "severity": "error", + "category": "derived_point_misuse", + "description": "点 M 是 derived 类型但使用了坐标定义", + "affected_commands": [7], + "correction_needed": "将 M = (x, y) 改为 M = Midpoint[A, B]" + } + ], + "corrections": [ + { + "issue_id": "issue_1", + "action": "replace", + "target_sequence": 7, + "new_command": "P = Intersect[s_AB, s_CD]", + "reason": "交点必须使用 Intersect 命令" + }, + { + "issue_id": "issue_2", + "action": "insert", + "target_sequence": 11, + "new_command": "SetLineStyle[aux, 1]", + "reason": "辅助线应设置为虚线" + } + ], + "final_verification": { + "no_wrong_assumptions": true, + "all_derived_points_use_commands": true, + "all_use_bbox_points_use_coordinates": true, + "all_constraints_satisfied": true, + "layout_matches_original": true, + "ready_for_rendering": true + }, + "corrected_commands": [ + { + "sequence": 1, + "command": "ShowGrid[true]", + "description": "显示网格" + }, + { + "sequence": 2, + "command": "A = (-3, 0)", + "description": "创建基准点 A" + } + ] +} +``` + +## 验证清单 + +在输出之前,请确认以下所有项目: + +### 反假设验证(最重要) +- [ ] ⚠️ **检查所有使用 Midpoint/Intersect 等命令的点** +- [ ] ⚠️ **确认这些点在 Analysis 中标记为 `type: "derived"`** +- [ ] ⚠️ **如果是 `use_bbox: true` 但用了几何命令 → 修正为坐标定义** + +### 点类型一致性 +- [ ] **所有派生点(`type: "derived"`)都使用了几何命令** +- [ ] **所有 use_bbox 点都使用了坐标定义(不是几何命令)** +- [ ] **所有 has_coordinate 点都使用了题干坐标** + +### 完整性检查 +- [ ] Analysis 中列出的所有点都在 GGBScript 中有对应的创建命令 +- [ ] use_bbox 点的坐标与 Analysis 的 estimated_ggb_coordinate 一致 + +### 几何约束验证 +- [ ] 所有长度约束都已验证 +- [ ] 所有角度约束都已验证 +- [ ] 所有平行/垂直关系都已验证 + +### 样式和布局 +- [ ] 布局与原图基本一致 +- [ ] 辅助线已设置虚线样式 +- [ ] 辅助对象已正确隐藏 +- [ ] 坐标系范围足够容纳所有元素 + +### 点完整性检查流程 + +对比 Analysis 的 `key_elements.points` 和 GGBScript 的命令: + +``` +对于每个点 P: +├── Analysis 类型是 "derived" 吗? +│ ├── 是 → GGBScript 必须用几何命令(Midpoint, Intersect 等) +│ │ 如果用了坐标 → 修正为几何命令 +│ └── 否 → 继续 +├── Analysis 有 has_coordinate: true 吗? +│ ├── 是 → GGBScript 必须用题干坐标 +│ └── 否 → 继续 +├── Analysis 有 use_bbox: true 吗? +│ ├── 是 → GGBScript 必须用坐标定义 +│ │ 如果用了几何命令 → ⚠️ 严重错误!修正为坐标 +│ └── 否 → 检查是否遗漏 +└── GGBScript 是否创建了这个点? + ├── 是 → 检查定义方式是否正确 + └── 否 → 补充创建命令 +``` + +## 注意事项 + +1. **⚠️ 反假设检查是第一优先级**:首先检查是否有 use_bbox 点被错误地定义为派生点 +2. **派生点必须用命令**:确保所有 type: "derived" 的点使用几何命令 +3. **use_bbox 点必须用坐标**:确保所有 use_bbox: true 的点使用坐标定义 +4. **修正要完整**:如果发现问题,`corrected_commands` 必须包含完整的修正后命令序列 +5. **保持序号连续**:修正后的命令序号要重新编排 +6. **使用 Analysis 的估算坐标**:use_bbox 点应使用 estimated_ggb_coordinate + +## 最终检查 + +### 检查顺序(按优先级) +1. **反假设检查**:有没有 use_bbox 点被错误地用几何命令定义? +2. **派生点检查**:所有 derived 点都用了几何命令吗? +3. **完整性检查**:所有点都被创建了吗? +4. **约束检查**:几何约束都满足吗? + +如果 `issues_found` 为空,说明验证通过,直接复制 GGBScript 的 commands 到 `corrected_commands`。 + +如果有问题,必须提供完整的 `corrected_commands`,不能只给出修改的部分。 + +**特别注意**:如果发现 use_bbox 点被错误地定义为派生点(如 `C = Midpoint[A, B]` 但 C 应该是 use_bbox),这是**最严重的错误**,必须首先修正。 diff --git a/src/agents/vision_solver/prompts/tutor.md b/src/agents/vision_solver/prompts/tutor.md new file mode 100644 index 00000000..c0734ecd --- /dev/null +++ b/src/agents/vision_solver/prompts/tutor.md @@ -0,0 +1,154 @@ +# 数学题目解答 Prompt + +## 角色定义 + +你是一位专业的数学教师,拥有 GeoGebra 数字白板作为教学工具。你需要基于已完成的图像分析结果,为学生提供清晰、详细的数学题目解答。 + +## 你的白板 + +- 你的白板使用 GeoGebra,一个强大的数学可视化工具。 +- 我已经基于题目图片分析生成了 GeoGebra 绘图指令(配图还原)。 +- **你必须在解答过程中使用 GeoGebra 可视化来演示解题步骤**。 + +## ⚠️ 核心要求:可视化解答 + +**你必须在解答过程中创建 GeoGebra 可视化**,包括但不限于: +- 绘制辅助线(垂直平分线、角平分线、延长线等) +- 标注关键点和计算结果 +- 演示解题步骤的几何操作 +- 展示最终答案的位置 + +解答中**至少包含一个 ggbscript 代码块**来可视化解题步骤。 + +## GeoGebra 脚本格式(重要) + +使用以下格式创建解题可视化: + +```ggbscript[page-id;page-title] +GeoGebra 命令(每行一条) +``` + +**重要规则:** +1. 所有 GeoGebra 命令必须包裹在 ```ggbscript[...] 代码块中 +2. `page-id` 是必需的,必须唯一(如 `solution-step1`, `answer-demo`) +3. `page-title` 推荐填写(如 `解题步骤1`, `答案演示`) +4. **解答中必须包含 ggbscript 块来可视化解题过程** +5. 使用方括号 [ ] 作为命令参数,如 `Circle[A, 3]` + +## GeoGebra 命令参考 + +### 点 +- `A = (x, y)` - 创建点 +- `Midpoint[A, B]` - 两点中点 +- `Intersect[obj1, obj2]` - 交点 + +### 线段和直线 +- `Segment[A, B]` - 线段 +- `Line[A, B]` - 直线 +- `Perpendicular[A, line]` - 过点A垂直于line的直线 +- `PerpendicularBisector[A, B]` - AB的垂直平分线 + +### 圆 +- `Circle[M, r]` - 圆心M,半径r +- `Circle[A, B, C]` - 过三点的圆 + +### 样式 +- `SetColor[obj, "Red"]` - 设置颜色 +- `SetLineThickness[obj, n]` - 线宽 (1-13) +- `SetLineStyle[obj, n]` - 线型 (0=实线, 1=虚线) +- `SetPointSize[A, n]` - 点大小 (1-9) + +### 视图 +- `ShowAxes[true/false]` - 显示/隐藏坐标轴 +- `ShowGrid[true/false]` - 显示/隐藏网格 + +**注意**:不要使用 `SetCoordSystem` 命令,坐标系会自动适配。 + +## LaTeX 数学表达式 + +在解答中使用 LaTeX 书写数学表达式: +- 行内公式:`$...$`,例如 `$x^2$` +- 独立公式:`$$...$$`,例如 `$$y = x^2 + 2x + 1$$` + +## 解答工作流 + +### 阶段 1 — 理解题目 +- 基于图片分析结果,准确理解题目要求 +- 确定需要求解的内容 +- 识别已知条件和未知量 + +### 阶段 2 — 分析与规划 +- 确定解题思路 +- 规划解题步骤 +- **规划需要的辅助可视化(必须)** + +### 阶段 3 — 逐步解答(配合可视化) +- 按步骤展开解答过程 +- 每一步给出清晰的推理说明 +- 使用 LaTeX 书写所有数学表达式 +- **创建 ggbscript 代码块来可视化关键步骤** + +#### 可视化解答示例 + +比如,在求解过程中需要作垂直平分线: + +```ggbscript[solution-step1;绘制垂直平分线] +M = Midpoint[A, B] +perp = PerpendicularBisector[A, B] +SetColor[perp, "Red"] +SetLineThickness[perp, 2] +SetLineStyle[perp, 1] +``` + +然后继续解答,标注答案点: + +```ggbscript[solution-final;标注答案] +C = (-0.5, -3) +SetColor[C, "Green"] +SetPointSize[C, 6] +``` + +### 阶段 4 — 总结答案 +- 明确给出最终答案 +- 可以适当总结解题关键点 +- **最后用 ggbscript 展示完整的解答图形(推荐)** + +## 输入信息 + +### 题目题干 +``` +{{ question_text }} +``` + +### 图片分析结果 + +题目图片已经过分析,生成了以下 GeoGebra 绘图指令用于还原配图: + +```ggbscript[image-analysis;题目配图] +{{ ggb_commands }} +``` + +分析摘要: +- 检测到的元素数量:{{ elements_count }} +- 几何约束数量:{{ constraints_count }} +- 图像是否为题目参考:{{ image_is_reference }} + +## 输出要求 + +1. 提供完整、详细的解题过程 +2. 数学表达式必须使用 LaTeX 格式 +3. 解答要逻辑清晰,步骤完整 +4. 最终给出明确的答案 +5. **必须包含 ggbscript 代码块来可视化解题过程** +6. 可视化内容应包括:辅助线绘制、关键点标注、答案演示等 + +## 可视化注意事项 + +1. **继承已有元素**:配图还原已创建了题目中的基本点和图形,你的解答可视化应该基于这些元素进行扩展 +2. **使用不同颜色**:辅助线建议使用红色或蓝色虚线,答案点使用绿色高亮 +3. **page-id 唯一性**:每个 ggbscript 块的 page-id 必须唯一,如 `solution-step1`, `solution-step2`, `answer-final` +4. **命令正确性**:确保使用正确的 GeoGebra 命令语法 + +## 语言要求 + +请使用中文进行解答,数学术语使用标准中文表达。 diff --git a/src/agents/vision_solver/vision_solver_agent.py b/src/agents/vision_solver/vision_solver_agent.py new file mode 100644 index 00000000..8b5561e0 --- /dev/null +++ b/src/agents/vision_solver/vision_solver_agent.py @@ -0,0 +1,732 @@ +"""Vision Solver Agent - Main orchestrator for image analysis pipeline. + +Implements a four-stage analysis workflow: +1. BBox - Visual element detection +2. Analysis - Geometric semantic analysis +3. GGBScript - Generate GeoGebra commands +4. Reflection - Validate and fix commands +""" + +import json +from pathlib import Path +import re +import traceback +from typing import Any, AsyncGenerator + +from src.agents.base_agent import BaseAgent +from src.agents.vision_solver.models import ( + AnalysisOutput, + BBoxOutput, + GGBCommand, + GGBScriptOutput, + ImageAnalysisState, + ReflectionOutput, + create_empty_analysis_output, + create_empty_bbox_output, + create_empty_ggbscript_output, + create_empty_reflection_output, +) + + +class VisionSolverAgent(BaseAgent): + """Agent for analyzing math problem images and generating GeoGebra visualizations.""" + + def __init__( + self, + api_key: str | None = None, + base_url: str | None = None, + model: str | None = None, + vision_model: str | None = None, + language: str = "zh", + **kwargs, + ): + """Initialize the Vision Solver Agent. + + Args: + api_key: API key for LLM provider + base_url: Base URL for LLM API + model: Model name for text generation + vision_model: Model name for vision tasks (defaults to model) + language: Language setting ('zh' or 'en') + **kwargs: Additional arguments passed to BaseAgent + """ + super().__init__( + module_name="vision_solver", + agent_name="vision_solver_agent", + api_key=api_key, + base_url=base_url, + model=model, + language=language, + **kwargs, + ) + + self.vision_model = vision_model or model + self._load_prompts() + + def _load_prompts(self): + """Load prompt templates from files.""" + prompts_dir = Path(__file__).parent / "prompts" + + self.prompt_templates = {} + for prompt_name in ["bbox", "analysis", "ggbscript", "reflection", "tutor"]: + prompt_file = prompts_dir / f"{prompt_name}.md" + if prompt_file.exists(): + self.prompt_templates[prompt_name] = prompt_file.read_text(encoding="utf-8") + else: + self.logger.warning(f"Prompt file not found: {prompt_file}") + self.prompt_templates[prompt_name] = "" + + def _render_prompt(self, template_name: str, context: dict[str, Any]) -> str: + """Render a prompt template with context variables. + + Args: + template_name: Name of the template (bbox, analysis, etc.) + context: Dictionary of variables to substitute + + Returns: + Rendered prompt string + """ + template = self.prompt_templates.get(template_name, "") + + # Simple Jinja2-like variable substitution + for key, value in context.items(): + placeholder = "{{ " + key + " }}" + if isinstance(value, (dict, list)): + template = template.replace( + placeholder, json.dumps(value, ensure_ascii=False, indent=2) + ) + else: + template = template.replace(placeholder, str(value)) + + return template + + def _extract_json_from_response(self, response: str) -> dict: + """Extract JSON from LLM response, handling markdown code blocks. + + Args: + response: Raw LLM response text + + Returns: + Parsed JSON dictionary + + Raises: + json.JSONDecodeError: If JSON parsing fails + """ + # Try to extract from markdown code block + json_pattern = r"```(?:json)?\s*([\s\S]*?)\s*```" + matches = re.findall(json_pattern, response) + + if matches: + json_str = matches[0] + else: + json_str = response + + # Remove JSON comments + json_str = re.sub(r"//.*?$", "", json_str, flags=re.MULTILINE) + json_str = re.sub(r"/\*.*?\*/", "", json_str, flags=re.DOTALL) + + try: + return json.loads(json_str) + except json.JSONDecodeError: + # Try fixing common issues + json_str = re.sub(r",\s*([}\]])", r"\1", json_str) # Remove trailing commas + try: + return json.loads(json_str) + except json.JSONDecodeError: + self.logger.error(f"JSON parsing failed, response: {response[:500]}...") + raise + + async def _call_vision_llm( + self, + prompt: str, + image_base64: str, + temperature: float = 0.3, + ) -> str: + """Call vision LLM with image input. + + Args: + prompt: Text prompt + image_base64: Base64 encoded image (data:image/...;base64,...) + temperature: Temperature for generation + + Returns: + LLM response text + """ + # Build multimodal message + messages = [ + { + "role": "user", + "content": [ + {"type": "text", "text": prompt}, + {"type": "image_url", "image_url": {"url": image_base64}}, + ], + } + ] + + response = await self.call_llm( + user_prompt="", + system_prompt="", + messages=messages, + temperature=temperature, + model=self.vision_model or self.get_model(), + verbose=False, + ) + + return response + + # ==================== Stage Processors ==================== + + async def _process_bbox(self, state: ImageAnalysisState) -> BBoxOutput: + """BBox stage: Extract pixel coordinates of geometric elements. + + Args: + state: Current pipeline state + + Returns: + BBox output with element coordinates + """ + self.logger.info(f"BBox stage - session: {state.session_id}") + + try: + prompt = self._render_prompt( + "bbox", + {"question_text": state.question_text}, + ) + + response = await self._call_vision_llm( + prompt=prompt, + image_base64=state.image_base64, + temperature=0.3, + ) + + bbox_output = self._extract_json_from_response(response) + elements_count = len(bbox_output.get("elements", [])) + self.logger.info(f"BBox completed - elements: {elements_count}") + + return bbox_output + + except Exception as e: + self.logger.error(f"BBox stage error: {e}") + self.logger.error(traceback.format_exc()) + return create_empty_bbox_output() + + async def _process_analysis(self, state: ImageAnalysisState) -> tuple[AnalysisOutput, bool]: + """Analysis stage: Extract geometric semantics. + + Args: + state: Current pipeline state + + Returns: + Tuple of (analysis output, image_is_reference flag) + """ + self.logger.info(f"Analysis stage - session: {state.session_id}") + + try: + prompt = self._render_prompt( + "analysis", + { + "question_text": state.question_text, + "bbox_output_json": state.bbox_output, + }, + ) + + response = await self._call_vision_llm( + prompt=prompt, + image_base64=state.image_base64, + temperature=0.3, + ) + + analysis_output = self._extract_json_from_response(response) + image_is_reference = analysis_output.get("image_reference_detected", False) + + if image_is_reference: + keywords = analysis_output.get("image_reference_keywords", []) + self.logger.info(f"Image reference detected - keywords: {keywords}") + + return analysis_output, image_is_reference + + except Exception as e: + self.logger.error(f"Analysis stage error: {e}") + self.logger.error(traceback.format_exc()) + return create_empty_analysis_output(), False + + async def _process_ggbscript(self, state: ImageAnalysisState) -> GGBScriptOutput: + """GGBScript stage: Generate GeoGebra commands. + + Args: + state: Current pipeline state + + Returns: + GGBScript output with command list + """ + self.logger.info(f"GGBScript stage - session: {state.session_id}") + + try: + prompt = self._render_prompt( + "ggbscript", + { + "question_text": state.question_text, + "bbox_output_json": state.bbox_output, + "analysis_output_json": state.analysis_output, + }, + ) + + response = await self._call_vision_llm( + prompt=prompt, + image_base64=state.image_base64, + temperature=0.3, + ) + + ggbscript_output = self._extract_json_from_response(response) + commands_count = len(ggbscript_output.get("commands", [])) + self.logger.info(f"GGBScript completed - commands: {commands_count}") + + return ggbscript_output + + except Exception as e: + self.logger.error(f"GGBScript stage error: {e}") + self.logger.error(traceback.format_exc()) + return create_empty_ggbscript_output() + + async def _process_reflection( + self, state: ImageAnalysisState + ) -> tuple[ReflectionOutput, list[GGBCommand]]: + """Reflection stage: Validate and fix commands. + + Args: + state: Current pipeline state + + Returns: + Tuple of (reflection output, final commands) + """ + self.logger.info(f"Reflection stage - session: {state.session_id}") + + try: + prompt = self._render_prompt( + "reflection", + { + "question_text": state.question_text, + "bbox_output_json": state.bbox_output, + "analysis_output_json": state.analysis_output, + "ggbscript_output_json": state.ggbscript_output, + }, + ) + + response = await self._call_vision_llm( + prompt=prompt, + image_base64=state.image_base64, + temperature=0.3, + ) + + reflection_output = self._extract_json_from_response(response) + + # Extract final commands + final_commands = reflection_output.get("corrected_commands", []) + if not final_commands: + # If no corrections, use original commands + final_commands = state.ggbscript_output.get("commands", []) + + issues_count = len(reflection_output.get("issues_found", [])) + self.logger.info( + f"Reflection completed - issues: {issues_count}, final commands: {len(final_commands)}" + ) + + return reflection_output, final_commands + + except Exception as e: + self.logger.error(f"Reflection stage error: {e}") + self.logger.error(traceback.format_exc()) + return create_empty_reflection_output(), state.ggbscript_output.get("commands", []) + + # ==================== Main Pipeline ==================== + + async def process( + self, + question_text: str, + image_base64: str | None = None, + session_id: str = "default", + ) -> dict[str, Any]: + """Process a math problem with optional image. + + Args: + question_text: The problem text + image_base64: Optional base64 encoded image + session_id: Session identifier + + Returns: + Dictionary with analysis results and final GGB commands + """ + state = ImageAnalysisState( + session_id=session_id, + question_text=question_text, + image_base64=image_base64, + has_image=bool(image_base64), + ) + + if not state.has_image: + return { + "has_image": False, + "final_ggb_commands": [], + } + + # Run pipeline + state.bbox_output = await self._process_bbox(state) + state.analysis_output, state.image_is_reference = await self._process_analysis(state) + state.ggbscript_output = await self._process_ggbscript(state) + state.reflection_output, state.final_ggb_commands = await self._process_reflection(state) + + return { + "has_image": True, + "bbox_output": state.bbox_output, + "analysis_output": state.analysis_output, + "ggbscript_output": state.ggbscript_output, + "reflection_output": state.reflection_output, + "final_ggb_commands": state.final_ggb_commands, + "image_is_reference": state.image_is_reference, + } + + async def stream_process( + self, + question_text: str, + image_base64: str | None = None, + session_id: str = "default", + ) -> AsyncGenerator[dict[str, Any], None]: + """Stream the analysis process with stage-by-stage events. + + Args: + question_text: The problem text + image_base64: Optional base64 encoded image + session_id: Session identifier + + Yields: + Event dictionaries for each stage completion + """ + state = ImageAnalysisState( + session_id=session_id, + question_text=question_text, + image_base64=image_base64, + has_image=bool(image_base64), + ) + + if not state.has_image: + yield {"event": "no_image", "data": {}} + return + + yield {"event": "analysis_start", "data": {"session_id": session_id}} + + # BBox stage + state.bbox_output = await self._process_bbox(state) + elements = state.bbox_output.get("elements", []) + yield { + "event": "bbox_complete", + "data": { + "stage": "bbox", + "elements_count": len(elements), + "elements": [ + {"type": e.get("type", "unknown"), "label": e.get("label", "")} + for e in elements[:10] + ], + }, + } + + # Analysis stage + state.analysis_output, state.image_is_reference = await self._process_analysis(state) + constraints = state.analysis_output.get("constraints", []) + relations = state.analysis_output.get("geometric_relations", []) + yield { + "event": "analysis_complete", + "data": { + "stage": "analysis", + "constraints_count": len(constraints), + "relations_count": len(relations), + "image_is_reference": state.image_is_reference, + "constraints": constraints[:10] if isinstance(constraints, list) else [], + }, + } + + # GGBScript stage + state.ggbscript_output = await self._process_ggbscript(state) + commands = state.ggbscript_output.get("commands", []) + yield { + "event": "ggbscript_complete", + "data": { + "stage": "ggbscript", + "commands_count": len(commands), + "commands": [ + {"command": c.get("command", ""), "description": c.get("description", "")} + for c in commands[:10] + ], + }, + } + + # Reflection stage + state.reflection_output, state.final_ggb_commands = await self._process_reflection(state) + issues = state.reflection_output.get("issues_found", []) + yield { + "event": "reflection_complete", + "data": { + "stage": "reflection", + "issues_count": len(issues), + "commands_count": len(state.final_ggb_commands), + "final_commands": state.final_ggb_commands, + }, + } + + # Final analysis message + ggb_script_content = self._format_ggb_commands(state.final_ggb_commands) + yield { + "event": "analysis_message_complete", + "data": { + "ggb_block": { + "page_id": "image-analysis-restore", + "title": "题目配图还原", + "content": ggb_script_content, + } + if ggb_script_content + else None, + "analysis_summary": { + "constraints": constraints[:10] if isinstance(constraints, list) else [], + "relations": [ + r.get("description", str(r)) if isinstance(r, dict) else str(r) + for r in relations[:10] + ], + }, + }, + } + + def _format_ggb_commands(self, commands: list[GGBCommand]) -> str: + """Format GGB commands into script content. + + Args: + commands: List of GGB command dictionaries + + Returns: + Formatted script string + """ + if not commands: + return "" + + lines = [] + for cmd in commands: + if isinstance(cmd, dict): + command = cmd.get("command", "") + description = cmd.get("description", "") + if description: + lines.append(f"# {description}") + lines.append(command) + else: + lines.append(str(cmd)) + + return "\n".join(lines) + + def format_ggb_block( + self, commands: list[GGBCommand], page_id: str = "main", title: str = "题目图形" + ) -> str: + """Format commands into a ggbscript block. + + Args: + commands: List of GGB commands + page_id: Page identifier + title: Block title + + Returns: + Formatted ggbscript block string + """ + content = self._format_ggb_commands(commands) + if not content: + return "" + + return f"```ggbscript[{page_id};{title}]\n{content}\n```" + + # ==================== Tutor Response ==================== + + async def stream_tutor_response( + self, + question_text: str, + final_ggb_commands: list[GGBCommand], + analysis_output: dict | None = None, + session_id: str = "default", + ) -> AsyncGenerator[str, None]: + """Stream the tutor's solution response based on image analysis. + + Args: + question_text: The problem text + final_ggb_commands: GeoGebra commands from image analysis + analysis_output: Analysis stage output (optional) + session_id: Session identifier + + Yields: + Text chunks of the tutor's response + """ + self.logger.info(f"[{session_id}] Starting tutor response stream") + + # Prepare context for tutor prompt + ggb_commands_str = self._format_ggb_commands(final_ggb_commands) + + # Get analysis metrics + elements_count = 0 + constraints_count = 0 + image_is_reference = False + + if analysis_output: + constraints = analysis_output.get("constraints", []) + constraints_count = len(constraints) if isinstance(constraints, list) else 0 + image_is_reference = analysis_output.get("image_reference_detected", False) + + # Render tutor prompt + tutor_prompt = self._render_prompt( + "tutor", + { + "question_text": question_text, + "ggb_commands": ggb_commands_str, + "elements_count": len(final_ggb_commands), + "constraints_count": constraints_count, + "image_is_reference": "是" if image_is_reference else "否", + }, + ) + + # Stream response from LLM + try: + async for chunk in self.stream_llm( + user_prompt=tutor_prompt, + system_prompt="你是一位专业的数学教师,善于使用可视化方式解释数学问题。请基于图像分析结果,为学生提供清晰、详细的解题过程。", + temperature=0.7, + ): + yield chunk + + except Exception as e: + self.logger.error(f"[{session_id}] Tutor response error: {e}") + yield f"\n\n抱歉,解题过程生成出现错误:{e}" + + async def stream_process_with_tutor( + self, + question_text: str, + image_base64: str | None = None, + session_id: str = "default", + ) -> AsyncGenerator[dict[str, Any], None]: + """Stream the full analysis and tutor response process. + + Args: + question_text: The problem text + image_base64: Optional base64 encoded image + session_id: Session identifier + + Yields: + Event dictionaries for each stage and tutor response chunks + """ + state = ImageAnalysisState( + session_id=session_id, + question_text=question_text, + image_base64=image_base64, + has_image=bool(image_base64), + ) + + if not state.has_image: + yield {"event": "no_image", "data": {}} + # Even without image, we can still provide a solution + yield {"event": "answer_start", "data": {"has_image_analysis": False}} + async for chunk in self.stream_tutor_response( + question_text=question_text, + final_ggb_commands=[], + analysis_output=None, + session_id=session_id, + ): + yield {"event": "text", "data": {"content": chunk}} + yield {"event": "done", "data": {}} + return + + yield {"event": "analysis_start", "data": {"session_id": session_id}} + + # BBox stage + state.bbox_output = await self._process_bbox(state) + elements = state.bbox_output.get("elements", []) + yield { + "event": "bbox_complete", + "data": { + "stage": "bbox", + "elements_count": len(elements), + "elements": [ + {"type": e.get("type", "unknown"), "label": e.get("label", "")} + for e in elements[:10] + ], + }, + } + + # Analysis stage + state.analysis_output, state.image_is_reference = await self._process_analysis(state) + constraints = state.analysis_output.get("constraints", []) + relations = state.analysis_output.get("geometric_relations", []) + yield { + "event": "analysis_complete", + "data": { + "stage": "analysis", + "constraints_count": len(constraints), + "relations_count": len(relations), + "image_is_reference": state.image_is_reference, + "constraints": constraints[:10] if isinstance(constraints, list) else [], + }, + } + + # GGBScript stage + state.ggbscript_output = await self._process_ggbscript(state) + commands = state.ggbscript_output.get("commands", []) + yield { + "event": "ggbscript_complete", + "data": { + "stage": "ggbscript", + "commands_count": len(commands), + "commands": [ + {"command": c.get("command", ""), "description": c.get("description", "")} + for c in commands[:10] + ], + }, + } + + # Reflection stage + state.reflection_output, state.final_ggb_commands = await self._process_reflection(state) + issues = state.reflection_output.get("issues_found", []) + yield { + "event": "reflection_complete", + "data": { + "stage": "reflection", + "issues_count": len(issues), + "commands_count": len(state.final_ggb_commands), + "final_commands": state.final_ggb_commands, + }, + } + + # Analysis message with GGB block + ggb_script_content = self._format_ggb_commands(state.final_ggb_commands) + yield { + "event": "analysis_message_complete", + "data": { + "ggb_block": { + "page_id": "image-analysis-restore", + "title": "题目配图还原", + "content": ggb_script_content, + } + if ggb_script_content + else None, + "analysis_summary": { + "constraints": constraints[:10] if isinstance(constraints, list) else [], + "relations": [ + r.get("description", str(r)) if isinstance(r, dict) else str(r) + for r in relations[:10] + ], + }, + }, + } + + # Start tutor response + yield {"event": "answer_start", "data": {"has_image_analysis": True}} + + # Stream tutor response + async for chunk in self.stream_tutor_response( + question_text=question_text, + final_ggb_commands=state.final_ggb_commands, + analysis_output=state.analysis_output, + session_id=session_id, + ): + yield {"event": "text", "data": {"content": chunk}} + + yield {"event": "done", "data": {}} diff --git a/src/api/main.py b/src/api/main.py index b22c8ba3..1d32cbac 100644 --- a/src/api/main.py +++ b/src/api/main.py @@ -20,6 +20,7 @@ settings, solve, system, + vision_solver, ) from src.logging import get_logger from src.services.path_service import get_path_service @@ -243,6 +244,7 @@ async def lifespan(app: FastAPI): app.include_router(system.router, prefix="/api/v1/system", tags=["system"]) app.include_router(config.router, prefix="/api/v1/config", tags=["config"]) app.include_router(agent_config.router, prefix="/api/v1/agent-config", tags=["agent-config"]) +app.include_router(vision_solver.router, prefix="/api/v1", tags=["vision-solver"]) @app.get("/") diff --git a/src/api/routers/vision_solver.py b/src/api/routers/vision_solver.py new file mode 100644 index 00000000..1178c48f --- /dev/null +++ b/src/api/routers/vision_solver.py @@ -0,0 +1,249 @@ +"""Vision Solver API Router. + +WebSocket endpoint for real-time image analysis with GeoGebra visualization. +""" + +import asyncio +from typing import Any + +from fastapi import APIRouter, HTTPException, WebSocket, WebSocketDisconnect +from pydantic import BaseModel + +from src.agents.vision_solver import VisionSolverAgent +from src.logging import get_logger +from src.services.llm import get_llm_config +from src.services.settings.interface_settings import get_ui_language +from src.tools.vision import ImageError, resolve_image_input + +logger = get_logger("VisionSolverAPI", level="INFO") + +router = APIRouter() + + +# ==================== Request/Response Models ==================== + + +class VisionAnalyzeRequest(BaseModel): + """Request for image analysis.""" + + question: str + image_base64: str | None = None + image_url: str | None = None + session_id: str | None = None + + +class VisionAnalyzeResponse(BaseModel): + """Response from image analysis.""" + + session_id: str + has_image: bool + final_ggb_commands: list[dict] = [] + ggb_script: str | None = None + analysis_summary: dict = {} + + +# ==================== REST Endpoints ==================== + + +@router.post("/vision/analyze") +async def analyze_image(request: VisionAnalyzeRequest) -> VisionAnalyzeResponse: + """Analyze a math problem image and return GeoGebra commands. + + Args: + request: Analysis request with question and image + + Returns: + Analysis response with GGB commands + """ + session_id = request.session_id or f"vision_{id(request)}" + + try: + # Resolve image input + image_base64 = await resolve_image_input( + image_base64=request.image_base64, + image_url=request.image_url, + ) + + if not image_base64: + return VisionAnalyzeResponse( + session_id=session_id, + has_image=False, + ) + + # Get LLM config + try: + llm_config = get_llm_config() + api_key = llm_config.api_key + base_url = llm_config.base_url + except Exception as e: + logger.error(f"Failed to get LLM config: {e}") + raise HTTPException(status_code=500, detail=f"LLM configuration error: {e}") + + # Initialize agent + language = get_ui_language(default="zh") + agent = VisionSolverAgent( + api_key=api_key, + base_url=base_url, + language=language, + ) + + # Process image + result = await agent.process( + question_text=request.question, + image_base64=image_base64, + session_id=session_id, + ) + + # Format GGB script + ggb_script = None + if result.get("final_ggb_commands"): + ggb_script = agent.format_ggb_block( + result["final_ggb_commands"], + page_id="analysis", + title="题目图形", + ) + + return VisionAnalyzeResponse( + session_id=session_id, + has_image=result.get("has_image", False), + final_ggb_commands=result.get("final_ggb_commands", []), + ggb_script=ggb_script, + analysis_summary={ + "image_is_reference": result.get("image_is_reference", False), + "elements_count": len(result.get("bbox_output", {}).get("elements", [])), + "commands_count": len(result.get("final_ggb_commands", [])), + }, + ) + + except ImageError as e: + logger.error(f"Image error: {e}") + raise HTTPException(status_code=400, detail=str(e)) + except Exception as e: + logger.error(f"Analysis failed: {e}", exc_info=True) + raise HTTPException(status_code=500, detail=str(e)) + + +# ==================== WebSocket Endpoint ==================== + + +@router.websocket("/vision/solve") +async def websocket_vision_solve(websocket: WebSocket): + """WebSocket endpoint for streaming image analysis. + + Protocol: + 1. Client sends: {"question": "...", "image_base64": "...", "session_id": "..."} + 2. Server streams: + - {"type": "session", "session_id": "..."} + - {"type": "analysis_start", "data": {...}} + - {"type": "bbox_complete", "data": {...}} + - {"type": "analysis_complete", "data": {...}} + - {"type": "ggbscript_complete", "data": {...}} + - {"type": "reflection_complete", "data": {...}} + - {"type": "analysis_message_complete", "data": {...}} + - {"type": "answer_start", "data": {...}} + - {"type": "text", "content": "..."} + - {"type": "done"} + """ + await websocket.accept() + + connection_closed = asyncio.Event() + + async def safe_send_json(data: dict[str, Any]) -> bool: + """Safely send JSON, checking if connection is closed.""" + if connection_closed.is_set(): + return False + try: + await websocket.send_json(data) + return True + except (WebSocketDisconnect, RuntimeError, ConnectionError) as e: + logger.debug(f"WebSocket connection closed: {e}") + connection_closed.set() + return False + except Exception as e: + logger.debug(f"Error sending WebSocket message: {e}") + return False + + session_id = None + + try: + # 1. Receive initial message + data = await websocket.receive_json() + question = data.get("question") + image_base64 = data.get("image_base64") + image_url = data.get("image_url") + session_id = data.get("session_id", f"vision_{id(data)}") + + if not question: + await safe_send_json({"type": "error", "content": "Question is required"}) + return + + # Send session ID + await safe_send_json({"type": "session", "session_id": session_id}) + + # 2. Resolve image input + try: + resolved_image = await resolve_image_input( + image_base64=image_base64, + image_url=image_url, + ) + except ImageError as e: + await safe_send_json({"type": "error", "content": str(e)}) + return + + if not resolved_image: + await safe_send_json({"type": "no_image", "data": {}}) + await safe_send_json({"type": "done"}) + return + + # 3. Initialize agent + try: + llm_config = get_llm_config() + api_key = llm_config.api_key + base_url = llm_config.base_url + except Exception as e: + logger.error(f"Failed to get LLM config: {e}") + await safe_send_json({"type": "error", "content": f"LLM configuration error: {e}"}) + return + + language = get_ui_language(default="zh") + agent = VisionSolverAgent( + api_key=api_key, + base_url=base_url, + language=language, + ) + + logger.info(f"[{session_id}] Starting vision analysis: {question[:50]}...") + + # 4. Stream analysis and tutor response + async for event in agent.stream_process_with_tutor( + question_text=question, + image_base64=resolved_image, + session_id=session_id, + ): + event_type = event.get("event", "unknown") + event_data = event.get("data", {}) + + if not await safe_send_json({"type": event_type, "data": event_data}): + break + + logger.info(f"[{session_id}] Vision analysis and tutor response completed") + + except WebSocketDisconnect: + logger.info(f"[{session_id}] WebSocket disconnected") + except Exception as e: + connection_closed.set() + await safe_send_json({"type": "error", "content": str(e)}) + logger.error(f"[{session_id}] Vision solve failed: {e}", exc_info=True) + finally: + connection_closed.set() + try: + if hasattr(websocket, "client_state"): + state = websocket.client_state + if hasattr(state, "name") and state.name != "DISCONNECTED": + await websocket.close() + else: + await websocket.close() + except (WebSocketDisconnect, RuntimeError, ConnectionError): + pass + except Exception as e: + logger.debug(f"Error closing WebSocket: {e}") diff --git a/src/tools/vision/__init__.py b/src/tools/vision/__init__.py new file mode 100644 index 00000000..85aac443 --- /dev/null +++ b/src/tools/vision/__init__.py @@ -0,0 +1,79 @@ +"""Vision tools for image processing and GeoGebra support.""" + +from src.tools.vision.block_parser import ( + BlockType, + GGBBlock, + ParsedContent, + StreamingBlockParser, + parse_ggb_blocks, +) +from src.tools.vision.coord_transform import ( + DEFAULT_GGB_COORD, + GGBCoordSystem, + ImageDimensions, + Point, + bbox_to_ggb, + calculate_distance, + calculate_midpoint, + convert_bbox_elements_to_ggb, + format_ggb_point, + format_set_coord_system, + ggb_to_bbox, + is_parallel, + is_perpendicular, + suggest_coord_system, + validate_point_in_bounds, +) +from src.tools.vision.ggb_validator import ( + ValidationResult, + get_command_help, + validate_command, + validate_ggbscript, +) +from src.tools.vision.image_utils import ( + ImageError, + fetch_image_from_url, + image_bytes_to_base64, + is_base64_image, + is_valid_image_url, + resolve_image_input, + url_to_base64, +) + +__all__ = [ + # Image utils + "ImageError", + "fetch_image_from_url", + "image_bytes_to_base64", + "is_base64_image", + "is_valid_image_url", + "resolve_image_input", + "url_to_base64", + # Coord transform + "DEFAULT_GGB_COORD", + "GGBCoordSystem", + "ImageDimensions", + "Point", + "bbox_to_ggb", + "calculate_distance", + "calculate_midpoint", + "convert_bbox_elements_to_ggb", + "format_ggb_point", + "format_set_coord_system", + "ggb_to_bbox", + "is_parallel", + "is_perpendicular", + "suggest_coord_system", + "validate_point_in_bounds", + # GGB validator + "ValidationResult", + "get_command_help", + "validate_command", + "validate_ggbscript", + # Block parser + "BlockType", + "GGBBlock", + "ParsedContent", + "StreamingBlockParser", + "parse_ggb_blocks", +] diff --git a/src/tools/vision/block_parser.py b/src/tools/vision/block_parser.py new file mode 100644 index 00000000..167f7d30 --- /dev/null +++ b/src/tools/vision/block_parser.py @@ -0,0 +1,251 @@ +"""Parser for GGBScript code blocks in LLM output.""" + +from dataclasses import dataclass, field +from enum import Enum +import re + +from src.tools.vision.ggb_validator import validate_ggbscript + + +class BlockType(str, Enum): + """Types of special blocks in the response.""" + + GGBSCRIPT = "ggbscript" + GEOGEBRA = "geogebra" + + +@dataclass +class GGBBlock: + """Represents a parsed GeoGebra script block.""" + + page_id: str + title: str + content: str + original_content: str = "" # Original content before validation/fixing + validation_warnings: list[str] = field(default_factory=list) + block_type: BlockType = BlockType.GGBSCRIPT + + +@dataclass +class ParsedContent: + """Result of parsing LLM output.""" + + text_segments: list[str] = field(default_factory=list) + ggb_blocks: list[GGBBlock] = field(default_factory=list) + + +# Regex pattern for matching GGBScript blocks +# Matches: ```ggbscript[page-id;optional-title] or ```geogebra[page-id;optional-title] +BLOCK_START_PATTERN = re.compile( + r"```\s*(ggbscript|geogebra)\s*\[([^\]\s;]+)(?:;([^\]]*))?\]\s*\n?", + re.IGNORECASE, +) + +BLOCK_END_PATTERN = re.compile(r"```\s*(?:\n|$)") + + +def parse_ggb_blocks(text: str) -> ParsedContent: + """Parse text to extract GeoGebra script blocks and regular text. + + Args: + text: The full text response from the LLM + + Returns: + ParsedContent containing text segments and GGB blocks + """ + result = ParsedContent() + current_pos = 0 + + while current_pos < len(text): + # Find the next block start + start_match = BLOCK_START_PATTERN.search(text, current_pos) + + if not start_match: + # No more blocks, add remaining text + remaining = text[current_pos:].strip() + if remaining: + result.text_segments.append(remaining) + break + + # Add text before the block + text_before = text[current_pos : start_match.start()].strip() + if text_before: + result.text_segments.append(text_before) + + # Extract block metadata + block_type_str = start_match.group(1).lower() + page_id = start_match.group(2) + title = start_match.group(3) or "Untitled" + + # Find the block end + content_start = start_match.end() + end_match = BLOCK_END_PATTERN.search(text, content_start) + + if not end_match: + # No closing ```, treat rest as content + content = text[content_start:].strip() + current_pos = len(text) + else: + content = text[content_start : end_match.start()].strip() + current_pos = end_match.end() + + # Validate and fix the content + fixed_content, warnings, errors = validate_ggbscript(content) + + # Create the block with validated content + block = GGBBlock( + page_id=page_id, + title=title.strip(), + content=fixed_content, + original_content=content if content != fixed_content else "", + validation_warnings=warnings, + block_type=BlockType(block_type_str), + ) + result.ggb_blocks.append(block) + + return result + + +class StreamingBlockParser: + """Stateful parser for streaming LLM output. + + Handles incremental parsing of text chunks. + """ + + def __init__(self): + self.buffer = "" + self.state = "idle" # idle, await_block, in_block + self.current_block: dict | None = None + self.pending_text = "" + + def feed(self, chunk: str) -> list[dict]: + """Feed a chunk of text and return any complete events. + + Args: + chunk: New text chunk from the stream + + Returns: + List of events: {"type": "text", "content": "..."} or + {"type": "ggb_block", "page_id": "...", "title": "...", "content": "..."} + """ + self.buffer += chunk + events = [] + + while True: + if self.state == "idle": + # Look for block start + start_match = BLOCK_START_PATTERN.search(self.buffer) + + if start_match: + # Emit text before block + text_before = self.buffer[: start_match.start()] + if text_before: + events.append({"type": "text", "content": text_before}) + + # Start collecting block + self.current_block = { + "type": start_match.group(1).lower(), + "page_id": start_match.group(2), + "title": (start_match.group(3) or "Untitled").strip(), + "content": "", + } + self.buffer = self.buffer[start_match.end() :] + self.state = "in_block" + continue + else: + # Check if we might be at the start of a block pattern + if "```" in self.buffer: + idx = self.buffer.rfind("```") + # Check if this could be a block start + potential = self.buffer[idx:] + if len(potential) < 50: # Reasonable max length for block header + # Keep it in buffer, emit text before + text_before = self.buffer[:idx] + if text_before: + events.append({"type": "text", "content": text_before}) + self.buffer = potential + break + + # No block pattern, emit all as text + if self.buffer: + events.append({"type": "text", "content": self.buffer}) + self.buffer = "" + break + + elif self.state == "in_block": + # Look for block end + end_match = BLOCK_END_PATTERN.search(self.buffer) + + if end_match: + # Complete the block + original_content = self.buffer[: end_match.start()].strip() + + # Validate and fix the content + fixed_content, warnings, errors = validate_ggbscript(original_content) + + self.current_block["content"] = fixed_content + self.current_block["original_content"] = ( + original_content if original_content != fixed_content else "" + ) + self.current_block["validation_warnings"] = warnings + + events.append( + { + "type": "ggb_block", + "page_id": self.current_block["page_id"], + "title": self.current_block["title"], + "content": self.current_block["content"], + "original_content": self.current_block["original_content"], + "validation_warnings": self.current_block["validation_warnings"], + } + ) + self.buffer = self.buffer[end_match.end() :] + self.current_block = None + self.state = "idle" + continue + else: + # Keep collecting block content + break + + return events + + def flush(self) -> list[dict]: + """Flush any remaining content when stream ends. + + Returns: + List of final events + """ + events = [] + + if self.state == "in_block" and self.current_block: + # Incomplete block, emit as block anyway + original_content = self.buffer.strip() + + # Validate and fix the content + fixed_content, warnings, errors = validate_ggbscript(original_content) + + self.current_block["content"] = fixed_content + self.current_block["original_content"] = ( + original_content if original_content != fixed_content else "" + ) + self.current_block["validation_warnings"] = warnings + + events.append( + { + "type": "ggb_block", + "page_id": self.current_block["page_id"], + "title": self.current_block["title"], + "content": self.current_block["content"], + "original_content": self.current_block["original_content"], + "validation_warnings": self.current_block["validation_warnings"], + } + ) + elif self.buffer: + # Remaining text + events.append({"type": "text", "content": self.buffer}) + + self.buffer = "" + self.state = "idle" + self.current_block = None + + return events diff --git a/src/tools/vision/coord_transform.py b/src/tools/vision/coord_transform.py new file mode 100644 index 00000000..f65e477b --- /dev/null +++ b/src/tools/vision/coord_transform.py @@ -0,0 +1,436 @@ +"""Coordinate transformation utilities. + +Converts between BBox pixel coordinates and GeoGebra math coordinates. + +BBox coordinate system: +- Origin at top-left +- X-axis: right is positive +- Y-axis: down is positive + +GeoGebra coordinate system: +- Origin at center (or user-specified) +- X-axis: right is positive +- Y-axis: up is positive +""" + +from dataclasses import dataclass +import math + + +@dataclass +class ImageDimensions: + """Image dimensions.""" + + width: int + height: int + + +@dataclass +class GGBCoordSystem: + """GeoGebra coordinate system range.""" + + x_min: float + x_max: float + y_min: float + y_max: float + + @property + def width(self) -> float: + """Coordinate system width.""" + return self.x_max - self.x_min + + @property + def height(self) -> float: + """Coordinate system height.""" + return self.y_max - self.y_min + + @property + def center(self) -> tuple[float, float]: + """Coordinate system center.""" + return ((self.x_min + self.x_max) / 2, (self.y_min + self.y_max) / 2) + + +@dataclass +class Point: + """2D point.""" + + x: float + y: float + + def __repr__(self) -> str: + return f"({self.x:.2f}, {self.y:.2f})" + + +# Default configuration +DEFAULT_GGB_COORD = GGBCoordSystem(x_min=-10, x_max=10, y_min=-8, y_max=8) + + +def bbox_to_ggb( + bbox_x: float, + bbox_y: float, + img_dimensions: ImageDimensions, + ggb_coord: GGBCoordSystem | None = None, +) -> Point: + """Convert BBox pixel coordinates to GeoGebra math coordinates. + + Args: + bbox_x: BBox X coordinate (pixels) + bbox_y: BBox Y coordinate (pixels) + img_dimensions: Image dimensions + ggb_coord: GeoGebra coordinate range, default [-10, 10] x [-8, 8] + + Returns: + Point in GeoGebra coordinate system + """ + if ggb_coord is None: + ggb_coord = DEFAULT_GGB_COORD + + # Normalize to [0, 1] + norm_x = bbox_x / img_dimensions.width + norm_y = bbox_y / img_dimensions.height + + # Map to GeoGebra coordinates + # X: direct linear mapping + ggb_x = ggb_coord.x_min + norm_x * ggb_coord.width + + # Y: need to flip (BBox Y down, GeoGebra Y up) + ggb_y = ggb_coord.y_max - norm_y * ggb_coord.height + + return Point(x=ggb_x, y=ggb_y) + + +def ggb_to_bbox( + ggb_x: float, + ggb_y: float, + img_dimensions: ImageDimensions, + ggb_coord: GGBCoordSystem | None = None, +) -> Point: + """Convert GeoGebra math coordinates to BBox pixel coordinates. + + Args: + ggb_x: GeoGebra X coordinate + ggb_y: GeoGebra Y coordinate + img_dimensions: Image dimensions + ggb_coord: GeoGebra coordinate range, default [-10, 10] x [-8, 8] + + Returns: + Point in BBox pixel coordinate system + """ + if ggb_coord is None: + ggb_coord = DEFAULT_GGB_COORD + + # Normalize to [0, 1] + norm_x = (ggb_x - ggb_coord.x_min) / ggb_coord.width + norm_y = (ggb_coord.y_max - ggb_y) / ggb_coord.height # Y-axis flip + + # Map to pixel coordinates + bbox_x = norm_x * img_dimensions.width + bbox_y = norm_y * img_dimensions.height + + return Point(x=bbox_x, y=bbox_y) + + +def convert_bbox_elements_to_ggb( + bbox_output: dict, + ggb_coord: GGBCoordSystem | None = None, +) -> dict: + """Batch convert all element coordinates in BBox output. + + Args: + bbox_output: BBox node output + ggb_coord: GeoGebra coordinate range + + Returns: + Converted BBox output (with ggb_position fields) + """ + if ggb_coord is None: + ggb_coord = DEFAULT_GGB_COORD + + # Get image dimensions + img_dims_data = bbox_output.get("image_dimensions", {}) + img_dimensions = ImageDimensions( + width=img_dims_data.get("width", 800), + height=img_dims_data.get("height", 600), + ) + + # Convert each element + result = bbox_output.copy() + converted_elements = [] + + for element in bbox_output.get("elements", []): + converted = element.copy() + + # Convert point coordinates + if "position" in element and element["position"]: + pos = element["position"] + ggb_point = bbox_to_ggb( + pos.get("x", 0), + pos.get("y", 0), + img_dimensions, + ggb_coord, + ) + converted["ggb_position"] = {"x": ggb_point.x, "y": ggb_point.y} + + # Convert segment start and end + if "start" in element and element["start"]: + start = element["start"] + ggb_start = bbox_to_ggb( + start.get("x", 0), + start.get("y", 0), + img_dimensions, + ggb_coord, + ) + converted["ggb_start"] = {"x": ggb_start.x, "y": ggb_start.y} + + if "end" in element and element["end"]: + end = element["end"] + ggb_end = bbox_to_ggb( + end.get("x", 0), + end.get("y", 0), + img_dimensions, + ggb_coord, + ) + converted["ggb_end"] = {"x": ggb_end.x, "y": ggb_end.y} + + # Convert polygon vertices + if "vertices" in element and element["vertices"]: + ggb_vertices = [] + for vertex in element["vertices"]: + ggb_v = bbox_to_ggb( + vertex.get("x", 0), + vertex.get("y", 0), + img_dimensions, + ggb_coord, + ) + ggb_vertices.append({"label": vertex.get("label", ""), "x": ggb_v.x, "y": ggb_v.y}) + converted["ggb_vertices"] = ggb_vertices + + # Convert circle center + if "center" in element and element["center"]: + center = element["center"] + ggb_center = bbox_to_ggb( + center.get("x", 0), + center.get("y", 0), + img_dimensions, + ggb_coord, + ) + converted["ggb_center"] = {"x": ggb_center.x, "y": ggb_center.y} + + # Convert radius (scale proportionally) + if "radius" in element: + pixel_radius = element["radius"] + scale_x = ggb_coord.width / img_dimensions.width + converted["ggb_radius"] = pixel_radius * scale_x + + converted_elements.append(converted) + + result["elements"] = converted_elements + return result + + +def validate_point_in_bounds( + point: Point, + ggb_coord: GGBCoordSystem | None = None, + tolerance: float = 0.1, +) -> tuple[bool, str]: + """Validate if point is within GeoGebra coordinate bounds. + + Args: + point: Point to validate + ggb_coord: Coordinate range + tolerance: Boundary tolerance + + Returns: + (is_valid, error_message) + """ + if ggb_coord is None: + ggb_coord = DEFAULT_GGB_COORD + + x_valid = ggb_coord.x_min - tolerance <= point.x <= ggb_coord.x_max + tolerance + y_valid = ggb_coord.y_min - tolerance <= point.y <= ggb_coord.y_max + tolerance + + if not x_valid: + return ( + False, + f"X coordinate {point.x:.2f} out of range [{ggb_coord.x_min}, {ggb_coord.x_max}]", + ) + if not y_valid: + return ( + False, + f"Y coordinate {point.y:.2f} out of range [{ggb_coord.y_min}, {ggb_coord.y_max}]", + ) + + return True, "" + + +def calculate_distance(p1: Point, p2: Point) -> float: + """Calculate distance between two points.""" + return math.sqrt((p2.x - p1.x) ** 2 + (p2.y - p1.y) ** 2) + + +def calculate_midpoint(p1: Point, p2: Point) -> Point: + """Calculate midpoint of two points.""" + return Point(x=(p1.x + p2.x) / 2, y=(p1.y + p2.y) / 2) + + +def is_perpendicular( + p1: Point, + p2: Point, + p3: Point, + p4: Point, + tolerance: float = 0.01, +) -> bool: + """Check if two segments are perpendicular. + + Segment 1: p1 -> p2 + Segment 2: p3 -> p4 + """ + # Direction vectors + v1 = (p2.x - p1.x, p2.y - p1.y) + v2 = (p4.x - p3.x, p4.y - p3.y) + + # Dot product + dot_product = v1[0] * v2[0] + v1[1] * v2[1] + + return abs(dot_product) < tolerance + + +def is_parallel( + p1: Point, + p2: Point, + p3: Point, + p4: Point, + tolerance: float = 0.01, +) -> bool: + """Check if two segments are parallel. + + Segment 1: p1 -> p2 + Segment 2: p3 -> p4 + """ + # Direction vectors + v1 = (p2.x - p1.x, p2.y - p1.y) + v2 = (p4.x - p3.x, p4.y - p3.y) + + # Cross product (parallel when 0) + cross_product = v1[0] * v2[1] - v1[1] * v2[0] + + # Normalize + len1 = math.sqrt(v1[0] ** 2 + v1[1] ** 2) + len2 = math.sqrt(v2[0] ** 2 + v2[1] ** 2) + + if len1 < 1e-10 or len2 < 1e-10: + return False # Degenerate case + + normalized_cross = abs(cross_product) / (len1 * len2) + + return normalized_cross < tolerance + + +def suggest_coord_system( + bbox_output: dict, + padding_ratio: float = 0.2, +) -> GGBCoordSystem: + """Suggest appropriate GeoGebra coordinate range based on BBox output. + + Args: + bbox_output: BBox node output + padding_ratio: Boundary padding ratio + + Returns: + Suggested coordinate range + """ + # Collect all coordinate points + all_x: list[float] = [] + all_y: list[float] = [] + + img_dims_data = bbox_output.get("image_dimensions", {}) + img_dimensions = ImageDimensions( + width=img_dims_data.get("width", 800), + height=img_dims_data.get("height", 600), + ) + + for element in bbox_output.get("elements", []): + if "position" in element and element["position"]: + all_x.append(element["position"].get("x", 0)) + all_y.append(element["position"].get("y", 0)) + + if "start" in element and element["start"]: + all_x.append(element["start"].get("x", 0)) + all_y.append(element["start"].get("y", 0)) + + if "end" in element and element["end"]: + all_x.append(element["end"].get("x", 0)) + all_y.append(element["end"].get("y", 0)) + + if "vertices" in element: + for v in element["vertices"]: + all_x.append(v.get("x", 0)) + all_y.append(v.get("y", 0)) + + if "center" in element and element["center"]: + all_x.append(element["center"].get("x", 0)) + all_y.append(element["center"].get("y", 0)) + + if not all_x or not all_y: + return DEFAULT_GGB_COORD + + # Calculate bounds + min_x, max_x = min(all_x), max(all_x) + min_y, max_y = min(all_y), max(all_y) + + # Calculate range + range_x = max_x - min_x if max_x > min_x else img_dimensions.width + range_y = max_y - min_y if max_y > min_y else img_dimensions.height + + # Maintain aspect ratio + aspect_ratio = img_dimensions.width / img_dimensions.height + + # Estimate appropriate coordinate range + ggb_range_x = range_x / img_dimensions.width * 20 + ggb_range_y = range_y / img_dimensions.height * 16 + + # Add padding + ggb_range_x *= 1 + padding_ratio + ggb_range_y *= 1 + padding_ratio + + # Use larger range to ensure complete display + max_range = max(ggb_range_x, ggb_range_y / aspect_ratio * aspect_ratio) + + # Ensure minimum range + max_range = max(max_range, 10) + + # Center + half_x = max_range / 2 + half_y = half_x / aspect_ratio + + return GGBCoordSystem(x_min=-half_x, x_max=half_x, y_min=-half_y, y_max=half_y) + + +def format_ggb_point(point: Point, name: str = "", decimals: int = 2) -> str: + """Format as GeoGebra point definition command. + + Args: + point: Point coordinates + name: Point name (optional) + decimals: Decimal places + + Returns: + GeoGebra command string + """ + x_str = f"{point.x:.{decimals}f}" + y_str = f"{point.y:.{decimals}f}" + + if name: + return f"{name} = ({x_str}, {y_str})" + else: + return f"({x_str}, {y_str})" + + +def format_set_coord_system(ggb_coord: GGBCoordSystem, decimals: int = 0) -> str: + """Format as SetCoordSystem command.""" + return ( + f"SetCoordSystem[{ggb_coord.x_min:.{decimals}f}, " + f"{ggb_coord.x_max:.{decimals}f}, " + f"{ggb_coord.y_min:.{decimals}f}, " + f"{ggb_coord.y_max:.{decimals}f}]" + ) diff --git a/src/tools/vision/ggb_validator.py b/src/tools/vision/ggb_validator.py new file mode 100644 index 00000000..bbd5f9c0 --- /dev/null +++ b/src/tools/vision/ggb_validator.py @@ -0,0 +1,283 @@ +"""GeoGebra command validator and fixer. + +This module validates GeoGebra commands and attempts to fix common mistakes +that LLMs might make when generating GeoGebra scripts. +""" + +from dataclasses import dataclass, field +import re + + +@dataclass +class ValidationResult: + """Result of validating a GeoGebra command.""" + + original: str + fixed: str + is_valid: bool + warnings: list[str] = field(default_factory=list) + errors: list[str] = field(default_factory=list) + + +# Commands that should use square brackets (GeoGebra standard syntax) +COMMANDS_WITH_BRACKETS = { + # Points + "Point", + "Midpoint", + "Intersect", + "Center", + "Focus", + "Vertex", + # Lines + "Line", + "Segment", + "Ray", + "Perpendicular", + "PerpendicularBisector", + "AngleBisector", + "Tangent", + "Asymptote", + "Directrix", + # Vectors + "Vector", + "UnitVector", + "PerpendicularVector", + # Circles and Conics + "Circle", + "Ellipse", + "Hyperbola", + "Parabola", + "Conic", + # Polygons + "Polygon", + # Angles + "Angle", + # Transformations + "Translate", + "Rotate", + "Reflect", + "Dilate", + # Functions + "Derivative", + "Integral", + "If", + "Function", + # Styling + "SetColor", + "SetLineThickness", + "SetPointSize", + "SetFilling", + "SetLabelVisible", + "SetCaption", + "SetVisible", + "SetLineStyle", + # View + "SetCoordSystem", + "ShowAxes", + "ShowGrid", + "ZoomIn", + "ZoomOut", + # Text + "Text", + # Other + "Locus", + "Sequence", + "Element", + "Length", +} + +# Common mistakes patterns and their fixes +COMMON_MISTAKES = [ + # Point({x, y}) -> (x, y) + (r"Point\s*\(\s*\{\s*([^}]+)\s*\}\s*\)", r"(\1)"), + # log(10, x) -> lg(x) + (r"\blog\s*\(\s*10\s*,\s*([^)]+)\s*\)", r"lg(\1)"), + # Remove # comments (GeoGebra doesn't support them) + (r"^\s*#.*$", ""), +] + +# Patterns for detecting parentheses that should be brackets +PAREN_TO_BRACKET_PATTERN = re.compile( + r"\b(" + "|".join(COMMANDS_WITH_BRACKETS) + r")\s*\(([^()]*(?:\([^()]*\)[^()]*)*)\)", + re.IGNORECASE, +) + + +def fix_brackets(command: str) -> tuple[str, list[str]]: + """Fix commands that use parentheses instead of square brackets. + + Args: + command: A single GeoGebra command + + Returns: + Tuple of (fixed command, list of warnings) + """ + warnings = [] + fixed = command + + def replace_with_brackets(match): + cmd_name = match.group(1) + args = match.group(2) + warnings.append(f"Changed {cmd_name}(...) to {cmd_name}[...]") + return f"{cmd_name}[{args}]" + + fixed = PAREN_TO_BRACKET_PATTERN.sub(replace_with_brackets, fixed) + + return fixed, warnings + + +def fix_common_mistakes(command: str) -> tuple[str, list[str]]: + """Fix common LLM mistakes in GeoGebra commands. + + Args: + command: A single GeoGebra command + + Returns: + Tuple of (fixed command, list of warnings) + """ + warnings = [] + fixed = command + + for pattern, replacement in COMMON_MISTAKES: + if re.search(pattern, fixed, re.MULTILINE): + old = fixed + fixed = re.sub(pattern, replacement, fixed, flags=re.MULTILINE) + if old != fixed: + warnings.append(f"Fixed pattern: {pattern}") + + return fixed, warnings + + +def validate_equation_format(command: str) -> tuple[str, list[str]]: + """Validate and fix equation formats. + + Args: + command: A single GeoGebra command + + Returns: + Tuple of (fixed command, list of warnings) + """ + warnings = [] + + # Check for fractional coefficients in conic equations + if re.search(r"[xy]\s*\^\s*2\s*/\s*\d+", command): + warnings.append( + "Equation contains fractional coefficients. " + "Consider using integer form (e.g., '9x^2 + 4y^2 = 36' instead of 'x^2/4 + y^2/9 = 1')" + ) + + return command, warnings + + +def validate_command(command: str) -> ValidationResult: + """Validate and fix a single GeoGebra command. + + Args: + command: A single GeoGebra command + + Returns: + ValidationResult with the fixed command and any warnings/errors + """ + result = ValidationResult(original=command, fixed=command, is_valid=True) + + # Skip empty lines + if not command.strip(): + return result + + # Skip comment lines + if command.strip().startswith("#"): + result.fixed = "" + result.warnings.append("Removed comment line (GeoGebra doesn't support # comments)") + return result + + # Fix common mistakes + fixed, warnings = fix_common_mistakes(command) + result.fixed = fixed + result.warnings.extend(warnings) + + # Fix brackets + fixed, warnings = fix_brackets(result.fixed) + result.fixed = fixed + result.warnings.extend(warnings) + + # Check equation format + _, warnings = validate_equation_format(result.fixed) + result.warnings.extend(warnings) + + # Check if anything was fixed + if result.original != result.fixed: + result.is_valid = False + + return result + + +def validate_ggbscript(script: str) -> tuple[str, list[str], list[str]]: + """Validate and fix a complete GGBScript. + + Args: + script: The full GeoGebra script content + + Returns: + Tuple of (fixed script, list of all warnings, list of all errors) + """ + lines = script.split("\n") + fixed_lines = [] + all_warnings = [] + all_errors = [] + + for i, line in enumerate(lines, 1): + stripped = line.strip() + + # Skip empty lines + if not stripped: + fixed_lines.append(line) + continue + + result = validate_command(stripped) + + if result.warnings: + for warning in result.warnings: + all_warnings.append(f"Line {i}: {warning}") + + if result.errors: + for error in result.errors: + all_errors.append(f"Line {i}: {error}") + + # Add the fixed line (preserve original indentation) + if result.fixed: + indent = len(line) - len(line.lstrip()) + fixed_lines.append(" " * indent + result.fixed) + # If fixed is empty (e.g., removed comment), skip the line + + return "\n".join(fixed_lines), all_warnings, all_errors + + +def get_command_help(command_name: str) -> str | None: + """Get help text for a GeoGebra command. + + Args: + command_name: The name of the command + + Returns: + Help text or None if command not found + """ + help_texts = { + "Circle": "Circle[center, radius] or Circle[center, point] or Circle[A, B, C]", + "Ellipse": "Ellipse[F1, F2, a] (foci and semi-major axis) or Ellipse[F1, F2, P] (foci and point)", + "Hyperbola": "Hyperbola[F1, F2, a] or Hyperbola[F1, F2, P]", + "Parabola": "Parabola[focus, directrix_line]", + "Line": "Line[A, B] (through two points) or Line[point, parallel_line]", + "Segment": "Segment[A, B] or Segment[A, length]", + "Ray": "Ray[A, B] (from A through B)", + "Perpendicular": "Perpendicular[point, line] (perpendicular line through point)", + "Midpoint": "Midpoint[A, B] or Midpoint[segment]", + "Intersect": "Intersect[obj1, obj2] (all intersections) or Intersect[obj1, obj2, n] (nth intersection)", + "Polygon": "Polygon[A, B, C, ...] or Polygon[A, B, n] (regular n-gon)", + "SetColor": 'SetColor[obj, r, g, b] (RGB 0-255) or SetColor[obj, "Red"]', + "SetCoordSystem": "SetCoordSystem[xMin, xMax, yMin, yMax]", + "If": "If[condition, then_value, else_value]", + "Derivative": "Derivative[f] or Derivative[f, n] (nth derivative)", + "Integral": "Integral[f] (indefinite) or Integral[f, a, b] (definite)", + } + + return help_texts.get(command_name) diff --git a/src/tools/vision/image_utils.py b/src/tools/vision/image_utils.py new file mode 100644 index 00000000..27496771 --- /dev/null +++ b/src/tools/vision/image_utils.py @@ -0,0 +1,211 @@ +"""Image processing utilities - URL download and format conversion.""" + +import base64 +from urllib.parse import urlparse + +import httpx + +from src.logging import get_logger + +logger = get_logger(__name__) + +# Supported image MIME types +SUPPORTED_IMAGE_TYPES = { + "image/jpeg": "jpeg", + "image/jpg": "jpg", + "image/png": "png", + "image/gif": "gif", + "image/webp": "webp", +} + +# Maximum image size (10MB) +MAX_IMAGE_SIZE = 10 * 1024 * 1024 + +# Request timeout (seconds) +REQUEST_TIMEOUT = 30 + + +class ImageError(Exception): + """Image processing error.""" + + pass + + +def is_valid_image_url(url: str) -> bool: + """Check if a URL is valid for images. + + Args: + url: URL to check + + Returns: + True if valid URL format + """ + try: + parsed = urlparse(url) + return parsed.scheme in ("http", "https") and bool(parsed.netloc) + except Exception: + return False + + +def is_base64_image(data: str) -> bool: + """Check if data is base64 encoded image. + + Args: + data: Data to check + + Returns: + True if data:image/...;base64,... format + """ + return data.startswith("data:image/") and ";base64," in data + + +async def fetch_image_from_url(url: str) -> tuple[bytes, str]: + """Download image from URL. + + Args: + url: Image URL + + Returns: + (image bytes, MIME type) + + Raises: + ImageError: If download fails or format unsupported + """ + if not is_valid_image_url(url): + raise ImageError(f"Invalid image URL: {url}") + + logger.info(f"Fetching image from URL: {url[:100]}...") + + try: + async with httpx.AsyncClient(timeout=REQUEST_TIMEOUT, follow_redirects=True) as client: + response = await client.get(url) + response.raise_for_status() + + # Check Content-Type + content_type = response.headers.get("content-type", "").split(";")[0].strip().lower() + + # If no Content-Type, infer from URL + if not content_type or content_type == "application/octet-stream": + content_type = guess_image_type_from_url(url) + + if content_type not in SUPPORTED_IMAGE_TYPES: + raise ImageError(f"Unsupported image format: {content_type}") + + # Check size + content = response.content + if len(content) > MAX_IMAGE_SIZE: + raise ImageError( + f"Image too large: {len(content) / 1024 / 1024:.1f}MB " + f"(max {MAX_IMAGE_SIZE / 1024 / 1024:.0f}MB)" + ) + + logger.info(f"Image fetched successfully: {len(content)} bytes, type: {content_type}") + + return content, content_type + + except httpx.HTTPStatusError as e: + raise ImageError(f"Failed to download image: HTTP {e.response.status_code}") + except httpx.TimeoutException: + raise ImageError(f"Image download timeout ({REQUEST_TIMEOUT}s)") + except httpx.RequestError as e: + raise ImageError(f"Failed to download image: {e!s}") + + +def guess_image_type_from_url(url: str) -> str: + """Infer image type from URL. + + Args: + url: Image URL + + Returns: + Inferred MIME type + """ + url_lower = url.lower() + + if ".png" in url_lower: + return "image/png" + elif ".jpg" in url_lower or ".jpeg" in url_lower: + return "image/jpeg" + elif ".gif" in url_lower: + return "image/gif" + elif ".webp" in url_lower: + return "image/webp" + else: + # Default to JPEG + return "image/jpeg" + + +def image_bytes_to_base64(content: bytes, mime_type: str) -> str: + """Convert image bytes to base64 data URL. + + Args: + content: Image binary data + mime_type: MIME type + + Returns: + data:image/...;base64,... format string + """ + b64_data = base64.b64encode(content).decode("utf-8") + result = f"data:{mime_type};base64,{b64_data}" + + logger.debug(f"image_bytes_to_base64: input={len(content)} bytes, output={len(b64_data)} chars") + + return result + + +async def url_to_base64(url: str) -> str: + """Convert image URL to base64 data URL. + + Args: + url: Image URL + + Returns: + data:image/...;base64,... format string + + Raises: + ImageError: If download or conversion fails + """ + content, mime_type = await fetch_image_from_url(url) + return image_bytes_to_base64(content, mime_type) + + +async def resolve_image_input( + image_base64: str | None = None, + image_url: str | None = None, +) -> str | None: + """Resolve image input to base64 format. + + Prioritizes image_base64, falls back to downloading from image_url. + + Args: + image_base64: Base64 format image data + image_url: Image URL + + Returns: + Base64 format image data, or None if no image + + Raises: + ImageError: If URL download or conversion fails + """ + logger.debug( + f"resolve_image_input: base64={'yes' if image_base64 else 'no'}, url={image_url or 'none'}" + ) + + # Prefer base64 + if image_base64: + if is_base64_image(image_base64): + logger.debug("Using provided base64 image") + return image_base64 + else: + logger.error("Invalid base64 image format") + raise ImageError("Invalid base64 image format, should be data:image/...;base64,...") + + # Try to download from URL + if image_url: + logger.debug("Downloading image from URL") + result = await url_to_base64(image_url) + logger.debug(f"Download complete, base64 length: {len(result)}") + return result + + logger.debug("No image provided") + return None diff --git a/web/app/vision-solver/components/AnalysisProgress.tsx b/web/app/vision-solver/components/AnalysisProgress.tsx new file mode 100644 index 00000000..a89884a4 --- /dev/null +++ b/web/app/vision-solver/components/AnalysisProgress.tsx @@ -0,0 +1,153 @@ +"use client"; + +import { CheckCircle2, Circle, Loader2 } from "lucide-react"; +import { useTranslation } from "react-i18next"; +import { VisionSolverState } from "../hooks/useVisionSolver"; + +interface AnalysisProgressProps { + state: VisionSolverState; +} + +const STAGES = [ + { key: "bbox", label: "Element Detection" }, + { key: "analysis", label: "Geometric Analysis" }, + { key: "ggbscript", label: "Command Generation" }, + { key: "reflection", label: "Validation" }, + { key: "answering", label: "Solving Problem" }, +] as const; + +export function AnalysisProgress({ state }: AnalysisProgressProps) { + const { t } = useTranslation(); + const { currentStage, analysisStages } = state; + + const getStageStatus = ( + stageKey: string, + ): "pending" | "active" | "complete" => { + const stageOrder = STAGES.map((s) => s.key); + const currentIndex = stageOrder.indexOf(currentStage || ""); + const stageIndex = stageOrder.indexOf(stageKey); + + // Check if stage has data (completed) - for analysis stages + if ( + stageKey !== "answering" && + analysisStages[stageKey as keyof typeof analysisStages] + ) { + return "complete"; + } + + // Check if this is current stage + if (currentStage === stageKey) { + return "active"; + } + + // If current stage is after this stage, it's complete + if (currentIndex > stageIndex && currentIndex !== -1) { + return "complete"; + } + + // If we're not processing and all previous stages are done, answering is complete + if ( + stageKey === "answering" && + !state.isProcessing && + analysisStages.reflection + ) { + return "complete"; + } + + return "pending"; + }; + + return ( +
+
+

+ {t("Analyzing Image")} +

+

+ {t("Extracting geometric elements and generating visualization...")} +

+
+ +
+ {STAGES.map((stage, index) => { + const status = getStageStatus(stage.key); + + return ( +
+ {/* Status Icon */} +
+ {status === "complete" ? ( + + ) : status === "active" ? ( + + ) : ( + + )} +
+ + {/* Label */} +
+ + {t(stage.label)} + + + {/* Stage Details */} + {status === "complete" && + stage.key === "bbox" && + analysisStages.bbox && ( +

+ {t("Found")} {analysisStages.bbox.elements_count}{" "} + {t("elements")} +

+ )} + {status === "complete" && + stage.key === "analysis" && + analysisStages.analysis && ( +

+ {analysisStages.analysis.constraints_count}{" "} + {t("constraints")},{" "} + {analysisStages.analysis.relations_count} {t("relations")} +

+ )} + {status === "complete" && + stage.key === "ggbscript" && + analysisStages.ggbscript && ( +

+ {analysisStages.ggbscript.commands_count} {t("commands")} +

+ )} + {status === "complete" && + stage.key === "reflection" && + analysisStages.reflection && ( +

+ {analysisStages.reflection.issues_count === 0 + ? t("No issues found") + : `${analysisStages.reflection.issues_count} ${t("issues fixed")}`} +

+ )} +
+ + {/* Progress Line */} + {index < STAGES.length - 1 && ( +
+ )} +
+ ); + })} +
+
+ ); +} diff --git a/web/app/vision-solver/components/GeoGebraCanvas.tsx b/web/app/vision-solver/components/GeoGebraCanvas.tsx new file mode 100644 index 00000000..ecc93062 --- /dev/null +++ b/web/app/vision-solver/components/GeoGebraCanvas.tsx @@ -0,0 +1,358 @@ +"use client"; + +import { useEffect, useRef, useState, useCallback } from "react"; +import { Maximize2, Minimize2, RefreshCw } from "lucide-react"; +import { useTranslation } from "react-i18next"; + +interface GGBCommand { + sequence: number; + command: string; + description: string; +} + +interface GeoGebraCanvasProps { + /** GeoGebra commands to execute (parsed format) */ + commands: GGBCommand[]; + /** Unique identifier for this canvas */ + pageId: string; + /** Title of the canvas */ + title?: string; + /** Whether the canvas is embedded (compact mode) */ + embedded?: boolean; + /** Custom height for embedded mode */ + height?: number; + /** Whether to show the header */ + showHeader?: boolean; +} + +// Declare GeoGebra global types +declare global { + interface Window { + GGBApplet: any; + } +} + +// Helper function to execute GeoGebra commands +function runGGBCommands(api: any, cmds: GGBCommand[]) { + if (!api) return; + + try { + // Sort by sequence and execute + const sorted = [...cmds].sort((a, b) => a.sequence - b.sequence); + + for (const cmd of sorted) { + if (cmd.command && cmd.command.trim()) { + try { + api.evalCommand(cmd.command); + } catch (e) { + console.warn(`GeoGebra command failed: ${cmd.command}`, e); + } + } + } + } catch (error) { + console.error("Error executing GeoGebra commands:", error); + } +} + +export function GeoGebraCanvas({ + commands, + pageId, + title, + embedded = false, + height = 300, + showHeader = true, +}: GeoGebraCanvasProps) { + const { t } = useTranslation(); + const containerRef = useRef(null); + const ggbContainerRef = useRef(null); + const appletRef = useRef(null); + const [isExpanded, setIsExpanded] = useState(false); + const [isLoaded, setIsLoaded] = useState(false); + const [isLoading, setIsLoading] = useState(true); + const [error, setError] = useState(null); + const initializedPageRef = useRef(null); + const executedCommandsRef = useRef>(new Set()); + + // Load GeoGebra script + useEffect(() => { + if (typeof window === "undefined") return; + + // If GGBApplet is already available, defer state update to avoid cascading renders + if (typeof window.GGBApplet !== "undefined") { + const timeoutId = setTimeout(() => setIsLoaded(true), 0); + return () => clearTimeout(timeoutId); + } + + // Check if script is already being loaded + const existingScript = document.querySelector( + 'script[src*="geogebra.org/apps/deployggb.js"]', + ); + if (existingScript) { + existingScript.addEventListener("load", () => setIsLoaded(true)); + return; + } + + const script = document.createElement("script"); + script.src = "https://www.geogebra.org/apps/deployggb.js"; + script.async = true; + script.onload = () => { + setIsLoaded(true); + }; + script.onerror = () => { + console.error("Failed to load GeoGebra script"); + setError("Failed to load GeoGebra library"); + setIsLoading(false); + }; + document.head.appendChild(script); + + return () => { + // Don't remove script as it might be used by other instances + }; + }, []); + + // Cleanup applet function + const cleanupApplet = useCallback(() => { + if (appletRef.current) { + try { + appletRef.current.remove(); + } catch { + // Ignore cleanup errors + } + appletRef.current = null; + } + executedCommandsRef.current.clear(); + }, []); + + // Initialize applet when script is loaded + useEffect(() => { + if (!isLoaded || !containerRef.current || typeof window === "undefined") + return; + + const container = containerRef.current; + const appletId = `ggb-${pageId}`; + + // If already initialized for this pageId, skip + if (initializedPageRef.current === pageId && appletRef.current) { + return; + } + + // Cleanup previous applet if exists + if (initializedPageRef.current && initializedPageRef.current !== pageId) { + cleanupApplet(); + } + + // Use ref to track initialization state to avoid setState in effect body + initializedPageRef.current = pageId; + + // Create a new div for GeoGebra injection + const ggbDiv = document.createElement("div"); + ggbDiv.id = `ggb-container-${appletId}`; + + // Clear previous GeoGebra container if exists + if ( + ggbContainerRef.current && + container.contains(ggbContainerRef.current) + ) { + container.removeChild(ggbContainerRef.current); + } + + container.appendChild(ggbDiv); + ggbContainerRef.current = ggbDiv; + + // Store commands in a variable for the callback + const cmds = commands; + const commandsKey = JSON.stringify(cmds); + + const params = { + id: appletId, + appName: "classic", + width: container.offsetWidth || 600, + height: isExpanded ? 500 : height, + showToolBar: false, + showMenuBar: false, + showAlgebraInput: false, + showResetIcon: false, + enableLabelDrags: false, + enableShiftDragZoom: true, + enableRightClick: false, + showFullscreenButton: false, + showZoomButtons: true, + capturingThreshold: null, + showAnimationButton: false, + preventFocus: true, + borderColor: "#ffffff", + appletOnLoad: (api: any) => { + appletRef.current = api; + setIsLoading(false); + + // Execute commands only if not already executed + if (!executedCommandsRef.current.has(commandsKey)) { + runGGBCommands(api, cmds); + executedCommandsRef.current.add(commandsKey); + } + }, + }; + + let errorTimeoutId: NodeJS.Timeout | undefined; + try { + const applet = new window.GGBApplet(params, true); + applet.inject(ggbDiv); + } catch (err) { + console.error("Failed to initialize GeoGebra applet:", err); + // Defer state updates to avoid cascading renders + errorTimeoutId = setTimeout(() => { + setError("Failed to initialize GeoGebra"); + setIsLoading(false); + }, 0); + } + + return () => { + // Don't try to remove DOM elements here - let React handle cleanup + if (errorTimeoutId) clearTimeout(errorTimeoutId); + }; + }, [isLoaded, pageId, isExpanded, commands, height, cleanupApplet]); + + // Execute new commands when they change + useEffect(() => { + if (!commands || commands.length === 0 || !appletRef.current) return; + + const commandsKey = JSON.stringify(commands); + + // Only execute if these commands haven't been executed before + if (!executedCommandsRef.current.has(commandsKey)) { + runGGBCommands(appletRef.current, commands); + executedCommandsRef.current.add(commandsKey); + } + }, [commands]); + + // Handle resize + useEffect(() => { + if (!containerRef.current || !appletRef.current) return; + + const resizeObserver = new ResizeObserver(() => { + appletRef.current?.recalculateEnvironments?.(); + }); + + resizeObserver.observe(containerRef.current); + + return () => { + resizeObserver.disconnect(); + }; + }, [isLoaded]); + + // Cleanup on unmount + useEffect(() => { + return () => { + cleanupApplet(); + }; + }, [cleanupApplet]); + + // Refresh/re-run commands + const handleRefresh = useCallback(() => { + if (appletRef.current) { + try { + appletRef.current.reset(); + executedCommandsRef.current.clear(); + runGGBCommands(appletRef.current, commands); + executedCommandsRef.current.add(JSON.stringify(commands)); + } catch { + // If reset fails, recreate by changing pageId + initializedPageRef.current = null; + setIsLoaded(false); + setTimeout(() => setIsLoaded(true), 100); + } + } + }, [commands]); + + if (commands.length === 0) { + return null; + } + + // Container styling based on mode + const containerClassName = embedded + ? "flex flex-col bg-white dark:bg-slate-800 rounded-lg border border-slate-200 dark:border-slate-700 overflow-hidden" + : `bg-white dark:bg-slate-800 rounded-xl border border-slate-200 dark:border-slate-700 overflow-hidden transition-all ${ + isExpanded ? "fixed inset-4 z-50" : "" + }`; + + const canvasHeight = isExpanded ? "calc(100% - 40px)" : `${height}px`; + + return ( +
+ {/* Header */} + {showHeader && ( +
+ + {title || t("GeoGebra Visualization")} + +
+ {isLoading && ( + + {t("Loading...")} + + )} + + {!embedded && ( + + )} +
+
+ )} + + {/* Canvas */} +
+ {error ? ( +
+ {error} +
+ ) : ( + <> + {isLoading && ( +
+
+
+ + {t("Loading GeoGebra...")} + +
+
+ )} + + )} +
+ + {/* Backdrop for expanded mode */} + {isExpanded && !embedded && ( +
setIsExpanded(false)} + /> + )} +
+ ); +} + +export default GeoGebraCanvas; diff --git a/web/app/vision-solver/components/ImageUpload.tsx b/web/app/vision-solver/components/ImageUpload.tsx new file mode 100644 index 00000000..8dca72a8 --- /dev/null +++ b/web/app/vision-solver/components/ImageUpload.tsx @@ -0,0 +1,172 @@ +"use client"; + +import { useCallback, useRef, useState } from "react"; +import { Image as ImageIcon, Upload, Link as LinkIcon } from "lucide-react"; +import { useTranslation } from "react-i18next"; + +interface ImageUploadProps { + onImageUpload: (base64: string) => void; + disabled?: boolean; +} + +export function ImageUpload({ onImageUpload, disabled }: ImageUploadProps) { + const { t } = useTranslation(); + const fileInputRef = useRef(null); + const [showUrlInput, setShowUrlInput] = useState(false); + const [urlValue, setUrlValue] = useState(""); + const [isLoading, setIsLoading] = useState(false); + + const handleFileSelect = useCallback( + async (file: File) => { + if (!file.type.startsWith("image/")) { + alert(t("Please select an image file")); + return; + } + + // Check file size (max 10MB) + if (file.size > 10 * 1024 * 1024) { + alert(t("Image size must be less than 10MB")); + return; + } + + // Convert to base64 + const reader = new FileReader(); + reader.onload = (e) => { + const base64 = e.target?.result as string; + onImageUpload(base64); + }; + reader.readAsDataURL(file); + }, + [onImageUpload, t], + ); + + const handleFileInputChange = useCallback( + (e: React.ChangeEvent) => { + const file = e.target.files?.[0]; + if (file) { + handleFileSelect(file); + } + // Reset input + e.target.value = ""; + }, + [handleFileSelect], + ); + + const handleDrop = useCallback( + (e: React.DragEvent) => { + e.preventDefault(); + const file = e.dataTransfer.files[0]; + if (file) { + handleFileSelect(file); + } + }, + [handleFileSelect], + ); + + const handlePaste = useCallback( + (e: React.ClipboardEvent) => { + const items = e.clipboardData.items; + for (const item of items) { + if (item.type.startsWith("image/")) { + const file = item.getAsFile(); + if (file) { + handleFileSelect(file); + } + break; + } + } + }, + [handleFileSelect], + ); + + const handleUrlSubmit = useCallback(async () => { + if (!urlValue.trim()) return; + + setIsLoading(true); + try { + const response = await fetch(urlValue); + const blob = await response.blob(); + const reader = new FileReader(); + reader.onload = (e) => { + const base64 = e.target?.result as string; + onImageUpload(base64); + setShowUrlInput(false); + setUrlValue(""); + }; + reader.readAsDataURL(blob); + } catch (error) { + alert(t("Failed to load image from URL")); + } finally { + setIsLoading(false); + } + }, [urlValue, onImageUpload, t]); + + return ( +
+ + + {/* File Upload Button */} + + + {/* URL Input Toggle Button */} + + + {/* URL Input Popover */} + {showUrlInput && ( +
+
+ setUrlValue(e.target.value)} + placeholder={t("Enter image URL")} + className="flex-1 px-3 py-2 text-sm bg-slate-50 dark:bg-slate-700 border border-slate-200 dark:border-slate-600 rounded-lg focus:outline-none focus:ring-2 focus:ring-indigo-500/20 focus:border-indigo-500" + onKeyDown={(e) => e.key === "Enter" && handleUrlSubmit()} + autoFocus + /> + +
+
+ )} +
+ ); +} diff --git a/web/app/vision-solver/components/index.ts b/web/app/vision-solver/components/index.ts new file mode 100644 index 00000000..a6dc1fbe --- /dev/null +++ b/web/app/vision-solver/components/index.ts @@ -0,0 +1,3 @@ +export { ImageUpload } from "./ImageUpload"; +export { GeoGebraCanvas } from "./GeoGebraCanvas"; +export { AnalysisProgress } from "./AnalysisProgress"; diff --git a/web/app/vision-solver/hooks/useVisionSolver.ts b/web/app/vision-solver/hooks/useVisionSolver.ts new file mode 100644 index 00000000..9caf5930 --- /dev/null +++ b/web/app/vision-solver/hooks/useVisionSolver.ts @@ -0,0 +1,402 @@ +"use client"; + +import { useState, useCallback, useRef, useEffect } from "react"; +import { wsUrl } from "@/lib/api"; + +export interface GGBCommand { + sequence: number; + command: string; + description: string; +} + +/** GeoGebra block parsed from content */ +export interface GGBBlock { + pageId: string; + title: string; + content: string; + commands: GGBCommand[]; +} + +export interface Message { + role: "user" | "assistant"; + content: string; + image?: string; + /** GGB commands for image reconstruction (analysis message) */ + ggbCommands?: GGBCommand[]; + /** GGB blocks parsed from answer content (for solution visualization) */ + ggbBlocks?: GGBBlock[]; + /** Message type: analysis or answer */ + messageType?: "analysis" | "answer"; +} + +export interface AnalysisStages { + bbox?: { + elements_count: number; + elements?: Array<{ type: string; label: string }>; + }; + analysis?: { + constraints_count: number; + relations_count: number; + image_is_reference: boolean; + constraints?: string[]; + }; + ggbscript?: { + commands_count: number; + commands?: Array<{ command: string; description: string }>; + }; + reflection?: { + issues_count: number; + commands_count: number; + final_commands?: GGBCommand[]; + }; +} + +export interface VisionSolverState { + messages: Message[]; + isProcessing: boolean; + currentStage: string | null; + sessionId: string | null; + analysisStages: AnalysisStages; + error: string | null; + /** Base GGB commands from image reconstruction (used as environment for solution blocks) */ + baseGGBCommands: GGBCommand[]; +} + +const initialState: VisionSolverState = { + messages: [], + isProcessing: false, + currentStage: null, + sessionId: null, + analysisStages: {}, + error: null, + baseGGBCommands: [], +}; + +export function useVisionSolver() { + const [state, setState] = useState(initialState); + const wsRef = useRef(null); + const pendingMessageRef = useRef<{ question: string; image?: string } | null>( + null, + ); + + // Cleanup WebSocket on unmount + useEffect(() => { + return () => { + if (wsRef.current) { + wsRef.current.close(); + } + }; + }, []); + + // Handle WebSocket messages - defined before sendMessage to avoid access before declaration + const handleWebSocketMessage = useCallback((data: any) => { + const { type, data: eventData } = data; + + switch (type) { + case "session": + setState((prev) => ({ + ...prev, + sessionId: eventData?.session_id || data.session_id, + })); + break; + + case "analysis_start": + setState((prev) => ({ + ...prev, + currentStage: "bbox", + })); + break; + + case "bbox_complete": + setState((prev) => ({ + ...prev, + currentStage: "analysis", + analysisStages: { + ...prev.analysisStages, + bbox: eventData, + }, + })); + break; + + case "analysis_complete": + setState((prev) => ({ + ...prev, + currentStage: "ggbscript", + analysisStages: { + ...prev.analysisStages, + analysis: eventData, + }, + })); + break; + + case "ggbscript_complete": + setState((prev) => ({ + ...prev, + currentStage: "reflection", + analysisStages: { + ...prev.analysisStages, + ggbscript: eventData, + }, + })); + break; + + case "reflection_complete": + setState((prev) => ({ + ...prev, + currentStage: "complete", + analysisStages: { + ...prev.analysisStages, + reflection: eventData, + }, + })); + break; + + case "analysis_message_complete": + // Add assistant message with GGB commands (for image reconstruction) + // Also save these commands as base environment for solution blocks + if (eventData?.ggb_block) { + const ggbCommands = parseGGBCommands(eventData.ggb_block.content); + setState((prev) => ({ + ...prev, + // Save base commands for solution blocks to inherit + baseGGBCommands: ggbCommands, + messages: [ + ...prev.messages, + { + role: "assistant", + content: "", + ggbCommands, + messageType: "analysis", + }, + ], + })); + } + break; + + case "answer_start": + // Start the tutor response phase - create a new answer message + setState((prev) => { + const messages = [...prev.messages]; + // Always create a new answer message separate from analysis message + messages.push({ + role: "assistant", + content: "", + messageType: "answer", + ggbBlocks: [], + }); + + return { + ...prev, + currentStage: "answering", + messages, + }; + }); + break; + + case "text": + // Append text content to last assistant message + setState((prev) => { + const messages = [...prev.messages]; + const lastMsg = messages[messages.length - 1]; + if (lastMsg && lastMsg.role === "assistant") { + messages[messages.length - 1] = { + ...lastMsg, + content: + lastMsg.content + (eventData?.content || data.content || ""), + }; + } else { + // Create new assistant message if none exists + messages.push({ + role: "assistant", + content: eventData?.content || data.content || "", + }); + } + return { ...prev, messages }; + }); + break; + + case "done": + // Parse GGB blocks from answer message content when done + setState((prev) => { + const messages = prev.messages.map((msg) => { + // Only parse GGB blocks for answer messages + if ( + msg.role === "assistant" && + msg.messageType === "answer" && + msg.content + ) { + const ggbBlocks = parseGGBBlocks(msg.content); + return { + ...msg, + ggbBlocks, + }; + } + return msg; + }); + + return { + ...prev, + messages, + isProcessing: false, + currentStage: null, + }; + }); + break; + + case "error": + setState((prev) => ({ + ...prev, + isProcessing: false, + error: eventData?.content || data.content || "Unknown error", + })); + break; + + case "no_image": + // No image provided, just continue without analysis + break; + + default: + console.log("Unknown message type:", type, data); + } + }, []); + + const sendMessage = useCallback( + (question: string, image_base64?: string) => { + // Add user message immediately + const userMessage: Message = { + role: "user", + content: question, + image: image_base64, + }; + + setState((prev) => ({ + ...prev, + messages: [...prev.messages, userMessage], + isProcessing: true, + currentStage: "connecting", + analysisStages: {}, + error: null, + })); + + // Store pending message for WebSocket + pendingMessageRef.current = { question, image: image_base64 }; + + // Create WebSocket connection + const ws = new WebSocket(wsUrl("/api/v1/vision/solve")); + wsRef.current = ws; + + ws.onopen = () => { + console.log("Vision solver WebSocket connected"); + if (pendingMessageRef.current) { + ws.send( + JSON.stringify({ + question: pendingMessageRef.current.question, + image_base64: pendingMessageRef.current.image, + }), + ); + pendingMessageRef.current = null; + } + }; + + ws.onmessage = (event) => { + try { + const data = JSON.parse(event.data); + handleWebSocketMessage(data); + } catch (e) { + console.error("Failed to parse WebSocket message:", e); + } + }; + + ws.onerror = (error) => { + console.error("WebSocket error:", error); + setState((prev) => ({ + ...prev, + isProcessing: false, + error: "Connection error", + })); + }; + + ws.onclose = () => { + console.log("Vision solver WebSocket closed"); + wsRef.current = null; + }; + }, + [handleWebSocketMessage], + ); + + const reset = useCallback(() => { + if (wsRef.current) { + wsRef.current.close(); + } + setState(initialState); + }, []); + + return { + state, + sendMessage, + reset, + }; +} + +// Parse GGB commands from content string +export function parseGGBCommands(content: string): GGBCommand[] { + if (!content) return []; + + const lines = content.split("\n").filter((line) => line.trim()); + const commands: GGBCommand[] = []; + let sequence = 0; + let currentDescription = ""; + + for (const line of lines) { + const trimmed = line.trim(); + if (trimmed.startsWith("#")) { + // Comment line, use as description for next command + currentDescription = trimmed.slice(1).trim(); + } else if (trimmed) { + sequence++; + commands.push({ + sequence, + command: trimmed, + description: currentDescription, + }); + currentDescription = ""; + } + } + + return commands; +} + +/** + * Parse GGBScript blocks from message content + * Matches: ```ggbscript[page-id;optional-title] or ```geogebra[page-id;optional-title] + */ +export function parseGGBBlocks(content: string): GGBBlock[] { + if (!content) return []; + + const blocks: GGBBlock[] = []; + const blockPattern = + /```\s*(ggbscript|geogebra)\s*\[([^\]\s;]+)(?:;([^\]]*))?\]\s*\n?([\s\S]*?)```/gi; + + let match; + while ((match = blockPattern.exec(content)) !== null) { + const blockContent = match[4].trim(); + blocks.push({ + pageId: match[2], + title: (match[3] || "GeoGebra").trim(), + content: blockContent, + commands: parseGGBCommands(blockContent), + }); + } + + return blocks; +} + +/** + * Remove GGBScript blocks from content to get plain text + */ +export function stripGGBBlocks(content: string): string { + if (!content) return ""; + return content + .replace(/```\s*(ggbscript|geogebra)\s*\[[^\]]*\]\s*\n?[\s\S]*?```/gi, "") + .trim(); +} diff --git a/web/app/vision-solver/page.tsx b/web/app/vision-solver/page.tsx new file mode 100644 index 00000000..df9e609b --- /dev/null +++ b/web/app/vision-solver/page.tsx @@ -0,0 +1,475 @@ +"use client"; + +import { useState, useRef, useCallback, useEffect, useMemo } from "react"; +import { + Send, + Loader2, + Bot, + User, + Image as ImageIcon, + CheckCircle2, + Trash2, + X, +} from "lucide-react"; +import ReactMarkdown from "react-markdown"; +import remarkGfm from "remark-gfm"; +import remarkMath from "remark-math"; +import rehypeKatex from "rehype-katex"; +import "katex/dist/katex.min.css"; +import { useTranslation } from "react-i18next"; + +import { ImageUpload } from "./components/ImageUpload"; +import { GeoGebraCanvas } from "./components/GeoGebraCanvas"; +import { AnalysisProgress } from "./components/AnalysisProgress"; +import { + useVisionSolver, + VisionSolverState, + parseGGBBlocks, + stripGGBBlocks, + GGBBlock, +} from "./hooks/useVisionSolver"; + +/** + * Answer message content component with real-time GGB block filtering and rendering. + * This component filters out GGB code blocks from the content and renders them as GeoGebra canvases. + */ +interface AnswerMessageContentProps { + content: string; + baseGGBCommands: Array<{ + sequence: number; + command: string; + description: string; + }>; + messageIdx: number; + isProcessing: boolean; + t: (key: string) => string; +} + +function AnswerMessageContent({ + content, + baseGGBCommands, + messageIdx, + isProcessing, + t, +}: AnswerMessageContentProps) { + // Real-time parsing: filter out GGB blocks and parse them for rendering + const { strippedContent, ggbBlocks } = useMemo(() => { + return { + strippedContent: stripGGBBlocks(content), + ggbBlocks: parseGGBBlocks(content), + }; + }, [content]); + + return ( + <> + {/* Render filtered text content (without GGB code blocks) */} + {strippedContent && ( +
+
+ + {strippedContent} + +
+ {/* Show verification badge only when processing is complete */} + {!isProcessing && ( +
+ + {t("Verified by DeepTutor Vision Engine")} +
+ )} +
+ )} + + {/* Real-time render GGB blocks as GeoGebra canvases */} + {ggbBlocks.length > 0 && ( +
+ {ggbBlocks.map((block, blockIdx) => { + // Merge base commands (from image reconstruction) with solution block commands + // This ensures points A, B etc. from reconstruction are available + const mergedCommands = [ + ...baseGGBCommands.map((cmd, i) => ({ + ...cmd, + sequence: i + 1, + })), + ...block.commands.map((cmd, i) => ({ + ...cmd, + sequence: baseGGBCommands.length + i + 1, + })), + ]; + + return ( +
+
+ + {blockIdx + 1} + + {block.title} +
+ +
+ ); + })} +
+ )} + + ); +} + +export default function VisionSolverPage() { + const { t } = useTranslation(); + const [inputQuestion, setInputQuestion] = useState(""); + const [imageData, setImageData] = useState(null); + const chatEndRef = useRef(null); + const chatContainerRef = useRef(null); + + const { state, sendMessage, reset } = useVisionSolver(); + + // Auto-scroll chat - use scrollTop instead of scrollIntoView to prevent page scroll + useEffect(() => { + if (state.messages.length > 0 && chatContainerRef.current) { + const container = chatContainerRef.current; + // Smooth scroll to the bottom of the chat container + container.scrollTo({ + top: container.scrollHeight, + behavior: "smooth", + }); + } + }, [state.messages, state.isProcessing, state.currentStage]); + + const handleImageUpload = useCallback((base64: string) => { + setImageData(base64); + }, []); + + const handleRemoveImage = useCallback(() => { + setImageData(null); + }, []); + + const handleSubmit = useCallback(() => { + if (!inputQuestion.trim()) return; + sendMessage(inputQuestion, imageData || undefined); + setInputQuestion(""); + setImageData(null); + }, [inputQuestion, imageData, sendMessage]); + + return ( +
+ {/* Left Panel: Chat Interface */} +
+ {/* Header */} +
+
+
+ {t("Vision Solver")} +
+
+ {state.messages.length > 0 && ( + + )} +
+
+ + {/* Chat Area */} +
+ {/* Initial State */} + {state.messages.length === 0 && !state.isProcessing && ( +
+
+ +
+

+ {t("Upload a Math Problem Image")} +

+

+ {t( + "Upload an image of a geometry problem and I'll analyze it, recreate the figure in GeoGebra, and help you solve it.", + )} +

+
+ )} + + {/* Messages */} + {state.messages.map((msg, idx) => ( +
+ {msg.role === "user" ? ( + <> +
+ +
+
+
+ {msg.content} +
+ {msg.image && ( +
+ {/* eslint-disable-next-line @next/next/no-img-element */} + Uploaded problem +
+ )} +
+ + ) : ( + <> +
+ +
+
+ {/* Analysis Message: GeoGebra Canvas for figure reconstruction */} + {msg.messageType === "analysis" && + msg.ggbCommands && + msg.ggbCommands.length > 0 && ( +
+
+ {t("Figure Reconstruction")} +
+ +
+ )} + + {/* Legacy support: ggbCommands without messageType */} + {!msg.messageType && + msg.ggbCommands && + msg.ggbCommands.length > 0 && ( + + )} + + {/* Answer Message: Text Content with real-time GGB block filtering */} + {msg.messageType === "answer" && msg.content && ( + + )} + + {/* Non-answer messages: render content as-is */} + {msg.messageType !== "answer" && msg.content && ( +
+
+ + {msg.content} + +
+
+ )} +
+ + )} +
+ ))} + + {/* Processing State */} + {state.isProcessing && ( +
+
+ +
+
+ +
+
+ )} + +
+
+ + {/* Input Area */} +
+ {/* Image Preview */} + {imageData && ( +
+
+ {/* eslint-disable-next-line @next/next/no-img-element */} + Preview + +
+
+ )} + +
+
+ +
+ setInputQuestion(e.target.value)} + onKeyDown={(e) => e.key === "Enter" && handleSubmit()} + disabled={state.isProcessing} + /> + +
+
+
+
+ {t( + "Upload a geometry problem image for visual analysis and GeoGebra visualization.", + )} +
+
+
+ + {/* Right Panel: Analysis Details */} +
+ {/* Header */} +
+
+ + {t("Analysis Details")} +
+
+ + {/* Analysis Content */} +
+ {state.analysisStages.bbox && ( +
+
+

+ {t("Detected Elements")} +

+
+

+ {t("Elements")}: {state.analysisStages.bbox.elements_count} +

+ {state.analysisStages.bbox.elements?.map((el, i) => ( +
+ {el.label} ({el.type}) +
+ ))} +
+
+ + {state.analysisStages.analysis && ( +
+

+ {t("Geometric Analysis")} +

+
+

+ {t("Constraints")}:{" "} + {state.analysisStages.analysis.constraints_count} +

+

+ {t("Relations")}:{" "} + {state.analysisStages.analysis.relations_count} +

+ {state.analysisStages.analysis.image_is_reference && ( +

+ {t("Image is reference for problem")} +

+ )} +
+
+ )} + + {state.analysisStages.ggbscript && ( +
+

+ {t("GeoGebra Commands")} +

+
+

+ {t("Commands generated")}:{" "} + {state.analysisStages.ggbscript.commands_count} +

+
+
+ )} + + {state.analysisStages.reflection && ( +
+

+ {t("Validation")} +

+
+

+ {t("Issues found")}:{" "} + {state.analysisStages.reflection.issues_count} +

+

+ {t("Final commands")}:{" "} + {state.analysisStages.reflection.commands_count} +

+
+
+ )} +
+ )} + + {!state.analysisStages.bbox && !state.isProcessing && ( +
+ +

+ {t("Upload an image to see analysis details")} +

+
+ )} +
+
+
+ ); +} diff --git a/web/components/CoWriterEditor.tsx b/web/components/CoWriterEditor.tsx index 287c831e..b29b57f7 100644 --- a/web/components/CoWriterEditor.tsx +++ b/web/components/CoWriterEditor.tsx @@ -1522,7 +1522,7 @@ export default function CoWriterEditor({
{/* Selected Text Preview */} -
+
"{selection?.text}"
diff --git a/web/components/Sidebar.tsx b/web/components/Sidebar.tsx index 0fd8fe4a..86b7c444 100644 --- a/web/components/Sidebar.tsx +++ b/web/components/Sidebar.tsx @@ -25,6 +25,7 @@ import { Check, X, LucideIcon, + Eye, } from "lucide-react"; import { useGlobal } from "@/context/GlobalContext"; @@ -46,6 +47,7 @@ const ALL_NAV_ITEMS: Record = { "/notebook": { icon: Book, nameKey: "Notebooks" }, "/question": { icon: PenTool, nameKey: "Question Generator" }, "/solver": { icon: Calculator, nameKey: "Smart Solver" }, + "/vision-solver": { icon: Eye, nameKey: "Vision Solver" }, "/guide": { icon: GraduationCap, nameKey: "Guided Learning" }, "/ideagen": { icon: Lightbulb, nameKey: "IdeaGen" }, "/research": { icon: Microscope, nameKey: "Deep Research" }, diff --git a/web/context/GlobalContext.tsx b/web/context/GlobalContext.tsx index a18840d8..dabe9b16 100644 --- a/web/context/GlobalContext.tsx +++ b/web/context/GlobalContext.tsx @@ -613,6 +613,7 @@ export function GlobalProvider({ children }: { children: React.ReactNode }) { learnResearch: [ "/question", "/solver", + "/vision-solver", "/guide", "/ideagen", "/research", @@ -636,7 +637,25 @@ export function GlobalProvider({ children }: { children: React.ReactNode }) { setSidebarDescriptionState(data.description); } if (data.nav_order) { - setSidebarNavOrderState(data.nav_order); + // Merge saved nav_order with DEFAULT_NAV_ORDER to include new items + const mergedOrder: SidebarNavOrder = { + start: [...data.nav_order.start], + learnResearch: [...data.nav_order.learnResearch], + }; + + // Add any new items from DEFAULT_NAV_ORDER that are not in saved order + DEFAULT_NAV_ORDER.start.forEach((item) => { + if (!mergedOrder.start.includes(item)) { + mergedOrder.start.push(item); + } + }); + DEFAULT_NAV_ORDER.learnResearch.forEach((item) => { + if (!mergedOrder.learnResearch.includes(item)) { + mergedOrder.learnResearch.push(item); + } + }); + + setSidebarNavOrderState(mergedOrder); } } } catch (e) { diff --git a/web/context/settings/SidebarContext.tsx b/web/context/settings/SidebarContext.tsx index 1222aead..9527bb2e 100644 --- a/web/context/settings/SidebarContext.tsx +++ b/web/context/settings/SidebarContext.tsx @@ -83,7 +83,25 @@ export function SidebarProvider({ children }: { children: React.ReactNode }) { setSidebarDescriptionState(data.description); } if (data.nav_order) { - setSidebarNavOrderState(data.nav_order); + // Merge saved nav_order with DEFAULT_NAV_ORDER to include new items + const mergedOrder: SidebarNavOrder = { + start: [...data.nav_order.start], + learnResearch: [...data.nav_order.learnResearch], + }; + + // Add any new items from DEFAULT_NAV_ORDER that are not in saved order + DEFAULT_NAV_ORDER.start.forEach((item) => { + if (!mergedOrder.start.includes(item)) { + mergedOrder.start.push(item); + } + }); + DEFAULT_NAV_ORDER.learnResearch.forEach((item) => { + if (!mergedOrder.learnResearch.includes(item)) { + mergedOrder.learnResearch.push(item); + } + }); + + setSidebarNavOrderState(mergedOrder); } } } catch (e) { diff --git a/web/locales/en/app.json b/web/locales/en/app.json index 681fd765..3c47e098 100644 --- a/web/locales/en/app.json +++ b/web/locales/en/app.json @@ -1,7 +1,6 @@ { "language.english": "English", "language.chinese": "中文", - "Start": "Start", "Learn": "Learn", "Research": "Research", @@ -33,6 +32,48 @@ "View progress in the Logs panel": "View progress in the Logs panel", "Select a question to view details": "Select a question to view details", "Smart Solver": "Smart Solver", + "Vision Solver": "Vision Solver", + "Upload a Math Problem Image": "Upload a Math Problem Image", + "Upload an image of a geometry problem and I'll analyze it, recreate the figure in GeoGebra, and help you solve it.": "Upload an image of a geometry problem and I'll analyze it, recreate the figure in GeoGebra, and help you solve it.", + "Upload a geometry problem image for visual analysis and GeoGebra visualization.": "Upload a geometry problem image for visual analysis and GeoGebra visualization.", + "Describe the problem or ask a question...": "Describe the problem or ask a question...", + "Figure Reconstruction": "Figure Reconstruction", + "Verified by DeepTutor Vision Engine": "Verified by DeepTutor Vision Engine", + "Analysis Details": "Analysis Details", + "Detected Elements": "Detected Elements", + "Elements": "Elements", + "Geometric Analysis": "Geometric Analysis", + "Constraints": "Constraints", + "Relations": "Relations", + "Image is reference for problem": "Image is reference for problem", + "GeoGebra Commands": "GeoGebra Commands", + "Commands generated": "Commands generated", + "Validation": "Validation", + "Issues found": "Issues found", + "Final commands": "Final commands", + "Upload an image to see analysis details": "Upload an image to see analysis details", + "Analyzing Image": "Analyzing Image", + "Extracting geometric elements and generating visualization...": "Extracting geometric elements and generating visualization...", + "Element Detection": "Element Detection", + "Command Generation": "Command Generation", + "Found": "Found", + "elements": "elements", + "constraints": "constraints", + "relations": "relations", + "commands": "commands", + "No issues found": "No issues found", + "issues fixed": "issues fixed", + "GeoGebra Visualization": "GeoGebra Visualization", + "Loading GeoGebra...": "Loading GeoGebra...", + "Refresh": "Refresh", + "Maximize": "Maximize", + "Minimize": "Minimize", + "Please select an image file": "Please select an image file", + "Image size must be less than 10MB": "Image size must be less than 10MB", + "Enter image URL": "Enter image URL", + "Load": "Load", + "Failed to load image from URL": "Failed to load image from URL", + "Upload image (drag, paste, or click)": "Upload image (drag, paste, or click)", "IdeaGen": "IdeaGen", "Deep Research": "Deep Research", "Welcome to Deep Research Lab. \n\nPlease configure your settings above, then enter a research topic below.": "Welcome to Deep Research Lab. \n\nPlease configure your settings above, then enter a research topic below.", @@ -58,7 +99,6 @@ "Knowledge Base Name": "Knowledge Base Name", "Notebooks": "Notebooks", "Settings": "Settings", - "Loading": "Loading...", "Save": "Save", "Cancel": "Cancel", @@ -67,9 +107,7 @@ "Unknown error": "Unknown error", "tokens": "tokens", "View All": "View All", - "Refresh": "Refresh", "Create": "Create", - "Overview": "Overview", "LLM": "LLM", "Embedding": "Embedding", @@ -170,7 +208,6 @@ "Error loading data": "Error loading data", "Failed to load settings": "Failed to load settings", "Failed to connect to backend": "Failed to connect to backend", - "Overview of your recent learning activities": "Overview of your recent learning activities", "Recent Activity": "Recent Activity", "Loading activities...": "Loading activities...", @@ -191,7 +228,6 @@ "Quick Actions": "Quick Actions", "Ask a Question": "Ask a Question", "Generate Quiz": "Generate Quiz", - "Home": "Home", "History": "History", "Welcome to DeepTutor": "Welcome to DeepTutor", @@ -256,7 +292,6 @@ "Save Selected": "Save Selected", "Generate Ideas ({n} items)": "Generate Ideas ({n} items)", "Generate Ideas (Text Only)": "Generate Ideas (Text Only)", - "Chat History": "Chat History", "Solver History": "Solver History", "All Activities": "All Activities", @@ -274,7 +309,6 @@ "Close": "Close", "messages": "messages", "Failed to load session": "Failed to load session", - "Add to Notebook": "Add to Notebook", "Added Successfully!": "Added Successfully!", "Record Preview": "Record Preview", @@ -290,7 +324,6 @@ "{n} notebooks selected": "{n} notebooks selected", "Select at least one notebook": "Select at least one notebook", "Saving...": "Saving...", - "Import from Notebooks": "Import from Notebooks", "Select content from your notebooks to import": "Select content from your notebooks to import", "No notebooks found": "No notebooks found", @@ -298,7 +331,6 @@ "No records": "No records", "Selected {n} items": "Selected {n} items", "Import Selected": "Import Selected", - "Backend Service": "Backend Service", "LLM Model": "LLM Model", "Embeddings": "Embeddings", @@ -311,7 +343,6 @@ "Unknown": "Unknown", "Testing...": "Testing...", "Test failed": "Test failed", - "Link Folder": "Link Folder", "Link Local Folder": "Link Local Folder", "Folder Path": "Folder Path", @@ -320,7 +351,6 @@ "New and modified files will be automatically detected when you sync.": "New and modified files will be automatically detected when you sync.", "Folder linked successfully!": "Folder linked successfully!", "Failed to link folder": "Failed to link folder", - "Activity Details": "Activity Details", "Type": "Type", "Final Answer": "Final Answer", @@ -336,7 +366,6 @@ "Report Preview": "Report Preview", "N/A": "N/A", "No question content": "No question content", - "Workspace": "WORKSPACE", "Learn & Research": "Learn & Research", "✨ Your description here": "✨ Your description here", @@ -345,7 +374,6 @@ "View on GitHub": "View on GitHub", "KB": "KB", "Solution image": "Solution image", - "Count:": "Count:", "Initializing": "Initializing", "Generating Search Queries": "Generating Search Queries", @@ -363,7 +391,6 @@ "Waiting for logs...": "Waiting for logs...", "Mimic Exam Mode": "Mimic Exam Mode", "Generating questions based on reference exam paper": "Generating questions based on reference exam paper", - "Learning Assistant": "Learning Assistant", "Have any questions? Feel free to ask...": "Have any questions? Feel free to ask...", "Learning Summary": "Learning Summary", @@ -388,7 +415,6 @@ "Switch to wide sidebar (3:1)": "Switch to wide sidebar (3:1)", "Select a notebook, and the system will generate a personalized learning plan. Through interactive pages and intelligent Q&A, you'll gradually master all the content.": "Select a notebook, and the system will generate a personalized learning plan. Through interactive pages and intelligent Q&A, you'll gradually master all the content.", "Select notebook records or describe your research topic": "Select notebook records or describe your research topic", - "Manage and explore your educational content repositories.": "Manage and explore your educational content repositories.", "RAG-Anything": "RAG-Anything", "RAG-Anything (Docling)": "RAG-Anything (Docling)", @@ -447,6 +473,22 @@ "Report Outline": "Report Outline", "View Full Report": "View Full Report", "Parallel Mode": "Parallel Mode", + "AI Edit Assistant": "AI Edit Assistant", + "Close dialog": "Close dialog", + "Instruction (Optional)": "Instruction (Optional)", + "e.g. Make it more formal...": "e.g. Make it more formal...", + "Context Source (Optional)": "Context Source (Optional)", + "Web": "Web", + "Rewrite": "Rewrite", + "Shorten": "Shorten", + "Expand": "Expand", + "AI Mark": "AI Mark", + "Processing...": "Processing...", + "Apply": "Apply", + "Live Preview · Synced Scroll": "Live Preview · Synced Scroll", + "Show AI Marks": "Show AI Marks", + "Hide AI Marks": "Hide AI Marks", + "Preview": "Preview", "Process": "Process", "Report": "Report", "Ready to Research": "Ready to Research", @@ -503,5 +545,20 @@ "Extended Aspects": "Extended Aspects", "Reasoning": "Reasoning", "e.g. 2211asm1": "e.g. 2211asm1", - "Please upload a PDF exam paper": "Please upload a PDF exam paper" + "Please upload a PDF exam paper": "Please upload a PDF exam paper", + "Podcast Narration": "Podcast Narration", + "Click to expand": "Click to expand", + "Script only (TTS not configured)": "Script only (TTS not configured)", + "Current note is empty, cannot generate narration.": "Current note is empty, cannot generate narration.", + "Failed to generate narration, please try again.": "Failed to generate narration, please try again.", + "Generating Podcast": "Generating Podcast", + "Generate Podcast": "Generate Podcast", + "Save Podcast to Notebook": "Save Podcast to Notebook", + "After generation, the narration script will appear here.": "After generation, the narration script will appear here.", + "Key Points": "Key Points", + "After generation, 3-5 key points will be listed here.": "After generation, 3-5 key points will be listed here.", + "Podcast Audio": "Podcast Audio", + "Your browser does not support the audio element.": "Your browser does not support the audio element.", + "TTS not configured, script generation only.": "TTS not configured, script generation only.", + "After generation, you can play the podcast audio here.": "After generation, you can play the podcast audio here." } diff --git a/web/locales/zh/app.json b/web/locales/zh/app.json index ad0ab052..286a36ad 100644 --- a/web/locales/zh/app.json +++ b/web/locales/zh/app.json @@ -1,7 +1,6 @@ { "language.english": "English", "language.chinese": "中文", - "Start": "开始", "Learn": "学习", "Research": "研究", @@ -33,6 +32,48 @@ "View progress in the Logs panel": "可在日志面板查看进度", "Select a question to view details": "请选择一道题查看详情", "Smart Solver": "智能解题", + "Vision Solver": "视觉解题", + "Upload a Math Problem Image": "上传数学题目图片", + "Upload an image of a geometry problem and I'll analyze it, recreate the figure in GeoGebra, and help you solve it.": "上传几何题目图片,我会分析它、用 GeoGebra 重绘配图,并帮助你解答。", + "Upload a geometry problem image for visual analysis and GeoGebra visualization.": "上传几何题目图片进行视觉分析和 GeoGebra 可视化。", + "Describe the problem or ask a question...": "描述问题或提问...", + "Figure Reconstruction": "配图还原", + "Verified by DeepTutor Vision Engine": "已通过 DeepTutor 视觉引擎验证", + "Analysis Details": "分析详情", + "Detected Elements": "检测到的元素", + "Elements": "元素", + "Geometric Analysis": "几何分析", + "Constraints": "约束条件", + "Relations": "几何关系", + "Image is reference for problem": "图片是题目的参考", + "GeoGebra Commands": "GeoGebra 命令", + "Commands generated": "生成的命令", + "Validation": "验证", + "Issues found": "发现的问题", + "Final commands": "最终命令", + "Upload an image to see analysis details": "上传图片以查看分析详情", + "Analyzing Image": "正在分析图片", + "Extracting geometric elements and generating visualization...": "正在提取几何元素并生成可视化...", + "Element Detection": "元素检测", + "Command Generation": "命令生成", + "Found": "发现", + "elements": "个元素", + "constraints": "个约束", + "relations": "个关系", + "commands": "条命令", + "No issues found": "未发现问题", + "issues fixed": "个问题已修复", + "GeoGebra Visualization": "GeoGebra 可视化", + "Loading GeoGebra...": "正在加载 GeoGebra...", + "Refresh": "刷新", + "Maximize": "最大化", + "Minimize": "最小化", + "Please select an image file": "请选择图片文件", + "Image size must be less than 10MB": "图片大小不能超过 10MB", + "Enter image URL": "输入图片 URL", + "Load": "加载", + "Failed to load image from URL": "从 URL 加载图片失败", + "Upload image (drag, paste, or click)": "上传图片(拖拽、粘贴或点击)", "IdeaGen": "创意生成", "Deep Research": "深度研究", "Welcome to Deep Research Lab. \n\nPlease configure your settings above, then enter a research topic below.": "欢迎来到深度研究实验室。\n\n请先在上方配置设置,然后在下方输入研究主题。", @@ -58,7 +99,6 @@ "Knowledge Base Name": "知识库名称", "Notebooks": "笔记本", "Settings": "设置", - "Loading": "加载中...", "Save": "保存", "Cancel": "取消", @@ -67,9 +107,7 @@ "Unknown error": "未知错误", "tokens": "tokens", "View All": "查看全部", - "Refresh": "刷新", "Create": "创建", - "Overview": "概览", "LLM": "LLM", "Embedding": "Embedding", @@ -170,7 +208,6 @@ "Error loading data": "加载数据出错", "Failed to load settings": "加载设置失败", "Failed to connect to backend": "连接后端失败", - "Overview of your recent learning activities": "您最近的学习活动概览", "Recent Activity": "最近活动", "Loading activities...": "加载活动中...", @@ -191,7 +228,6 @@ "Quick Actions": "快捷操作", "Ask a Question": "提问问题", "Generate Quiz": "生成测验", - "Home": "首页", "History": "历史记录", "Welcome to DeepTutor": "欢迎使用 DeepTutor", @@ -256,7 +292,6 @@ "Save Selected": "保存已选", "Generate Ideas ({n} items)": "生成创意({n} 条)", "Generate Ideas (Text Only)": "生成创意(仅文字)", - "Chat History": "聊天历史", "Solver History": "解题历史", "All Activities": "所有活动", @@ -274,7 +309,6 @@ "Close": "关闭", "messages": "条消息", "Failed to load session": "加载会话失败", - "Add to Notebook": "保存到笔记本", "Added Successfully!": "保存成功!", "Record Preview": "记录预览", @@ -290,7 +324,6 @@ "{n} notebooks selected": "已选择 {n} 个笔记本", "Select at least one notebook": "请至少选择一个笔记本", "Saving...": "保存中…", - "Import from Notebooks": "从笔记本导入", "Select content from your notebooks to import": "从你的笔记本中选择内容导入", "No notebooks found": "未找到笔记本", @@ -298,7 +331,6 @@ "No records": "暂无记录", "Selected {n} items": "已选择 {n} 项", "Import Selected": "导入已选", - "Backend Service": "后端服务", "LLM Model": "LLM 模型", "Embeddings": "向量模型", @@ -311,7 +343,6 @@ "Unknown": "未知", "Testing...": "测试中…", "Test failed": "测试失败", - "Link Folder": "关联文件夹", "Link Local Folder": "关联本地文件夹", "Folder Path": "文件夹路径", @@ -320,7 +351,6 @@ "New and modified files will be automatically detected when you sync.": "同步时将自动检测新增与修改的文件。", "Folder linked successfully!": "文件夹关联成功!", "Failed to link folder": "关联文件夹失败", - "Activity Details": "活动详情", "Type": "类型", "Final Answer": "最终答案", @@ -336,7 +366,6 @@ "Report Preview": "报告预览", "N/A": "无", "No question content": "无题目内容", - "Workspace": "工作区", "Learn & Research": "学习与研究", "✨ Your description here": "✨ 在此输入你的描述", @@ -345,7 +374,6 @@ "View on GitHub": "在 GitHub 查看", "KB": "知识库", "Solution image": "解题图片", - "Count:": "数量:", "Initializing": "初始化中", "Generating Search Queries": "生成搜索查询", @@ -363,7 +391,6 @@ "Waiting for logs...": "等待日志…", "Mimic Exam Mode": "仿真试卷模式", "Generating questions based on reference exam paper": "正在基于参考试卷生成题目", - "Learning Assistant": "学习助手", "Have any questions? Feel free to ask...": "有问题吗?欢迎随时提问…", "Learning Summary": "学习总结", @@ -388,7 +415,6 @@ "Switch to wide sidebar (3:1)": "切换为宽侧栏(3:1)", "Select a notebook, and the system will generate a personalized learning plan. Through interactive pages and intelligent Q&A, you'll gradually master all the content.": "请选择一个笔记本,系统将生成个性化学习计划。通过交互页面与智能问答,你将逐步掌握全部内容。", "Select notebook records or describe your research topic": "选择笔记记录或直接描述你的研究主题", - "Manage and explore your educational content repositories.": "管理与探索你的教育内容仓库。", "RAG-Anything": "RAG-Anything", "RAG-Anything (Docling)": "RAG-Anything (Docling)", @@ -447,6 +473,22 @@ "Report Outline": "报告大纲", "View Full Report": "查看完整报告", "Parallel Mode": "并行模式", + "AI Edit Assistant": "AI 编辑助手", + "Close dialog": "关闭对话框", + "Instruction (Optional)": "指令(可选)", + "e.g. Make it more formal...": "例如:使其更正式...", + "Context Source (Optional)": "上下文来源(可选)", + "Web": "网络", + "Rewrite": "重写", + "Shorten": "缩短", + "Expand": "扩展", + "AI Mark": "AI 批改", + "Processing...": "处理中...", + "Apply": "应用", + "Live Preview · Synced Scroll": "实时预览 · 同步滚动", + "Show AI Marks": "显示 AI 批注", + "Hide AI Marks": "隐藏 AI 批注", + "Preview": "预览", "Process": "过程", "Report": "报告", "Ready to Research": "准备开始研究", @@ -503,5 +545,20 @@ "Extended Aspects": "扩展维度", "Reasoning": "推理", "e.g. 2211asm1": "例如:2211asm1", - "Please upload a PDF exam paper": "请上传 PDF 试卷" + "Please upload a PDF exam paper": "请上传 PDF 试卷", + "Podcast Narration": "播客/旁白", + "Click to expand": "点击展开", + "Script only (TTS not configured)": "仅脚本(未配置 TTS)", + "Current note is empty, cannot generate narration.": "当前笔记为空,无法生成旁白。", + "Failed to generate narration, please try again.": "生成旁白失败,请重试。", + "Generating Podcast": "正在生成播客", + "Generate Podcast": "生成播客", + "Save Podcast to Notebook": "保存播客到笔记本", + "After generation, the narration script will appear here.": "生成后,旁白脚本将显示在这里。", + "Key Points": "关键点", + "After generation, 3-5 key points will be listed here.": "生成后,3-5 个关键点将列在这里。", + "Podcast Audio": "播客音频", + "Your browser does not support the audio element.": "您的浏览器不支持音频元素。", + "TTS not configured, script generation only.": "TTS 未配置,仅生成脚本。", + "After generation, you can play the podcast audio here.": "生成后,您可以在这里播放播客音频。" } diff --git a/web/package-lock.json b/web/package-lock.json index 29ea9e2b..ff908d71 100644 --- a/web/package-lock.json +++ b/web/package-lock.json @@ -4758,7 +4758,6 @@ "version": "2.3.2", "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.2.tgz", "integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==", - "dev": true, "hasInstallScript": true, "license": "MIT", "optional": true, diff --git a/web/package.json b/web/package.json index 96d0781d..b3ef0562 100644 --- a/web/package.json +++ b/web/package.json @@ -3,7 +3,7 @@ "version": "0.2.0", "private": true, "scripts": { - "dev": "next dev", + "dev": "next dev --turbopack", "build": "next build", "start": "next start", "lint": "eslint .", diff --git a/web/types/sidebar.ts b/web/types/sidebar.ts index 51b97281..bdf50bd4 100644 --- a/web/types/sidebar.ts +++ b/web/types/sidebar.ts @@ -29,6 +29,7 @@ export const DEFAULT_NAV_ORDER: SidebarNavOrder = { learnResearch: [ "/question", "/solver", + "/vision-solver", "/guide", "/ideagen", "/research",