Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions math-agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# MathAgent — AI Math Reasoning Engine

LLM-as-Translator + SymPy-as-Solver architecture for verified mathematical computation.

## Architecture

```
User Question → LLM (Translator) → Python/SymPy Code → Python Subprocess → Computed Answer
Majority Voting on N paths
```

**Core Principle:** The LLM never "solves" math — it only *translates* questions into SymPy code. Python/SymPy does the actual computation, giving verified results instead of hallucinated guesses.

## Key Components

| File | Purpose |
|------|---------|
| `server/sympyExecutor.ts` | Python subprocess executor with 15s timeout and basic sandboxing |
| `server/mathSolver.ts` | Main pipeline: code generation → execution → majority voting |
| `server/vllmClient.ts` | LLM client with vLLM (Vast.ai) fallback to Forge API |
| `server/routers.ts` | tRPC API endpoints (solve, history, getResult, modelStatus) |
| `server/db.ts` | Database helpers (Drizzle ORM + MySQL) |
| `drizzle/schema.ts` | Database schema (problems, solution_paths, users) |
| `client/src/pages/Home.tsx` | Main solver UI with code display and path visualization |
| `client/src/pages/Architecture.tsx` | Architecture explanation page |
| `client/src/pages/ProblemDetail.tsx` | Detailed view of a solved problem |

## Pipeline Flow

1. **Translate** — LLM generates N independent Python/SymPy code snippets (temperature variation)
2. **Execute** — Each snippet runs in a Python subprocess with SymPy, capturing stdout
3. **Vote** — Majority voting on execution outputs determines the final answer
4. **Confidence** — Agreement ratio = (paths with majority answer) / (total paths)

## Database Schema

### `solution_paths` table (new fields)

- `generatedCode` — The Python/SymPy code generated by the LLM
- `executionOutput` — Captured stdout from Python execution
- `executionStatus` — `success` | `error` | `timeout`

Legacy fields (`reasoningSteps`, `verificationCode`, `verificationOutput`) are kept for backward compatibility.

## Tests

31 passing vitest tests covering:
- Router input validation (solve, history, getResult)
- SymPy executor (real Python execution: solve, factorial, error handling, no-output detection)
- `extractAnswer` utility
- Majority voting logic (unanimous, tie, error exclusion, normalization)

## Tech Stack

- **Frontend:** React 19 + Tailwind CSS 4 + shadcn/ui
- **Backend:** Express + tRPC + Drizzle ORM + MySQL
- **LLM:** Forge API (free) with optional vLLM on Vast.ai
- **Compute:** SymPy (Python CAS) via subprocess
41 changes: 41 additions & 0 deletions math-agent/client/src/App.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import { Toaster } from "@/components/ui/sonner";
import { TooltipProvider } from "@/components/ui/tooltip";
import NotFound from "@/pages/NotFound";
import { Route, Switch } from "wouter";
import ErrorBoundary from "./components/ErrorBoundary";
import { ThemeProvider } from "./contexts/ThemeContext";
import DashboardLayout from "./components/DashboardLayout";
import Home from "./pages/Home";
import History from "./pages/History";
import Architecture from "./pages/Architecture";
import ProblemDetail from "./pages/ProblemDetail";

function Router() {
return (
<DashboardLayout>
<Switch>
<Route path={"/"} component={Home} />
<Route path={"/history"} component={History} />
<Route path={"/architecture"} component={Architecture} />
<Route path={"/problem/:id"} component={ProblemDetail} />
<Route path={"/404"} component={NotFound} />
<Route component={NotFound} />
</Switch>
</DashboardLayout>
);
}

function App() {
return (
<ErrorBoundary>
<ThemeProvider defaultTheme="dark">
<TooltipProvider>
<Toaster />
<Router />
</TooltipProvider>
</ThemeProvider>
</ErrorBoundary>
);
}

export default App;
Loading
Loading