Sentinel Mnemonic

A production-ready demonstration of reasoning-first, memory-aware AI agents

This Next.js application showcases advanced AI agent patterns including semantic memory, pronoun resolution, and transparent reasoning. Unlike typical chatbots, this agent thinks before answering by explicitly planning, deciding whether past context is needed, resolving pronouns correctly, and reasoning transparently in real time.

🎯 What This Demonstrates

This project solves four critical problems in AI agent development:

1. The Forgetful Agent Problem

Problem: Users ask similar questions and the agent acts like it's never seen them before.
Solution: Two-step memory process that distinguishes between:

Memory Existence (UX) - "Have I seen this question before?"
Memory Dependency (Reasoning) - "Do I need prior context to answer?"

2. The Pronoun Confusion Problem

Problem: User asks "Who is Tim Cook?" then follows with "How old is he?" and agent says "I don't know who 'he' is."
Solution: Mandatory pronoun resolution that automatically resolves references using conversation history, ensuring pronouns always resolve to the most recent relevant entity.

3. The Semantic Similarity Problem

Problem: "Who is George Washington's wife?" doesn't match "Who is George Washington's spouse?" because text search can't understand synonyms
Solution: Vector database (pgvector) with OpenAI embeddings for semantic search. Questions are converted to 1,536-dimensional vectors where "wife" and "spouse" are mathematically close. Achieves 87.7% similarity match where text search would fail completely.

4. The Black Box Agent Problem

Problem: Users don't understand what the agent is doing or why it's slow.
Solution: Streaming reasoning timeline with user-facing steps showing exactly what the agent is thinking.

🏗️ Architecture

The system implements a single-agent design with explicit steps:

Planning → Memory Check → Dependency Decision → Pronoun Resolution → Reasoning Loop → Answer

Key Features

Server-first architecture with Server Components and Server Actions
Streaming architecture for real-time updates
Vector database (pgvector) with OpenAI embeddings for semantic search
Hybrid search strategy - vector similarity (primary) + text search (fallback)
Intelligent caching - 90% cost reduction on duplicate questions via semantic matching
Validation everywhere using Zod (env vars, inputs, outputs, LLM responses)
PostgreSQL-only database access via Drizzle ORM
Better Auth for authentication (Google + GitHub OAuth)
Transparent reasoning visible to users in real-time with search analytics
Optional session password protection for demo deployments to prevent API abuse

🛠️ Tech Stack

Layer	Technology	Version
Framework	Next.js	16.1.3+
Runtime	React	19.2.3
Database	Supabase PostgreSQL + pgvector	-
ORM	Drizzle	0.45.1+
Auth	Better Auth	1.4.15+
AI	OpenAI (gpt-4o + embeddings)	6.16.0+
Vector Search	pgvector	Latest
Validation	Zod	4.3.5+
UI	shadcn/ui	Latest
Styling	Tailwind CSS	v4
Markdown	react-markdown	10.1.0+

🚀 Quick Start

Prerequisites

Node.js 18+
pnpm
Supabase account
OpenAI API key
Google OAuth credentials (optional)
GitHub OAuth credentials (optional)

Installation

Clone the repository

git clone <your-repo-url>
cd ai-agent-demo

Install dependencies
```
pnpm install
```

Set up environment variables

Create a .env file in the root directory:

# Database Configuration
DATABASE_URL=postgresql://postgres:[YOUR-PASSWORD]@db.[YOUR-PROJECT-ID].supabase.co:6543/postgres
DATABASE_SCHEMA=ai_agent

# Authentication (min 32 characters)
AUTH_SECRET=generate-a-secure-random-string-at-least-32-chars
BETTER_AUTH_URL=http://localhost:3000

# Google OAuth (https://console.developers.google.com/)
GOOGLE_CLIENT_ID=your-real-google-client-id
GOOGLE_CLIENT_SECRET=your-real-google-client-secret

# GitHub OAuth (https://github.com/settings/developers)
GITHUB_CLIENT_ID=your-real-github-client-id
GITHUB_CLIENT_SECRET=your-real-github-client-secret

# OpenAI
OPENAI_API_KEY=sk-your-real-openai-api-key

# Session Password Protection (Optional - for demo deployment)
REQUIRE_SESSION_PASSWORD=false
SESSION_PASSWORD=whoami

Set up the database

Run the schema creation script in your Supabase SQL Editor:

CREATE SCHEMA IF NOT EXISTS ai_agent;
GRANT USAGE ON SCHEMA ai_agent TO authenticated;
GRANT CREATE ON SCHEMA ai_agent TO authenticated;
ALTER ROLE authenticated SET search_path TO ai_agent, public;

Enable pgvector extension (for semantic search)

Run in Supabase SQL Editor:
```
CREATE EXTENSION IF NOT EXISTS vector;
```
Run database migrations
```
pnpm db:generate
pnpm db:migrate
```
(Optional) Backfill existing questions with embeddings
```
pnpm backfill:embeddings
```
Configure OAuth callback URLs

For local development:
- Google: Add http://localhost:3000/api/auth/callback/google to authorized redirect URIs
- GitHub: Set callback URL to http://localhost:3000/api/auth/callback/github
Start the development server
```
pnpm dev
```
Visit http://localhost:3000

📋 Available Commands

pnpm dev                  # Start dev server
pnpm build                # Build for production
pnpm start                # Start production server
pnpm db:generate          # Generate Drizzle migrations
pnpm db:migrate           # Apply database migrations
pnpm db:test              # Test database connection
pnpm db:reset             # Reset database (DEV ONLY - destructive!)
pnpm backfill:embeddings  # Generate embeddings for existing questions

🧪 Testing the System

The repository includes 20 comprehensive test scenarios in /docs/002.app.docs/sample-questions.md. Access them in the UI by clicking the "Sample Questions" link above the input box.

Quick Test Examples

Basic Pronoun Resolution:

"Who is Tim Cook?"
"How old is he?"
"Where was he born?"

Vector Search - Semantic Similarity:

"Who is Bill Clinton's wife?"
"Who is Bill Clinton's spouse?" (should show 87%+ similarity)
"Who is he married to?" (pronoun resolution + semantic search)

Vector Search - Synonym Detection:

"How big is Yosemite National Park?"
"How large is Yosemite National Park?" (should show 95%+ similarity)
"What is the size of Yosemite?" (paraphrase detection)

Memory Exists But Not Required:

"What is 2 + 2?"
"What is 2 + 2?" (recognizes repetition but doesn't fetch memory)

See the full test suite in /docs/002.app.docs/sample-questions.md.

📁 Project Structure

├── src/
│   ├── agent/                    # Agent implementation
│   │   ├── orchestrator.ts       # Main agent orchestration
│   │   ├── steps/                # Agent execution steps
│   │   │   ├── memory-existence-check.ts  # Vector + text hybrid search
│   │   │   ├── memory-dependency-decision.ts
│   │   │   ├── reasoning.ts
│   │   │   └── answer.ts
│   │   ├── tools/                # Agent tools
│   │   │   └── memory-retrieval.ts  # Semantic memory search
│   │   └── validation.ts         # Memory decision validation
│   ├── db/                       # Database layer
│   │   ├── schema/               # Drizzle schemas (with vector columns)
│   │   └── queries/              # Database queries
│   │       ├── agent-runs.ts     # Auto-generate embeddings
│   │       └── vector-search.ts  # Vector similarity queries
│   ├── lib/                      # Utilities
│   │   ├── env.ts                # Validated environment variables
│   │   └── embeddings.ts         # OpenAI embedding generation
│   ├── app/                      # Next.js app directory
│   └── components/               # React components
│       └── memory/
│           └── search-analytics.tsx  # Vector search UI indicator
├── docs/                         # Documentation
│   ├── 001.ai.rules/             # AI governance & development rules
│   ├── 002.app.docs/             # Application documentation
│   │   ├── features/             # Feature documentation
│   │   │   ├── QUICK_START_VECTOR.md
│   │   │   ├── VECTOR_IMPLEMENTATION_SUMMARY.md
│   │   │   └── VECTOR_SEARCH_TEST_SCENARIOS.md (27 test cases)
│   │   └── deployment/           # Deployment guides
│   │       └── ENABLE_PGVECTOR.md
│   ├── 003.screenshots/          # UI screenshots
│   └── 010.sql.scripts/          # Database scripts
├── scripts/
│   └── backfill-embeddings.ts    # Backfill script for existing data
└── drizzle/                      # Generated migrations

🔑 Key Implementation Details

Pronoun Resolution Context Propagation

The system mandatorily resolves pronouns and propagates resolved context through both reasoning and answer generation:

// Build resolved context from pronoun resolution
let resolvedContext: string | undefined
if (dependencyDecision.pronounResolution?.resolved) {
  const entities = dependencyDecision.pronounResolution.resolvedEntities
    .map(e => `"${e.pronoun}" refers to "${e.resolvedTo}"`)
    .join(", ")
  resolvedContext = `The following pronouns have been resolved: ${entities}...`
}

// CRITICAL: Pass to BOTH reasoning and answer
const reasoningResult = await performReasoningLoop(
  question, plan, memories, resolvedContext  // ← HERE
)

const answerStream = await generateAnswerStream(
  question, plan, memories, thoughts, resolvedContext  // ← AND HERE
)

Vector-Based Semantic Search

The system uses OpenAI embeddings + pgvector for semantic similarity:

Embedding Generation - Each question converted to 1,536-dimensional vector using text-embedding-3-small ($0.02/1M tokens)
Vector Search - pgvector finds similar questions using cosine distance
Hybrid Strategy - Vector similarity (primary) + text search (fallback for records without embeddings)
Smart Caching - Questions with >= 85% similarity reuse cached answers

Example: "Who is his wife?" matches "Who is his spouse?" at 87.7% similarity, saving $0.03 by reusing the cached answer instead of calling GPT-4o.

Cost Savings: 90% reduction on duplicate questions (embedding cost: $0.00002 vs GPT-4o cost: $0.01-0.05)

Validation Rules

Memory existence and dependency are separate decisions
Pronoun resolution is MANDATORY if pronouns detected
Resolved context MUST propagate to reasoning and answer generation
All rules enforced in /src/agent/validation.ts

Vector Search Implementation

The system implements production-grade semantic search with pgvector:

// 1. Embedding generation (automatic on insert)
const embeddingResult = await generateEmbedding(question)
await db.insert(agentRuns).values({
  ...data,
  questionEmbedding: embeddingResult.embedding,  // 1536 dimensions
  embeddingModel: "text-embedding-3-small"
})

// 2. Vector similarity search (cosine distance)
const vectorResults = await searchByVector(userId, question, {
  limit: 5,
  similarityThreshold: 0.75  // Adjustable threshold
})

// 3. Answer reuse logic (4 scenarios)
// Scenario 4: Trust high vector similarity (>= 85%)
if (existenceCheck.vectorSimilarityScore >= 0.85) {
  answerToReuse = existenceCheck.existingAnswer  // Save $0.03 per reuse!
}

UI Integration:

Purple "Semantic Search" badge for vector results
Gray "Text Search" badge for fallback
Similarity scores displayed (e.g., "87.7% match")
Visual indicators in reasoning timeline

Performance:

Vector search: ~10-30ms
Text search fallback: ~50-100ms
Answer reuse: ~100ms total vs 2-5s for new generation
20-50x speedup on duplicate questions

🤖 For AI Assistants (Claude Code)

If you're an AI assistant helping with this codebase, please read these documents first:

/docs/001.ai.rules/claude_rules.md - How to interact with this codebase
/docs/001.ai.rules/react_performance_contract.md - React patterns to follow
/docs/001.ai.rules/performance_checklist.md - Performance requirements
/docs/001.ai.rules/prompts/master.prompt.v1.md - Complete implementation reference

Key Architectural Rules

Server Components by default, "use client" only when needed
All mutations via Server Actions
Zod validation everywhere (no unvalidated boundaries)
No process.env in app code (use @/lib/env)
Better Auth IDs are text, not uuid
Never use public schema (use DATABASE_SCHEMA)

Common Pitfalls to Avoid

❌ Forgetting to pass resolvedContext to reasoning AND answer
❌ Using simple substring search instead of multi-strategy fallback
❌ Conflating memory existence with memory dependency
❌ Not exporting new functions from index.ts files
❌ State updates during render (use useEffect)

🎓 What This Project Demonstrates

This is an interview-level demonstration showcasing:

Senior Engineering: Clean code, proper patterns, thoughtful architecture
Director-Level Thinking: Separation of UX concerns from computational needs
Production Readiness: Validation, error handling, security best practices
Transparency: User-facing reasoning, not hidden magic
Vector Database Expertise: pgvector implementation with semantic search
RAG Implementation: Hybrid search strategy with embeddings
LLM Cost Optimization: 90% reduction via intelligent caching

The system is designed to be trustworthy, predictable, and reviewable.

Demonstrates Bonus Skills

✅ Hands-on experience with agentic AI frameworks - Explicit planning, memory, and reasoning ✅ Familiarity with vector databases and RAG - pgvector with OpenAI embeddings, cosine similarity search ✅ Deploying and maintaining LLM-integrated systems - Production deployment, cost optimization, monitoring

📚 Documentation

Core Documentation

Overview - Complete system overview
Setup Guide - Detailed setup instructions
Sample Questions - 20 test scenarios
App Summary - High-level summary

Vector Database Documentation

Quick Start Guide - 5-minute setup
Implementation Summary - Complete overview
Test Scenarios - 27 test cases
Enable pgvector - Deployment guide

AI Governance

AI Rules - AI governance documentation

🔒 Security & Best Practices

All environment variables validated with Zod schemas
Server-first architecture minimizes client-side attack surface
OAuth authentication via Better Auth
PostgreSQL parameterized queries prevent SQL injection
No sensitive data in logs or client-side code

Session Password Protection (Optional)

For public demo deployments to prevent abuse and excessive API costs, the app supports optional session-based password protection:

How it works:

First-time users are prompted to enter a password before submitting questions
Password is verified server-side against SESSION_PASSWORD environment variable
Once authenticated, the session is marked and no further prompts appear
Authentication persists for the entire user session

Configuration:

For development (no password protection):

REQUIRE_SESSION_PASSWORD=false
# or simply omit both variables

For production/demo deployment (Vercel, etc.):

REQUIRE_SESSION_PASSWORD=true
SESSION_PASSWORD=your-secure-password-here

Security features:

Password verification happens server-side only (never exposed to client)
Authentication state stored in database session table (tied to Better Auth session lifecycle)
One-time verification per session (no repeated prompts)
When user logs out and logs back in, they must enter password again (new session)
Feature can be completely disabled via environment variable

🤝 Contributing

This is a demonstration project. If you're using this as a reference:

Review the AI governance documents in /docs/001.ai.rules/
Follow the React performance contract
Maintain the validation-everywhere pattern
Keep the agent steps explicit and transparent

📄 License

[Your License Here]

🙏 Acknowledgments

Built with:

Next.js 16 (App Router)
OpenAI GPT-4o + text-embedding-3-small
Supabase PostgreSQL + pgvector
Better Auth
shadcn/ui
Drizzle ORM

Think. Remember. Decide. Search Semantically.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
docs		docs
drizzle		drizzle
public		public
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
README.md		README.md
components.json		components.json
drizzle.config.ts		drizzle.config.ts
env.example		env.example
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

Sentinel Mnemonic

🎯 What This Demonstrates

1. The Forgetful Agent Problem

2. The Pronoun Confusion Problem

3. The Semantic Similarity Problem

4. The Black Box Agent Problem

🏗️ Architecture

Key Features

🛠️ Tech Stack

🚀 Quick Start

Prerequisites

Installation

📋 Available Commands

🧪 Testing the System

Quick Test Examples

📁 Project Structure

🔑 Key Implementation Details

Pronoun Resolution Context Propagation

Vector-Based Semantic Search

Validation Rules

Vector Search Implementation

🤖 For AI Assistants (Claude Code)

Key Architectural Rules

Common Pitfalls to Avoid

🎓 What This Project Demonstrates

Demonstrates Bonus Skills

📚 Documentation

Core Documentation

Vector Database Documentation

AI Governance

🔒 Security & Best Practices

Session Password Protection (Optional)

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages