📈 FinSight AI — Investment RAG Analyzer

Resume-worthy project — An AI-powered investment research chatbot using
Retrieval-Augmented Generation (RAG) + NLP, backed entirely by free, unlimited APIs.

🏗️ Architecture Overview

User Query
    │
    ▼
[ FastAPI Backend — main.py ]
    │
    ├─► [ NLP Ticker Extraction — llm_handler.py ]
    │         Uses Groq LLM to identify stocks from natural language
    │
    ├─► [ Financial Data Fetcher — financial_data.py ]
    │         yfinance → income statements, balance sheets, cash flow,
    │         price history, news (100% free, no API key)
    │
    ├─► [ Embedding Manager — embeddings.py ]
    │         sentence-transformers (all-MiniLM-L6-v2) → text embeddings
    │         ChromaDB → local vector storage + semantic retrieval
    │
    └─► [ LLM Handler — llm_handler.py ]
              Groq API (llama3-8b-8192) → generates structured analysis
              with RAG context injected into the prompt

[ FastAPI serves frontend ]
[ Frontend — index.html / style.css / script.js ]
    └─► Chart.js price charts + marked.js markdown rendering

🆓 Free Resources Used (All Unlimited for Personal Use)

Tool	Purpose	Cost
yfinance	Live stock data, financials, news	Free, no key
Groq API	LLM inference (llama3-8b)	Free — 14,400 req/day
sentence-transformers	Local text embeddings	Free, open-source
ChromaDB	Local vector database	Free, open-source
FastAPI + Uvicorn	Backend server	Free, open-source
Chart.js	Price charts (CDN)	Free
marked.js	Markdown rendering (CDN)	Free

📁 File Structure

investment-rag-chatbot/
├── backend/
│   ├── main.py            ← FastAPI app + API routes + frontend serving
│   ├── rag_engine.py      ← RAG orchestrator (the brain)
│   ├── financial_data.py  ← yfinance data fetcher
│   ├── embeddings.py      ← ChromaDB + sentence-transformers
│   └── llm_handler.py     ← Groq LLM + NLP ticker extraction
├── frontend/
│   ├── index.html         ← UI layout
│   ├── style.css          ← Dark investment dashboard theme
│   └── script.js          ← Chat logic, charts, market ticker
├── chroma_db/             ← Auto-created by ChromaDB
├── requirements.txt
├── .env.example
└── README.md

⚙️ Step-by-Step Setup

Step 1 — Prerequisites

Make sure you have:

Python 3.10 or 3.11 (recommended)
pip
Git (optional)

python --version   # Should show 3.10+
pip --version

Step 2 — Get a Free Groq API Key

Go to https://console.groq.com
Sign up (no credit card needed)
Click "API Keys" → "Create API Key"
Copy your key (looks like: gsk_xxxxxxxxxxxxx)

Step 3 — Clone / Download the Project

# If you have the zip, extract it first, then:
cd investment-rag-chatbot

Step 4 — Create a Virtual Environment

# Windows
python -m venv venv
venv\Scripts\activate

# Mac / Linux
python -m venv venv
source venv/bin/activate

Step 5 — Install Dependencies

pip install -r requirements.txt

⏳ First run downloads the sentence-transformers model (~80 MB). This happens once.

Step 6 — Configure Environment Variables

# Copy the example file
cp .env.example .env

# Open .env and paste your Groq API key
# .env should look like:
# GROQ_API_KEY=gsk_your_actual_key_here

On Windows (if cp doesn't work):

copy .env.example .env
notepad .env

Step 7 — Run the Application

cd backend
python main.py

You should see:

INFO:     Started server process [xxxxx]
INFO:     Uvicorn running on http://0.0.0.0:8000

Step 8 — Open the App

Open your browser and go to:
http://localhost:8000

You'll see the FinSight AI dashboard. Try asking:

"Analyze Apple stock"
"Compare Microsoft and Google"
"What is Tesla's P/E ratio and is it overvalued?"

🔗 How the Files Connect (Integration Map)

main.py
 └── imports RAGEngine from rag_engine.py
 └── imports FinancialDataFetcher from financial_data.py
 └── serves ../frontend/ as static files

rag_engine.py
 └── imports FinancialDataFetcher from financial_data.py
 └── imports EmbeddingManager from embeddings.py
 └── imports LLMHandler from llm_handler.py
 └── Orchestrates: fetch → chunk → embed → retrieve → generate

financial_data.py
 └── No internal imports (standalone)
 └── Uses: yfinance, pandas

embeddings.py
 └── No internal imports (standalone)
 └── Uses: sentence-transformers, chromadb

llm_handler.py
 └── No internal imports (standalone)
 └── Uses: groq, python-dotenv

frontend/index.html
 └── loads /static/style.css
 └── loads /static/script.js
 └── calls /api/chat, /api/stock/{ticker}, /api/market, /api/news/{ticker}

🌐 Deployment (Free — Render.com)

Deploy to Render (Free Tier — Unlimited hobby projects)

Create account at https://render.com (free, no credit card)

Push to GitHub:

git init
git add .
git commit -m "Initial commit: FinSight AI"
# Create a repo on github.com, then:
git remote add origin https://github.com/YOUR_USERNAME/finsight-ai.git
git push -u origin main

Create a render.yaml in the project root:

services:
  - type: web
    name: finsight-ai
    runtime: python
    rootDir: backend
    buildCommand: pip install -r ../requirements.txt
    startCommand: python main.py
    envVars:
      - key: GROQ_API_KEY
        sync: false   # You'll enter this in Render dashboard
      - key: APP_PORT
        value: "10000"
      - key: CHROMA_PERSIST_DIR
        value: "./chroma_db"

On Render dashboard:
- Click "New Web Service"
- Connect your GitHub repo
- Set GROQ_API_KEY in Environment Variables
- Click Deploy
Your app will be live at: https://finsight-ai.onrender.com

⚠️ Note: Render free tier spins down after 15 min of inactivity and wakes up on the next request (~30s delay). ChromaDB data resets on each deploy (ephemeral filesystem). This is fine for demo purposes.

Alternative: Deploy to Railway (also free)

Go to https://railway.app
"New Project" → "Deploy from GitHub"
Add env var: GROQ_API_KEY=your_key
Set start command: cd backend && python main.py
Done — Railway provides a persistent filesystem so ChromaDB data survives restarts.

🎯 Features for Your Resume

RAG Pipeline — Fetches real-time financial data, converts to embeddings, retrieves semantically relevant chunks per query
NLP Ticker Extraction — LLM identifies stock symbols from natural language ("Tell me about Apple" → AAPL)
Multi-turn Conversation — Maintains conversation history for contextual follow-up questions
Live Financial Data — Income statements, balance sheets, cash flow, price history, analyst ratings, news
Vector Search — ChromaDB cosine similarity search over financial document chunks
Professional UI — Dark dashboard, animated market ticker, Chart.js price charts, markdown rendering

🧠 Key Concepts Explained

What is RAG?

RAG (Retrieval-Augmented Generation) = fetch relevant data first, then ask the LLM.
Without RAG, the LLM only knows its training data.
With RAG, we fetch live Apple earnings → embed it → find the most relevant parts for your question → send that as context to the LLM. The LLM then answers based on real data.

What are Embeddings?

Text converted to numerical vectors where semantically similar text is mathematically close.
Example: "Apple revenue" and "AAPL income" will have similar embeddings, so a question about "Apple's earnings" retrieves both.

What is ChromaDB?

A local vector database. Stores (text + embedding) pairs and lets you do similarity search in milliseconds.

Why Groq?

Groq runs LLMs at ~300 tokens/second — far faster than OpenAI's free tier. The llama3-8b-8192 model on Groq gives you 14,400 free requests/day with an 8,192 token context window.

🛠️ Troubleshooting

Problem	Solution
`GROQ_API_KEY not found`	Make sure `.env` exists in the project root with your key
`No module named 'groq'`	Run `pip install -r requirements.txt` inside your venv
Backend shows 500 error	Check terminal logs; most likely a yfinance network timeout
Chart doesn't appear	Some tickers have limited yfinance data; try AAPL, MSFT, GOOGL
Model download stuck	Wait ~2 min on first run; `all-MiniLM-L6-v2` is ~80MB
Port 8000 in use	Change `APP_PORT=8001` in `.env`

📝 License

MIT — Free for personal and commercial use.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📈 FinSight AI — Investment RAG Analyzer

🏗️ Architecture Overview

🆓 Free Resources Used (All Unlimited for Personal Use)

📁 File Structure

⚙️ Step-by-Step Setup

Step 1 — Prerequisites

Step 2 — Get a Free Groq API Key

Step 3 — Clone / Download the Project

Step 4 — Create a Virtual Environment

Step 5 — Install Dependencies

Step 6 — Configure Environment Variables

Step 7 — Run the Application

Step 8 — Open the App

🔗 How the Files Connect (Integration Map)

🌐 Deployment (Free — Render.com)

Deploy to Render (Free Tier — Unlimited hobby projects)

Alternative: Deploy to Railway (also free)

🎯 Features for Your Resume

🧠 Key Concepts Explained

What is RAG?

What are Embeddings?

What is ChromaDB?

Why Groq?

🛠️ Troubleshooting

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📈 FinSight AI — Investment RAG Analyzer

🏗️ Architecture Overview

🆓 Free Resources Used (All Unlimited for Personal Use)

📁 File Structure

⚙️ Step-by-Step Setup

Step 1 — Prerequisites

Step 2 — Get a Free Groq API Key

Step 3 — Clone / Download the Project

Step 4 — Create a Virtual Environment

Step 5 — Install Dependencies

Step 6 — Configure Environment Variables

Step 7 — Run the Application

Step 8 — Open the App

🔗 How the Files Connect (Integration Map)

🌐 Deployment (Free — Render.com)

Deploy to Render (Free Tier — Unlimited hobby projects)

Alternative: Deploy to Railway (also free)

🎯 Features for Your Resume

🧠 Key Concepts Explained

What is RAG?

What are Embeddings?

What is ChromaDB?

Why Groq?

🛠️ Troubleshooting

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages