Skip to content

originaljayeshsharma/FinSight-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“ˆ FinSight AI β€” Investment RAG Analyzer

Resume-worthy project β€” An AI-powered investment research chatbot using
Retrieval-Augmented Generation (RAG) + NLP, backed entirely by free, unlimited APIs.


πŸ—οΈ Architecture Overview

User Query
    β”‚
    β–Ό
[ FastAPI Backend β€” main.py ]
    β”‚
    β”œβ”€β–Ί [ NLP Ticker Extraction β€” llm_handler.py ]
    β”‚         Uses Groq LLM to identify stocks from natural language
    β”‚
    β”œβ”€β–Ί [ Financial Data Fetcher β€” financial_data.py ]
    β”‚         yfinance β†’ income statements, balance sheets, cash flow,
    β”‚         price history, news (100% free, no API key)
    β”‚
    β”œβ”€β–Ί [ Embedding Manager β€” embeddings.py ]
    β”‚         sentence-transformers (all-MiniLM-L6-v2) β†’ text embeddings
    β”‚         ChromaDB β†’ local vector storage + semantic retrieval
    β”‚
    └─► [ LLM Handler β€” llm_handler.py ]
              Groq API (llama3-8b-8192) β†’ generates structured analysis
              with RAG context injected into the prompt

[ FastAPI serves frontend ]
[ Frontend β€” index.html / style.css / script.js ]
    └─► Chart.js price charts + marked.js markdown rendering

πŸ†“ Free Resources Used (All Unlimited for Personal Use)

Tool Purpose Cost
yfinance Live stock data, financials, news Free, no key
Groq API LLM inference (llama3-8b) Free β€” 14,400 req/day
sentence-transformers Local text embeddings Free, open-source
ChromaDB Local vector database Free, open-source
FastAPI + Uvicorn Backend server Free, open-source
Chart.js Price charts (CDN) Free
marked.js Markdown rendering (CDN) Free

πŸ“ File Structure

investment-rag-chatbot/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py            ← FastAPI app + API routes + frontend serving
β”‚   β”œβ”€β”€ rag_engine.py      ← RAG orchestrator (the brain)
β”‚   β”œβ”€β”€ financial_data.py  ← yfinance data fetcher
β”‚   β”œβ”€β”€ embeddings.py      ← ChromaDB + sentence-transformers
β”‚   └── llm_handler.py     ← Groq LLM + NLP ticker extraction
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ index.html         ← UI layout
β”‚   β”œβ”€β”€ style.css          ← Dark investment dashboard theme
β”‚   └── script.js          ← Chat logic, charts, market ticker
β”œβ”€β”€ chroma_db/             ← Auto-created by ChromaDB
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .env.example
└── README.md

βš™οΈ Step-by-Step Setup

Step 1 β€” Prerequisites

Make sure you have:

  • Python 3.10 or 3.11 (recommended)
  • pip
  • Git (optional)
python --version   # Should show 3.10+
pip --version

Step 2 β€” Get a Free Groq API Key

  1. Go to https://console.groq.com
  2. Sign up (no credit card needed)
  3. Click "API Keys" β†’ "Create API Key"
  4. Copy your key (looks like: gsk_xxxxxxxxxxxxx)

Step 3 β€” Clone / Download the Project

# If you have the zip, extract it first, then:
cd investment-rag-chatbot

Step 4 β€” Create a Virtual Environment

# Windows
python -m venv venv
venv\Scripts\activate

# Mac / Linux
python -m venv venv
source venv/bin/activate

Step 5 β€” Install Dependencies

pip install -r requirements.txt

⏳ First run downloads the sentence-transformers model (~80 MB). This happens once.


Step 6 β€” Configure Environment Variables

# Copy the example file
cp .env.example .env

# Open .env and paste your Groq API key
# .env should look like:
# GROQ_API_KEY=gsk_your_actual_key_here

On Windows (if cp doesn't work):

copy .env.example .env
notepad .env

Step 7 β€” Run the Application

cd backend
python main.py

You should see:

INFO:     Started server process [xxxxx]
INFO:     Uvicorn running on http://0.0.0.0:8000

Step 8 β€” Open the App

Open your browser and go to:
http://localhost:8000

You'll see the FinSight AI dashboard. Try asking:

  • "Analyze Apple stock"
  • "Compare Microsoft and Google"
  • "What is Tesla's P/E ratio and is it overvalued?"

πŸ”— How the Files Connect (Integration Map)

main.py
 └── imports RAGEngine from rag_engine.py
 └── imports FinancialDataFetcher from financial_data.py
 └── serves ../frontend/ as static files

rag_engine.py
 └── imports FinancialDataFetcher from financial_data.py
 └── imports EmbeddingManager from embeddings.py
 └── imports LLMHandler from llm_handler.py
 └── Orchestrates: fetch β†’ chunk β†’ embed β†’ retrieve β†’ generate

financial_data.py
 └── No internal imports (standalone)
 └── Uses: yfinance, pandas

embeddings.py
 └── No internal imports (standalone)
 └── Uses: sentence-transformers, chromadb

llm_handler.py
 └── No internal imports (standalone)
 └── Uses: groq, python-dotenv

frontend/index.html
 └── loads /static/style.css
 └── loads /static/script.js
 └── calls /api/chat, /api/stock/{ticker}, /api/market, /api/news/{ticker}

🌐 Deployment (Free β€” Render.com)

Deploy to Render (Free Tier β€” Unlimited hobby projects)

  1. Create account at https://render.com (free, no credit card)

  2. Push to GitHub:

    git init
    git add .
    git commit -m "Initial commit: FinSight AI"
    # Create a repo on github.com, then:
    git remote add origin https://github.com/YOUR_USERNAME/finsight-ai.git
    git push -u origin main
  3. Create a render.yaml in the project root:

    services:
      - type: web
        name: finsight-ai
        runtime: python
        rootDir: backend
        buildCommand: pip install -r ../requirements.txt
        startCommand: python main.py
        envVars:
          - key: GROQ_API_KEY
            sync: false   # You'll enter this in Render dashboard
          - key: APP_PORT
            value: "10000"
          - key: CHROMA_PERSIST_DIR
            value: "./chroma_db"
  4. On Render dashboard:

    • Click "New Web Service"
    • Connect your GitHub repo
    • Set GROQ_API_KEY in Environment Variables
    • Click Deploy
  5. Your app will be live at: https://finsight-ai.onrender.com

⚠️ Note: Render free tier spins down after 15 min of inactivity and wakes up on the next request (~30s delay). ChromaDB data resets on each deploy (ephemeral filesystem). This is fine for demo purposes.


Alternative: Deploy to Railway (also free)

  1. Go to https://railway.app
  2. "New Project" β†’ "Deploy from GitHub"
  3. Add env var: GROQ_API_KEY=your_key
  4. Set start command: cd backend && python main.py
  5. Done β€” Railway provides a persistent filesystem so ChromaDB data survives restarts.

🎯 Features for Your Resume

  • RAG Pipeline β€” Fetches real-time financial data, converts to embeddings, retrieves semantically relevant chunks per query
  • NLP Ticker Extraction β€” LLM identifies stock symbols from natural language ("Tell me about Apple" β†’ AAPL)
  • Multi-turn Conversation β€” Maintains conversation history for contextual follow-up questions
  • Live Financial Data β€” Income statements, balance sheets, cash flow, price history, analyst ratings, news
  • Vector Search β€” ChromaDB cosine similarity search over financial document chunks
  • Professional UI β€” Dark dashboard, animated market ticker, Chart.js price charts, markdown rendering

🧠 Key Concepts Explained

What is RAG?

RAG (Retrieval-Augmented Generation) = fetch relevant data first, then ask the LLM.
Without RAG, the LLM only knows its training data.
With RAG, we fetch live Apple earnings β†’ embed it β†’ find the most relevant parts for your question β†’ send that as context to the LLM. The LLM then answers based on real data.

What are Embeddings?

Text converted to numerical vectors where semantically similar text is mathematically close.
Example: "Apple revenue" and "AAPL income" will have similar embeddings, so a question about "Apple's earnings" retrieves both.

What is ChromaDB?

A local vector database. Stores (text + embedding) pairs and lets you do similarity search in milliseconds.

Why Groq?

Groq runs LLMs at ~300 tokens/second β€” far faster than OpenAI's free tier. The llama3-8b-8192 model on Groq gives you 14,400 free requests/day with an 8,192 token context window.


πŸ› οΈ Troubleshooting

Problem Solution
GROQ_API_KEY not found Make sure .env exists in the project root with your key
No module named 'groq' Run pip install -r requirements.txt inside your venv
Backend shows 500 error Check terminal logs; most likely a yfinance network timeout
Chart doesn't appear Some tickers have limited yfinance data; try AAPL, MSFT, GOOGL
Model download stuck Wait ~2 min on first run; all-MiniLM-L6-v2 is ~80MB
Port 8000 in use Change APP_PORT=8001 in .env

πŸ“ License

MIT β€” Free for personal and commercial use.

About

πŸš€ AI-powered investment research chatbot using RAG (Retrieval-Augmented Generation) + NLP | Live stock analysis via yfinance | Groq LLM (llama3) | ChromaDB vector search | FastAPI backend | Dockerized & production-ready

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors