Skip to content

AliHamzaAzam/financial-sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Financial Sentiment Analysis: FinBERT, Local LLM, and RAG

A comparative study of three sentiment classification approaches on financial text using the FinancialPhraseBank dataset.

Features

  • FinBERT: Domain-specific transformer pre-trained on financial text (93.37% accuracy)
  • Local LLM: Zero-shot Phi-3-mini via MLX for Apple Silicon (67.42% accuracy)
  • RAG-Enhanced LLM: FAISS retrieval-augmented generation (91.35% accuracy)
  • Topic Modeling: LDA with 5 topics for thematic analysis

Project Structure

.
├── Financial_Sentiment_Analysis.ipynb   # Main notebook
├── README.md
├── requirements.txt
├── results/                              # Output visualizations and predictions
└── report/
    ├── Financial_Sentiment_Analysis_Report.tex
    └── Financial_Sentiment_Analysis_Report.pdf

Dataset

This project uses the FinancialPhraseBank dataset. Download and place it in data/FinancialPhraseBank-v1.0/ before running the notebook.

Installation

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Usage

jupyter notebook Financial_Sentiment_Analysis.ipynb

Execute cells sequentially. The notebook covers:

  1. Setup: Device detection (MPS/CUDA/CPU), dependency imports
  2. Data Loading: FinancialPhraseBank (4,840 sentences), train/val/test split (70/10/20)
  3. Topic Modeling: LDA with 5 topics, coherence scoring
  4. FinBERT Evaluation: Pre-trained model inference
  5. Local LLM: Zero-shot Phi-3-mini with few-shot prompting
  6. RAG-Enhanced LLM: FAISS index construction, k=7 retrieval
  7. Comparative Analysis: Confusion matrices, per-class metrics, agreement statistics

Results

Method Accuracy Precision Recall F1-Score Latency
FinBERT 93.37% 93.58% 93.37% 93.42% 8.7 ms
RAG-Enhanced 91.35% 91.58% 91.35% 91.40% 424 ms
Local LLM 67.42% 70.12% 67.42% 68.01% 387 ms

Output Files

  • results/comprehensive_comparison.png - Performance visualization
  • results/three_method_confusion_matrices.png - Confusion matrices
  • results/predictions_all_methods.csv - All predictions with labels
  • results/summary_stats.json - Summary statistics

Report

A detailed technical report is included in report/:

cd report
pdflatex Financial_Sentiment_Analysis_Report.tex

License

This project is licensed under the MIT License.

Author

Ali Hamza Azam

About

Comparative study of FinBERT, Local LLM, and RAG-enhanced approaches for financial sentiment classification on FinancialPhraseBank

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors