A comparative study of three sentiment classification approaches on financial text using the FinancialPhraseBank dataset.
- FinBERT: Domain-specific transformer pre-trained on financial text (93.37% accuracy)
- Local LLM: Zero-shot Phi-3-mini via MLX for Apple Silicon (67.42% accuracy)
- RAG-Enhanced LLM: FAISS retrieval-augmented generation (91.35% accuracy)
- Topic Modeling: LDA with 5 topics for thematic analysis
.
├── Financial_Sentiment_Analysis.ipynb # Main notebook
├── README.md
├── requirements.txt
├── results/ # Output visualizations and predictions
└── report/
├── Financial_Sentiment_Analysis_Report.tex
└── Financial_Sentiment_Analysis_Report.pdf
This project uses the FinancialPhraseBank dataset. Download and place it in data/FinancialPhraseBank-v1.0/ before running the notebook.
python -m venv venv
source venv/bin/activate
pip install -r requirements.txtjupyter notebook Financial_Sentiment_Analysis.ipynbExecute cells sequentially. The notebook covers:
- Setup: Device detection (MPS/CUDA/CPU), dependency imports
- Data Loading: FinancialPhraseBank (4,840 sentences), train/val/test split (70/10/20)
- Topic Modeling: LDA with 5 topics, coherence scoring
- FinBERT Evaluation: Pre-trained model inference
- Local LLM: Zero-shot Phi-3-mini with few-shot prompting
- RAG-Enhanced LLM: FAISS index construction, k=7 retrieval
- Comparative Analysis: Confusion matrices, per-class metrics, agreement statistics
| Method | Accuracy | Precision | Recall | F1-Score | Latency |
|---|---|---|---|---|---|
| FinBERT | 93.37% | 93.58% | 93.37% | 93.42% | 8.7 ms |
| RAG-Enhanced | 91.35% | 91.58% | 91.35% | 91.40% | 424 ms |
| Local LLM | 67.42% | 70.12% | 67.42% | 68.01% | 387 ms |
results/comprehensive_comparison.png- Performance visualizationresults/three_method_confusion_matrices.png- Confusion matricesresults/predictions_all_methods.csv- All predictions with labelsresults/summary_stats.json- Summary statistics
A detailed technical report is included in report/:
cd report
pdflatex Financial_Sentiment_Analysis_Report.texThis project is licensed under the MIT License.
Ali Hamza Azam