Warren Buffet letters Analysis

Overview

The Goal of this project is to use NLP techniques such as Question and Answering, Sentiment Analysis, WordCloud, document similarity and others to extract meaningful insights about Warren Buffet annual letters to the Berkshire Hathaway shareholders.

Installation

Create a virtual environment named ibm_venv.

$ python3 -m venv warren_venv -- for Linux and macOS
$ python -m venv warren_venv -- for Windows

After that, activate the python virtual environment

$ source warren_venv/bin/activate -- for Linux and macOS
$ warren_venv\Scripts\activate -- for Windows

Install the requirements

$ pip install -r requirements.txt

Running

Running the QA notebook

To run it you have to download the letters after 2000 at https://www.berkshirehathaway.com/letters/letters.html. After that you need to change the parameters from the function get_letters_corpus_dict to the directory containing the letters, after that you only need to run the desired cells of the notebook

Running the document similarity

You can get the most similar documents to a specific letter year by running the doc_sim_main.py.

python doc_sim_main.py --algorithm <algorithm> --distance <distance> --path <path> --target <target> --number <number> --pretrained <pretrained>

Where:

algorithm: Could be tfidf, word2vec, doc2vect and transformer
distance: Could be cosine or euclidean
path: Pickle path to the letters dict
target: The target letter year
number: The number of letters to return
pretrained: The pretrained model to use in transformers

Final Considerations and acknowledgments

To see the full analysis of this code, access my medium post at: https://medium.com/analytics-vidhya/best-nlp-algorithms-to-get-document-similarity-a5559244b23b https://medium.com/analytics-vidhya/using-nlp-to-get-inside-warren-buffet-mind-part-2-8e3557810a39 https://medium.com/analytics-vidhya/using-nlp-to-get-inside-warren-buffet-mind-part-i-666d717d0c2e

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
LICENSE		LICENSE
README.md		README.md
constants.py		constants.py
doc_sim_main.py		doc_sim_main.py
document_similarity.py		document_similarity.py
letters_dict.pickle		letters_dict.pickle
requirements.txt		requirements.txt
sentiment_analysis_dict.pickle		sentiment_analysis_dict.pickle
utils.py		utils.py
warren_buffet.ipynb		warren_buffet.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Warren Buffet letters Analysis

Table of Contents

Overview

Installation

Running

Running the QA notebook

Running the document similarity

Final Considerations and acknowledgments

About

Releases

Packages

Languages

License

jairNeto/warren_buffet_letters

Folders and files

Latest commit

History

Repository files navigation

Warren Buffet letters Analysis

Table of Contents

Overview

Installation

Running

Running the QA notebook

Running the document similarity

Final Considerations and acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages