This repository contains Jupyter notebooks illustrating few basics of machine learning, primarily the basics of artificial neural networks. I created it to explain working principles behind machine learning to technical leadership at the company I work for.
The learning journey starts with a simple perceptron introduced in 1957, progresses through the development of multi-layer perceptrons and recurrent neural networks (RNNs) for sequence-to-sequence tasks, and finishes with modern-day transformer architecture and large language models (LLMs). For explanations of sequence-to-sequence models, RNNs, and transformers, I primarily reference the excellent NLP course by Lena Voita.
The following table of contents is helpful if you want to navigate the material in logical order.
- Perceptron
- Multi-layer Perceptron (MLP)
- Sequence-to-sequence tasks
- Encoder-decoder framework
- Recurring Neural Networks (RNNs)
- Attention
- Self-Attention
- 🚧 Positional Encoding
- 🚧 Normalization
- Transformer architecture
- Foundation Models
- 🚧 Generative Pre-trained Transformer
- Use cases for generative AI
- Hands-on
- 🚧 Hands-on: GitHub Copilot
- 🚧 Hands-on: Using LLMs
- Business impact
The requirements.txt file in the root of this repository lists all Python packages and their corresponding versions installed in my Python virtual environment (the file was obtained by pip freeze > requirements.txt
).