Author: Wenbin Guo
Last Update: 2023 Spring
This intermediate workshop offers an in-depth introduction to machine learning concepts and algorithms, emphasizing both theoretical understanding and practical skills. Participants will explore supervised and unsupervised learning, learn essential principles of model training and evaluation, and gain hands-on experience with the popular scikit-learn
package (https://scikit-learn.org/stable/). By the end of the workshop, attendees will understand the rationale behind key machine learning algorithms and gain experience in building, evaluating, and applying models to real-world data. For registration information, please visit this link.
The workshop is taught every quarter (3-day workshop, 3 hours per day).
Day 1: Introduction to Machine Learning
- Key concepts and applications of machine learning
- Workflow for training a machine learning model
- Essential tools: Jupyter Notebook,
scikit-learn
,numpy
,matplotlib
- Setting up a basic machine learning workflow
- Hands-on practice with a supervised learning example
Day 2: Supervised Learning
- Classification algorithms: Logistic Regression, KNN, Naive Bayes, SVM, Decision Tree, Random Forest, AdaBoost, XGBoost, Neural Network
- Performance metrics: Accuracy, Confusion Matrix, Precision & Recall, ROC, AUC, PRROC
- Overfitting, underfitting, and regularization methods for model generalization
- Practical exercises on model training and evaluation
Day 3: Regression and Unsupervised Learning
- Regression techniques: Linear and Polynomial Regression, SVR, Tree-based models, GBM, Neural Network
- Dimension reduction techniques: PCA, t-SNE, Autoencoder
- Clustering methods: K-means, Hierarchical, DBSCAN
- Hands-on exercises for regression, dimensionality reduction, and clustering
Python and Jupyter Notebook installation
Attendees should have basic programming experience in Python. Familiarity with basic statistics will be helpful.
slides
: slides for each day of the workshopdayN
: example code and exercises for each day’s topics