Course material for iXperience Data Science 2018. Explanatory notes and code for work in deep learning, machine learning and data science.
| Monday | Tuesday | Wednesday | Thursday | Friday | |
|---|---|---|---|---|---|
| Topic Summary | Introduction to Data Science | Introduction to Python | Fundamentals of data Manipulation in Python | Data visualization | Collaborative work and version control |
| Class structure | The pipeline from data to models in production. Deep learning and the data scientist's skillset. | Syntax, data structures: lists, dictionaries, functions, classes. | Numpy and Pandas. | Matplotlib deep dive. | Git and Github. |
| Homework Assignments | Vim, Tmux, navigating the terminal. | Python programming exercises. | Data structures in python and view construction in pandas. | Plotting figures with Matplotlib. | Collaborative project extracting features from cryptocurrency trading and order book data. |
| Monday | Tuesday | Wednesday | Thursday | Friday | |
|---|---|---|---|---|---|
| Topic Summary | Introduction to Machine Learning | Machine Learning algorithms | Evaluation of classifiers | Essential SQL for data scientists | Spark and Big Data |
| Class structure | Quality of fit, bias variance trade-off,decision boundaries. | Tree-based methods, support vector machines, hyperparameter optimization. | Class imbalance, ROC, precision and recall, confusion matrices, boosting. | Declarative languages, SQL syntax, selecting, grouping, joining, indices and optimisation. | RDDs, big data pipelines and the PySpark API. |
| Homework Assignments | Plotting decision boundaries, evaluating model complexity, bias and variance. Cross validation | Hyperparameter optimisation: grid search vs random. | Modelling with class imbalance, rigorous model evaluation. | SQL exercises. | Spark pipeline for feature extraction. |
| Monday | Tuesday | Wednesday | Thursday | Friday | |
|---|---|---|---|---|---|
| Topic Summary | Dimensionality reduction | Clustering | GPU Server Setup | Introduction to neural networks | Convolutional networks |
| Class structure | Linear vs non-linear dimensionality reduction. PCA, t-SNE. | Density-based clustering, DB-SCAN, hierarchical clustering. | GPU acceleration, Nvidia CUDA and CUDNN. | Feedforward networks motivation and development, introduction to the Keras API. | Why convolutions, genesis and building blocks of convolutional models, transfer learning. |
| Homework Assignments | t-SNE, density and preseved quantities. | Assessing clustering quality. | Setting up a GPU server for deep learning with Google Cloud Compute. | Feedforward networks with Keras. | Convolutional networks and transfer learning. |
| Monday | Tuesday | Wednesday | Thursday | Friday | |
|---|---|---|---|---|---|
| Topic Summary | Recurrent models | Recurrent models | Autoencoders | Model productionization | Putting it all together |
| Class structure | Simple RNN cells, memory and vanishing gradients. | Generators, LSTMs and implementation. | Foundations of autoencoders and unsupervised learning. | Model serving and APIs with Flask and Celery | Integrating model design and productionization. |
| Homework Assignments | Recurrent model intuitions. | Temperature prediction and generative sequence modelling. | Generative adversarial network design. | Creating a web server to host a trained model. | Start-to-finish modelling pipeline. |