Skip to content

The most common machine learning models and algorithms programmed from first principles. The aim is to implement these models and algorithms to the same standard as widely used libraries such as scikit-learn with clean, efficient code, clear modular design, and rigorous adherence to the underlying mathematics.

License

Notifications You must be signed in to change notification settings

prithvi-ramrucha/Machine-Learning-From-Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning From Scratch

The most common machine learning models and algorithms programmed from first principles. The aim is to implement these models and algorithms to the same standard as widely used libraries such as scikit-learn with clean, efficient code, clear modular design, and rigorous adherence to the underlying mathematics.

Model / Algorithm Type Completed? Notes
Linear Regression Regression TBR
Logistic Regression Classification TBR
K-Nearest Neighbours Classification TBR
K-Means Clustering TBR
Stochastic Gradient Descent Optimisation TBR
Naive Bayes Classifiers Classification Categorical ✅, Multinomial ❌, Gaussian ❌ TBR
Decision Trees Classification/Regression Classifier ✅, Regressor ✅, CCP Pruning ✅, Feature Importance ✅ Speed and efficiently need to be further optimised for production-level.
Random Forests Classification/Regression Bootstrap aggregated ✅, Rotation forest ❌, Extremely Randomised Trees (ERT) ❌ TBR
Support Vector Machine Classification/Regression Hard-margin SVC ✅, Soft-margin SVC ✅ Kernel Trick OvO ❌ OvR ✅ CURRENTLY WORKING ON
Principal Component Analysis Dimensionality Reduction First principle derivation of the eigenvalue equation need to be added.
DBSCAN Clustering Pseudo-code needs to be added for queuing algorithm used for cluster growth.
Gaussian Mixture Models Clustering (Probabilistic) ⏳🚧 In progress (put on hold). Need to formally explore Maximum Likelihood Estimation (MLE) theory.
Linear Discriminant Analysis Dimensionality Reduction Not started.
Gradient Boosting Classification/Regression ⏳🚧 Theory section written, but the model has not been implemented.

Current working on: Support Vector Machines

  • Implement the kernel trick for SVCs.
  • Implement a soft-margin SVC using the dual formulation.
  • Implement a soft-margin SVC using SGD and Hinge loss.
  • Combine multiple SVCs (OvO and OvR) to handle classification with C>2 number of classes.

About

The most common machine learning models and algorithms programmed from first principles. The aim is to implement these models and algorithms to the same standard as widely used libraries such as scikit-learn with clean, efficient code, clear modular design, and rigorous adherence to the underlying mathematics.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published