ML-From-Scratch
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
A Python project that rebuilds popular machine learning algorithms step by step using only NumPy, so you can see exactly how each algorithm works under the hood instead of treating a library like scikit-learn or PyTorch as a black box.
ML From Scratch is a collection of Python implementations of machine learning algorithms written from first principles using only NumPy, the fundamental numerical computing library. Its goal is education: rather than providing optimized, production-ready code, it prioritizes showing exactly how each algorithm works step by step, making the underlying math and logic visible and approachable.
The project covers a broad range of machine learning techniques organized into four categories. Supervised learning includes algorithms like linear regression, decision trees, support vector machines, and neural networks. Unsupervised learning includes clustering methods like k-means and DBSCAN, dimensionality reduction methods like PCA, and generative models like variational autoencoders and generative adversarial networks. Reinforcement learning includes deep Q-networks. The deep learning section covers building neural network layers from scratch, including convolutional layers, recurrent layers, batch normalization, and attention mechanisms.
Each implementation is accompanied by runnable example scripts that produce visualizations, such as an animated GIF of a GAN learning to generate handwritten digits or a graph of a regression model fitting temperature data. This makes abstract concepts concrete by letting learners run and observe the algorithms directly.
You would use this repository when studying machine learning and wanting to understand what is actually happening inside a model, rather than just using a high-level library like scikit-learn or PyTorch as a black box. It is also useful for preparing for technical interviews where implementation knowledge matters.
The tech stack is Python with NumPy as the only significant dependency. Some examples also use scikit-learn for datasets and Matplotlib for plotting. The project is designed to be read and run locally rather than deployed.
Where it fits
- Study how neural networks, decision trees, and support vector machines work by reading and running clean Python implementations
- Prepare for technical machine learning interviews by implementing classic algorithms from scratch with no shortcuts
- Visualize how a GAN learns to generate handwritten digits or how a regression model fits data through included runnable example scripts
- Understand deep learning building blocks like convolutional layers, batch normalization, and attention mechanisms without any black-box library hiding the math