MLP Digits Classifier — NumPy from Scratch

A Multi-Layer Perceptron (MLP) implemented from scratch with NumPy, trained to classify handwritten digits (0–9) using scikit-learn's built-in digits dataset (1797 samples of 8×8 images).

No deep learning frameworks — just NumPy and a clean implementation of forward pass, backpropagation, and gradient descent.

Follows on from LinearRegressionGradientDescent, extending gradient descent from a single-output linear model to a multi-class neural network.

Network Architectures

Single-Layer Perceptron (SLP)

A minimal baseline — one linear mapping from pixels to class logits. See np_slp_digits.py.

Input (64)  →  Output Softmax (10)

Layer	Size	Activation
Input	64 (8×8 pixel values)	—
Output	10 neurons (digits 0–9)	Softmax

Multi-Layer Perceptron (MLP)

Adds a hidden ReLU layer, enabling the network to learn non-linear features.

Input (64)  →  Hidden ReLU (32)  →  Output Softmax (10)

Layer	Size	Activation
Input	64 (8×8 pixel values)	—
Hidden	32 neurons	ReLU
Output	10 neurons (digits 0–9)	Softmax

How It Works

Data — Loads sklearn.datasets.load_digits (1797 samples, 64 features each); applies per-feature z-score standardisation (mean=0, std=1) with a small epsilon to avoid division by zero on constant pixels.
One-hot encoding — Converts integer labels to a (1797, 10) target matrix via np.eye(10)[y].
Forward pass — Computes ReLU activations in the hidden layer, then numerically stable softmax probabilities at the output.
Backward pass — Uses the combined cross-entropy + softmax gradient (predictions - targets) / N, then propagates through the ReLU gate using a binary mask (layer1 > 0).
Update — Applies full-batch gradient descent for 1000 epochs, printing accuracy every 100 epochs.

Key Implementation Details

Detail	Description
Numerically stable softmax	Subtracts per-row max before `exp` to prevent overflow
Combined CE + softmax gradient	`(probs - targets) / N` — avoids computing loss explicitly
ReLU gate	`layer1 > 0` binary mask zeros gradients for inactive units
Weight initialisation	`randn * 0.1` — small random weights; biases initialised to zero
Reproducibility	`np.random.seed(42)`

Hyperparameters

Parameter	Value
Learning rate	0.1
Epochs	1000
Batch size	Full batch (1797 samples)
Hidden units	32

Files

File	Description
`np_mlp_digits.py`	Clean Python script — full training loop
`np_mlp_digits.ipynb`	Jupyter Notebook version with cell-by-cell walkthrough
`np_slp_digits.py`	Even simpler single-layer (no hidden layer) softmax classifier
`np_slp_digits.ipynb`	Jupyter Notebook version of the single-layer classifier

Requirements

numpy
scikit-learn

Install with:

pip install numpy scikit-learn

Usage

python np_slp_digits.py   # single-layer
python np_mlp_digits.py   # multi-layer

Epoch	SLP (`np_slp_digits.py`)	MLP (`np_mlp_digits.py`)
0	13%	10%
100	94%	72%
200	96%	88%
300	97%	94%
400	97%	96%
500	98%	97%
600	98%	98%
700	98%	98%
800	98%	99%
900	98%	99%

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
LICENSE		LICENSE
README.md		README.md
np_mlp_digits.ipynb		np_mlp_digits.ipynb
np_mlp_digits.py		np_mlp_digits.py
np_slp_digits.ipynb		np_slp_digits.ipynb
np_slp_digits.py		np_slp_digits.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLP Digits Classifier — NumPy from Scratch

Network Architectures

Single-Layer Perceptron (SLP)

Multi-Layer Perceptron (MLP)

How It Works

Key Implementation Details

Hyperparameters

Files

Requirements

Usage

License

About

Uh oh!

Contributors

Uh oh!

Languages

Epoch	SLP (`np_slp_digits.py`)	MLP (`np_mlp_digits.py`)
0	13%	10%
100	94%	72%
200	96%	88%
300	97%	94%
400	97%	96%
500	98%	97%
600	98%	98%
700	98%	98%
800	98%	99%
900	98%	99%

Epoch	SLP (`np_slp_digits.py`)	MLP (`np_mlp_digits.py`)
0	13%	10%
100	94%	72%
200	96%	88%
300	97%	94%
400	97%	96%
500	98%	97%
600	98%	98%
700	98%	98%
800	98%	99%
900	98%	99%

Folders and files

Latest commit

History

Repository files navigation

MLP Digits Classifier — NumPy from Scratch

Network Architectures

Single-Layer Perceptron (SLP)

Multi-Layer Perceptron (MLP)

How It Works

Key Implementation Details

Hyperparameters

Files

Requirements

Usage

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

Epoch	SLP (`np_slp_digits.py`)	MLP (`np_mlp_digits.py`)
0	13%	10%
100	94%	72%
200	96%	88%
300	97%	94%
400	97%	96%
500	98%	97%
600	98%	98%
700	98%	98%
800	98%	99%
900	98%	99%