Handwritten-Digit-Classification-Using-K-Means-Clustering

Recreated and improved the K-Means algorithm from scratch to classify MNIST digits. Implemented K-Means++ initialization, centroid updates, and Euclidean-distance assignment, plus an outlier-detection system. Achieved 78% accuracy and identified high-variance misclassified digits.

Project Report

Read the full report here: Clustering and Classification of Handwritten Digits Using the K-Means Algorithm (PDF)

Overview

This project implements and optimizes the K-means clustering algorithm to classify handwritten digits from the Modified National Institute of Standards and Technology (MNIST) database. The goal was to classify 784-dimensional image vectors by forming clusters and calculating representative centroids. We modified the centroid initialization process using the K-means++ method for improved performance and established a distance-based statistical threshold for robust outlier detection. The resulting algorithm achieved 78% classification accuracy on the test set and successfully flagged 14 outliers.

Features

Core K-means Implementation: Developed a full K-means algorithm to classify high-dimensional MNIST image vectors
Centroid Initialization Optimization: Employed the K-means++ method to select initial centroids and improve overall clustering performance
Statistical Outlier Detection: Implemented a distance-based system to identify data anomalies using a statistical threshold
Parameter Tuning: Optimized the algorithm by running tests to determine the best number of clusters and iterations to minimize the cost function
Performance and Analysis: Achieved a classification accuracy of 78% and analyzed sources of error, particularly for digits lacking closed borders

Authors

Nick Regas
Lucas Selvik

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.DS_Store		.DS_Store
ESECaseStudy1.pdf		ESECaseStudy1.pdf
README.md		README.md
classifierdata.mat		classifierdata.mat
cs1_mnist_base_skeleton.m		cs1_mnist_base_skeleton.m
cs1_mnist_evaluate_test_set.m		cs1_mnist_evaluate_test_set.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Handwritten-Digit-Classification-Using-K-Means-Clustering

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Handwritten-Digit-Classification-Using-K-Means-Clustering

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages