Shel.io

Backpropagation From Scratch

Format

I built this project as part of one of my machine-learning classes at the University of Maryland. To maintain academic integrity policies, I have not posted this code anywhere publicly, rather I have provided screenshots of the code, which can be seen by visiting any of the three below "View Code" buttons. This is to prevent web scraping or indexing of the code.

For this project, I implemented various neural networks using gradient descent for training. I worked on building simple models for tasks like regression, classification, and dimensionality reduction (with PCA). The only already-provided code for generating the training data, for each model respectively, and everything else was up to me. This project combines both implementation and theoretical understanding of neural network concepts like loss functions, activations, and optimization techniques.

Single Perceptron

This Python code trains a single-layer perceptron for a linear regression task. It includes functions to generate training data, calculate mean squared error loss, train the perceptron using gradient descent, and test the trained model. The perceptron is trained to approximate a linear function with randomly generated data points. The training process iteratively updates the weights and bias to minimize the loss, and the final model is tested on a set of inputs to verify its performance.

Simple Network

This code implements a simple neural network with a single input and output layer using sigmoid activation, trained for binary classification. It uses gradient descent to train the model with random initialization of weights and bias. Two datasets are generated: one which is linearly separable and one that is not. This non-separable dataset represents the XOR problem. The model performs well on the linearly separable dataset but struggles with the XOR dataset due to inherent limitations of linear classifiers on non-linearly separable data.

Complex Network

This code implements a basic autoencoder neural network without activation functions, designed to perform a form of Principal Component Analysis (PCA) on 2D data points. In PCA, the goal is for the network to learn a low-dimensional representation of the input data by compressing it through the hidden layer and reconstructing the input at the output layer. The loss function is used to minimize the difference between input and output values, training the network to reproduce whatever is input.

Experience & Process

I enjoyed this project because it enabled me to dive deep into something often abstracted. Neural networks are the underpinning of most modern machine learning/artificial intelligence, so it is cruciaul to know.

Languages & Tools Used

Python (PyTorch)
Numpy
Scikit-learn