Skip to content

V2.1 optimisation #5

Merged
merged 12 commits into from
Jan 1, 2025
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.pth filter=lfs diff=lfs merge=lfs -text
16 changes: 11 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Winter-Mini-Project

Welcome to my image classification project to learn ML better
Deep learning can be very CPU intensive and memory intensive so a PC with a GPU and/or with a good amount of memory is strongly recommended for the running of this projects main code
It also has a version that uses the CUDA Deep Learning Network libraries which will need to be installed for optimal training speed of the model.
Expand All @@ -7,9 +8,10 @@ Another option in situations where a good PC may not be available is the use of
Session Log:

1. 20/12/2024:
-Research on convolutional neural networks. Layed out design of CNN and how it can be used to create an image classifyer. Installed libraries and selected a big cat database from Kaggle
-Research on convolutional neural networks: Layed out design of CNN and how it can be used to create an image classifyer
-Installed libraries and selected a big cat database from Kaggle

**Worked Part-Time job between these dates**
*Worked Part-Time job between these dates*

2. 23/12/2024:
-Ran into library issues and corruption with pytorch (Corrupted version and file path too long to be recognisable by terminal). Debugged the problem by doing as such:
Expand All @@ -18,7 +20,7 @@ Session Log:
-Committed code to convert image to tensor
-Created this file

4. 24/12/2024 (Christmas Eve):
3. 24/12/2024 (Christmas Eve):
- Huge progress: Custom Dataset and CNN Model creation. Learned a lot today
- Added example code using simple greyscale matrix for convolution and pooling
- CNN model class:
Expand All @@ -31,12 +33,13 @@ Session Log:
- Instantiates column values in the CSV as variables to access
- Combined the model and the dataset together to iterate through epochs of training data through the model to train it including backpropogation techniques (with cross entropy loss)

5. 28/12/2024:
4. 28/12/2024:
- Debugging code, replaced the labels with unique numeric labelling as is compatible with pytorch
- Finding solutions for quicker training methods such as GPU use using NVIDIA CUDA and cuDLNN. Both were unsuccessful. Searching for better option however in the meantime just using the CPU on my higher performance PC instead of my work laptop
- Script officially runnable but havent saved the created model

6. 30/12/2024:
5. 30/12/2024:

- Researched cloud computing solutions for training DL models quicker. Found google colab but was unsuccessful in integrating it due to compatability issues. If I wanted to use this as a solution, I should have started with google colab. Continued with slower unparalleled CPU training
- Saved state dictionary of the model (saved the model) as a .pth file and uploaded it to github using Git LFS
- Created testing script using the created model
Expand All @@ -54,6 +57,7 @@ References used:
- https://stackoverflow.com/questions/59097657/in-pytorch-how-to-test-simple-image-with-my-loaded-model
- https://stackoverflow.com/questions/58779759/how-can-i-display-an-image-on-vs-code-and-pycharm


NOTE: ChatGPT was used in aiding in the understanding of concepts for CNNs and understanding the function of sections of code used in referenced resources

Libraries required for running (copy and paste into terminal):
Expand All @@ -65,3 +69,5 @@ pip install pandas

(If you want to use cuda to speed up the training process, run this then install cuda and cnDLNN from the internet)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install torch pip install torchvision pip install pillow pip install pandas

37 changes: 37 additions & 0 deletions Test and Example Code/tester proposed addition.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
"""
class WildcatDataset(Dataset):
def __init__(self, csv_file, root_dir, dataset_type, transform=None):

self.data_frame = pd.read_csv(csv_file) #gets csv and reads it
self.root_dir = root_dir
self.transform = transform
self.dataset_type = dataset_type # 'train', 'valid', or 'test'

# Filter the data based on the 'dataset' column
self.data_frame = self.data_frame[self.data_frame['data set'] == self.dataset_type]

#Converts the labels that determine what something is (AFRICAN LEOPARD, etc) into a unique number to perform
# Create a mapping from labels to numeric class indices
self.label_to_idx = {label: idx for idx, label in enumerate(self.data_frame['labels'].unique())}

# Add a new column with numeric labels
self.data_frame['numeric_label'] = self.data_frame['labels'].map(self.label_to_idx)

def __len__(self):
return len(self.data_frame) #returns the number of rows in the csv

def __getitem__(self, idx):
img_name = os.path.join(self.root_dir, self.data_frame.iloc[idx, 1]) # grabs the filepath column from the csv
image = Image.open(img_name).convert('RGB') # Load image and converts into RGB values from grabbed filepath
label = torch.tensor(self.data_frame.iloc[idx]['numeric_label'], dtype=torch.long) # gets the numeric version of the label from the label column of the csv

if self.transform:
image = self.transform(image)

return image, label


test_dataset = WildcatDataset(csv_file=r'cats-in-the-wild-image-classification\versions\1\WILDCATS.csv', root_dir=r'cats-in-the-wild-image-classification\versions\1', dataset_type='test', transform=transform)
test_loader = DataLoader(test_dataset, batch_size=1, shuffle=True)

"""
3 changes: 3 additions & 0 deletions bigcat_model_state.pth
Git LFS file not shown
39 changes: 24 additions & 15 deletions main.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms

##IMPORTANT TASK: CREATING CUSTOM PYTORCH DATASET FOR BIG CAT CLASSIF.

##KEY DETAIL: The wildcat custom dataset
class WildcatDataset(Dataset):
def __init__(self, csv_file, root_dir, dataset_type, transform=None):
"""
Expand All @@ -20,32 +20,37 @@ def __init__(self, csv_file, root_dir, dataset_type, transform=None):
transform (callable, optional): Optional transform to be applied on a sample.
"""
self.data_frame = pd.read_csv(csv_file) #gets csv and reads it
self.root_dir = root_dir
self.transform = transform
self.root_dir = root_dir #Stores root directory
self.transform = transform #Stores required image size in case the image is misshapen
self.dataset_type = dataset_type # 'train', 'valid', or 'test'

# Filter the data based on the 'dataset' column
self.data_frame = self.data_frame[self.data_frame['data set'] == self.dataset_type]
self.data_frame = self.data_frame[self.data_frame['data set'] == self.dataset_type] #Sets it so the only data being considered is the data linked to its specific purpose. E.g: when training, it will only consider data where its data set column is "train"

#Converts the labels that determine what something is (AFRICAN LEOPARD, etc) into a unique number to perform
# Create a mapping from labels to numeric class indices
self.label_to_idx = {label: idx for idx, label in enumerate(self.data_frame['labels'].unique())}
#Converts the labels that determine what something is (AFRICAN LEOPARD, etc) into a unique number to operate supervised learning operations on
self.label_to_idx = {label: idx for idx, label in enumerate(self.data_frame['labels'].unique())}

# Add a new column with numeric labels
# Add a new column "numeric_label" to store the newly created numberic representation of its label (LION --> 4, AFRICAN LEOPARD --> 0, ...)
self.data_frame['numeric_label'] = self.data_frame['labels'].map(self.label_to_idx)

def __len__(self):
return len(self.data_frame) #returns the number of rows in the csv
return len(self.data_frame) #returns the number of rows in the csv in case it is needed (common practise in dataset creation)

def __getitem__(self, idx):
img_name = os.path.join(self.root_dir, self.data_frame.iloc[idx, 1]) # grabs the filepath column from the csv
image = Image.open(img_name).convert('RGB') # Load image and converts into RGB values from grabbed filepath
label = torch.tensor(self.data_frame.iloc[idx]['numeric_label'], dtype=torch.long) # gets the numeric version of the label from the label column of the csv

#For debugging and measuring progress in an epoch in a human readable way
length = self.__len__()
if (idx+1) % 16 == 0:
print(f"Batch: {int((idx+1)/16)} / {length // 16}") #Displays progress in number of batches per epoch completed

if self.transform:
#If the image is not the correct size and format, reshape it so it is as such
if self.transform:
image = self.transform(image)

return image, label
return image, label #Output the RGB representation of the image and what it is in numeric representation


class BigCatModel(nn.Module):
Expand Down Expand Up @@ -100,9 +105,9 @@ def forward(self, x):
valid_dataset = WildcatDataset(csv_file=r'cats-in-the-wild-image-classification\versions\1\WILDCATS.csv', root_dir=r'cats-in-the-wild-image-classification\versions\1', dataset_type='valid', transform=transform)
test_dataset = WildcatDataset(csv_file=r'cats-in-the-wild-image-classification\versions\1\WILDCATS.csv', root_dir=r'cats-in-the-wild-image-classification\versions\1', dataset_type='test', transform=transform)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
valid_loader = DataLoader(valid_dataset, batch_size=32, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)
train_loader = DataLoader(train_dataset, batch_size=16, shuffle=False)
valid_loader = DataLoader(valid_dataset, batch_size=16, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=16, shuffle=False)

#Creates the model
model = BigCatModel()
Expand All @@ -121,4 +126,8 @@ def forward(self, x):
loss.backward()
optimizer.step()
print(f"Epoch {epoch+1}/{epochs}, Loss: {loss.item()}")
torch.save(model, 'bigcat_model.pth')
print("Training Complete!")

# Save the state dictionary
torch.save(model.state_dict(), 'bigcat_model_state.pth')

94 changes: 94 additions & 0 deletions tester.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import os
import pandas as pd
from PIL import Image
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms

import matplotlib.pyplot as plt
import matplotlib.image as mpimg



class BigCatModel(nn.Module):
"""
Args:
nn.Module (library call)
"""
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 32, kernel_size=(3, 3), stride=1, padding=1)
self.act1 = nn.ReLU()
self.drop1 = nn.Dropout(0.3)

self.conv2 = nn.Conv2d(32, 64, kernel_size=(3, 3), stride=1, padding=1)
self.act2 = nn.ReLU()
self.pool2 = nn.MaxPool2d(kernel_size=(2, 2))

self.conv3 = nn.Conv2d(64, 128, kernel_size=(3, 3), stride=1, padding=1)
self.act3 = nn.ReLU()
self.pool3 = nn.MaxPool2d(kernel_size=(2, 2))

self.flat = nn.Flatten()

self.fc1 = nn.Linear(128 * 56 * 56, 512) # Adjust dimensions for your input
self.act4 = nn.ReLU()
self.drop4 = nn.Dropout(0.5)

self.fc2 = nn.Linear(512, 10) # 5 classes for big cats

def forward(self, x):
x = self.act1(self.conv1(x))
x = self.drop1(x)
x = self.act2(self.conv2(x))
x = self.pool2(x)
x = self.act3(self.conv3(x))
x = self.pool3(x)
x = self.flat(x)
x = self.act4(self.fc1(x))
x = self.drop4(x)
x = self.fc2(x)
return x

BigCatLabel = {
0 : "African Leopard",
1 : "Caracal",
2 : "Cheeta",
3 : "Clouded Leopard",
4 : "Jaguar",
5 : "Lions",
6 : "Ocelot",
7 : "Puma",
8 : "Snow Leopard",
9 : "Tiger"
}
path = r'cats-in-the-wild-image-classification\versions\1\test\SNOW LEOPARD\2.jpg'
# Recreate the model
model = BigCatModel()

# Load the state dictionary
model.load_state_dict(torch.load('bigcat_model_state.pth'))

# Set to evaluation mode
model.eval()

# Test with dummy data
img = mpimg.imread(path)
plt.imshow(img)
plt.show()

#Cleans input
# Define your transformations (e.g., resize, normalization) in case an image or not the correct format
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) # Normalize for pre-trained models
])
image = Image.open(path).convert('RGB')
test_input = transform(image).unsqueeze(0) # Add batch dimension: [1, 3, 224, 224]
output = model(test_input)
predicted_class = torch.argmax(output, dim=1).item()
print(f"Predicted class: {BigCatLabel[predicted_class]}")