6006CEM Machine Learning and Related Applications (2223SEPJAN)
This repository contains the implementation and demonstration files for 6006CEM, delivered in semester 1 starting September 2022 and to be completed by December 2022.
Chosen machine learning dataset: Rain In Australia - https://www.kaggle.com/datasets/jsphyg/weather-dataset-rattle-package
Directory Structure and Instructions
This repository contains the following files and folders:
data/
: This directory contains the datasetweatherAUS.csv
.tuning/
: A folder containing tuning logs for XGB and SVM models.figs/
: A directory which holds some correlation figures (only the ones in the report).code.ipynb
: This file contains the main notebook used to implement the machine learning algorithms.code_no_markdown.py
: A python script with all the code and markdown cells commented out. Note: never been run! Only created to paste into the appendix section of the report.requirements.txt
: A list of dependencies for this project to be installed viapip
.module_check.ipynb
: An interactive python notebook used to check if the required modules are installed.
To install all the requirements automatically, run the command pip install -r requirements.txt
. You can use the supplied module_check.ipynb
notebook to verify the requirements have been met.
Python and Module Versions
This code is tested to be working with the following modules and versions:
- Python version: 3.10.8 | packaged by conda-forge | (main, Nov 24 2022, 14:07:00) [MSC v.1916 64 bit (AMD64)]
- pandas version: 1.5.2
- matplotlib version: 3.6.2
- NumPy version: 1.23.5
- SciPy version: 1.9.3
- IPython version: 8.7.0 (part of jupyter notebook / jupyterlab)
- scikit-learn version: 1.1.3
- seaborn version: 0.12.1
- XGBoost version: 1.7.1