Heart-Disease-Classification

Project Overview

This project focuses on utilizing machine learning to predict cardiovascular disease based on key risk factors. Cardiovascular disease is a leading global cause of preventable deaths, responsible for significant suffering and straining healthcare systems. It claims about 17.7 million lives annually, making up 44% of non-communicable disease fatalities.

We monitor risk factors such as blood pressure, obesity, age, gender, diet, exercise, smoking, insurance, mental and physical health, alcohol use, sleep, and health check-ups. Our aim is to leverage machine learning techniques for accurate disease prediction, contributing to research and prevention efforts. Below is an organized breakdown of the project's key components and features:

Data Preprocessing

Before diving into machine learning models, it's essential to preprocess the data to ensure its quality and suitability for analysis. The following steps have been taken:

Resampling Techniques:
- Repeated Edited Nearest Neighbours (Undersampling)
- Random Over Sampler (Oversampling)
Standardization
Feature Selection:
- Recursive Feature Elimination

Machine Learning Models

Here are the models implemented:

Decision Tree Classifier
Logistic Regression
Support Vector Machine

Model Evaluation

To assess the performance of the machine learning models, the following evaluation metrics have been utilized:

Precision
Recall
F1-Score

Hyperparameter Tuning

Fine-tuning the model parameters is crucial for achieving optimal performance.

Grid Search