Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
1 changed file
with
14 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,18 @@ | ||
# 6006CEM_Machine-learning_Coursework | ||
Stroke is caused when a blood vessel in the brain bursts or the blood supply is blocked, preventing blood and oxygen from reaching the brain. It is the second leading cause of death worldwide and places a significant burden on individuals and national health care systems. Stroke has a wide range of underlying causes. Therefore, the main goal of this project is to apply the techniques of machine learning to existing datasets to make effective stroke predictions based on a wide range of relevant features and to make some recommendations. The dataset selected for this project is from Kaggle: “Stroke Prediction Dataset” | ||
Stroke is caused when a blood vessel in the brain bursts or the blood supply is blocked, preventing blood and oxygen from reaching the brain. It is the second leading cause of death worldwide and places a significant burden on individuals and national health care systems. Stroke has a wide range of underlying causes. Therefore, the main goal of this project is to apply the techniques of machine learning to existing datasets to make effective stroke predictions based on a wide range of relevant features and to make some recommendations. The dataset selected for this project is from Kaggle: “Stroke Prediction Dataset”The dataset has 12 columns: | ||
1. id: Unique identifier | ||
2. gender: Gender of patient | ||
3. age: The age of the patient | ||
4. hypertension: Having this disease is marked 1, else 0 | ||
5. heart_disease: Having this disease is marked 1, else 0 | ||
6. ever_married: “Yes” or “No” | ||
7. work_type: "children", "Govt_jov", "Never_worked", "Private" or "Self-employed" | ||
8. Residence_type: "Rural" or "Urban" | ||
9. avg_glucose_level: average glucose level in blood | ||
10. bmi: body mass index | ||
11. smoking_status: "formerly smoked", "never smoked", "smokes" or "Unknown" | ||
12. stroke: 1 if the patient had a stroke or 0 if not | ||
|
||
The link of the dataset is https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset. | ||
The URL of the dataset: https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset. | ||
|
||
The libraries required for this project are: pandas, numpy, matplotlib, warnings, seaborn, sklearn, imblearn, xgboost. These libraries need to be pre-installed to ensure that the program runs! |