From 43e502caab44e02bd17657c6cae8f7aad38368f4 Mon Sep 17 00:00:00 2001 From: "Lucas Lopes Oliveira (lopesoll)" Date: Sat, 9 Sep 2023 14:32:25 +0100 Subject: [PATCH] Update README.md --- README.md | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 573d0e2..bd4ccaa 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@ The Bayesian networks in our analysis incorporate several key risk factors as no ### Data Preprocessing -Before diving into machine learning models, it's essential to preprocess the data to ensure its quality and suitability for analysis. The following steps have been taken: +Before diving into the predictions, it's essential to preprocess the data to ensure its quality and suitability for analysis. The following steps have been taken: 1. **Resampling Techniques:** - Random Under Sampler (Undersampling) @@ -40,7 +40,7 @@ Here are the models implemented: ### Model Evaluation -To assess the performance of the machine learning models, the following evaluation metrics have been utilized: +To assess the performance of the models, the following evaluation metrics have been utilized: 1. **Precision** @@ -50,13 +50,35 @@ To assess the performance of the machine learning models, the following evaluati 4. **Specificity** +
## Files 1. Read_data.R + - Read original data + - Outlier Detection + - Discretization + - Write to CSV to use in python 2. Preprocessing.R + - Check for null values + - Plot the correlation of each feature with the target variable 3. Resampling_and_RFE.ipynb + - Resampling + - RFECV Recursive feature elimination with cross-validation + - Write to CSV to use in R 4. read_new_data.R + - Read resampled data + - Drop discarted variables 5. PC_and_HC_for_BN.R + - PC algorithm to create dag + - Hill climbing algorithm to create dag 6. BN_final.R + - Create two manual dags, dag 3 and 4 in report + - Plot the dags + - Fit the models on the training data + - Predict on the testing set + - Evaluate the predictions 7. RandomForest.R + - Fit the Random Forest model on the training set + - Predict on the test set + - Evaluate the results