WQD7001 PRINCIPLES OF DATA SCIENCE Prediction of stroke risk

    Ho Wei Yan (S2116489)

    Charroogesinee(17060405)

    Choong Che Wei(S2106183)

    Hii Yew Han(S2037987)

    Tan Shi Ling(S2115562)

Introduction

Problem Statement:

According to the World Stroke Organization, every 1 in 4 adults over the age of 25 will have a stroke in their lifetime and 5.5 millions people die yearly. It is estimated the deaths will rise to 6.7 million yearly without appropriate action.Most stroke risk factors are lifestyle related which includes high blood pressure, smoking, diabetes, high blood cholesterol levels, heavy drinking, high salt and high fat diet and lack of exercise.Someone who has already experienced a stroke is at increased risk of having another.

Beneficiary of project:

Patients with existing health conditions (and their family members who are concerned about the patients).

Objective:

This data product aims to create an early indicator to help people, especially those who have pre-existing health conditions, for testing their stroke risks.

Research Question:

(1) What are the attributes related to stroke?

(2) What is the correlation between pre-existing health conditions and stroke risks?

(3) How to predict stroke risk?

Data Science Process

1. Asking the right question

The primary goal was determined by asking interesting questions such as “what are the common health risks and diseases among adults.” The idea was to predict risks of developing stroke based on certain criteria.

2. Finding and collecting data

The healthcare dataset by the World Health Organization was obtained on Kaggle and it is more generic which will not restrict the users to a specific region. https://www.kaggle.com/fedesoriano/stroke-prediction-dataset

3. Data Preprocessing:

The mean imputation method is implemented on the missing data from the dataset. Attributes in the data is identified and classified accordingly.

4. Data Analysis:

Exploratory data analysis and Predictive analysis

References

https://www.kaggle.com/mirichoi0218/insurance https://www.kaggle.com/yassinehamdaoui1/cardiovascular-disease

Shiny Apps :

Github :