Filtering and selecting data
For the purpose of this study our clean data should include records
above the age of 60 as well as remove any nulls and have the data
arranged by age in ascending order.
## Selecting the relevant columns and filtering for patients above the age of 60.
data_clean<-data %>%
select(ID, M.F, Age, Educ, SES, MMSE, CDR) %>%
filter(Age>= 60) %>%
##Arranging in ascending order and removing any remaining NULLS
na.omit(SES) %>%
arrange(Age)
## Taking a look at our cleaned data.
glimpse(data_clean)
## Rows: 180
## Columns: 7
## $ ID <chr> "OAS1_0072_MR1", "OAS1_0200_MR1", "OAS1_0109_MR1", "OAS1_0455_MR1…
## $ M.F <chr> "F", "F", "F", "F", "M", "M", "F", "F", "F", "M", "M", "F", "F", …
## $ Age <int> 60, 60, 61, 61, 61, 62, 62, 63, 64, 64, 64, 64, 65, 65, 65, 65, 6…
## $ Educ <int> 5, 2, 4, 2, 5, 2, 3, 3, 3, 2, 5, 4, 2, 5, 3, 3, 1, 2, 2, 5, 3, 2,…
## $ SES <int> 1, 4, 3, 4, 2, 4, 3, 2, 2, 4, 2, 2, 3, 2, 4, 3, 4, 3, 4, 2, 4, 4,…
## $ MMSE <int> 30, 30, 30, 28, 30, 30, 26, 30, 30, 29, 22, 30, 29, 30, 29, 29, 2…
## $ CDR <dbl> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.0, 0.0, …
str(data_clean)
## 'data.frame': 180 obs. of 7 variables:
## $ ID : chr "OAS1_0072_MR1" "OAS1_0200_MR1" "OAS1_0109_MR1" "OAS1_0455_MR1" ...
## $ M.F : chr "F" "F" "F" "F" ...
## $ Age : int 60 60 61 61 61 62 62 63 64 64 ...
## $ Educ: int 5 2 4 2 5 2 3 3 3 2 ...
## $ SES : int 1 4 3 4 2 4 3 2 2 4 ...
## $ MMSE: int 30 30 30 28 30 30 26 30 30 29 ...
## $ CDR : num 0 0 0 0 0 0 0 0 0 0 ...
## - attr(*, "na.action")= 'omit' Named int [1:18] 5 18 20 24 38 41 52 63 69 75 ...
## ..- attr(*, "names")= chr [1:18] "5" "18" "20" "24" ...
- The above code filters the results to only show individuals above
the age of 60 years old. This is important as our case study revolves
only around patients above the age of 60. After this cleaning we have
180 observations left from 216. The remaining observations are cleaned
and will be used further for analysis.