Analysis Of Patient Data

Objective

The objective of this Exercise is to learn RMD using Patient data and Analysing the Patient data by first converting the data set to RMD file and cleaning the data in R and finally publishing it.

Probem Definition

1.To clean the dataset which will help us to read it
2.Summarise and Report the data

Code & Output

knitr Global Options

# for development
knitr::opts_chunk$set(echo=TRUE, eval=TRUE, error=TRUE, warning=TRUE, message=TRUE, cache=FALSE, tidy=FALSE, fig.path='figures/')
# for production
#knitr::opts_chunk$set(echo=TRUE, eval=TRUE, error=FALSE, warning=FALSE, message=FALSE, cache=FALSE, tidy=FALSE, fig.path='figures/')  

Load Libraries

library(dplyr)
## Warning: package 'dplyr' was built under R version 3.3.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Read Data

# inline comments
setwd("E:/R-BA/scripts/data")
dfPatient <- read.csv("./patient-data.csv", header=T, stringsAsFactors=F)
intRowCount <- nrow(dfPatient)
head(dfPatient)
##          ID      Name  Race Gender Smokes HeightInCms WeightInKgs
## 1 AC/AH/001 Demetrius White   Male  False      182.87       76.57
## 2 AC/AH/017   Rosario White   Male  False      179.12       80.43
## 3 AC/AH/020     Julio Black   Male  False      169.15       75.48
## 4 AC/AH/022      Lupe White   Male  False      175.66       94.54
## 5 AC/AH/029    Lavern White Female  False      164.47       71.78
## 6 AC/AH/033    Bernie   Dog Female   True      158.27       69.90
##    BirthDate        State  Pet HealthGrade  Died RecordDate
## 1 31-01-1972  Georgia,xxx  Dog           2 False 25-11-2015
## 2 09-06-1972     Missouri  Dog           2 False 25-11-2015
## 3 03-07-1972 Pennsylvania None           2 False 25-11-2015
## 4 11-08-1972      Florida  Cat           1 False 25-11-2015
## 5 06-06-1973         Iowa NULL           2  True 25-11-2015
## 6 25-06-1973     Maryland  Dog           2 False 25-11-2015

Total Rows Of Patient File: 100

Add coloumn BMI-Value

# inline comments
dfPatient <- mutate(dfPatient, BMIValue=(WeightInKgs/(HeightInCms/100)^2))
head(dfPatient)
##          ID      Name  Race Gender Smokes HeightInCms WeightInKgs
## 1 AC/AH/001 Demetrius White   Male  False      182.87       76.57
## 2 AC/AH/017   Rosario White   Male  False      179.12       80.43
## 3 AC/AH/020     Julio Black   Male  False      169.15       75.48
## 4 AC/AH/022      Lupe White   Male  False      175.66       94.54
## 5 AC/AH/029    Lavern White Female  False      164.47       71.78
## 6 AC/AH/033    Bernie   Dog Female   True      158.27       69.90
##    BirthDate        State  Pet HealthGrade  Died RecordDate BMIValue
## 1 31-01-1972  Georgia,xxx  Dog           2 False 25-11-2015 22.89674
## 2 09-06-1972     Missouri  Dog           2 False 25-11-2015 25.06859
## 3 03-07-1972 Pennsylvania None           2 False 25-11-2015 26.38080
## 4 11-08-1972      Florida  Cat           1 False 25-11-2015 30.63867
## 5 06-06-1973         Iowa NULL           2  True 25-11-2015 26.53567
## 6 25-06-1973     Maryland  Dog           2 False 25-11-2015 27.90487

Add column BMI-Label

# inline comments
dfPatient <- mutate(dfPatient, BMILabel=NA)
dfPatient$BMILabel <- ifelse(dfPatient$BMIValue < 18.50,"UNDERWEIGHT",
                       ifelse(dfPatient$BMIValue > 18.50 & dfPatient$BMIValue < 25.00, "NORMAL",
                       ifelse(dfPatient$BMIValue > 25.00 & dfPatient$BMIValue < 30.00, "OVERWEIGHT",
                       ifelse(dfPatient$BMIValue > 30.00,"OBESE", NA))))
head(dfPatient)
##          ID      Name  Race Gender Smokes HeightInCms WeightInKgs
## 1 AC/AH/001 Demetrius White   Male  False      182.87       76.57
## 2 AC/AH/017   Rosario White   Male  False      179.12       80.43
## 3 AC/AH/020     Julio Black   Male  False      169.15       75.48
## 4 AC/AH/022      Lupe White   Male  False      175.66       94.54
## 5 AC/AH/029    Lavern White Female  False      164.47       71.78
## 6 AC/AH/033    Bernie   Dog Female   True      158.27       69.90
##    BirthDate        State  Pet HealthGrade  Died RecordDate BMIValue
## 1 31-01-1972  Georgia,xxx  Dog           2 False 25-11-2015 22.89674
## 2 09-06-1972     Missouri  Dog           2 False 25-11-2015 25.06859
## 3 03-07-1972 Pennsylvania None           2 False 25-11-2015 26.38080
## 4 11-08-1972      Florida  Cat           1 False 25-11-2015 30.63867
## 5 06-06-1973         Iowa NULL           2  True 25-11-2015 26.53567
## 6 25-06-1973     Maryland  Dog           2 False 25-11-2015 27.90487
##     BMILabel
## 1     NORMAL
## 2 OVERWEIGHT
## 3 OVERWEIGHT
## 4      OBESE
## 5 OVERWEIGHT
## 6 OVERWEIGHT

Add Health-Grade column

dfPatient$HealthGrade <-with(dfPatient,ifelse(HealthGrade == 1,"Good Health",
                          ifelse(HealthGrade == 2,"Normal",
                          ifelse(HealthGrade == 3,"Bad Health",NA))))
                          summarise(group_by(dfPatient, HealthGrade), n())
## # A tibble: 4 × 2
##   HealthGrade `n()`
##         <chr> <int>
## 1  Bad Health    34
## 2 Good Health    29
## 3      Normal    30
## 4        <NA>     7
head(dfPatient)
##          ID      Name  Race Gender Smokes HeightInCms WeightInKgs
## 1 AC/AH/001 Demetrius White   Male  False      182.87       76.57
## 2 AC/AH/017   Rosario White   Male  False      179.12       80.43
## 3 AC/AH/020     Julio Black   Male  False      169.15       75.48
## 4 AC/AH/022      Lupe White   Male  False      175.66       94.54
## 5 AC/AH/029    Lavern White Female  False      164.47       71.78
## 6 AC/AH/033    Bernie   Dog Female   True      158.27       69.90
##    BirthDate        State  Pet HealthGrade  Died RecordDate BMIValue
## 1 31-01-1972  Georgia,xxx  Dog      Normal False 25-11-2015 22.89674
## 2 09-06-1972     Missouri  Dog      Normal False 25-11-2015 25.06859
## 3 03-07-1972 Pennsylvania None      Normal False 25-11-2015 26.38080
## 4 11-08-1972      Florida  Cat Good Health False 25-11-2015 30.63867
## 5 06-06-1973         Iowa NULL      Normal  True 25-11-2015 26.53567
## 6 25-06-1973     Maryland  Dog      Normal False 25-11-2015 27.90487
##     BMILabel
## 1     NORMAL
## 2 OVERWEIGHT
## 3 OVERWEIGHT
## 4      OBESE
## 5 OVERWEIGHT
## 6 OVERWEIGHT

Error Handling

Error handling in Gender

summarise(group_by(dfPatient, Gender), n())
## # A tibble: 6 × 2
##    Gender `n()`
##     <chr> <int>
## 1  Female     6
## 2    Male     3
## 3  Female    45
## 4 Female      4
## 5    Male    40
## 6   Male      2
dfPatient$Gender <- trimws(toupper(dfPatient$Gender))
summarise(group_by(dfPatient, Gender), n())
## # A tibble: 2 × 2
##   Gender `n()`
##    <chr> <int>
## 1 FEMALE    55
## 2   MALE    45
head(dfPatient)
##          ID      Name  Race Gender Smokes HeightInCms WeightInKgs
## 1 AC/AH/001 Demetrius White   MALE  False      182.87       76.57
## 2 AC/AH/017   Rosario White   MALE  False      179.12       80.43
## 3 AC/AH/020     Julio Black   MALE  False      169.15       75.48
## 4 AC/AH/022      Lupe White   MALE  False      175.66       94.54
## 5 AC/AH/029    Lavern White FEMALE  False      164.47       71.78
## 6 AC/AH/033    Bernie   Dog FEMALE   True      158.27       69.90
##    BirthDate        State  Pet HealthGrade  Died RecordDate BMIValue
## 1 31-01-1972  Georgia,xxx  Dog      Normal False 25-11-2015 22.89674
## 2 09-06-1972     Missouri  Dog      Normal False 25-11-2015 25.06859
## 3 03-07-1972 Pennsylvania None      Normal False 25-11-2015 26.38080
## 4 11-08-1972      Florida  Cat Good Health False 25-11-2015 30.63867
## 5 06-06-1973         Iowa NULL      Normal  True 25-11-2015 26.53567
## 6 25-06-1973     Maryland  Dog      Normal False 25-11-2015 27.90487
##     BMILabel
## 1     NORMAL
## 2 OVERWEIGHT
## 3 OVERWEIGHT
## 4      OBESE
## 5 OVERWEIGHT
## 6 OVERWEIGHT

Error handling in race

summarise(group_by(dfPatient, Race), n())
## # A tibble: 6 × 2
##        Race `n()`
##       <chr> <int>
## 1     Asian     5
## 2 Bi-Racial     1
## 3     Black     8
## 4       Dog     1
## 5  Hispanic    17
## 6     White    68
dfPatient$Race <- trimws(toupper(dfPatient$Race))
dfPatient$Race[dfPatient$Race=="DOG"] <- NA
dfPatient$Race[dfPatient$Race=="BI-RACIAL"] <- NA
summarise(group_by(dfPatient, Race), n())
## # A tibble: 5 × 2
##       Race `n()`
##      <chr> <int>
## 1    ASIAN     5
## 2    BLACK     8
## 3 HISPANIC    17
## 4    WHITE    68
## 5     <NA>     2

Error handling in Died

summarise(group_by(dfPatient, Died), n())
## # A tibble: 2 × 2
##    Died `n()`
##   <chr> <int>
## 1 False    46
## 2  True    54
class(dfPatient$Died)
## [1] "character"
dfPatient$Died <- as.logical(dfPatient$Died)
class(dfPatient$Died)
## [1] "logical"
summarise(group_by(dfPatient, Died), n())
## # A tibble: 2 × 2
##    Died `n()`
##   <lgl> <int>
## 1 FALSE    46
## 2  TRUE    54

Error handling in State

summarise(group_by(dfPatient, State), n())
## # A tibble: 34 × 2
##          State `n()`
##          <chr> <int>
## 1      Alabama     2
## 2      Arizona     2
## 3   California    13
## 4     Colorado     1
## 5  Connecticut     1
## 6      Florida     8
## 7      Georgia     3
## 8  Georgia,xxx     1
## 9       Hawaii     2
## 10    Illinois     4
## # ... with 24 more rows
dfPatient$State[dfPatient$State=="Georgia,xxx"] <- "Georgia"
summarise(group_by(dfPatient, State), n())
## # A tibble: 33 × 2
##          State `n()`
##          <chr> <int>
## 1      Alabama     2
## 2      Arizona     2
## 3   California    13
## 4     Colorado     1
## 5  Connecticut     1
## 6      Florida     8
## 7      Georgia     4
## 8       Hawaii     2
## 9     Illinois     4
## 10     Indiana     4
## # ... with 23 more rows

Error handling in Pet

summarise(group_by(dfPatient, Pet), n())
## # A tibble: 10 × 2
##      Pet `n()`
##    <chr> <int>
## 1   Bird     9
## 2    Cat    24
## 3    CAT     5
## 4    Dog    28
## 5    DOG     4
## 6  Horse     1
## 7   None    23
## 8   NONE     1
## 9   NULL     3
## 10  <NA>     2
dfPatient$Pet <- trimws(toupper(dfPatient$Pet))
dfPatient$Pet[dfPatient$Pet=="NONE"] <- NA
dfPatient$Pet[dfPatient$Pet=="NULL"] <- NA
summarise(group_by(dfPatient, Pet), n())
## # A tibble: 5 × 2
##     Pet `n()`
##   <chr> <int>
## 1  BIRD     9
## 2   CAT    29
## 3   DOG    32
## 4 HORSE     1
## 5  <NA>    29

Error handling in Smokes

summarise(group_by(dfPatient, Smokes), n())
## # A tibble: 4 × 2
##   Smokes `n()`
##    <chr> <int>
## 1  False    72
## 2     No     6
## 3   True    18
## 4    Yes     4
class(dfPatient$Smokes)
## [1] "character"
dfPatient$Smokes <- as.logical(dfPatient$Smokes)
class(dfPatient$Smokes)
## [1] "logical"
summarise(group_by(dfPatient, Smokes), n())
## # A tibble: 3 × 2
##   Smokes `n()`
##    <lgl> <int>
## 1  FALSE    72
## 2   TRUE    18
## 3     NA    10

Complete Cases

vclComplete <- complete.cases(dfPatient)
vclComplete[is.true(vclComplete)]
## Error in eval(expr, envir, enclos): could not find function "is.true"
dfPatient <- dfPatient[vclComplete, ]
head(dfPatient)
##           ID      Name  Race Gender Smokes HeightInCms WeightInKgs
## 1  AC/AH/001 Demetrius WHITE   MALE  FALSE      182.87       76.57
## 2  AC/AH/017   Rosario WHITE   MALE  FALSE      179.12       80.43
## 4  AC/AH/022      Lupe WHITE   MALE  FALSE      175.66       94.54
## 9  AC/AH/045   Shirley WHITE   MALE  FALSE      181.32       76.90
## 11 AC/AH/049    Martin WHITE FEMALE  FALSE      160.06       72.37
## 13 AC/AH/052  Courtney WHITE   MALE   TRUE      175.39       92.22
##     BirthDate      State   Pet HealthGrade  Died RecordDate BMIValue
## 1  31-01-1972    Georgia   DOG      Normal FALSE 25-11-2015 22.89674
## 2  09-06-1972   Missouri   DOG      Normal FALSE 25-11-2015 25.06859
## 4  11-08-1972    Florida   CAT Good Health FALSE 25-11-2015 30.63867
## 9  25-12-1971  Louisiana   DOG Good Health FALSE 25-11-2015 23.39025
## 11 28-04-1972 California HORSE      Normal  TRUE 25-12-2015 28.24834
## 13 16-03-1972    Indiana  BIRD  Bad Health FALSE 25-12-2015 29.97888
##      BMILabel
## 1      NORMAL
## 2  OVERWEIGHT
## 4       OBESE
## 9      NORMAL
## 11 OVERWEIGHT
## 13 OVERWEIGHT
nrow(dfPatient)
## [1] 60

Summarizing Columns

summarise(group_by(dfPatient, BMILabel), n())
## # A tibble: 3 × 2
##     BMILabel `n()`
##        <chr> <int>
## 1     NORMAL    12
## 2      OBESE     5
## 3 OVERWEIGHT    43
summarise(group_by(dfPatient, Gender), n())
## # A tibble: 2 × 2
##   Gender `n()`
##    <chr> <int>
## 1 FEMALE    34
## 2   MALE    26
summarise(group_by(dfPatient, Race), n())
## # A tibble: 4 × 2
##       Race `n()`
##      <chr> <int>
## 1    ASIAN     4
## 2    BLACK     3
## 3 HISPANIC     9
## 4    WHITE    44
summarise(group_by(dfPatient, Died), n())
## # A tibble: 2 × 2
##    Died `n()`
##   <lgl> <int>
## 1 FALSE    27
## 2  TRUE    33
summarise(group_by(dfPatient, Pet), n())
## # A tibble: 4 × 2
##     Pet `n()`
##   <chr> <int>
## 1  BIRD     8
## 2   CAT    28
## 3   DOG    23
## 4 HORSE     1
summarise(group_by(dfPatient, Smokes), n())
## # A tibble: 2 × 2
##   Smokes `n()`
##    <lgl> <int>
## 1  FALSE    50
## 2   TRUE    10
summarise(group_by(dfPatient, HealthGrade), n())
## # A tibble: 3 × 2
##   HealthGrade `n()`
##         <chr> <int>
## 1  Bad Health    24
## 2 Good Health    19
## 3      Normal    17
summarise(group_by(dfPatient, State), n())
## # A tibble: 28 × 2
##         State `n()`
##         <chr> <int>
## 1     Alabama     1
## 2     Arizona     2
## 3  California     7
## 4     Florida     5
## 5     Georgia     3
## 6      Hawaii     1
## 7    Illinois     3
## 8     Indiana     3
## 9        Iowa     1
## 10     Kansas     1
## # ... with 18 more rows

Reporting

Display top 10 records based on BMI-Value.

head(arrange(dfPatient, desc(BMIValue)), 10)
##           ID     Name     Race Gender Smokes HeightInCms WeightInKgs
## 1  AC/SG/009    Sammy    WHITE   MALE  FALSE      166.84       88.25
## 2  AC/SG/064      Jon    WHITE   MALE  FALSE      169.16       90.08
## 3  AC/AH/076   Albert    WHITE   MALE  FALSE      176.22       97.67
## 4  AC/AH/022     Lupe    WHITE   MALE  FALSE      175.66       94.54
## 5  AC/AH/248   Andrea    WHITE   MALE  FALSE      178.64       97.05
## 6  AC/SG/067   Thomas    WHITE   MALE  FALSE      167.51       84.15
## 7  AC/AH/052 Courtney    WHITE   MALE   TRUE      175.39       92.22
## 8  AC/AH/127     Jame    WHITE   MALE  FALSE      167.75       82.06
## 9  AC/SG/055     Evan    WHITE   MALE  FALSE      166.75       79.06
## 10 AC/SG/181    Terry HISPANIC   MALE  FALSE      177.14       88.70
##     BirthDate        State  Pet HealthGrade  Died RecordDate BMIValue
## 1  04-03-1972      Vermont  DOG Good Health FALSE 25-06-2016 31.70402
## 2  04-10-1972     Illinois  CAT      Normal  TRUE 25-07-2016 31.47988
## 3  08-04-1973    Louisiana  CAT      Normal FALSE 25-12-2015 31.45218
## 4  11-08-1972      Florida  CAT Good Health FALSE 25-11-2015 30.63867
## 5  12-01-1973      Indiana  CAT Good Health  TRUE 25-05-2016 30.41152
## 6  19-07-1972 Pennsylvania BIRD      Normal  TRUE 25-07-2016 29.98974
## 7  16-03-1972      Indiana BIRD  Bad Health FALSE 25-12-2015 29.97888
## 8  29-10-1972        Texas  DOG Good Health  TRUE 25-01-2016 29.16127
## 9  24-02-1972     Illinois BIRD  Bad Health  TRUE 25-07-2016 28.43316
## 10 24-11-1971      Indiana  CAT  Bad Health  TRUE 25-09-2016 28.26769
##      BMILabel
## 1       OBESE
## 2       OBESE
## 3       OBESE
## 4       OBESE
## 5       OBESE
## 6  OVERWEIGHT
## 7  OVERWEIGHT
## 8  OVERWEIGHT
## 9  OVERWEIGHT
## 10 OVERWEIGHT

Display bottom 10 records based on BMI-Value.

head(arrange(dfPatient, BMIValue), 10)
##           ID      Name     Race Gender Smokes HeightInCms WeightInKgs
## 1  AC/SG/193    Ronnie    WHITE   MALE   TRUE      185.43       73.63
## 2  AC/SG/099    Leslie    ASIAN   MALE  FALSE      172.72       67.62
## 3  AC/AH/001 Demetrius    WHITE   MALE  FALSE      182.87       76.57
## 4  AC/AH/086      Kyle    BLACK   MALE   TRUE      180.11       75.72
## 5  AC/AH/045   Shirley    WHITE   MALE  FALSE      181.32       76.90
## 6  AC/AH/114      Kris HISPANIC   MALE  FALSE      177.75       74.84
## 7  AC/AH/077     Tommy    BLACK   MALE  FALSE      174.09       72.20
## 8  AC/AH/150     Brett    WHITE   MALE   TRUE      181.56       79.54
## 9  AC/AH/057    Vernon    WHITE FEMALE   TRUE      163.79       65.76
## 10 AC/AH/207    Bobbie    WHITE FEMALE  FALSE      163.01       65.19
##     BirthDate        State  Pet HealthGrade  Died RecordDate BMIValue
## 1  05-06-1973         Iowa  DOG  Bad Health FALSE 25-09-2016 21.41385
## 2  04-02-1972         Ohio  CAT Good Health FALSE 25-07-2016 22.66678
## 3  31-01-1972      Georgia  DOG      Normal FALSE 25-11-2015 22.89674
## 4  12-05-1973      Georgia  CAT  Bad Health FALSE 25-12-2015 23.34183
## 5  25-12-1971    Louisiana  DOG Good Health FALSE 25-11-2015 23.39025
## 6  19-11-1972 Pennsylvania BIRD  Bad Health FALSE 25-01-2016 23.68725
## 7  01-02-1973   Washington  CAT  Bad Health FALSE 25-12-2015 23.82262
## 8  03-05-1972     Kentucky  DOG Good Health  TRUE 25-02-2016 24.12933
## 9  06-01-1972     Illinois  CAT  Bad Health FALSE 25-12-2015 24.51247
## 10 17-05-1973      Florida  DOG      Normal FALSE 25-03-2016 24.53310
##    BMILabel
## 1    NORMAL
## 2    NORMAL
## 3    NORMAL
## 4    NORMAL
## 5    NORMAL
## 6    NORMAL
## 7    NORMAL
## 8    NORMAL
## 9    NORMAL
## 10   NORMAL

Gender > Race - frequency / counts

summarise(group_by(dfPatient, Gender, Race), n())
## Source: local data frame [8 x 3]
## Groups: Gender [?]
## 
##   Gender     Race `n()`
##    <chr>    <chr> <int>
## 1 FEMALE    ASIAN     2
## 2 FEMALE    BLACK     1
## 3 FEMALE HISPANIC     4
## 4 FEMALE    WHITE    27
## 5   MALE    ASIAN     2
## 6   MALE    BLACK     2
## 7   MALE HISPANIC     5
## 8   MALE    WHITE    17

Race > Gender - max, min and average values for BMI-Values

summarise(group_by(dfPatient, Race, Gender), min(BMIValue), mean(BMIValue), max(BMIValue))
## Source: local data frame [8 x 5]
## Groups: Race [?]
## 
##       Race Gender `min(BMIValue)` `mean(BMIValue)` `max(BMIValue)`
##      <chr>  <chr>           <dbl>            <dbl>           <dbl>
## 1    ASIAN FEMALE        25.57631         26.88531        28.19431
## 2    ASIAN   MALE        22.66678         24.95782        27.24885
## 3    BLACK FEMALE        26.71407         26.71407        26.71407
## 4    BLACK   MALE        23.34183         23.58223        23.82262
## 5 HISPANIC FEMALE        25.03916         26.29513        26.89942
## 6 HISPANIC   MALE        23.68725         26.39844        28.26769
## 7    WHITE FEMALE        24.51247         26.60612        28.24834
## 8    WHITE   MALE        21.41385         27.53445        31.70402

All Dead people

filter(dfPatient, Died==TRUE)
##           ID        Name     Race Gender Smokes HeightInCms WeightInKgs
## 1  AC/AH/049      Martin    WHITE FEMALE  FALSE      160.06       72.37
## 2  AC/AH/127        Jame    WHITE   MALE  FALSE      167.75       82.06
## 3  AC/AH/133       Clyde HISPANIC   MALE  FALSE      181.15       83.93
## 4  AC/AH/150       Brett    WHITE   MALE   TRUE      181.56       79.54
## 5  AC/AH/154        Tony    WHITE FEMALE  FALSE      160.03       64.30
## 6  AC/AH/156      George    WHITE   MALE  FALSE      165.62       76.72
## 7  AC/AH/160        Rory    ASIAN FEMALE  FALSE      159.67       71.88
## 8  AC/AH/176       Jerry    ASIAN   MALE  FALSE      175.21       83.65
## 9  AC/AH/180        Drew    WHITE FEMALE  FALSE      160.80       64.77
## 10 AC/AH/186 Christopher    WHITE FEMALE  FALSE      157.95       67.41
## 11 AC/AH/211         Son    WHITE FEMALE  FALSE      157.16       69.64
## 12 AC/AH/219         Jay    WHITE FEMALE  FALSE      163.47       72.89
## 13 AC/AH/233      Marion    WHITE FEMALE  FALSE      163.97       66.71
## 14 AC/AH/248      Andrea    WHITE   MALE  FALSE      178.64       97.05
## 15 AC/AH/249       Jesus HISPANIC FEMALE   TRUE      159.78       68.31
## 16 AC/SG/010        Theo    ASIAN FEMALE  FALSE      159.32       64.92
## 17 AC/SG/016      Jimmie    BLACK FEMALE  FALSE      161.84       69.97
## 18 AC/SG/046        Carl HISPANIC   MALE  FALSE      171.41       81.70
## 19 AC/SG/055        Evan    WHITE   MALE  FALSE      166.75       79.06
## 20 AC/SG/064         Jon    WHITE   MALE  FALSE      169.16       90.08
## 21 AC/SG/065      Shayne    WHITE FEMALE  FALSE      157.01       66.56
## 22 AC/SG/067      Thomas    WHITE   MALE  FALSE      167.51       84.15
## 23 AC/SG/068   Valentine HISPANIC FEMALE  FALSE      160.47       68.20
## 24 AC/SG/084       Brian HISPANIC   MALE  FALSE      174.25       80.93
## 25 AC/SG/101       Jason    WHITE FEMALE  FALSE      159.23       69.96
## 26 AC/SG/123     Darnell    WHITE FEMALE   TRUE      162.32       72.72
## 27 AC/SG/134       Daryl    WHITE FEMALE   TRUE      162.59       69.76
## 28 AC/SG/155     Raymond    WHITE FEMALE  FALSE      158.35       69.72
## 29 AC/SG/165       Elmer    WHITE FEMALE  FALSE      162.18       67.81
## 30 AC/SG/179       Logan    WHITE   MALE  FALSE      183.10       82.47
## 31 AC/SG/181       Terry HISPANIC   MALE  FALSE      177.14       88.70
## 32 AC/SG/197       Stacy    WHITE FEMALE  FALSE      159.44       66.21
## 33 AC/SG/234        Luis HISPANIC FEMALE  FALSE      164.88       68.07
##     BirthDate          State   Pet HealthGrade Died RecordDate BMIValue
## 1  28-04-1972     California HORSE      Normal TRUE 25-12-2015 28.24834
## 2  29-10-1972          Texas   DOG Good Health TRUE 25-01-2016 29.16127
## 3  13-10-1973     Washington   CAT  Bad Health TRUE 25-02-2016 25.57647
## 4  03-05-1972       Kentucky   DOG Good Health TRUE 25-02-2016 24.12933
## 5  30-08-1973     California   DOG Good Health TRUE 25-02-2016 25.10777
## 6  09-07-1972     California   DOG Good Health TRUE 25-02-2016 27.96939
## 7  22-09-1973        Florida   CAT      Normal TRUE 25-02-2016 28.19431
## 8  01-05-1973       Virginia   DOG  Bad Health TRUE 25-03-2016 27.24885
## 9  18-02-1973         Oregon   CAT Good Health TRUE 25-03-2016 25.04966
## 10 06-05-1972     New Jersey   DOG  Bad Health TRUE 25-03-2016 27.01998
## 11 14-07-1973     California   CAT      Normal TRUE 25-04-2016 28.19517
## 12 07-04-1972 North Carolina  BIRD Good Health TRUE 25-04-2016 27.27670
## 13 23-12-1971           Ohio   CAT  Bad Health TRUE 25-04-2016 24.81202
## 14 12-01-1973        Indiana   CAT Good Health TRUE 25-05-2016 30.41152
## 15 23-04-1972        Alabama   CAT      Normal TRUE 25-05-2016 26.75713
## 16 29-01-1973       New York   CAT      Normal TRUE 25-06-2016 25.57631
## 17 03-04-1972        Arizona   CAT  Bad Health TRUE 25-06-2016 26.71407
## 18 05-08-1973    Mississippi  BIRD      Normal TRUE 25-06-2016 27.80672
## 19 24-02-1972       Illinois  BIRD  Bad Health TRUE 25-07-2016 28.43316
## 20 04-10-1972       Illinois   CAT      Normal TRUE 25-07-2016 31.47988
## 21 05-04-1972     California   DOG  Bad Health TRUE 25-07-2016 26.99968
## 22 19-07-1972   Pennsylvania  BIRD      Normal TRUE 25-07-2016 29.98974
## 23 15-04-1972      Tennessee   CAT  Bad Health TRUE 25-07-2016 26.48480
## 24 06-03-1972       Virginia   DOG      Normal TRUE 25-07-2016 26.65410
## 25 28-09-1973       Michigan   DOG      Normal TRUE 25-07-2016 27.59307
## 26 03-09-1972 North Carolina  BIRD Good Health TRUE 25-08-2016 27.60005
## 27 28-05-1972          Texas   CAT      Normal TRUE 25-08-2016 26.38875
## 28 02-06-1972     California   CAT  Bad Health TRUE 25-08-2016 27.80489
## 29 25-03-1972     Washington  BIRD Good Health TRUE 25-08-2016 25.78096
## 30 24-10-1972           Ohio   DOG  Bad Health TRUE 25-09-2016 24.59910
## 31 24-11-1971        Indiana   CAT  Bad Health TRUE 25-09-2016 28.26769
## 32 08-11-1972       New York   CAT Good Health TRUE 25-10-2016 26.04528
## 33 10-11-1971   Pennsylvania   CAT  Bad Health TRUE 25-10-2016 25.03916
##      BMILabel
## 1  OVERWEIGHT
## 2  OVERWEIGHT
## 3  OVERWEIGHT
## 4      NORMAL
## 5  OVERWEIGHT
## 6  OVERWEIGHT
## 7  OVERWEIGHT
## 8  OVERWEIGHT
## 9  OVERWEIGHT
## 10 OVERWEIGHT
## 11 OVERWEIGHT
## 12 OVERWEIGHT
## 13     NORMAL
## 14      OBESE
## 15 OVERWEIGHT
## 16 OVERWEIGHT
## 17 OVERWEIGHT
## 18 OVERWEIGHT
## 19 OVERWEIGHT
## 20      OBESE
## 21 OVERWEIGHT
## 22 OVERWEIGHT
## 23 OVERWEIGHT
## 24 OVERWEIGHT
## 25 OVERWEIGHT
## 26 OVERWEIGHT
## 27 OVERWEIGHT
## 28 OVERWEIGHT
## 29 OVERWEIGHT
## 30     NORMAL
## 31 OVERWEIGHT
## 32 OVERWEIGHT
## 33 OVERWEIGHT
nrow(filter(dfPatient, Died==TRUE))
## [1] 33

Hispanic Females

filter(dfPatient, Race=="HISPANIC" & Gender=="FEMALE")
##          ID      Name     Race Gender Smokes HeightInCms WeightInKgs
## 1 AC/AH/249     Jesus HISPANIC FEMALE   TRUE      159.78       68.31
## 2 AC/SG/068 Valentine HISPANIC FEMALE  FALSE      160.47       68.20
## 3 AC/SG/122    Michal HISPANIC FEMALE  FALSE      160.09       68.94
## 4 AC/SG/234      Luis HISPANIC FEMALE  FALSE      164.88       68.07
##    BirthDate          State Pet HealthGrade  Died RecordDate BMIValue
## 1 23-04-1972        Alabama CAT      Normal  TRUE 25-05-2016 26.75713
## 2 15-04-1972      Tennessee CAT  Bad Health  TRUE 25-07-2016 26.48480
## 3 16-12-1971 South Carolina DOG Good Health FALSE 25-08-2016 26.89942
## 4 10-11-1971   Pennsylvania CAT  Bad Health  TRUE 25-10-2016 25.03916
##     BMILabel
## 1 OVERWEIGHT
## 2 OVERWEIGHT
## 3 OVERWEIGHT
## 4 OVERWEIGHT
nrow(filter(dfPatient, Race=="HISPANIC" & Gender=="FEMALE"))
## [1] 4

7 sample records from the dataset using seed(707)

set.seed(707)
sample_n(dfPatient, 10)
##           ID     Name     Race Gender Smokes HeightInCms WeightInKgs
## 13 AC/AH/052 Courtney    WHITE   MALE   TRUE      175.39       92.22
## 48 AC/AH/219      Jay    WHITE FEMALE  FALSE      163.47       72.89
## 30 AC/AH/150    Brett    WHITE   MALE   TRUE      181.56       79.54
## 55 AC/AH/248   Andrea    WHITE   MALE  FALSE      178.64       97.05
## 73 AC/SG/084    Brian HISPANIC   MALE  FALSE      174.25       80.93
## 67 AC/SG/064      Jon    WHITE   MALE  FALSE      169.16       90.08
## 80 AC/SG/122   Michal HISPANIC FEMALE  FALSE      160.09       68.94
## 9  AC/AH/045  Shirley    WHITE   MALE  FALSE      181.32       76.90
## 20 AC/AH/086     Kyle    BLACK   MALE   TRUE      180.11       75.72
## 57 AC/SG/002      Jan    WHITE FEMALE   TRUE      161.57       67.92
##     BirthDate          State  Pet HealthGrade  Died RecordDate BMIValue
## 13 16-03-1972        Indiana BIRD  Bad Health FALSE 25-12-2015 29.97888
## 48 07-04-1972 North Carolina BIRD Good Health  TRUE 25-04-2016 27.27670
## 30 03-05-1972       Kentucky  DOG Good Health  TRUE 25-02-2016 24.12933
## 55 12-01-1973        Indiana  CAT Good Health  TRUE 25-05-2016 30.41152
## 73 06-03-1972       Virginia  DOG      Normal  TRUE 25-07-2016 26.65410
## 67 04-10-1972       Illinois  CAT      Normal  TRUE 25-07-2016 31.47988
## 80 16-12-1971 South Carolina  DOG Good Health FALSE 25-08-2016 26.89942
## 9  25-12-1971      Louisiana  DOG Good Health FALSE 25-11-2015 23.39025
## 20 12-05-1973        Georgia  CAT  Bad Health FALSE 25-12-2015 23.34183
## 57 03-07-1973        Arizona  DOG  Bad Health FALSE 25-05-2016 26.01814
##      BMILabel
## 13 OVERWEIGHT
## 48 OVERWEIGHT
## 30     NORMAL
## 55      OBESE
## 73 OVERWEIGHT
## 67      OBESE
## 80 OVERWEIGHT
## 9      NORMAL
## 20     NORMAL
## 57 OVERWEIGHT

SUMMARY

The dataset contained of 45 Males and 55 Females out of which only 60 were analysed since the rest of the data was incomplete.
Majority of the patients are either Obese or Overweight.
Majority of the Patient Population belongs to the White Population
Most of the patients with high BMI-Value are White males and majority of them were non-smokers.

OBJECTIVE

Hence the Objective to understand and learn RMD and analysing the Patient data has been completed.