For this exercise, please upload the following dataset:
Download data_example1_Clean.xlsx here
Import dataset “data_example1_Clean.xlsx” and rename as data1
library(readxl)
## Warning: package 'readxl' was built under R version 3.6.1
data1=read_excel("data_example1_Clean.xlsx")
*Notes:
first, we need to install package “gmodels” and load the package using function ‘library’
install.packages(“gmodels”)
library(gmodels)
## Warning: package 'gmodels' was built under R version 3.6.1
to generate frequency table for variable gender
freq.gender=CrossTable(data1$gender,format="SPSS")
##
## Cell Contents
## |-------------------------|
## | Count |
## | Row Percent |
## |-------------------------|
##
## Total Observations in Table: 28
##
## | Female | Male |
## |-----------|-----------|
## | 11 | 17 |
## | 39.286% | 60.714% |
## |-----------|-----------|
##
##
to generate frequency table for variable race by gender
Notes:* prop.r=TRUE=row percentage*
freq.gender.race=CrossTable(data1$gender,data1$race,expected=FALSE, prop.r=TRUE, prop.c=FALSE,prop.t=FALSE, prop.chisq=FALSE, chisq = FALSE, fisher=FALSE, mcnemar=FALSE, format="SPSS")
##
## Cell Contents
## |-------------------------|
## | Count |
## | Row Percent |
## |-------------------------|
##
## Total Observations in Table: 28
##
## | data1$race
## data1$gender | Indian | Malay | Row Total |
## -------------|-----------|-----------|-----------|
## Female | 3 | 8 | 11 |
## | 27.273% | 72.727% | 39.286% |
## -------------|-----------|-----------|-----------|
## Male | 5 | 12 | 17 |
## | 29.412% | 70.588% | 60.714% |
## -------------|-----------|-----------|-----------|
## Column Total | 8 | 20 | 28 |
## -------------|-----------|-----------|-----------|
##
##
first, we need to install package “psych” and load the package using function ’library
install.packages(“psych”)
library(psych)
## Warning: package 'psych' was built under R version 3.6.1
to describe variable ptage
desc.age=describe(data1$ptage,IQR = TRUE)
desc.age
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 27 41.78 5.63 44 41.91 5.93 34 48 14 -0.23 -1.68
## se IQR
## X1 1.08 13
*Notes:
Distributions of numerical data can be checked using skewness(skew) and kurtosis values.
The data is normally distributed if skewness and kurtosis value lies between -1 to +11 and -3 to +32 respectively.
If the data is normally distributed, mean and standard deviation(sd) should be reported.
Median and interquartile range (IQR), if otherwise.
Thus for ptage, mean and sd should be reported as the data is normally distributed.
1 Bulmer, M. G. (1979), Principles of Statistics. NY:Dover Books on Mathematics.
2 Kevin P. Balanda and H.L. MacGillivray. “Kurtosis: A Critical Review”. The American Statistician 42:2 [May 1988], pp 111-119
Notes:* ‘cbind’ is a funtion to combine values in column*
age.meansd=cbind("Mean"=desc.age$mean,"SD"=desc.age$sd)
age.meansd
## Mean SD
## [1,] 41.77778 5.625036
Exercise
Describe height, weight, bmi, sysbp and diasbp.
to describe variable ptage by gender
Notes:* ‘mat=TRUE’=output in matrix format*
desc.age.gender=describeBy(data1$ptage,data1$gender,IQR=TRUE,mat = TRUE)
desc.age.gender
## item group1 vars n mean sd median trimmed mad min max
## X11 1 Female 1 11 42.81818 6.096199 45.0 43.22222 4.4478 34 48
## X12 2 Male 1 16 41.06250 5.359960 41.5 41.07143 7.4130 34 48
## range skew kurtosis se IQR
## X11 14 -0.48040234 -1.705179 1.838073 11
## X12 14 -0.06648471 -1.720123 1.339990 10
Notes:* ‘cbind’ is a funtion to combine values in column*
age.gender.meansd=cbind(desc.age.gender$mean,desc.age.gender$sd)
rownames(age.gender.meansd)=c("Male","Female")
colnames(age.gender.meansd)=c("Mean","SD")
age.gender.meansd
## Mean SD
## Male 42.81818 6.096199
## Female 41.06250 5.359960
Exercise:
Describe height, weight, bmi, sysbp and diasbp among male and female.
to export result from R to Excel file
*Notes:
first, we need to install package “writexl” and load the package using function ’library
install.packages(“writexl”)
library(writexl)
## Warning: package 'writexl' was built under R version 3.6.1
export ‘age.gender.meansd’ results to excel file
write_xlsx(as.data.frame(age.gender.meansd), path="age.gender.meansd.xlsx")
then the excel file will appear in your project folder