The following data has been obtained from a survey of how stressed employees in a particular city are feeling on a Wednesday.
| Age (years) | 18 | 55 | 24 | 61 | 42 | 36 | 52 | 19 |
|---|---|---|---|---|---|---|---|---|
| Stress level | Low | Medium | Low | High | High | High | Low | Medium |
Copy and paste your R code for a and c, enter your answer for b:
Input this table into a data frame, making use of the factor() function to describe the categorical data of stress level.
age <- c(18,55,24,61,42,36,52,19)
stress <- c("Low","Medium","Low","High","High","High","Low","Medium")
survey <- data.frame(age,stress)
survey_factors <- factor(stress,levels=1:3)
levels(survey_factors) <- c("Low","Medium","High")
Using indexing with [,], find the average age of the people with high stress level.
# Creating new DF with filtered records
high_stress_age <- survey[survey$stress=="High",]
# extracting vector containing only age numbers
hs_age_only <- high_stress_age$age
# calculating average age of High Stress group
avg_high_stress_age <- mean(hs_age_only)
Average age = 46.3333333 years
Change the labelled of the stress levels from ‘low’, ‘medium’, ‘high’ to ‘negligible’, ‘moderate’ and ‘serious’ by changing the factor properties using levels().
levels(survey_factors)
## [1] "Low" "Medium" "High"
levels(survey_factors) <- c("negligible","moderate","serious")
levels(survey_factors)
## [1] "negligible" "moderate" "serious"