The ABC is the heart of development and learning excellence for the Syldavian Public Service. As such, ABC is dedicated to ensuring that public agencies and officers are future-ready: it supports initiatives that bring about change, learning and collaboration across the Public Service, and ensures that public agencies and officers are ready to embrace the future.
Technological disruptions over the years have led to significant strategic shifts within the Syldavian Public Service. Amidst this evolving landscape, the Senior Management has queried whether ABC is still meeting public officers’ expectations as a central learning institution for the Public Service.
You have been asked to mine a dataset to address this query, and to generate additional insights which you think would be beneficial for the Senior Management. The dataset comprises course feedback from all participants who have attended at least one course at the College between 2016 and 2018 inclusive, and cuts across different course domains and course types.
# Loading Libraries
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.6.3
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(ggpubr)
## Warning: package 'ggpubr' was built under R version 3.6.3
## Loading required package: magrittr
We will separate Questions 9 and 10 from the rest of the Questions in the Course Feedback as they have different rating scales from the rest of the Questions.
# Loading Data
data <- read.csv('./LFG_Analyst_Interview_Data_R.csv', header=TRUE)
data8 <- subset(data, Question != "Q9" & Question != "Q10")
data9 <- subset(data, Question == "Q9")
data10 <- subset(data, Question == "Q10")
head(data8)
head(data9)
head(data10)
# Summary of Data
summary(data8, digits = 1)
## ID Year Type
## Min. : 1 Min. :2016 Inhouse:436068
## 1st Qu.:23443 1st Qu.:2016 Public :407871
## Median :46886 Median :2017
## Mean :46886 Mean :2017
## 3rd Qu.:70329 3rd Qu.:2017
## Max. :93771 Max. :2018
##
## Domain Question Response
## Data Analytics : 5868 Q1 : 93771 4 :453234
## Effective Communication:581931 Q11 : 93771 5 :240603
## Governance : 477 Q2 : 93771 3 :140234
## Personal Development :243936 Q3 : 93771 2 : 9638
## Service Excellence : 11727 Q4 : 93771 1 : 230
## Q5 : 93771 10 : 0
## (Other):281313 (Other): 0
summary(data9, digits = 1)
## ID Year Type Domain
## Min. : 1 Min. :2016 Inhouse:48452 Data Analytics : 652
## 1st Qu.:23444 1st Qu.:2016 Public :45319 Effective Communication:64659
## Median :46886 Median :2017 Governance : 53
## Mean :46886 Mean :2017 Personal Development :27104
## 3rd Qu.:70329 3rd Qu.:2017 Service Excellence : 1303
## Max. :93771 Max. :2018
##
## Question Response
## Q9 :93771 9 :32447
## Q1 : 0 8 :27345
## Q10 : 0 10 :21305
## Q11 : 0 7 :10709
## Q2 : 0 6 : 1821
## Q3 : 0 5 : 141
## (Other): 0 (Other): 3
summary(data10, digits = 1)
## ID Year Type Domain
## Min. : 1 Min. :2016 Inhouse:48452 Data Analytics : 652
## 1st Qu.:23444 1st Qu.:2016 Public :45319 Effective Communication:64659
## Median :46886 Median :2017 Governance : 53
## Mean :46886 Mean :2017 Personal Development :27104
## 3rd Qu.:70329 3rd Qu.:2017 Service Excellence : 1303
## Max. :93771 Max. :2018
##
## Question Response
## Q10 :93771 Yes :83650
## Q1 : 0 No :10121
## Q11 : 0 1 : 0
## Q2 : 0 10 : 0
## Q3 : 0 2 : 0
## Q4 : 0 3 : 0
## (Other): 0 (Other): 0
# Setting Up a Data Frame for Response to Question 9
Q9Response <- as.data.frame(table(data9$Response))
Q9Response <- Q9Response[-c(11, 12), ]
Q9Response <- Q9Response[c(1, 3, 4, 5, 6, 7, 8, 9, 10, 2), ]
print(Q9Response)
## Var1 Freq
## 1 1 0
## 3 2 0
## 4 3 0
## 5 4 3
## 6 5 141
## 7 6 1821
## 8 7 10709
## 9 8 27345
## 10 9 32447
## 2 10 21305
# Visualisation of Simple Contingency Table for Question 9
ggbarplot(Q9Response, x = "Var1", y = "Freq",
title = "How likely are you to recommend this course to your colleagues?",
xlab = "Response", ylab = "Frequency", order = c(1:10))
From the Bar Plot shown above, majority of the responses had a rating of 8 to 10. This suggests that majority of the participants are likely to recommend their course to their colleagues.
# Computation of Relative Frequencies for Question 9
Q9_prop <- Q9Response$Freq/sum(Q9Response$Freq)
print(Q9_prop)
## [1] 0.000000e+00 0.000000e+00 0.000000e+00 3.199283e-05 1.503663e-03
## [6] 1.941965e-02 1.142038e-01 2.916147e-01 3.460238e-01 2.272024e-01
# Computation of Relative Percentages for Question 9
Q9_percent <- round(Q9_prop, 2)*100
print(Q9_percent)
## [1] 0 0 0 0 0 2 11 29 35 23
For Question 9, a rating of 8 to 10 took up the majority of responses (87%). This suggests that majority of the participants are likely to recommend their course to their colleagues.
# Setting Up a Data Frame for Response to Question 10
Q10Response <- as.data.frame(table(data10$Response))
Q10Response <- Q10Response[-c(1:10), ]
print(Q10Response)
## Var1 Freq
## 11 No 10121
## 12 Yes 83650
# Visualisation of Simple Contingency Table for Question 10
ggbarplot(Q10Response, x = "Var1", y = "Freq",
title = "Would you like to be updated on ABC's events and courses?",
xlab = "Response", ylab = "Frequency")
From the Bar Plot shown above, majority of the responses were “Yes”. This suggests that majority of the participants would like to be updated on ABC’s events and courses.
# Computation of Relative Frequencies for Question 10
Q10_prop <- Q10Response$Freq/sum(Q10Response$Freq)
print(Q10_prop)
## [1] 0.1079332 0.8920668
# Computation of Relative Percentages for Question 10
Q10_percent <- round(Q10_prop, 2)*100
print(Q10_percent)
## [1] 11 89
For Question 10, “Yes” took up the majority of responses (89%). This suggests that majority of the participants would like to be updated on ABC’s events and courses.
# Setting Up Variables
ID <- data8$ID
Year <- data8$Year
Type <- data8$Type
Domain <- data8$Domain
Question <- data8$Question
Response <- data8$Response
# Setting Up Data Frames
ID_df <- as.data.frame(table(ID, Response))
ID_df <- subset(ID_df, Response == "1" | Response == "2" | Response == "3" |
Response == "4" | Response == "5")
Year_df <- as.data.frame(table(Year, Response))
Year_df <- subset(Year_df, Response == "1" | Response == "2" | Response == "3" |
Response == "4" | Response == "5")
Type_df <- as.data.frame(table(Type, Response))
Type_df <- subset(Type_df, Response == "1" | Response == "2" | Response == "3" |
Response == "4" | Response == "5")
Domain_df <- as.data.frame(table(Domain, Response))
Domain_df <- subset(Domain_df, Response == "1" | Response == "2" | Response == "3" |
Response == "4" | Response == "5")
Question_df <- as.data.frame(table(Question, Response))
Question_df <- subset(Question_df, Response == "1" | Response == "2" | Response == "3" |
Response == "4" | Response == "5")
Question_df <- subset(Question_df, Question != "Q9" & Question != "Q10")
# Summary of Data Frames
summary(ID_df)
## ID Response Freq
## 1 : 5 1 :93771 Min. :0.0
## 2 : 5 2 :93771 1st Qu.:0.0
## 3 : 5 3 :93771 Median :0.0
## 4 : 5 4 :93771 Mean :1.8
## 5 : 5 5 :93771 3rd Qu.:3.0
## 6 : 5 10 : 0 Max. :9.0
## (Other):468825 (Other): 0
summary(Year_df)
## Year Response Freq
## 2016:5 1 :3 Min. : 47
## 2017:5 2 :3 1st Qu.: 2466
## 2018:5 3 :3 Median : 35914
## 4 :3 Mean : 56263
## 5 :3 3rd Qu.: 88418
## 10 :0 Max. :225313
## (Other):0
summary(Type_df)
## Type Response Freq
## Inhouse:5 1 :2 Min. : 98
## Public :5 2 :2 1st Qu.: 4593
## 3 :2 Median : 70117
## 4 :2 Mean : 84394
## 5 :2 3rd Qu.:125447
## 10 :0 Max. :232515
## (Other):0
summary(Domain_df)
## Domain Response Freq
## Data Analytics :5 1 :5 Min. : 0
## Effective Communication:5 2 :5 1st Qu.: 61
## Governance :5 3 :5 Median : 1450
## Personal Development :5 4 :5 Mean : 33758
## Service Excellence :5 5 :5 3rd Qu.: 6793
## 10 :0 Max. :312359
## (Other):0
summary(Question_df)
## Question Response Freq
## Q1 : 5 1 :9 Min. : 0
## Q11 : 5 2 :9 1st Qu.: 422
## Q2 : 5 3 :9 Median :14601
## Q3 : 5 4 :9 Mean :18754
## Q4 : 5 5 :9 3rd Qu.:32749
## Q5 : 5 10 :0 Max. :74840
## (Other):15 (Other):0
We will skip the visualisation of the two-way contingency table for ID against Response as there are too many unique values for ID.
# Visualisation of Year against Response
ggbarplot(Year_df, x = "Year", y = "Freq", title = "General Response over the Years",
color = "black", fill = "Response", xlab = "Year", ylab = "Frequency",
position = position_dodge())
From the Bar Plot shown above, the general response over the years is a rating of 4 to 5. This suggests that majority of the participants agree or strongly agree with the objectives set out for their courses, over the years.
# Visualisation of Type against Response
ggbarplot(Type_df, x = "Type", y = "Freq",
title = "General Response based on Type of Course",
color = "black", fill = "Response", xlab = "Type of Course",
ylab = "Frequency", position = position_dodge())
From the Bar Plot shown above, the general response based on type of course is a rating of 4 to 5. This suggests that majority of the participants do not have a particular preference for inhouse or public courses.
# Visualisation of Domain against Response
ggbarplot(Domain_df, x = "Domain", y = "Freq",
title = "General Response based on Course Domain",
color = "black", fill = "Response", xlab = "Course Domain",
ylab = "Frequency", position = position_dodge()) +
scale_x_discrete(limits = c("Data Analytics", "Effective Communication", "Governance",
"Personal Development", "Service Excellence"),
labels = c("Data\nAnalytics", "Effective\nCommunication", "Governance",
"Personal\nDevelopment", "Service\nExcellence"))
From the Bar Plot shown above, the general response based on course domain is a rating of 4 to 5, particularly in the areas of Effective Communication and Personal Development. This suggests that majority of the participants have taken a course in either Effective Communication or Personal Development. They may have a preference for these courses from ABC.
# Visualisation of Question against Response
ggbarplot(Question_df, x = "Question", y = "Freq",
title = "General Response based on Questions in Course Feedback",
color = "black", fill = "Response",
xlab = "Questions in Course Feedback", ylab = "Frequency",
order = c("Q1", "Q2", "Q3", "Q4", "Q5", "Q6", "Q7", "Q8", "Q11"),
position = position_dodge())
From the Bar Plot shown above, the general response based on questions in course feedback is a rating of 4 to 5 for all the questions, excluding questions 9 and 10. This suggests that majority of the participants agree or strongly agree with the objectives set out for their courses.
# Computation of Relative Frequencies for Year against Response
Year_prop <- table(Year, Response)
Year_prop <- Year_prop[, -c(2, 7, 8, 9, 10, 11, 12)]
Year_prop <- prop.table(Year_prop, 1)
print(Year_prop)
## Response
## Year 1 2 3 4 5
## 2016 0.0001911362 0.0091704691 0.1460524282 0.5261693873 0.3184165792
## 2017 0.0002519526 0.0112898775 0.1674837131 0.5406495579 0.2803248989
## 2018 0.0004302356 0.0147714235 0.1904178802 0.5435144736 0.2508659871
# Computation of Relative Percentages for Year against Response
Year_percent <- round(Year_prop, 2)*100
print(Year_percent)
## Response
## Year 1 2 3 4 5
## 2016 0 1 15 53 32
## 2017 0 1 17 54 28
## 2018 0 1 19 54 25
From Year 2016 to 2018, a rating of 4 to 5 is the general response (85%, 82%, 79%). This suggests that majority of the participants agree or strongly agree with the objectives set out for their courses, over the years.
# Computation of Relative Frequencies for Type against Response
Type_prop <- table(Type, Response)
Type_prop <- Type_prop[, -c(2, 7, 8, 9, 10, 11, 12)]
Type_prop <- prop.table(Type_prop, 1)
print(Type_prop)
## Response
## Type 1 2 3 4 5
## Inhouse 0.0002247356 0.0100144932 0.1570741261 0.5332081235 0.2994785217
## Public 0.0003236317 0.0129232037 0.1758864935 0.5411490398 0.2697176313
# Computation of Relative Percentages for Type against Response
Type_percent <- round(Type_prop, 2)*100
print(Type_percent)
## Response
## Type 1 2 3 4 5
## Inhouse 0 1 16 53 30
## Public 0 1 18 54 27
For the type of course, a rating of 4 to 5 is the general response (83%, 81%). This suggests that majority of the participants do not have a particular preference for inhouse or public courses.
# Computation of Relative Frequencies for Domain against Response
Domain_prop <- table(Domain, Response)
Domain_prop <- Domain_prop[, -c(2, 7, 8, 9, 10, 11, 12)]
Domain_prop <- prop.table(Domain_prop, 1)
print(Domain_prop)
## Response
## Domain 1 2 3 4
## Data Analytics 0.0000000000 0.0103953647 0.2471029312 0.4645535106
## Effective Communication 0.0001374733 0.0086814416 0.1400939287 0.5367629496
## Governance 0.0230607966 0.4633123690 0.4675052411 0.0461215933
## Personal Development 0.0004550374 0.0174021055 0.2327823691 0.5474058770
## Service Excellence 0.0023876524 0.0050311248 0.0214888718 0.3918308178
## Response
## Domain 5
## Data Analytics 0.2779481936
## Effective Communication 0.3143242068
## Governance 0.0000000000
## Personal Development 0.2019546110
## Service Excellence 0.5792615332
# Computation of Relative Percentages for Domain against Response
Domain_percent <- round(Domain_prop, 2)*100
print(Domain_percent)
## Response
## Domain 1 2 3 4 5
## Data Analytics 0 1 25 46 28
## Effective Communication 0 1 14 54 31
## Governance 2 46 47 5 0
## Personal Development 0 2 23 55 20
## Service Excellence 0 1 2 39 58
For the course domain, a rating of 4 to 5 is the general response, particularly in the areas of Data Analytics (74%), Effective Communication (85%), Personal Development (75%), and Service Excellence (97%). This is similar to the results from the Bar Plot for course domain, where absolute frequencies are highest in the areas of Effective Communication and Personal Development. This suggests that majority of the participants have taken a course in these areas. They may have a preference for these courses from ABC.
# Computation of Relative Frequencies for Question against Response
Question_prop <- table(Question, Response)
Question_prop <- Question_prop[, -c(2, 7, 8, 9, 10, 11, 12)]
Question_prop <- Question_prop[-c(2, 11), ]
Question_prop <- Question_prop[c(1, 3, 4, 5, 6, 7, 8, 9, 2), ]
Question_prop <- prop.table(Question_prop, 1)
print(Question_prop)
## Response
## Question 1 2 3 4 5
## Q1 4.265711e-05 9.053972e-03 1.765045e-01 5.267087e-01 2.876902e-01
## Q2 1.397020e-03 3.377377e-02 2.111527e-01 4.249821e-01 3.286944e-01
## Q3 6.291924e-04 2.220303e-02 2.075908e-01 4.659330e-01 3.036440e-01
## Q4 7.464995e-05 9.544529e-03 1.498118e-01 4.809163e-01 3.596528e-01
## Q5 1.599642e-04 1.033369e-02 1.557091e-01 4.845528e-01 3.492444e-01
## Q6 1.386356e-04 9.203272e-03 1.465378e-01 4.789007e-01 3.652195e-01
## Q7 0.000000e+00 4.084418e-03 1.679304e-01 5.979461e-01 2.300391e-01
## Q8 1.066428e-05 4.500325e-03 1.610626e-01 5.753591e-01 2.590673e-01
## Q11 0.000000e+00 8.531422e-05 1.191946e-01 7.981146e-01 8.260550e-02
# Computation of Relative Percentages for Question against Response
Question_percent <- round(Question_prop, 2)*100
print(Question_percent)
## Response
## Question 1 2 3 4 5
## Q1 0 1 18 53 29
## Q2 0 3 21 42 33
## Q3 0 2 21 47 30
## Q4 0 1 15 48 36
## Q5 0 1 16 48 35
## Q6 0 1 15 48 37
## Q7 0 0 17 60 23
## Q8 0 0 16 58 26
## Q11 0 0 12 80 8
For the questions in course feedback, a rating of 4 to 5 is the general response, excluding questions 9 and 10. The relative percentages are 82% for Question 1, 75% for Question 2, 77% for Question 3, 84% for Question 4, 83% for Question 5, 85% for Question 6, 83% for Question 7, 84% for Question 8, and 88% for Question 11. This suggests that majority of the participants agree or strongly agree with the objectives set out for their courses.
The ABC is said to meet public service officers’ expectations as a central learning institution for the Public Service if at least 75% of all public service officers either agree or strongly agree to the following statement: Does ABC meet your expectations as a central learning institution for the public service? Otherwise, the College is said to not meet public service officers’ expectations.
Essentially, this statement is reflected in Question 11 and 88% of all participants who have attended at least one course at the College between 2016 and 2018 inclusive; gave a rating of 4 to 5 across different course domains and course types. Therefore, it can be said that the ABC has met the expectations of public service officers as a central learning institution for the Public Service.