During the Fall semester of 2018, North Shore Community College revamped its college website. The new website included a new look to the Homepage, more user-friendly access to current students and faculty, and better tabs/links to ensure easy access for future students to apply. However, once the website went live, many people discovered that there were bugs that still had to be worked out. There were also many students and faculty members voicing their disapproval of the new website, in favor of the old website. Some believed that if there was nothing wrong with the old North Shore website, it should have been left as is. Throughout this analysis, we can see if the sampled data shows a significance difference in the overall satisfaction of each website.
Is there a difference in how satisified people at North Shore Community College are with the new website versus the old website?
Due to the overwhelming complaints that I had heard from various other students, as well as some of my professors, I decided to conduct a small, independent survey to analyze whether there is a significant difference in how people at North Shore Community college viewed each website. I surveyed students of various majors, faculty of different departments, and staff members from different areas of North Shore Community College, both on the Danvers and Lynn campuses. The survery asked each person to rate their satisfaction on various categories of the websites on a scale of 1 to 3. A rating of 1 is ‘Not Satisfied’, 2 is ‘Somewhat Satisfied’, and 3 is ‘Not Satisfied’. The components that did not apply to that person’s experience were rated ‘NA’. Six different categories were included in the data for each website. These categories include accessing your personal portal, viewing blackboard, accessing the library database, overall load times, overall aesthetics, and accessing the website through personal handheld devices. The same categories were included for both websites.
In order to explore this data thoroughly, we need to look at it through different functions. First, we need to load the data into our environment. A composite score was created from each catergory of the new and old website. The composite score describes the overall satisfaction from all the categories. This is necessary to best analyze the data as a whole.
# Load and store surveyed dataset
Website_survey <- read.csv("C:/Users/Guard/Desktop/final_project_cleaned_data.csv")
# Create and store as a dataframe
df <- as.data.frame(Website_survey)
# Create composite scores for the new website data
New_Comp_Score <- rowMeans(df[,4:9], na.rm = TRUE)
# Create composite scores for the old website data
Old_Comp_Score <- rowMeans(df[,10:15], na.rm = TRUE)
# Load libraries
library(reshape2)
## Warning: package 'reshape2' was built under R version 3.5.3
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.5.3
# Reshape data frame as variable and value
mdf <- melt(df, id=c("ID","Status", "Department"))
## Warning: attributes are not identical across measure variables; they will
## be dropped
By exploring the data using various functions, we can see the the variables and observations contained in the survey. The summary, structure, table, and ggplot functions will give us a better look into how the data is setup.
# View the structure of the dataframe
str(df)
## 'data.frame': 19 obs. of 19 variables:
## $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Status : Factor w/ 4 levels "","Faculty","Staff",..: 2 4 4 4 4 4 4 4 2 4 ...
## $ Department : Factor w/ 17 levels "","Biology","Business Admin",..: 11 8 9 16 9 6 9 3 14 2 ...
## $ New_Personal_Portal: int 1 2 2 3 3 3 2 2 2 3 ...
## $ New_Blackboard : int 2 2 3 3 2 2 2 NA 2 1 ...
## $ New_Library : int NA 1 2 1 2 NA 2 1 2 2 ...
## $ New_Load_Time : int 1 2 3 3 3 3 2 2 3 2 ...
## $ New_Aesthetics : int 1 2 3 3 3 3 1 1 1 1 ...
## $ New_Devices : int NA 1 3 2 2 3 2 1 1 2 ...
## $ OId_Personal_Portal: int 3 1 3 1 1 3 2 2 3 3 ...
## $ Old_Blackboard : int 2 1 3 3 1 3 3 NA 3 3 ...
## $ Old_Library : int NA 1 2 3 2 NA 3 2 3 NA ...
## $ Old_Load_Time : int 2 1 1 3 2 2 1 2 3 3 ...
## $ Old_Aesthetics : int 2 1 2 2 2 3 3 2 1 2 ...
## $ Old_Devices : int NA 1 2 2 2 3 3 1 2 2 ...
## $ Times : Factor w/ 4 levels "","4-Jan","7-May",..: 4 4 3 4 3 4 3 4 4 2 ...
## $ Platform : Factor w/ 4 levels "","Campus","Personal",..: 3 3 3 3 3 2 4 3 2 2 ...
## $ New_Comp_Score : num 1.25 1.67 2.67 2.5 2.5 ...
## $ Old_Comp_Score : num 2.25 1 2.17 2.33 1.67 ...
# Determine if there are any NA's in the data
table(is.na(df))
##
## FALSE TRUE
## 320 41
# View the summary of people surveyed
summary(df$Status)
## Faculty Staff Student
## 1 4 3 11
There are 19 observations over 19 different variables, with 41 NA responses. The survey includes 11 students, 4 faculty, and 3 staff members.
# Subset Personal Portal responses
personal_portal <- subset(mdf, variable == "New_Personal_Portal" | variable == "OId_Personal_Portal")
# Create a bar graph showing satisfaction of accessing the Personal Portal of each website
(pp <- ggplot(data = personal_portal, aes(value)) + geom_bar(aes(fill = variable)))
# Create a bar graph showing satisfaction of viewing Blackboard of each website
blackboard <- subset(mdf, variable == "New_Blackboard" | variable == "Old_Blackboard")
(pp <- ggplot(data = blackboard, aes(value)) + geom_bar(aes(fill = variable)))
# Create a bar graph showing satisfaction of accessing the Library webpage of each website
library <- subset(mdf, variable == "New_Library" | variable == "Old_Library")
(pp <- ggplot(data = library, aes(value)) + geom_bar(aes(fill = variable)))
# Create a bar graph showing satisfaction of main load times of each website
loadtime <- subset(mdf, variable == "New_Load_Time" | variable == "Old_Load_Time")
(pp <- ggplot(data = loadtime, aes(value)) + geom_bar(aes(fill = variable)))
# Create a bar graph showing satisfaction of the aesthetics of each website
aesthetics <- subset(mdf, variable == "New_Aesthetics" | variable == "Old_Aesthetics")
(pp <- ggplot(data = aesthetics, aes(value)) + geom_bar(aes(fill = variable)))
# Create a bar graph showing satisfaction of accessing each website through other devices
devices <- subset(mdf, variable == "New_Devices" | variable == "Old_Devices")
(pp <- ggplot(data = devices, aes(value)) + geom_bar(aes(fill = variable)))
# Subset the composite scores for each website
composite_score <- subset(mdf, variable == "New_Comp_Score" | variable == "Old_Comp_Score")
composite_score$value <- as.numeric(composite_score$value)
# Create a boxplot comparing the overall satisfaction of each website
pp <- ggplot(data = composite_score, aes(variable, value)) + geom_boxplot()
pp
## Warning: Removed 2 rows containing non-finite values (stat_boxplot).
We can conduct a Paired Data T-Test to test if there is a significant difference between the overall satisfaction of each website version.
\(H_O: \mu New - \mu Old = 0\)
\(H_A: \mu New - \mu Old \neq 0\)
# Find the difference in the composite scores
df$compdiff <- (New_Comp_Score) - (Old_Comp_Score)
# Find the mean difference in composite scores
(meandiff <- mean(df$compdiff, na.rm = TRUE))
## [1] -0.1185185
# Find the standard deviation difference in composite scores
(sddiff <- sd(df$compdiff, na.rm = TRUE))
## [1] 0.6319616
# Find sample size of composite scores
table(is.na(df$compdiff))
##
## FALSE TRUE
## 18 1
# Calculate the t-critical value for a 95% confidence interval
(tcv <- abs(qt(p = 0.025, df = 17)))
## [1] 2.109816
# Find the margin of error
(merror <- sddiff/sqrt(18))
## [1] 0.1489548
# Calculate lower bound of a 95% confidence interval
meandiff - tcv * merror
## [1] -0.4327856
# Calculate upper bound of a 95% confidence interval
meandiff + tcv * merror
## [1] 0.1957486
Using a confidence interval of 95%, we fail to reject the null hypothesis.
# Use a t.test function to determine the p-value
t.test(New_Comp_Score, Old_Comp_Score)$p.value
## [1] 0.4977981
Using a p-value signficance of 0.05, we fail to reject the null hypothesis.
Reviewing both tests, we can see that there is not enough evidence to suggest that there is a difference in the level of satisfaction between the new and old North Shore Community College websites. We can be 95% confident that the difference in composite scores for the overall satisfaction lies between -0.43 and 0.20. The p-value test resulted in a significance level of 0.50, which is more than the 0.05 significance level needed to reject the null hypothesis.
Even though there is not enough evidence to suggest a difference in the satisfaction of websites, the graphs display an indepth look at the satisfaction for each aspect of each website. While people were generally more satisfied with accessing the old personal portal and blackboard, the aesthetics was a clear favorite of the new website. Furthermore, the load times and library access seemed to be about the same. Finally, the boxplot suggests a slightly higher overall satisfaction with the old website versus the new one.
Upon conducting this survey independently, the most obvious limitation was the small sample size. This did a disservice to being able to show a truer overall satisfaction among those at North Shore Community College. Beyond simply collecting more surveys, the responses from students over those of faculty and staff were significantly more. I feel this did not accurately advocate for a more diverse sample size. Another limitation is that of how the survey was conducted. Some people had to ask what the different categories in the survey pertained to; thus, they were unsure how to rate them. I feel many of the NA responses were based on this fact. My survey could have been more transparent and revised, once a colleague had looked it over. Inputting the data itself required a bit cleaning and transformation, in order to analyze it as a whole. There were no biases when drafting the survey, conducting the survey, or analyzing the data; as I wanted to be sure to obtain an accurate analysis based solely on the survey responses collected. Overall, the data and the analysis suggest a different outcome than one I was expecting.
This document was produced as a final project for MAT 143H - Introduction to Statistics (Honors) at North Shore Community College.
The course was led by Professor Billy Jackson.
Student Name: Jamie Perry Semester: Spring 2019