Is the use of Artificial Intelligence Affecting Learning tendencies of Higher Education Students?
- Use of Artificial Intelligence in higher education is associated with declining active learning behaviors among students.
- Use of Artificial Intelligence in higher education is associated with passive learning tendencies among students.
- Use of Artificial Intelligence in higher education is associated with laziness among students.
Active Learning: activeSum (AP1, AP2, AP3, AP4, AP5, AP6, AP7, AP8, AP10)
Passive Learning: AP9
Laziness: lazinessSum (L1, L2, L3, L4, L5, L6)
AI_Use, Degree, AI_tool
data <- read.csv("C:/Users/billy/OneDrive/Documents/ANLY 699/AIdata.csv")
str(data)
## 'data.frame': 64 obs. of 20 variables:
## $ Timestamp : chr "2024/04/16 9:25:37 PM AST" "2024/04/16 10:53:10 PM AST" "2024/04/19 1:57:47 PM AST" "2024/04/20 2:59:21 PM AST" ...
## $ I.am.a.student.pursuing.a.n.. : chr "Undergraduate Degree" "Undergraduate Degree" "Graduate Degree" "Graduate Degree" ...
## $ I.frequently.use.AI.based.tools.or.systems.in.my.coursework. : int 3 3 3 4 5 2 1 1 5 3 ...
## $ I.commonly.use.the.following.types.of.AI.tools.in.my.studies. : chr "Grammarly" "Photoshop ai generator, ChatGPT " "ChatGPT" "Chatgpt, QuilBot" ...
## $ Made.a.class.or.online.presentation... : int 2 1 2 4 5 1 2 5 1 1 ...
## $ Participated.in.a.community.based.project..e.g...volunteering..as.part.of.your.study. : int 1 1 3 1 1 1 5 5 2 2 ...
## $ Discussed.ideas.from.your.readings.or.classes.with.others.outside.class..e.g...students..family.members..co.workers..: int 3 3 5 4 4 1 5 5 4 3 ...
## $ Tutored.or.taught.other.university.students..paid.or.voluntary.. : int 1 1 1 1 4 1 1 5 2 1 ...
## $ I.communicated.or.worked.online.with.other.students.rather.than.use.AI. : int 4 5 2 3 4 5 5 5 2 5 ...
## $ Asked.questions.or.contributed.to.discussions.in.class.or.online.without.AI.assistance : int 5 5 4 4 5 5 5 5 1 5 ...
## $ Instead.of.using.AI.tools..I.searched.online.for.resources.relevant.to.my.studies. : int 5 4 5 3 4 3 5 5 1 4 ...
## $ I.persisted.with.challenging.learning.activities.despite.initial.setbacks.without.using.AI. : int 3 5 3 2 3 3 5 5 4 3 ...
## $ I.feel.less.engaged.with.the.course.material.when.I.use.AI. : int 4 2 4 5 5 1 1 5 1 2 ...
## $ I.tend.to.come.up.with.ideas.independently.prior.to.using.AI. : int 2 5 4 3 2 5 3 1 5 4 ...
## $ Do.you.feel.a.lack.of.ability.contributes.to.your.tendency.to.delay.tasks.. : int 3 4 2 2 2 1 5 5 4 5 ...
## $ How.much.does.a.lack.of.interest.or.enthusiasm.impact.your.motivation.to.complete.tasks.promptly.. : int 5 2 4 5 5 5 4 5 5 4 ...
## $ To.what.degree.do.you.find.yourself.intentionally.delaying.tasks.without.any.external.pressure.or.influence.. : int 4 2 4 4 4 3 2 5 5 5 ...
## $ How.often.does.AI.use.contribute.to.your.tendency.to.procrastinate.on.tasks.. : int 4 2 3 4 3 1 5 3 4 3 ...
## $ I.feel.motivated.to.complete.my.school.work.without.using.AI.tools.. : int 4 4 3 3 4 5 5 1 1 2 ...
## $ The.availability.of.AI.increases.my.laziness.tendencies. : int 3 2 4 5 2 1 1 5 3 4 ...
data <- data[,-1]
colnames(data) <- c("Degree", "AI_use", "AI_tool", "AP1", "AP2", "AP3", "AP4", "AP5", "AP6", "AP7", "AP8", "AP9", "AP10", "L1", "L2", "L3", "L4", "L5", "L6")
exclusionCriteria <- data$AI_tool %in% c("almost never", "google scholar", "Google Scholar")
data <- data[!exclusionCriteria, ]
summary(data)
## Degree AI_use AI_tool AP1
## Length:63 Min. :1.000 Length:63 Min. :1.000
## Class :character 1st Qu.:3.000 Class :character 1st Qu.:1.000
## Mode :character Median :4.000 Mode :character Median :2.000
## Mean :3.667 Mean :2.619
## 3rd Qu.:5.000 3rd Qu.:4.000
## Max. :5.000 Max. :5.000
## AP2 AP3 AP4 AP5
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :2.000
## 1st Qu.:2.500 1st Qu.:4.000 1st Qu.:1.000 1st Qu.:3.000
## Median :4.000 Median :4.000 Median :1.000 Median :4.000
## Mean :3.651 Mean :4.127 Mean :2.016 Mean :4.063
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:3.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
## AP6 AP7 AP8 AP9
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:4.000 1st Qu.:3.000 1st Qu.:3.000 1st Qu.:2.000
## Median :5.000 Median :4.000 Median :4.000 Median :4.000
## Mean :4.286 Mean :3.778 Mean :3.825 Mean :3.286
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
## AP10 L1 L2 L3
## Min. :1.000 Min. :1.000 Min. :2.000 Min. :1.000
## 1st Qu.:2.000 1st Qu.:3.000 1st Qu.:4.000 1st Qu.:2.000
## Median :4.000 Median :4.000 Median :5.000 Median :3.000
## Mean :3.444 Mean :3.984 Mean :4.317 Mean :3.317
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
## L4 L5 L6
## Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:3.000 1st Qu.:2.000 1st Qu.:2.000
## Median :4.000 Median :4.000 Median :4.000
## Mean :3.857 Mean :3.429 Mean :3.571
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000
# Preprocess responses to lowercase and remove whitespace
data$AI_tool <- tolower(trimws(data$AI_tool))
# Define custom levels based on categories
custom_levels <- c("chatgpt/openai", "grammarly", "photoshop ai generator", "quillbot", "tutorai", "google bard/gemini", "prowritingaid", "trinka", "consensus", "scite", "microsoft copilot", "cognii", "mathly", "unschooler", "duolingo", "claude.ai", "perplexity.ai", "pi.ai", "kiwi", "ivy.ai", "cramify.ai", "mindgrasp.ai", "teach anything", "soofy.io ai")
# Create a vector to store categorized responses
categorized_responses <- character(length(data$AI_tool))
# Loop through each processed response and categorize them
for (i in seq_along(data$AI_tool)) {
# Check if the response contains specific keywords and categorize accordingly
if ("chatgpt" %in% data$AI_tool[i] | " chatgpt " %in% data$AI_tool[i] | "openai" %in% data$AI_tool[i] | "chat got " %in% data$AI_tool[i] | "GPT-4" %in% data$AI_tool[i] | "llms - gpt4" %in% data$AI_tool[i]) {
categorized_responses[i] <- "chatgpt/openai"
} else if ("grammarly" %in% data$AI_tool[i] | "www.Grammarly.com" %in% data$AI_tool[i]){
categorized_responses[i] <- "grammarly"
} else if ("photoshop ai generator" %in% data$AI_tool[i]) {
categorized_responses[i] <- "photoshop ai generator"
} else if ("quillbot" %in% data$AI_tool[i] | " quilbot" %in% data$AI_tool[i]) {
categorized_responses[i] <- "quillbot"
} else if ("tutorai" %in% data$AI_tool[i] | "tutor ai" %in% data$AI_tool[i]) {
categorized_responses[i] <- "tutorai"
} else if ("google bard" %in% data$AI_tool[i] | "google gemini" %in% data$AI_tool[i] | "bard/gemini" %in% data$AI_tool[i]) {
categorized_responses[i] <- "google bard/gemini"
} else if ("prowritingaid" %in% data$AI_tool[i]) {
categorized_responses[i] <- "prowritingaid"
}else if ("trinka" %in% data$AI_tool[i]) {
categorized_responses[i] <- "trinka"
}else if ("consensus" %in% data$AI_tool[i]) {
categorized_responses[i] <- "consensus"
}else if ("scite" %in% data$AI_tool[i]) {
categorized_responses[i] <- "scite"
}else if ("microsoft copilot" %in% data$AI_tool[i] | "copilot" %in% data$AI_tool[i]) {
categorized_responses[i] <- "microsoft copilot"
}else if ("cognii" %in% data$AI_tool[i] | "Cognii.com" %in% data$AI_tool[i] ) {
categorized_responses[i] <- "cognii"
}else if ("mathly" %in% data$AI_tool[i]) {
categorized_responses[i] <- "mathly"
}else if ("unschooler" %in% data$AI_tool[i]) {
categorized_responses[i] <- "unschooler"
}else if ("duolingo" %in% data$AI_tool[i]) {
categorized_responses[i] <- "duolingo"
}else if ("claude.ai" %in% data$AI_tool[i]) {
categorized_responses[i] <- "claude.ai"
}else if ("perplexity.ai" %in% data$AI_tool[i]) {
categorized_responses[i] <- "perplexity.ai"
}else if ("pi.ai" %in% data$AI_tool[i] | "pi" %in% data$AI_tool[i] ) {
categorized_responses[i] <- "pi.ai"
}else if ("kiwi" %in% data$AI_tool[i]) {
categorized_responses[i] <- "kiwi"
}else if ("ivy.ai" %in% data$AI_tool[i]) {
categorized_responses[i] <- "ivy.ai"
}else if ("cramify.ai" %in% data$AI_tool[i]) {
categorized_responses[i] <- "cramify.ai"
}else if ("mindgrasp.ai" %in% data$AI_tool[i]) {
categorized_responses[i] <- "mindgrasp.ai"
}else if ("teach anything" %in% data$AI_tool[i]) {
categorized_responses[i] <- "teach anything"
}else if ("soofy.io ai" %in% data$AI_tool[i] | "soofy.io AI writing tools" %in% data$AI_tool[i] ) {
categorized_responses[i] <- "soofy.io ai"
}
}
str(data)
## 'data.frame': 63 obs. of 19 variables:
## $ Degree : chr "Undergraduate Degree" "Undergraduate Degree" "Graduate Degree" "Graduate Degree" ...
## $ AI_use : int 3 3 3 4 5 2 1 1 5 3 ...
## $ AI_tool: chr "grammarly" "photoshop ai generator, chatgpt" "chatgpt" "chatgpt, quilbot" ...
## $ AP1 : int 2 1 2 4 5 1 2 5 1 1 ...
## $ AP2 : int 1 1 3 1 1 1 5 5 2 2 ...
## $ AP3 : int 3 3 5 4 4 1 5 5 4 3 ...
## $ AP4 : int 1 1 1 1 4 1 1 5 2 1 ...
## $ AP5 : int 4 5 2 3 4 5 5 5 2 5 ...
## $ AP6 : int 5 5 4 4 5 5 5 5 1 5 ...
## $ AP7 : int 5 4 5 3 4 3 5 5 1 4 ...
## $ AP8 : int 3 5 3 2 3 3 5 5 4 3 ...
## $ AP9 : int 4 2 4 5 5 1 1 5 1 2 ...
## $ AP10 : int 2 5 4 3 2 5 3 1 5 4 ...
## $ L1 : int 3 4 2 2 2 1 5 5 4 5 ...
## $ L2 : int 5 2 4 5 5 5 4 5 5 4 ...
## $ L3 : int 4 2 4 4 4 3 2 5 5 5 ...
## $ L4 : int 4 2 3 4 3 1 5 3 4 3 ...
## $ L5 : int 4 4 3 3 4 5 5 1 1 2 ...
## $ L6 : int 3 2 4 5 2 1 1 5 3 4 ...
summary(data)
## Degree AI_use AI_tool AP1
## Length:63 Min. :1.000 Length:63 Min. :1.000
## Class :character 1st Qu.:3.000 Class :character 1st Qu.:1.000
## Mode :character Median :4.000 Mode :character Median :2.000
## Mean :3.667 Mean :2.619
## 3rd Qu.:5.000 3rd Qu.:4.000
## Max. :5.000 Max. :5.000
## AP2 AP3 AP4 AP5
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :2.000
## 1st Qu.:2.500 1st Qu.:4.000 1st Qu.:1.000 1st Qu.:3.000
## Median :4.000 Median :4.000 Median :1.000 Median :4.000
## Mean :3.651 Mean :4.127 Mean :2.016 Mean :4.063
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:3.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
## AP6 AP7 AP8 AP9
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:4.000 1st Qu.:3.000 1st Qu.:3.000 1st Qu.:2.000
## Median :5.000 Median :4.000 Median :4.000 Median :4.000
## Mean :4.286 Mean :3.778 Mean :3.825 Mean :3.286
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
## AP10 L1 L2 L3
## Min. :1.000 Min. :1.000 Min. :2.000 Min. :1.000
## 1st Qu.:2.000 1st Qu.:3.000 1st Qu.:4.000 1st Qu.:2.000
## Median :4.000 Median :4.000 Median :5.000 Median :3.000
## Mean :3.444 Mean :3.984 Mean :4.317 Mean :3.317
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
## L4 L5 L6
## Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:3.000 1st Qu.:2.000 1st Qu.:2.000
## Median :4.000 Median :4.000 Median :4.000
## Mean :3.857 Mean :3.429 Mean :3.571
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000
data$Degree <- factor(data$Degree,
labels = c("Undergraduate", "Graduate"))
data$AI_tool <- factor(categorized_responses,
labels = custom_levels)
str(data)
## 'data.frame': 63 obs. of 19 variables:
## $ Degree : Factor w/ 2 levels "Undergraduate",..: 2 2 1 1 2 1 1 2 1 2 ...
## $ AI_use : int 3 3 3 4 5 2 1 1 5 3 ...
## $ AI_tool: Factor w/ 24 levels "chatgpt/openai",..: 9 1 2 1 1 2 2 2 1 18 ...
## $ AP1 : int 2 1 2 4 5 1 2 5 1 1 ...
## $ AP2 : int 1 1 3 1 1 1 5 5 2 2 ...
## $ AP3 : int 3 3 5 4 4 1 5 5 4 3 ...
## $ AP4 : int 1 1 1 1 4 1 1 5 2 1 ...
## $ AP5 : int 4 5 2 3 4 5 5 5 2 5 ...
## $ AP6 : int 5 5 4 4 5 5 5 5 1 5 ...
## $ AP7 : int 5 4 5 3 4 3 5 5 1 4 ...
## $ AP8 : int 3 5 3 2 3 3 5 5 4 3 ...
## $ AP9 : int 4 2 4 5 5 1 1 5 1 2 ...
## $ AP10 : int 2 5 4 3 2 5 3 1 5 4 ...
## $ L1 : int 3 4 2 2 2 1 5 5 4 5 ...
## $ L2 : int 5 2 4 5 5 5 4 5 5 4 ...
## $ L3 : int 4 2 4 4 4 3 2 5 5 5 ...
## $ L4 : int 4 2 3 4 3 1 5 3 4 3 ...
## $ L5 : int 4 4 3 3 4 5 5 1 1 2 ...
## $ L6 : int 3 2 4 5 2 1 1 5 3 4 ...
# Missing Data
sum(is.na(data)) > 0
## [1] FALSE
## No missing data present
# Duplicate Data
nrow(data[duplicated(data), ]) > 0
## [1] FALSE
## No duplicate data present
# Errors
sum(data[, -c(1,3)] < 0, na.rm = TRUE) > 0
## [1] FALSE
## No errors present
# Outliers
q1 <- quantile(data[, -c(1,3)], 0.25, na.rm = TRUE)
q3 <- quantile(data[, -c(1,3)], 0.75, na.rm = TRUE)
iqr <- q3 - q1
lower_bound <- q1 - 1.5 * iqr
upper_bound <- q3 + 1.5 * iqr
outliers <- data[, -c(1,3)] < lower_bound | data[, -c(1,3)] > upper_bound
sum(outliers, na.rm = TRUE) > 0
## [1] FALSE
## no outliers present
data$avrgActive <- as.integer(rowMeans(data[c("AP1", "AP2", "AP3", "AP4", "AP5", "AP6", "AP7", "AP8", "AP10")]))
data$avrgLaziness <- as.integer(rowMeans(data[c("L1", "L2", "L3", "L4", "L5", "L6")]))
str(data)
## 'data.frame': 63 obs. of 21 variables:
## $ Degree : Factor w/ 2 levels "Undergraduate",..: 2 2 1 1 2 1 1 2 1 2 ...
## $ AI_use : int 3 3 3 4 5 2 1 1 5 3 ...
## $ AI_tool : Factor w/ 24 levels "chatgpt/openai",..: 9 1 2 1 1 2 2 2 1 18 ...
## $ AP1 : int 2 1 2 4 5 1 2 5 1 1 ...
## $ AP2 : int 1 1 3 1 1 1 5 5 2 2 ...
## $ AP3 : int 3 3 5 4 4 1 5 5 4 3 ...
## $ AP4 : int 1 1 1 1 4 1 1 5 2 1 ...
## $ AP5 : int 4 5 2 3 4 5 5 5 2 5 ...
## $ AP6 : int 5 5 4 4 5 5 5 5 1 5 ...
## $ AP7 : int 5 4 5 3 4 3 5 5 1 4 ...
## $ AP8 : int 3 5 3 2 3 3 5 5 4 3 ...
## $ AP9 : int 4 2 4 5 5 1 1 5 1 2 ...
## $ AP10 : int 2 5 4 3 2 5 3 1 5 4 ...
## $ L1 : int 3 4 2 2 2 1 5 5 4 5 ...
## $ L2 : int 5 2 4 5 5 5 4 5 5 4 ...
## $ L3 : int 4 2 4 4 4 3 2 5 5 5 ...
## $ L4 : int 4 2 3 4 3 1 5 3 4 3 ...
## $ L5 : int 4 4 3 3 4 5 5 1 1 2 ...
## $ L6 : int 3 2 4 5 2 1 1 5 3 4 ...
## $ avrgActive : int 2 3 3 2 3 2 4 4 2 3 ...
## $ avrgLaziness: int 3 2 3 3 3 2 3 4 3 3 ...
dim(data)
## [1] 63 21
names(data)
## [1] "Degree" "AI_use" "AI_tool" "AP1" "AP2"
## [6] "AP3" "AP4" "AP5" "AP6" "AP7"
## [11] "AP8" "AP9" "AP10" "L1" "L2"
## [16] "L3" "L4" "L5" "L6" "avrgActive"
## [21] "avrgLaziness"
summary(data)
## Degree AI_use AI_tool AP1
## Undergraduate:34 Min. :1.000 chatgpt/openai :12 Min. :1.000
## Graduate :29 1st Qu.:3.000 grammarly :10 1st Qu.:1.000
## Median :4.000 teach anything : 5 Median :2.000
## Mean :3.667 consensus : 3 Mean :2.619
## 3rd Qu.:5.000 photoshop ai generator: 2 3rd Qu.:4.000
## Max. :5.000 quillbot : 2 Max. :5.000
## (Other) :29
## AP2 AP3 AP4 AP5
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :2.000
## 1st Qu.:2.500 1st Qu.:4.000 1st Qu.:1.000 1st Qu.:3.000
## Median :4.000 Median :4.000 Median :1.000 Median :4.000
## Mean :3.651 Mean :4.127 Mean :2.016 Mean :4.063
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:3.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
##
## AP6 AP7 AP8 AP9
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:4.000 1st Qu.:3.000 1st Qu.:3.000 1st Qu.:2.000
## Median :5.000 Median :4.000 Median :4.000 Median :4.000
## Mean :4.286 Mean :3.778 Mean :3.825 Mean :3.286
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
##
## AP10 L1 L2 L3
## Min. :1.000 Min. :1.000 Min. :2.000 Min. :1.000
## 1st Qu.:2.000 1st Qu.:3.000 1st Qu.:4.000 1st Qu.:2.000
## Median :4.000 Median :4.000 Median :5.000 Median :3.000
## Mean :3.444 Mean :3.984 Mean :4.317 Mean :3.317
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
##
## L4 L5 L6 avrgActive
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :2.000
## 1st Qu.:3.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:3.000
## Median :4.000 Median :4.000 Median :4.000 Median :3.000
## Mean :3.857 Mean :3.429 Mean :3.571 Mean :3.111
## 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:5.000 3rd Qu.:4.000
## Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000
##
## avrgLaziness
## Min. :2.000
## 1st Qu.:3.000
## Median :3.000
## Mean :3.365
## 3rd Qu.:4.000
## Max. :5.000
##
# Frequency of each variable - find mean of each question
## Active learning
hist(data$avrgActive, main = "Average Active Learning Frequency Plot", xlab = "Active Learning", ylab = "Frequency")
## Passive learning
hist(data$AP9, main = "Passive Leanring Frequency Plot", xlab = "Passive Learning", ylab = "Frequency")
## Laziness
hist(data$avrgLaziness, main = "Average LazinessFrequency Plot", xlab = "Laziness", ylab = "Frequency")
# Pie Chart of which AI tools were used the most
aitoolPlot <- plot_ly(data, labels = ~AI_tool, values = ~AI_use, type = 'pie',width = 800, height = 550)
aitoolPlot <- aitoolPlot %>%
layout(title = 'Percentage of AI Tools used',
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))
aitoolPlot
## Here we can see that chatGPT 26.2%, mindgrasp 8.93% and grammarly 8.33% are the top 3 AI tools used.
# Bar plot of which AI tools were used by Degree
topAItools <- data %>%
filter(AI_tool %in% c("chatgpt/openai", "grammarly", "teach anything", "consensus", "duolingo", "claude.ai", "scite", "trinka", "prowritingaid", "microsoft copilot"))
aiToolDegreePlot <- ggplot(topAItools, aes(x = AI_tool, y = AI_use, fill = Degree)) +
stat_summary(fun = base::mean,
geom = "bar",
position = "dodge") +
theme_classic() +
coord_cartesian(ylim = c(0.5, 5.5)) +
labs(x = "AI Tools",
y = "AI Use",
title = "Top 10 AI tools used by degree",
color = "Category") +
theme(legend.position = "right", axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
ggplotly(aiToolDegreePlot)
str(data)
## 'data.frame': 63 obs. of 21 variables:
## $ Degree : Factor w/ 2 levels "Undergraduate",..: 2 2 1 1 2 1 1 2 1 2 ...
## $ AI_use : int 3 3 3 4 5 2 1 1 5 3 ...
## $ AI_tool : Factor w/ 24 levels "chatgpt/openai",..: 9 1 2 1 1 2 2 2 1 18 ...
## $ AP1 : int 2 1 2 4 5 1 2 5 1 1 ...
## $ AP2 : int 1 1 3 1 1 1 5 5 2 2 ...
## $ AP3 : int 3 3 5 4 4 1 5 5 4 3 ...
## $ AP4 : int 1 1 1 1 4 1 1 5 2 1 ...
## $ AP5 : int 4 5 2 3 4 5 5 5 2 5 ...
## $ AP6 : int 5 5 4 4 5 5 5 5 1 5 ...
## $ AP7 : int 5 4 5 3 4 3 5 5 1 4 ...
## $ AP8 : int 3 5 3 2 3 3 5 5 4 3 ...
## $ AP9 : int 4 2 4 5 5 1 1 5 1 2 ...
## $ AP10 : int 2 5 4 3 2 5 3 1 5 4 ...
## $ L1 : int 3 4 2 2 2 1 5 5 4 5 ...
## $ L2 : int 5 2 4 5 5 5 4 5 5 4 ...
## $ L3 : int 4 2 4 4 4 3 2 5 5 5 ...
## $ L4 : int 4 2 3 4 3 1 5 3 4 3 ...
## $ L5 : int 4 4 3 3 4 5 5 1 1 2 ...
## $ L6 : int 3 2 4 5 2 1 1 5 3 4 ...
## $ avrgActive : int 2 3 3 2 3 2 4 4 2 3 ...
## $ avrgLaziness: int 3 2 3 3 3 2 3 4 3 3 ...
# Correlation
cor_matrix <- cor(data[,c(2, 12, 20:21)])
new_names <- c("AI Use", "Active Learning", "Passive Learning", "Laziness")
colnames(cor_matrix) <- new_names
rownames(cor_matrix) <- new_names
corrplot(cor_matrix, tl.col = "black", tl.srt = 15, tl.cex = 0.8, cl.cex = 0.8, number.cex = 0.8)
mtext("Correlation Matrix", side = 2, line = 1, cex = 1.2)
#corrplot(cor(data[,c(2, 12, 20:21)]))
# Normality
shapiro.test(data$AI_use)
##
## Shapiro-Wilk normality test
##
## data: data$AI_use
## W = 0.85063, p-value = 1.999e-06
shapiro.test(data$avrgActive)
##
## Shapiro-Wilk normality test
##
## data: data$avrgActive
## W = 0.84677, p-value = 1.531e-06
shapiro.test(data$AP9)
##
## Shapiro-Wilk normality test
##
## data: data$AP9
## W = 0.78694, p-value = 3.743e-08
shapiro.test(data$avrgLaziness)
##
## Shapiro-Wilk normality test
##
## data: data$avrgLaziness
## W = 0.84312, p-value = 1.194e-06
# Homoscedasticity
bartlett.test(data$avrgActive, data$AI_use)
##
## Bartlett test of homogeneity of variances
##
## data: data$avrgActive and data$AI_use
## Bartlett's K-squared = 8.0748, df = 4, p-value = 0.08887
bartlett.test(data$AP9, data$AI_use)
##
## Bartlett test of homogeneity of variances
##
## data: data$AP9 and data$AI_use
## Bartlett's K-squared = 0.65198, df = 4, p-value = 0.9571
bartlett.test(data$avrgLaziness, data$AI_use)
##
## Bartlett test of homogeneity of variances
##
## data: data$avrgLaziness and data$AI_use
## Bartlett's K-squared = 7.1794, df = 4, p-value = 0.1267
# Multicollinearity
mutlicollinearModel1 <- lm(AI_use ~ avrgActive + AP9 + avrgLaziness, data = data)
vif(mutlicollinearModel1)
## avrgActive AP9 avrgLaziness
## 1.578639 1.491100 2.174997
## Very moderate multicollinearity since all are below 5
model1 <- lm(avrgActive ~ AI_use + Degree + AI_tool, data = data)
model2 <- lm(AP9 ~ AI_use + Degree + AI_tool, data = data)
model3 <- lm(avrgLaziness ~ AI_use + Degree + AI_tool, data = data)
# Independence of Errors:
durbinWatsonTest(model1)
## lag Autocorrelation D-W Statistic p-value
## 1 -0.1608455 2.267462 0.718
## Alternative hypothesis: rho != 0
durbinWatsonTest(model2)
## lag Autocorrelation D-W Statistic p-value
## 1 -0.08433773 2.14565 0.476
## Alternative hypothesis: rho != 0
durbinWatsonTest(model3)
## lag Autocorrelation D-W Statistic p-value
## 1 -0.1316651 2.261465 0.772
## Alternative hypothesis: rho != 0
# Linearity
ggplot(data, aes(x = fitted(model1), y = residuals(model1))) +
geom_point() +
geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
labs(x = "Fitted values", y = "Residuals") +
ggtitle("Residuals vs. Fitted Plot for Active Learning")
ggplot(data, aes(x = fitted(model2), y = residuals(model2))) +
geom_point() +
geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
labs(x = "Fitted values", y = "Residuals") +
ggtitle("Residuals vs. Fitted Plot for Passive Learning")
ggplot(data, aes(x = fitted(model3), y = residuals(model3))) +
geom_point() +
geom_hline(yintercept = 0, linetype = "dashed", color = "red") +
labs(x = "Fitted values", y = "Residuals") +
ggtitle("Residuals vs. Fitted Plot for Laziness")
# No Outliers or Influential Observations:
## Assumption met when checking for outliers in preprocessing steps
agreeActiveSubset <- subset(data[, -c(4:19, 21)], avrgActive > 3)
disagreeActiveSubset <- subset(data[, -c(4:19, 21)], avrgActive < 3)
neutralActiveSubset <- subset(data[, -c(4:19, 21)], avrgActive == 3)
str(neutralActiveSubset)
## 'data.frame': 31 obs. of 4 variables:
## $ Degree : Factor w/ 2 levels "Undergraduate",..: 2 1 2 2 1 1 2 1 2 2 ...
## $ AI_use : int 3 3 5 3 5 5 4 3 4 3 ...
## $ AI_tool : Factor w/ 24 levels "chatgpt/openai",..: 1 2 1 18 23 5 19 13 8 2 ...
## $ avrgActive: int 3 3 3 3 3 3 3 3 3 3 ...
agreeLazinessSubset <- subset(data[, -c(4:20)], avrgLaziness > 3)
disagreeLazinessSubset <- subset(data[, -c(4:20)], avrgLaziness < 3)
neutralLazinessSubset <- subset(data[, -c(4:20)], avrgLaziness == 3)
agreePassiveSubset <- subset(data[, -c(4:11,13:21)], AP9 > 3)
disagreePassiveSubset <- subset(data[, -c(4:11,13:21)], AP9 < 3)
neutralPassiveSubset <- subset(data[, -c(4:11,13:21)], AP9 == 3)
str(neutralPassiveSubset)
## 'data.frame': 3 obs. of 4 variables:
## $ Degree : Factor w/ 2 levels "Undergraduate",..: 1 2 2
## $ AI_use : int 1 1 2
## $ AI_tool: Factor w/ 24 levels "chatgpt/openai",..: 4 4 23
## $ AP9 : int 3 3 3
#ACTIVE LEARNING
activeLearningModel1 <- lm(avrgActive ~ AI_use, data = agreeActiveSubset)
summary(activeLearningModel1)
##
## Call:
## lm(formula = avrgActive ~ AI_use, data = agreeActiveSubset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.3925 -0.3925 -0.1536 0.6075 0.6075
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.79522 0.33821 11.222 1.07e-08 ***
## AI_use 0.11945 0.07665 1.559 0.14
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.45 on 15 degrees of freedom
## Multiple R-squared: 0.1394, Adjusted R-squared: 0.08199
## F-statistic: 2.429 on 1 and 15 DF, p-value: 0.14
# The results show that AI use influence on increased Active Learning is not significant
## NOT significant F-statistic: 2.429 on 1 and 15 DF, p-value: 0.14
activeLearningModel3 <- lm(avrgActive ~ AI_use, data = disagreeActiveSubset)
summary(activeLearningModel3)
## Warning in summary.lm(activeLearningModel3): essentially perfect fit: summary
## may be unreliable
##
## Call:
## lm(formula = avrgActive ~ AI_use, data = disagreeActiveSubset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.593e-15 6.370e-17 9.555e-17 1.592e-16 1.592e-16
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.000e+00 3.561e-16 5.616e+15 <2e-16 ***
## AI_use 3.185e-17 9.877e-17 3.220e-01 0.752
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.59e-16 on 13 degrees of freedom
## Multiple R-squared: 0.561, Adjusted R-squared: 0.5273
## F-statistic: 16.61 on 1 and 13 DF, p-value: 0.001311
# The results show that AI use influence on decreased Active Learning is significant
## Significant F-statistic: 16.61 on 1 and 13 DF, p-value: 0.001311
activeLearningModel2 <- lm(avrgActive ~ AI_use, data = neutralActiveSubset)
summary(activeLearningModel2)
## Warning in summary.lm(activeLearningModel2): essentially perfect fit: summary
## may be unreliable
##
## Call:
## lm(formula = avrgActive ~ AI_use, data = neutralActiveSubset)
##
## Residuals:
## Min 1Q Median 3Q Max
## 0 0 0 0 0
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3 0 Inf <2e-16 ***
## AI_use 0 0 NaN NaN
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0 on 29 degrees of freedom
## Multiple R-squared: NaN, Adjusted R-squared: NaN
## F-statistic: NaN on 1 and 29 DF, p-value: NA
# NA
#PASSIVE LEARNING
passiveLearningModel1 <- lm(AP9 ~ AI_use, data = agreePassiveSubset)
summary(passiveLearningModel1)
##
## Call:
## lm(formula = AP9 ~ AI_use, data = agreePassiveSubset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.91484 0.08516 0.08516 0.20839 0.57806
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.29871 0.23804 18.059 <2e-16 ***
## AI_use 0.12323 0.05743 2.146 0.0398 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3936 on 31 degrees of freedom
## Multiple R-squared: 0.1293, Adjusted R-squared: 0.1012
## F-statistic: 4.604 on 1 and 31 DF, p-value: 0.03983
#F-statistic: 4.604 on 1 and 31 DF, p-value: 0.03983
passiveLearningModel3 <- lm(AP9 ~ AI_use, data = disagreePassiveSubset)
summary(passiveLearningModel3)
##
## Call:
## lm(formula = AP9 ~ AI_use, data = disagreePassiveSubset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.6226 -0.4760 -0.2318 0.4751 0.5728
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.13410 0.30458 3.724 0.001 **
## AI_use 0.09770 0.08119 1.203 0.240
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5048 on 25 degrees of freedom
## Multiple R-squared: 0.05476, Adjusted R-squared: 0.01695
## F-statistic: 1.448 on 1 and 25 DF, p-value: 0.2401
#F-statistic: 1.448 on 1 and 25 DF, p-value: 0.2401
passiveLearningModel2 <- lm(AP9 ~ AI_use, data = neutralPassiveSubset)
summary(passiveLearningModel2)
## Warning in summary.lm(passiveLearningModel2): essentially perfect fit: summary
## may be unreliable
##
## Call:
## lm(formula = AP9 ~ AI_use, data = neutralPassiveSubset)
##
## Residuals:
## 26 27 36
## 3.846e-16 -3.846e-16 -4.930e-32
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.000e+00 9.421e-16 3.185e+15 <2e-16 ***
## AI_use -3.846e-16 6.661e-16 -5.770e-01 0.667
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.439e-16 on 1 degrees of freedom
## Multiple R-squared: 0.5714, Adjusted R-squared: 0.1429
## F-statistic: 1.333 on 1 and 1 DF, p-value: 0.4544
#F-statistic: 1.333 on 1 and 1 DF, p-value: 0.4544
#LAZINESS
lazinessModel1 <- lm(avrgLaziness ~ AI_use, data = agreeLazinessSubset)
summary(lazinessModel1)
##
## Call:
## lm(formula = avrgLaziness ~ AI_use, data = agreeLazinessSubset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.4308 -0.4308 -0.1077 0.5692 0.5692
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.6231 0.3283 11.04 3.35e-10 ***
## AI_use 0.1615 0.0748 2.16 0.0425 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4356 on 21 degrees of freedom
## Multiple R-squared: 0.1817, Adjusted R-squared: 0.1428
## F-statistic: 4.664 on 1 and 21 DF, p-value: 0.04252
#F-statistic: 4.664 on 1 and 21 DF, p-value: 0.04252
lazinessModel3 <- lm(avrgLaziness ~ AI_use, data = disagreeLazinessSubset)
summary(lazinessModel3)
## Warning in summary.lm(lazinessModel3): essentially perfect fit: summary may be
## unreliable
##
## Call:
## lm(formula = avrgLaziness ~ AI_use, data = disagreeLazinessSubset)
##
## Residuals:
## 2 6 14 19 32 48 58
## 0 0 0 0 0 0 0
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2 0 Inf <2e-16 ***
## AI_use 0 0 NaN NaN
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0 on 5 degrees of freedom
## Multiple R-squared: NaN, Adjusted R-squared: NaN
## F-statistic: NaN on 1 and 5 DF, p-value: NA
# NA
lazinessModel2 <- lm(avrgLaziness ~ AI_use, data = neutralLazinessSubset)
summary(lazinessModel2)
## Warning in summary.lm(lazinessModel2): essentially perfect fit: summary may be
## unreliable
##
## Call:
## lm(formula = avrgLaziness ~ AI_use, data = neutralLazinessSubset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.356e-16 -2.756e-16 -1.956e-16 -1.156e-16 7.378e-15
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.000e+00 7.028e-16 4.268e+15 <2e-16 ***
## AI_use -8.000e-17 1.868e-16 -4.280e-01 0.671
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.35e-15 on 31 degrees of freedom
## Multiple R-squared: 0.5133, Adjusted R-squared: 0.4976
## F-statistic: 32.7 on 1 and 31 DF, p-value: 2.741e-06
#F-statistic: 32.7 on 1 and 31 DF, p-value: 2.741e-06
#I aim to study whether AI use has an influence on Active Learning, passive learning tendencies and laziness. A regression analyses was conducted by using R. I found that, There is a direct relationship between .
# Result shows that after controlling for typical open time, openrate and click through rate both have a significant influence on digital literacy.
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.