Start Date: 7 Dec 2024
Report Date: 31 December 2024
Source: Psi Chi
R
Welcome to the contest for December! The prompts can be found here,
and the data is available here. The contest will run until the last day
of December (December 31st), latest time zone on earth.
knitr::opts_chunk$set(echo = T,message = F,warning = F)
setwd("C:/Users/alanh/Documents/R/Psi_Chi_R")
library(tidyverse)
data=read_csv('data.csv')
#access CSV directly through URL; local copy will be overwritten with a different month's data
#data= read.csv('https://osf.io/download/7uc9x/')
Skim and clean
## EDA
names(data) = make.names(colnames(data))
SmartEDA::ExpData(data,type=2)
Level 1:
Write a script that will remove participants who are missing the
‘age’ variable. Then, write a script that will show the distribution of
the ‘rating’ variable. Is ‘Review Rating’ skewed?
data1 = data %>%
filter(!is.na(Age))
hist(data1$Review.Rating)

boxplot(data1$Review.Rating)

shapiro.test(data1$Review.Rating)
##
## Shapiro-Wilk normality test
##
## data: data1$Review.Rating
## W = 0.95552, p-value < 2.2e-16
#yes, Review Rating is skewed.
Level 2:
Find the min, median, max, average, and standard deviation of ‘Review
Rating’
desc_stats = function(x) {
c(min = min(x,na.rm = T),
median = median(x,na.rm=T),
max = max(x,na.rm = T),
average = mean(x,na.rm=T),
standard_dev = sd(x,na.rm=T))
}
print(desc_stats(data1$Review.Rating))
## min median max average standard_dev
## 2.500000 3.700000 5.000000 3.749910 0.716266
Level 3:
Create a table that shows the average ‘Review Rating’ for each
‘Category.’
review_rating_table = data1 %>%
group_by(Category) %>%
summarise(Average_Review_Rating = mean(Review.Rating,na.rm = T))
print(review_rating_table)
## # A tibble: 4 × 2
## Category Average_Review_Rating
## <chr> <dbl>
## 1 Accessories 3.77
## 2 Clothing 3.72
## 3 Footwear 3.79
## 4 Outerwear 3.74
Level 4:
Create a visualization showing the relationship between ‘Purchase
Amount (USD)’ and ‘Review Rating.’
data1 %>%
ggplot(aes(x=Review.Rating,y=Purchase.Amount..USD.))+
geom_point()+
geom_smooth(method = "lm", se = T)

Test the correlation between the variables. What is the rvalue? What
is the p value?
shapiro.test(data1$Purchase.Amount..USD.)
##
## Shapiro-Wilk normality test
##
## data: data1$Purchase.Amount..USD.
## W = 0.95023, p-value < 2.2e-16
shapiro.test(data1$Review.Rating)
##
## Shapiro-Wilk normality test
##
## data: data1$Review.Rating
## W = 0.95552, p-value < 2.2e-16
cor.test(data1$Purchase.Amount..USD.,data1$Review.Rating,method = 'spearman')
##
## Spearman's rank correlation rho
##
## data: data1$Purchase.Amount..USD. and data1$Review.Rating
## S = 9436398249, p-value = 0.06224
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.02994088
#p-value = 0.06224 and rho = 0.02994088
LS0tDQp0aXRsZTogIlBzaSBDaGkgUiAtIERlY2VtYmVyIDIwMjQiDQphdXRob3I6ICJieSBBbGFuIExhbSINCiNkYXRlOiAiYHIgU3lzLkRhdGUoKWAiDQojZGF0ZTogIkRhdGU6IGByIGZvcm1hdChTeXMuRGF0ZSgpLCAnJWQgJUIgJVknKWAiIA0Kb3V0cHV0Og0KICBodG1sX2RvY3VtZW50Og0KICAgIHRoZW1lOiByZWFkYWJsZQ0KICAgIGFsd2F5c19hbGxvd19odG1sOiB5ZXMNCiAgICBkZl9wcmludDogcGFnZWQNCiAgICB0b2M6IHllcw0KICAgIHRvY19mbG9hdDogeWVzDQogICAgbnVtYmVyX3NlY3Rpb25zOiBubw0KICAgIGFuY2hvcl9zZWN0aW9uczogVFJVRQ0KICAgIGNvZGVfZm9sZGluZzogaGlkZQ0KICAgIGNvZGVfZG93bmxvYWQ6IHRydWUNCi0tLQ0KDQpTdGFydCBEYXRlOiA3IERlYyAyMDI0DQoNClJlcG9ydCBEYXRlOiBgciBmb3JtYXQoU3lzLkRhdGUoKSwgJyVkICVCICVZJylgDQoNClsqKlNvdXJjZSoqOiBQc2kgQ2hpIFJdKGh0dHBzOi8vb3NmLmlvL2Via2NqLykNCg0KV2VsY29tZSB0byB0aGUgY29udGVzdCBmb3IgRGVjZW1iZXIhIFRoZSBwcm9tcHRzIGNhbiBiZSBmb3VuZCBoZXJlLCBhbmQgdGhlIGRhdGEgaXMgYXZhaWxhYmxlIGhlcmUuIFRoZSBjb250ZXN0IHdpbGwgcnVuIHVudGlsIHRoZSBsYXN0IGRheSBvZiBEZWNlbWJlciAoRGVjZW1iZXIgMzFzdCksIGxhdGVzdCB0aW1lIHpvbmUgb24gZWFydGguDQoNCmBgYHtyIHNldHVwLCB3YXJuaW5nPUYsbWVzc2FnZT1GfQ0Ka25pdHI6Om9wdHNfY2h1bmskc2V0KGVjaG8gPSBULG1lc3NhZ2UgPSBGLHdhcm5pbmcgPSBGKQ0KDQpzZXR3ZCgiQzovVXNlcnMvYWxhbmgvRG9jdW1lbnRzL1IvUHNpX0NoaV9SIikNCg0KbGlicmFyeSh0aWR5dmVyc2UpDQoNCmRhdGE9cmVhZF9jc3YoJ2RhdGEuY3N2JykNCg0KI2FjY2VzcyBDU1YgZGlyZWN0bHkgdGhyb3VnaCBVUkw7IGxvY2FsIGNvcHkgd2lsbCBiZSBvdmVyd3JpdHRlbiB3aXRoIGEgZGlmZmVyZW50IG1vbnRoJ3MgZGF0YQ0KDQojZGF0YT0gcmVhZC5jc3YoJ2h0dHBzOi8vb3NmLmlvL2Rvd25sb2FkLzd1Yzl4LycpDQpgYGANCg0KIyMgU2tpbSBhbmQgY2xlYW4NCmBgYHtyfQ0KIyMgRURBDQoNCm5hbWVzKGRhdGEpID0gbWFrZS5uYW1lcyhjb2xuYW1lcyhkYXRhKSkNCg0KU21hcnRFREE6OkV4cERhdGEoZGF0YSx0eXBlPTIpDQoNCmBgYA0KDQoNCiMjIExldmVsIDE6IA0KV3JpdGUgYSBzY3JpcHQgdGhhdCB3aWxsIHJlbW92ZSBwYXJ0aWNpcGFudHMgd2hvIGFyZSBtaXNzaW5nIHRoZSDigJhhZ2XigJkgdmFyaWFibGUuIFRoZW4sIHdyaXRlIGEgc2NyaXB0IHRoYXQgd2lsbCBzaG93IHRoZSBkaXN0cmlidXRpb24gb2YgdGhlIOKAmHJhdGluZ+KAmSB2YXJpYWJsZS4gSXMg4oCYUmV2aWV3IFJhdGluZ+KAmSBza2V3ZWQ/DQoNCmBgYHtyfQ0KZGF0YTEgPSBkYXRhICU+JSANCiAgZmlsdGVyKCFpcy5uYShBZ2UpKQ0KDQpoaXN0KGRhdGExJFJldmlldy5SYXRpbmcpDQpib3hwbG90KGRhdGExJFJldmlldy5SYXRpbmcpDQpzaGFwaXJvLnRlc3QoZGF0YTEkUmV2aWV3LlJhdGluZykNCg0KI3llcywgUmV2aWV3IFJhdGluZyBpcyBza2V3ZWQuDQpgYGANCg0KIyMgTGV2ZWwgMjogDQpGaW5kIHRoZSBtaW4sIG1lZGlhbiwgbWF4LCBhdmVyYWdlLCBhbmQgc3RhbmRhcmQgZGV2aWF0aW9uIG9mIOKAmFJldmlldyBSYXRpbmfigJkNCg0KYGBge3J9DQpkZXNjX3N0YXRzID0gZnVuY3Rpb24oeCkgew0KICAgICAgICAgICAgYyhtaW4gPSBtaW4oeCxuYS5ybSA9IFQpLA0KICAgICAgICAgICAgICBtZWRpYW4gPSBtZWRpYW4oeCxuYS5ybT1UKSwNCiAgICAgICAgICAgICAgbWF4ID0gbWF4KHgsbmEucm0gPSBUKSwNCiAgICAgICAgICAgICAgYXZlcmFnZSA9IG1lYW4oeCxuYS5ybT1UKSwNCiAgICAgICAgICAgICAgc3RhbmRhcmRfZGV2ID0gc2QoeCxuYS5ybT1UKSkNCn0NCg0KcHJpbnQoZGVzY19zdGF0cyhkYXRhMSRSZXZpZXcuUmF0aW5nKSkNCmBgYA0KDQojIyBMZXZlbCAzOiANCkNyZWF0ZSBhIHRhYmxlIHRoYXQgc2hvd3MgdGhlIGF2ZXJhZ2Ug4oCYUmV2aWV3IFJhdGluZ+KAmSBmb3IgZWFjaCDigJhDYXRlZ29yeS4nDQoNCmBgYHtyfQ0KcmV2aWV3X3JhdGluZ190YWJsZSA9IGRhdGExICU+JSANCiAgZ3JvdXBfYnkoQ2F0ZWdvcnkpICU+JSANCiAgc3VtbWFyaXNlKEF2ZXJhZ2VfUmV2aWV3X1JhdGluZyA9IG1lYW4oUmV2aWV3LlJhdGluZyxuYS5ybSA9IFQpKQ0KDQpwcmludChyZXZpZXdfcmF0aW5nX3RhYmxlKQ0KYGBgDQoNCiMjIExldmVsIDQ6IA0KQ3JlYXRlIGEgdmlzdWFsaXphdGlvbiBzaG93aW5nIHRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiDigJhQdXJjaGFzZSBBbW91bnQgKFVTRCnigJkgYW5kIOKAmFJldmlldyBSYXRpbmcu4oCZIA0KDQpgYGB7cn0NCg0KZGF0YTEgJT4lIA0KICBnZ3Bsb3QoYWVzKHg9UmV2aWV3LlJhdGluZyx5PVB1cmNoYXNlLkFtb3VudC4uVVNELikpKw0KICBnZW9tX3BvaW50KCkrDQogIGdlb21fc21vb3RoKG1ldGhvZCA9ICJsbSIsIHNlID0gVCkgDQoNCg0KYGBgDQoNClRlc3QgdGhlIGNvcnJlbGF0aW9uIGJldHdlZW4gdGhlIHZhcmlhYmxlcy4gV2hhdCBpcyB0aGUgcnZhbHVlPyBXaGF0IGlzIHRoZSBwIHZhbHVlPw0KYGBge3J9DQpzaGFwaXJvLnRlc3QoZGF0YTEkUHVyY2hhc2UuQW1vdW50Li5VU0QuKQ0KDQpzaGFwaXJvLnRlc3QoZGF0YTEkUmV2aWV3LlJhdGluZykNCg0KY29yLnRlc3QoZGF0YTEkUHVyY2hhc2UuQW1vdW50Li5VU0QuLGRhdGExJFJldmlldy5SYXRpbmcsbWV0aG9kID0gJ3NwZWFybWFuJykNCiNwLXZhbHVlID0gMC4wNjIyNCBhbmQgcmhvID0gMC4wMjk5NDA4OA0KYGBgDQoNCg0KDQpgYGB7ciBpbmNsdWRlPUZBTFNFfQ0KI2JlZXAgd2hlbiBkb25lDQppZiAocmVxdWlyZSgiYmVlcHIiLHF1aWV0bHkgPSBUKSkNCiAgYmVlcHI6OmJlZXAoMikNCmBgYA0KDQo=