available at: at http://rpubs.com/kirstenz/24661
with Kay and Louise Friday 8 August
Students who report looking at lecture recordings, do they?
Student plans don’t come to fruition (ref)
-> so if students didn’t say they use lecture recordings and now plan to (in response to weakness or Mid-Sem), does access change? For how long?
=> by eye, lectures were tues arvo and wed morning and access peaks tue, wed and thurs (can visualise with calendarHeat figures)
how do students prepare for classes => do students look at prac videos
=> by eye Thurs/Fri for Fri prac,
how many of 5 prac’s with videos (as how many times/weeks they access)
are there changes over time ie students stop looking when they realise not so useful
actually figures give how many students access how often first, but calculations do determine which students look, and when
data in “Folder access across semester.xls” moved to “LectAccess.csv”
clean - remove name column, remove empty rows (233-678)
move total column and total row to new vectors, and remove
clean - keep only consenting students reports number of students by number of variables
## [1] 230 116
## [1] 99 116
clean - De-ID students so can push to html
basic structure of data
## StudentID X4.03.14 X5.03.14 X6.03.14 X7.03.14
## 1 S6089847 1 1 0 0
## 3 S8117889 0 1 4 0
## 4 S8118323 2 0 0 0
## 5 S8152093 0 0 0 0
## 6 S8239113 1 1 0 0
## 7 S8283571 0 2 0 0
## 9 S8395419 1 0 0 0
## 12 S8407099 3 8 4 0
## 13 S8408815 0 0 0 0
## 14 S8465063 0 0 1 0
dimensions (rows by columns)
## [1] 99 116
Number of students who looked at lectures x number of times
clipped x axis at 100 access clicks to zoom into lower end
Converted x axis to log to spread clumped data into normal-ish curve
NB Log scale:
0 = 1
1 ~ 3
2 ~ 7
3 ~ 20
4 ~ 55
5 ~ 150
6 ~ 403
Working out viewings by day - number of times folder accessed per day (access.day), number of students who access each day (stud.day)…
Working out number of times (access.stud) and number of days (days.stud) each student accessed…
Loading ‘describe’ function to get descriptive stat’s…
Descriptive stat’s for viewings by day and by student:
Number of times lecture recording folder was accessed per day
## min max median mean SD SEM n NAs sum
## 0.0 162.0 30.0 40.1 31.3 2.9 114.0 0.0 4570.0
Number of students who accessed lecture recordings each day
## min max median mean SD SEM n NAs sum
## 0.0 43.0 13.0 14.9 9.9 0.9 114.0 0.0 1697.0
Number of times each student accessed the lecture recordings
## min max median mean SD SEM n NAs sum
## 4.0 194.0 35.0 46.2 35.0 3.5 99.0 0.0 4570.0
Number of days each student accessed the lecture recordings
## min max median mean SD SEM n NAs sum
## 3.0 41.0 16.0 17.1 8.5 0.9 99.0 0.0 1697.0
Useful conclusions: 114 days (16 weeks, 2 days) in data for 99 consenting students (cohort 231)
Large range in the number of access hits (0, 22) recorded for each student each day. Overall, the number of access hits per day is 2-3x number of students who access per day, and number of access hits per student is also 2-3x the number of days a student access the folder.
Since we don’t really know how the number of folder openings is tracked by Blackboard (could be refreshings), the number of students is probably a better way of looking at the data than number of times the folder is ‘opened’.
On average 15 +/- 1 (mean+/-SEM) students accessed each day, with a max of 43 students one day (1/5/14).
On average students accessed lecture recordings on 17 days, with a max of 41 days and a minimum of 3 days. So there were no students who didn’t access lecture recordings at all?
## days.stud
## 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 20 21 22 23 25 26 27 28 29
## 1 1 3 3 6 1 5 8 5 5 5 6 1 6 6 1 5 3 4 3 3 4 1 1 1
## 30 31 32 33 34 35 41
## 2 1 1 1 3 2 1
So, only 1 student looked on three days… only 1 student looked on 41 days, the majority looked on 7-26 days. Or as a histogram:
Transposed data (lat) so we can get real dates…
## [1] 114 100
## Date S8530605 S8636955 S8475915 S8645607
## 1 2014-03-04 0 1 3 2
## 2 2014-03-05 1 2 8 1
## 3 2014-03-06 0 4 0 0
## 4 2014-03-07 0 2 0 0
## 5 2014-03-08 0 2 0 0
Summed numbers of times the lecture recording folder was accessed and number of students who accessed lecture recording folder per day…
Loaded calanderHeat function…
## Loading required package: chron
Calendar of number of times lecture recording folder was accessesed each day
Calendar of number of students who accessed lecture recordings each day
Built data frame with student ID and T/F for access each day… NB created 2 data frames: la.norm 0 = not accessed, 1 = accessed; la.norm2 1 = accessed, NA = not accessed (NA = missing value), but cluster analysis errors with NA => don’t use data with missing values for cluster analysis
Use la.norm to cluster leture recording access
distances = dist(la.norm[2:115], method = "euclidean")
clusterLA = hclust(distances, method = "ward.D")
plot(clusterLA)
clusterGroups3 = cutree(clusterLA, k = 3)
la.norm$cluster3 = clusterGroups3
dim(la.norm)
## [1] 99 116
la.norm[1:5,1:5]
## StudentID X4.03.14 X5.03.14 X6.03.14 X7.03.14
## 1 S6089847 1 1 0 0
## 3 S8117889 0 1 1 0
## 4 S8118323 1 0 0 0
## 5 S8152093 0 0 0 0
## 6 S8239113 1 1 0 0
la.norm[1:5,110:116]
## X20.06.14 X21.06.14 X22.06.14 X23.06.14 X24.06.14 X25.06.14 cluster3
## 1 0 0 0 0 0 0 1
## 3 1 1 1 1 1 0 2
## 4 1 0 0 0 0 0 1
## 5 0 1 0 0 0 0 3
## 6 0 0 0 0 0 0 1
## [1] "The number of students in each cluster by the the number of variables"
## [1] 26 116
## [1] 36 116
## [1] 37 116
## [1] 0 116
## [1] 0 116
So what are the characteristics of the clusters - how often do students view lecture recordings and when:
## min max median mean SD SEM n NAs sum
## 15.0 41.0 25.5 25.7 7.4 1.5 26.0 0.0 668.0
## min max median mean SD SEM n NAs sum
## 12.0 30.0 18.0 19.1 5.3 0.9 36.0 0.0 687.0
## min max median mean SD SEM n NAs sum
## 3.0 17.0 9.0 9.2 3.5 0.6 37.0 0.0 342.0
## Warning in min(x, na.rm = T): no non-missing arguments to min; returning
## Inf
## Warning in max(x, na.rm = T): no non-missing arguments to max; returning -
## Inf
## min max median mean SD SEM n NAs sum
## Inf -Inf NA NaN NA NA 0 0 0
## Warning in min(x, na.rm = T): no non-missing arguments to min; returning
## Inf
## Warning in min(x, na.rm = T): no non-missing arguments to max; returning -
## Inf
## min max median mean SD SEM n NAs sum
## Inf -Inf NA NaN NA NA 0 0 0
To see ‘when’ need to get cluster groups into lat (transposed version)
The run calendarHeat for all 3 clusters…
Load in qualitative coding “pattern of lecture recording use ML” -> “qual.csv”
clean - de-identify
clean - capitalisation, converted “no info”, “deferred” and “” to NA (ie missing)
Data structure
## [1] 99 10
## StudentID ML1.previous ML2.planMS ML3.usedMS ML4.planEOS total.no
## 1 S6089847 No Yes No No 3
## 2 S8117889 No No No No 4
## 3 S8118323 Yes No No No 3
## 4 S8152093 Maybe No Maybe No 2
## 5 S8239113 Yes No Yes No 2
## total.yes total.maybe total.noinfo access
## 1 1 0 0 21
## 2 0 0 0 75
## 3 1 0 0 122
## 4 0 2 0 13
## 5 2 0 0 89
## ML1.previous ML2.planMS ML3.usedMS ML4.planEOS
## Maybe:18 Maybe: 5 Maybe: 4 Maybe: 3
## No :52 No :77 No :69 No :71
## Yes :27 Yes :15 Yes :23 Yes :17
## NA's : 2 NA's : 2 NA's : 3 NA's : 8
Patterns of self-reported lecture recording use
##
## Maybe No Yes
## 18 52 27
## [1] "ML1.previous"
##
## Maybe No Yes
## 5 77 15
## [1] "ML2.planMS"
##
## Maybe No Yes
## 4 69 23
## [1] "ML3.usedMS"
##
## Maybe No Yes
## 3 71 17
## [1] "ML4.planEOS"
## ML2.planMS
## ML1.previous Maybe No Yes Sum
## Maybe 1 16 0 17
## No 2 40 9 51
## Yes 2 19 6 27
## Sum 5 75 15 95
## ML3.usedMS
## ML2.planMS Maybe No Yes Sum
## Maybe 0 4 0 4
## No 4 57 14 75
## Yes 0 6 9 15
## Sum 4 67 23 94
Concl:
Most students (52/99 (i.e. 53%)) report that they don’t usually use lecture recordings, even more didn’t plan to use lecture recordings for mid-semeter exam (77/99) and a similar number didn’t use lecture recordings for mid-semeter exam (69/99), and this was the same for the end of semester exam (71/99).
This seems inconsistent with the number of students who do use lecture recordings (all 99 at some point), and the majority used lecture recordings on 7-26 days, which is still half to twice the number of weeks in semester so ~ once/fortnight to twice/week.
What are the patterns of No, No, No, No etc, similar to what Kay calculated as number of No’s, Yes’, Maybe’s (tables have the number of no’s 0-4 in header row, then frequency (number of students) in 2nd row)
##
## 0 1 2 3 4
## 2 15 21 32 29
## [1] "total.no"
##
## 0 1 2 3 4
## 52 27 7 11 2
## [1] "total.yes"
##
## 0 1 2
## 72 24 3
## [1] "total.maybe"
Most frequent patterns of repsonse:
##
## No Yes Yes Yes Yes No No Yes Yes Yes Yes No No Yes No No Yes No Yes Yes
## 3 3 3 4 4
## Yes No No No Maybe No No No No No No No
## 8 9 29
everything else was reported by 2 or less students.
So there is definitely a group of 29 students who never report using lecture recordings (LR). There are 27 students who report that they usually used LR.
Of these, 6 plan to use LR for mid-sem, 2 maybes, and 19 don’t mention LR for mid-sem prep. There are 52 (51?) students who don’t report usually using LR. Of these, 9 plan to use LR for mid-sem, 2 maybes, and 40 don’t mention LR for mid-sem prep.
## [1] 99 116
## StudentID X4.03.14 X5.03.14 X6.03.14 X7.03.14
## 1 S6089847 1 1 0 0
## 3 S8117889 0 1 1 0
## 4 S8118323 1 0 0 0
## 5 S8152093 0 0 0 0
## 6 S8239113 1 1 0 0
## X20.06.14 X21.06.14 X22.06.14 X23.06.14 X24.06.14 X25.06.14 cluster3
## 1 0 0 0 0 0 0 1
## 3 1 1 1 1 1 0 2
## 4 1 0 0 0 0 0 1
## 5 0 1 0 0 0 0 3
## 6 0 0 0 0 0 0 1
## [1] 99 126
## StudentID X4.03.14 X5.03.14 X6.03.14 X7.03.14
## 1 S6089847 1 1 0 0
## 2 S8117889 0 1 1 0
## 3 S8118323 1 0 0 0
## 4 S8152093 0 0 0 0
## 5 S8239113 1 1 0 0
## X25.06.14 cluster3 ML1.previous ML2.planMS ML3.usedMS ML4.planEOS
## 1 0 1 No Yes No No
## 2 0 2 No No No No
## 3 0 1 Yes No No No
## 4 0 3 Maybe No Maybe No
## 5 0 1 Yes No Yes No
## total.no total.yes total.maybe total.noinfo access pattern
## 1 3 1 0 0 21 No Yes No No
## 2 4 0 0 0 75 No No No No
## 3 3 1 0 0 122 Yes No No No
## 4 2 0 2 0 13 Maybe No Maybe No
## 5 2 2 0 0 89 Yes No Yes No
## [1] 99 128
## X25.06.14 cluster3 ML1.previous ML2.planMS ML3.usedMS ML4.planEOS
## 1 0 1 No Yes No No
## 2 0 2 No No No No
## 3 0 1 Yes No No No
## 4 0 3 Maybe No Maybe No
## 5 0 1 Yes No Yes No
## total.no total.yes total.maybe total.noinfo access pattern
## 1 3 1 0 0 21 No Yes No No
## 2 4 0 0 0 75 No No No No
## 3 3 1 0 0 122 Yes No No No
## 4 2 0 2 0 13 Maybe No Maybe No
## 5 2 2 0 0 89 Yes No Yes No
## prevLR access.days
## 1 No 15
## 2 No 26
## 3 Yes 33
## 4 Yes 10
## 5 Yes 28
Statistical tests:
Wilcox (ie unpaired t test for categorical data)
Do students who report usually using LR, access more LR? First as number of folder openings, then as number of days. (order is test, mean, sem)
##
## Wilcoxon rank sum test with continuity correction
##
## data: access by prevLR
## W = 739, p-value = 0.00184
## alternative hypothesis: true location shift is not equal to 0
## No Yes
## 37.9 56.2
## [1] 4.7961
## [1] 5.032717
## No Yes
## 4.8 5.0
##
## Wilcoxon rank sum test with continuity correction
##
## data: access.days by prevLR
## W = 790, p-value = 0.005989
## alternative hypothesis: true location shift is not equal to 0
## No Yes
## 14.96154 19.82222
## [1] 1.077664
## [1] 1.311676
## No Yes
## 1.077664 1.311676
Do students who report usually using LR, fall into different clusters? (order is test, table, mean, sem)
##
## Wilcoxon rank sum test with continuity correction
##
## data: cluster3 by prevLR
## W = 1473.5, p-value = 0.01951
## alternative hypothesis: true location shift is not equal to 0
## cluster3
## prevLR 1 2 3 Sum
## No 9 19 24 52
## Yes 16 17 12 45
## Sum 25 36 36 97
## No Yes
## 2.29 1.91
## [1] 0.1039801
## [1] 0.1181602
## No Yes
## 0.10 0.12
1 Previous and did >3 y yyyy ynyy yyny yyyn ynyn 2 previous/intended, but did not ynnn yynn ynny -> 3
3 No previous use, but then did or intended nnyy nyyy nnny -> 2 4 No previous use, intention but not nyny nynn
5 No report nnnn
0 Don’t fit?
Load “Kay.gp.index.csv” fixed for paper version of group names ie 2 and 3 swapped clean - de-identified
merge into la.norm.qual
## X25.06.14 cluster3 ML1.previous ML2.planMS ML3.usedMS ML4.planEOS
## 1 0 1 No Yes No No
## 2 0 2 No No No No
## 3 0 1 Yes No No No
## 4 0 3 Maybe No Maybe No
## 5 0 1 Yes No Yes No
## total.no total.yes total.maybe total.noinfo access pattern
## 1 3 1 0 0 21 No Yes No No
## 2 4 0 0 0 75 No No No No
## 3 3 1 0 0 122 Yes No No No
## 4 2 0 2 0 13 Maybe No Maybe No
## 5 2 2 0 0 89 Yes No Yes No
## prevLR access.days Kay.pattern
## 1 No 15 4
## 2 No 26 5
## 3 Yes 33 3
## 4 Yes 10 0
## 5 Yes 28 1
##
## Maybe Maybe No No Maybe NA No No Maybe No Maybe No Maybe No NA No
## 0 1 1 2 1
## 1 0 0 0 0
## 2 0 0 0 0
## 3 0 0 0 0
## 4 0 0 0 0
## 5 0 0 0 0
##
## Maybe No No NA Maybe No No No Maybe No Yes NA Maybe No Yes No
## 0 1 0 0 0
## 1 0 0 1 2
## 2 0 0 0 0
## 3 0 0 0 0
## 4 0 0 0 0
## 5 0 9 0 0
##
## NA No No No NA No No Yes No Maybe No No No NA No No No No Maybe No
## 0 1 1 2 0 0
## 1 0 0 0 0 0
## 2 0 0 0 0 0
## 3 0 0 0 0 0
## 4 0 0 0 0 0
## 5 0 0 0 1 1
##
## No No Maybe Yes No No NA No No No No NA No No No No No No No Yes
## 0 0 1 0 0 0
## 1 0 0 0 0 0
## 2 1 0 0 0 1
## 3 0 0 0 0 0
## 4 0 0 0 0 0
## 5 0 0 2 29 0
##
## No No Yes NA No No Yes No No No Yes Yes No Yes No Maybe No Yes No No
## 0 0 0 0 0 0
## 1 0 0 0 0 0
## 2 2 2 1 0 0
## 3 0 0 0 0 0
## 4 0 0 0 1 4
## 5 0 0 0 0 0
##
## No Yes Yes Maybe No Yes Yes Yes Yes Maybe NA No Yes Maybe No No
## 0 0 0 1 1
## 1 0 0 0 0
## 2 1 3 0 0
## 3 0 0 0 0
## 4 0 0 0 0
## 5 0 0 0 0
##
## Yes No No Maybe Yes No No NA Yes No No No Yes No No Yes Yes No Yes NA
## 0 0 1 0 0 1
## 1 0 0 0 0 0
## 2 0 0 0 0 0
## 3 1 0 8 3 0
## 4 0 0 0 0 0
## 5 0 0 0 0 0
##
## Yes No Yes No Yes No Yes Yes Yes Yes No Yes Yes Yes Yes No
## 0 0 0 0 0
## 1 1 4 1 3
## 2 0 0 0 0
## 3 0 0 0 0
## 4 0 0 0 0
## 5 0 0 0 0
##
## Yes Yes Yes Yes
## 0 0
## 1 2
## 2 0
## 3 0
## 4 0
## 5 0
Alignment Kay gp with cluster3
## Kay.pattern
## cluster3 0 1 2 3 4 5
## 1 3 5 3 5 1 9
## 2 2 8 6 4 4 12
## 3 10 1 2 3 0 21
What do the 3 clusters look like?
##
## 1 2 3
## 26 36 37
Alignment between 3 clusters and self report
## ML1.previous
## cluster3 Maybe No Yes
## 1 6 9 10
## 2 5 19 12
## 3 7 24 5
## ML2.planMS
## cluster3 Maybe No Yes
## 1 0 20 6
## 2 1 28 7
## 3 4 29 2
## ML3.usedMS
## cluster3 Maybe No Yes
## 1 0 18 8
## 2 2 21 13
## 3 2 30 2
## ML4.planEOS
## cluster3 Maybe No Yes
## 1 2 14 6
## 2 1 24 8
## 3 0 33 3
## total.no
## cluster3 0 1 2 3 4 Sum
## 1 0.50000000 0.16666667 0.11904762 0.15625000 0.06896552 0.13131313
## 2 0.00000000 0.23333333 0.23809524 0.17187500 0.13793103 0.18181818
## 3 0.00000000 0.10000000 0.14285714 0.17187500 0.29310345 0.18686869
## Sum 0.50000000 0.50000000 0.50000000 0.50000000 0.50000000 0.50000000
##
## cluster3 FALSE TRUE Sum
## 1 12 14 26
## 2 17 19 36
## 3 9 28 37
## Sum 38 61 99
## total.yes
## cluster3 0 1 2 3 4
## 1 10 9 2 3 2
## 2 12 14 4 6 0
## 3 30 4 1 2 0
Kay’s rules for 3 groups 1 = 3-4 y 2 = any 2 y + 2 N, 2n+y+m, 2y+n+m 3 = 3-4n 0 = noinfo
## [1] "3" "3" "3" "" "" "3" "3" "1" "3" "3" "3" "" "1" "" "3"
##
## 1 2 3 Sum
## 2 13 23 61 99
## [1] 23 130
## ML1.previous ML2.planMS ML3.usedMS ML4.planEOS total.no total.yes
## 4 Maybe No Maybe No 2 0
## 5 Yes No Yes No 2 2
## 12 Yes No Yes <NA> 1 2
## 14 Maybe Maybe No No 2 0
## 16 No Yes Yes Maybe 1 2
## 18 Maybe No Yes No 2 1
## 19 Yes No No Yes 2 2
## 23 Maybe No Yes No 2 1
## 25 Maybe No Maybe No 2 0
## 33 Maybe <NA> No No 2 0
## 42 No Yes No Maybe 2 1
## 45 Yes No No Yes 2 2
## 46 No No Yes <NA> 2 1
## 53 <NA> No No Yes 2 1
## 59 Maybe No No <NA> 2 0
## 60 Yes No No <NA> 2 1
## 63 Yes No No Maybe 2 1
## 70 No No Maybe Yes 2 1
## 71 Yes Maybe No No 2 1
## 74 No No Yes <NA> 2 1
## 79 Yes No No Yes 2 2
## 82 Maybe No <NA> No 2 0
## 99 No No Yes Yes 2 2
## [1] 8636869
## [1] 60
## StudentID ML1.previous ML2.planMS ML3.usedMS ML4.planEOS
## 60 S8636869 Yes No No <NA>
## ML1.previous ML2.planMS ML3.usedMS ML4.planEOS
## 1 No Yes No No
## 2 No No No No
## 3 Yes No No No
## 4 Maybe No Maybe No
## 5 Yes No Yes No
## 6 No No No No
## 7 No No No No
## 8 No Yes Yes Yes
## 9 Yes No No No
## 10 No Yes No No
## 11 Maybe No No No
## 12 Yes No Yes <NA>
## 13 Yes Yes Yes Yes
## 14 Maybe Maybe No No
## 15 No No No No
## 16 No Yes Yes Maybe
## 17 Yes No No No
## 18 Maybe No Yes No
## 19 Yes No No Yes
## 20 Yes No No No
##
## FALSE TRUE
## 381 15
##
## FALSE TRUE
## 351 30
## cluster3
## Kay3 1 2 3 Sum
## 1 0 1 2
## 1 5 6 2 13
## 2 6 11 6 23
## 3 14 19 28 61
## Sum 26 36 37 99
Trying a 2 cluster solution Use la.norm to cluster leture recording access
distances = dist(la.norm[2:115], method = "euclidean")
clusterLA = hclust(distances, method = "ward.D")
plot(clusterLA)
clusterGroups2 = cutree(clusterLA, k = 2)
la.norm$cluster2 = clusterGroups2
dim(la.norm)
## [1] 99 117
la.norm[1:5,1:5]
## StudentID X4.03.14 X5.03.14 X6.03.14 X7.03.14
## 1 S6089847 1 1 0 0
## 3 S8117889 0 1 1 0
## 4 S8118323 1 0 0 0
## 5 S8152093 0 0 0 0
## 6 S8239113 1 1 0 0
la.norm[1:5,ncol(la.norm)]
## [1] 1 1 1 2 1
la.norm[1:5,115:117]
## X25.06.14 cluster3 cluster2
## 1 0 1 1
## 3 0 2 1
## 4 0 1 1
## 5 0 3 2
## 6 0 1 1
addmargins(with(la.norm, table(cluster3, cluster2)))
## cluster2
## cluster3 1 2 Sum
## 1 26 0 26
## 2 36 0 36
## 3 0 37 37
## Sum 62 37 99
moving cluster2 over to la.norm.qual
Kay.office = la.norm
df = Kay.office
dim(df)
## [1] 99 117
df = cbind(df$StudentID, df[117])
dim(df)
## [1] 99 2
df[1:5,]
## df$StudentID cluster2
## 1 S6089847 1
## 3 S8117889 1
## 4 S8118323 1
## 5 S8152093 2
## 6 S8239113 1
Kay.office = df
dim(Kay.office)
## [1] 99 2
Kay.office[1:5,]
## df$StudentID cluster2
## 1 S6089847 1
## 3 S8117889 1
## 4 S8118323 1
## 5 S8152093 2
## 6 S8239113 1
names(Kay.office) = c("StudentID", "cluster2")
la.norm.qual = merge(la.norm.qual, Kay.office, by="StudentID")
dim(la.norm.qual)
## [1] 99 131
la.norm.qual[1:5,1:5]
## StudentID X4.03.14 X5.03.14 X6.03.14 X7.03.14
## 1 S6089847 1 1 0 0
## 2 S8117889 0 1 1 0
## 3 S8118323 1 0 0 0
## 4 S8152093 0 0 0 0
## 5 S8239113 1 1 0 0
la.norm.qual[1:5,125:131]
## access pattern prevLR access.days Kay.pattern Kay3 cluster2
## 1 21 No Yes No No No 15 4 3 1
## 2 75 No No No No No 26 5 3 1
## 3 122 Yes No No No Yes 33 3 3 1
## 4 13 Maybe No Maybe No Yes 10 0 2 2
## 5 89 Yes No Yes No Yes 28 1 2 1
with(la.norm.qual, table(Kay3, cluster2))
## cluster2
## Kay3 1 2
## 1 1
## 1 11 2
## 2 17 6
## 3 33 28
addmargins(with(la.norm.qual, table(total.yes, cluster2)))
## cluster2
## total.yes 1 2 Sum
## 0 22 30 52
## 1 23 4 27
## 2 6 1 7
## 3 9 2 11
## 4 2 0 2
## Sum 62 37 99
with(la.norm.qual, tapply(access, cluster2, mean))
## 1 2
## 62.50000 18.78378
with(la.norm.qual, tapply(access, cluster2, sem))
## [1] 4.276394
## [1] 2.204669
## 1 2
## 4.276394 2.204669
with(la.norm.qual, tapply(access.days, cluster2, mean))
## 1 2
## 21.854839 9.243243
with(la.norm.qual, tapply(access.days, cluster2, sem))
## [1] 0.89148
## [1] 0.570029
## 1 2
## 0.891480 0.570029
la.norm.qual[1:5,115:131]
## X25.06.14 cluster3 ML1.previous ML2.planMS ML3.usedMS ML4.planEOS
## 1 0 1 No Yes No No
## 2 0 2 No No No No
## 3 0 1 Yes No No No
## 4 0 3 Maybe No Maybe No
## 5 0 1 Yes No Yes No
## total.no total.yes total.maybe total.noinfo access pattern
## 1 3 1 0 0 21 No Yes No No
## 2 4 0 0 0 75 No No No No
## 3 3 1 0 0 122 Yes No No No
## 4 2 0 2 0 13 Maybe No Maybe No
## 5 2 2 0 0 89 Yes No Yes No
## prevLR access.days Kay.pattern Kay3 cluster2
## 1 No 15 4 3 1
## 2 No 26 5 3 1
## 3 Yes 33 3 3 1
## 4 Yes 10 0 2 2
## 5 Yes 28 1 2 1
addmargins(with(la.norm.qual, table(ML1.previous, cluster2)))
## cluster2
## ML1.previous 1 2 Sum
## Maybe 11 7 18
## No 28 24 52
## Yes 22 5 27
## Sum 61 36 97
addmargins(with(la.norm.qual, table(ML2.planMS, cluster2)))
## cluster2
## ML2.planMS 1 2 Sum
## Maybe 1 4 5
## No 48 29 77
## Yes 13 2 15
## Sum 62 35 97
addmargins(with(la.norm.qual, table(ML3.usedMS, cluster2)))
## cluster2
## ML3.usedMS 1 2 Sum
## Maybe 2 2 4
## No 39 30 69
## Yes 21 2 23
## Sum 62 34 96
addmargins(with(la.norm.qual, table(ML4.planEOS, cluster2)))
## cluster2
## ML4.planEOS 1 2 Sum
## Maybe 3 0 3
## No 38 33 71
## Yes 14 3 17
## Sum 55 36 91
addmargins(with(la.norm.qual, table(ML1.previous == "Yes", cluster2)))
## cluster2
## 1 2 Sum
## FALSE 39 31 70
## TRUE 22 5 27
## Sum 61 36 97
addmargins(with(la.norm.qual, table(ML2.planMS == "Yes", cluster2)))
## cluster2
## 1 2 Sum
## FALSE 49 33 82
## TRUE 13 2 15
## Sum 62 35 97
addmargins(with(la.norm.qual, table(ML3.usedMS == "Yes", cluster2)))
## cluster2
## 1 2 Sum
## FALSE 41 32 73
## TRUE 21 2 23
## Sum 62 34 96
addmargins(with(la.norm.qual, table(ML4.planEOS == "Yes", cluster2)))
## cluster2
## 1 2 Sum
## FALSE 41 33 74
## TRUE 14 3 17
## Sum 55 36 91
Then run calendarHeat for all 2 clusters…
## [1] 62 131
## [1] 37 131
## [1] 62 115
## StudentID X4.03.14 X5.03.14 X6.03.14 X7.03.14
## 1 S6089847 1 1 0 0
## 2 S8117889 0 1 1 0
## 3 S8118323 1 0 0 0
## 5 S8239113 1 1 0 0
## 6 S8283571 0 1 0 0
## X20.06.14 X21.06.14 X22.06.14 X23.06.14 X24.06.14 X25.06.14
## 1 0 0 0 0 0 0
## 2 1 1 1 1 1 0
## 3 1 0 0 0 0 0
## 5 0 0 0 0 0 0
## 6 0 0 0 0 1 0
## [1] "matrix"
## [1] 115 62
## 1 2 3 5 6
## StudentID "S6089847" "S8117889" "S8118323" "S8239113" "S8283571"
## X4.03.14 "1" "0" "1" "1" "0"
## X5.03.14 "1" "1" "0" "1" "1"
## X6.03.14 "0" "1" "0" "0" "0"
## X7.03.14 "0" "0" "0" "0" "0"
## 79 81 86 88 90
## StudentID "S8643917" "S8644267" "S8646161" "S8646489" "S8647069"
## X4.03.14 "0" "0" "1" "1" "1"
## X5.03.14 "1" "0" "1" "1" "1"
## X6.03.14 "1" "1" "1" "0" "0"
## X7.03.14 "0" "0" "0" "0" "0"
## 95 98 99
## StudentID "S8648397" "S8651655" "S8651793"
## X4.03.14 "0" "0" "1"
## X5.03.14 "0" "0" "0"
## X6.03.14 "0" "1" "1"
## X7.03.14 "0" "0" "0"
## [1] "data.frame"
## [1] 115 62
## 1 2 3 5 6
## StudentID S6089847 S8117889 S8118323 S8239113 S8283571
## X4.03.14 1 0 1 1 0
## X5.03.14 1 1 0 1 1
## X6.03.14 0 1 0 0 0
## X7.03.14 0 0 0 0 0
## 79 81 86 88 90 95 98
## StudentID S8643917 S8644267 S8646161 S8646489 S8647069 S8648397 S8651655
## X4.03.14 0 0 1 1 1 0 0
## X5.03.14 1 0 1 1 1 0 0
## X6.03.14 1 1 1 0 0 0 1
## X7.03.14 0 0 0 0 0 0 0
## 99
## StudentID S8651793
## X4.03.14 1
## X5.03.14 0
## X6.03.14 1
## X7.03.14 0
## 79 81 86 88 90 95 98 99 Dates
## X4.03.14 0 0 1 1 1 0 0 1 X4.03.14
## X5.03.14 1 0 1 1 1 0 0 0 X5.03.14
## X6.03.14 1 1 1 0 0 0 1 1 X6.03.14
## X7.03.14 0 0 0 0 0 0 0 0 X7.03.14
## [1] "data.frame"
## [1] 114 63
## 1 2 3 5 6
## X4.03.14 1 0 1 1 0
## X5.03.14 1 1 0 1 1
## X6.03.14 0 1 0 0 0
## X7.03.14 0 0 0 0 0
## X8.03.14 0 0 0 1 0
## 79 81 86 88 90 95 98 99 Dates
## X4.03.14 0 0 1 1 1 0 0 1 X4.03.14
## X5.03.14 1 0 1 1 1 0 0 0 X5.03.14
## X6.03.14 1 1 1 0 0 0 1 1 X6.03.14
## X7.03.14 0 0 0 0 0 0 0 0 X7.03.14
## X8.03.14 0 0 0 0 0 0 0 0 X8.03.14
## X9.03.14 0 0 0 0 0 0 0 0 X9.03.14
## X10.03.14 0 1 0 0 1 0 0 0 X10.03.14
## X11.03.14 0 1 0 1 0 0 0 0 X11.03.14
## X12.03.14 1 0 0 0 0 0 0 0 X12.03.14
## X13.03.14 0 0 0 0 0 0 1 0 X13.03.14
## X14.03.14 1 0 0 0 0 0 0 0 X14.03.14
## X15.03.14 0 0 0 0 0 0 0 0 X15.03.14
## X16.03.14 0 0 0 0 0 0 0 0 X16.03.14
## X17.03.14 0 0 0 0 0 0 0 0 X17.03.14
## X18.03.14 1 0 0 0 0 1 0 0 X18.03.14
## 79 81 86 88 90 95 98 99 Dates Dates2
## X4.03.14 0 0 1 1 1 0 0 1 X4.03.14 4.03.14
## X5.03.14 1 0 1 1 1 0 0 0 X5.03.14 5.03.14
## X6.03.14 1 1 1 0 0 0 1 1 X6.03.14 6.03.14
## X7.03.14 0 0 0 0 0 0 0 0 X7.03.14 7.03.14
## X8.03.14 0 0 0 0 0 0 0 0 X8.03.14 8.03.14
## X9.03.14 0 0 0 0 0 0 0 0 X9.03.14 9.03.14
## X10.03.14 0 1 0 0 1 0 0 0 X10.03.14 10.03.14
## X11.03.14 0 1 0 1 0 0 0 0 X11.03.14 11.03.14
## X12.03.14 1 0 0 0 0 0 0 0 X12.03.14 12.03.14
## X13.03.14 0 0 0 0 0 0 1 0 X13.03.14 13.03.14
## X14.03.14 1 0 0 0 0 0 0 0 X14.03.14 14.03.14
## X15.03.14 0 0 0 0 0 0 0 0 X15.03.14 15.03.14
## X16.03.14 0 0 0 0 0 0 0 0 X16.03.14 16.03.14
## X17.03.14 0 0 0 0 0 0 0 0 X17.03.14 17.03.14
## X18.03.14 1 0 0 0 0 1 0 0 X18.03.14 18.03.14
## chr [1:114] "4.03.14" "5.03.14" "6.03.14" "7.03.14" ...
## 79 81 86 88 90 95 98 99 Dates Dates2
## X4.03.14 0 0 1 1 1 0 0 1 X4.03.14 2014-03-04
## X5.03.14 1 0 1 1 1 0 0 0 X5.03.14 2014-03-05
## X6.03.14 1 1 1 0 0 0 1 1 X6.03.14 2014-03-06
## X7.03.14 0 0 0 0 0 0 0 0 X7.03.14 2014-03-07
## X8.03.14 0 0 0 0 0 0 0 0 X8.03.14 2014-03-08
## X9.03.14 0 0 0 0 0 0 0 0 X9.03.14 2014-03-09
## X10.03.14 0 1 0 0 1 0 0 0 X10.03.14 2014-03-10
## X11.03.14 0 1 0 1 0 0 0 0 X11.03.14 2014-03-11
## X12.03.14 1 0 0 0 0 0 0 0 X12.03.14 2014-03-12
## X13.03.14 0 0 0 0 0 0 1 0 X13.03.14 2014-03-13
## X14.03.14 1 0 0 0 0 0 0 0 X14.03.14 2014-03-14
## X15.03.14 0 0 0 0 0 0 0 0 X15.03.14 2014-03-15
## X16.03.14 0 0 0 0 0 0 0 0 X16.03.14 2014-03-16
## X17.03.14 0 0 0 0 0 0 0 0 X17.03.14 2014-03-17
## X18.03.14 1 0 0 0 0 1 0 0 X18.03.14 2014-03-18
## 79 81 86 88 90 95 98 99 Dates Dates2
## X4.03.14 0 0 1 1 1 0 0 1 X4.03.14 2014-03-04
## X5.03.14 1 0 1 1 1 0 0 0 X5.03.14 2014-03-05
## X6.03.14 1 1 1 0 0 0 1 1 X6.03.14 2014-03-06
## X7.03.14 0 0 0 0 0 0 0 0 X7.03.14 2014-03-07
## X8.03.14 0 0 0 0 0 0 0 0 X8.03.14 2014-03-08
## X9.03.14 0 0 0 0 0 0 0 0 X9.03.14 2014-03-09
## X10.03.14 0 1 0 0 1 0 0 0 X10.03.14 2014-03-10
## X11.03.14 0 1 0 1 0 0 0 0 X11.03.14 2014-03-11
## X12.03.14 1 0 0 0 0 0 0 0 X12.03.14 2014-03-12
## X13.03.14 0 0 0 0 0 0 1 0 X13.03.14 2014-03-13
## X14.03.14 1 0 0 0 0 0 0 0 X14.03.14 2014-03-14
## X15.03.14 0 0 0 0 0 0 0 0 X15.03.14 2014-03-15
## X16.03.14 0 0 0 0 0 0 0 0 X16.03.14 2014-03-16
## X17.03.14 0 0 0 0 0 0 0 0 X17.03.14 2014-03-17
## X18.03.14 1 0 0 0 0 1 0 0 X18.03.14 2014-03-18
## 'data.frame': 114 obs. of 5 variables:
## $ 1: chr "1" "1" "0" "0" ...
## $ 2: chr "0" "1" "1" "0" ...
## $ 3: chr "1" "0" "0" "0" ...
## $ 5: chr "1" "1" "0" "0" ...
## $ 6: chr "0" "1" "0" "0" ...
## 79 81 86 88 90 95 98 99 Dates Dates2
## X4.03.14 0 0 1 1 1 0 0 1 X4.03.14 2014-03-04
## X5.03.14 1 0 1 1 1 0 0 0 X5.03.14 2014-03-05
## X6.03.14 1 1 1 0 0 0 1 1 X6.03.14 2014-03-06
## X7.03.14 0 0 0 0 0 0 0 0 X7.03.14 2014-03-07
## X8.03.14 0 0 0 0 0 0 0 0 X8.03.14 2014-03-08
## X9.03.14 0 0 0 0 0 0 0 0 X9.03.14 2014-03-09
## X10.03.14 0 1 0 0 1 0 0 0 X10.03.14 2014-03-10
## X11.03.14 0 1 0 1 0 0 0 0 X11.03.14 2014-03-11
## X12.03.14 1 0 0 0 0 0 0 0 X12.03.14 2014-03-12
## X13.03.14 0 0 0 0 0 0 1 0 X13.03.14 2014-03-13
## X14.03.14 1 0 0 0 0 0 0 0 X14.03.14 2014-03-14
## X15.03.14 0 0 0 0 0 0 0 0 X15.03.14 2014-03-15
## X16.03.14 0 0 0 0 0 0 0 0 X16.03.14 2014-03-16
## X17.03.14 0 0 0 0 0 0 0 0 X17.03.14 2014-03-17
## X18.03.14 1 0 0 0 0 1 0 0 X18.03.14 2014-03-18
## 'data.frame': 114 obs. of 5 variables:
## $ 1: num 1 1 0 0 0 0 0 0 0 1 ...
## $ 2: num 0 1 1 0 0 1 0 0 1 0 ...
## $ 3: num 1 0 0 0 0 1 1 1 0 1 ...
## $ 5: num 1 1 0 0 1 0 0 0 0 0 ...
## $ 6: num 0 1 0 0 0 0 0 0 0 0 ...
## [1] 114 65
## 79 81 86 88 90 95 98 99 Dates Dates2 Total
## X4.03.14 0 0 1 1 1 0 0 1 X4.03.14 2014-03-04 29
## X5.03.14 1 0 1 1 1 0 0 0 X5.03.14 2014-03-05 32
## X6.03.14 1 1 1 0 0 0 1 1 X6.03.14 2014-03-06 17
## X7.03.14 0 0 0 0 0 0 0 0 X7.03.14 2014-03-07 6
## X8.03.14 0 0 0 0 0 0 0 0 X8.03.14 2014-03-08 10
## X9.03.14 0 0 0 0 0 0 0 0 X9.03.14 2014-03-09 11
## X10.03.14 0 1 0 0 1 0 0 0 X10.03.14 2014-03-10 15
## X11.03.14 0 1 0 1 0 0 0 0 X11.03.14 2014-03-11 21
## X12.03.14 1 0 0 0 0 0 0 0 X12.03.14 2014-03-12 16
## X13.03.14 0 0 0 0 0 0 1 0 X13.03.14 2014-03-13 7
## X14.03.14 1 0 0 0 0 0 0 0 X14.03.14 2014-03-14 7
## X15.03.14 0 0 0 0 0 0 0 0 X15.03.14 2014-03-15 6
## X16.03.14 0 0 0 0 0 0 0 0 X16.03.14 2014-03-16 10
## X17.03.14 0 0 0 0 0 0 0 0 X17.03.14 2014-03-17 17
## X18.03.14 1 0 0 0 0 1 0 0 X18.03.14 2014-03-18 19
## [1] 114 39
## 94 96 97 Dates Dates2
## X4.03.14 1 0 1 X4.03.14 2014-03-04
## X5.03.14 0 0 0 X5.03.14 2014-03-05
## X6.03.14 0 0 0 X6.03.14 2014-03-06
## X7.03.14 0 0 0 X7.03.14 2014-03-07
## X8.03.14 0 0 0 X8.03.14 2014-03-08
## [1] 114 40
## 94 96 97 Dates Dates2 Total
## X4.03.14 1 0 1 X4.03.14 2014-03-04 10
## X5.03.14 0 0 0 X5.03.14 2014-03-05 8
## X6.03.14 0 0 0 X6.03.14 2014-03-06 2
## X7.03.14 0 0 0 X7.03.14 2014-03-07 2
## X8.03.14 0 0 0 X8.03.14 2014-03-08 1
## X9.03.14 0 0 0 X9.03.14 2014-03-09 2
## X10.03.14 0 0 0 X10.03.14 2014-03-10 3
## X11.03.14 0 0 0 X11.03.14 2014-03-11 3
## X12.03.14 1 0 0 X12.03.14 2014-03-12 9
## X13.03.14 0 0 1 X13.03.14 2014-03-13 3
## X14.03.14 0 0 0 X14.03.14 2014-03-14 1
## X15.03.14 0 0 0 X15.03.14 2014-03-15 0
## X16.03.14 0 0 0 X16.03.14 2014-03-16 1
## X17.03.14 0 0 0 X17.03.14 2014-03-17 4
## X18.03.14 0 0 0 X18.03.14 2014-03-18 7
Kay’s email Wed 13 Aug 2014
Low 37 18.8+2.2 9.2+0.57 Meta-learning response (did and/or intended to access) n mean
yes 47 66+5.6
0 yes 52 28.23+2.5**
dim(la.norm.qual)
## [1] 99 131
with(la.norm.qual, tapply(access, total.yes == 0, mean))
## FALSE TRUE
## 66.00000 28.23077
with(la.norm.qual, tapply(access, total.yes == 0, sem))
## [1] 5.590563
## [1] 2.54175
## FALSE TRUE
## 5.590563 2.541750
with(la.norm.qual, tapply(access.days, total.yes == 0, mean))
## FALSE TRUE
## 21.29787 13.38462
with(la.norm.qual, tapply(access.days, total.yes == 0, sem))
## [1] 1.272681
## [1] 0.8848285
## FALSE TRUE
## 1.2726813 0.8848285
with(la.norm.qual, table(total.yes == 0, cluster2))
## cluster2
## 1 2
## FALSE 40 7
## TRUE 22 30
Days before for ML1-4 and Ass plus AcP for course (and Ass), plus qual categories 3 and 5 (up to 4 types of each so ordinal data) using MLsub that has Ml1-4 submission and due dates and time differences (should be loaded into global - if not then some indications of code in markup v1)
clean - consent
dim(ci)
## [1] 231 2
MLsub = NULL
MLsub = read.csv("MLsub.csv")
dim(MLsub)
## [1] 876 11
MLsub[,11] = NULL
MLsub[,1] = NULL
str(MLsub)
## 'data.frame': 876 obs. of 9 variables:
## $ StudentID: Factor w/ 230 levels "s3044923","s361850",..: 56 89 31 148 19 198 136 206 137 68 ...
## $ Date : Factor w/ 33 levels "1/06/14","10/04/14",..: 19 20 20 19 21 18 14 16 22 16 ...
## $ Submitted: Factor w/ 866 levels "0:01:00","0:03:09",..: 586 304 427 490 144 244 571 260 218 114 ...
## $ Duration : Factor w/ 785 levels "","0:00:42","0:01:07",..: 73 220 718 697 207 684 541 280 611 435 ...
## $ MLtask : Factor w/ 4 levels "ML1","ML2","ML3",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Open : Factor w/ 4 levels "19/03/14","27/05/14",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Due : Factor w/ 4 levels "14/05/14","16/04/14",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ SubDT : Factor w/ 850 levels "1/06/14 0:28",..: 476 491 494 468 510 441 338 386 543 378 ...
## $ DueDT : Factor w/ 4 levels "14/05/14 17:00",..: 3 3 3 3 3 3 3 3 3 3 ...
MLsub$SubDT = as.character(MLsub$SubDT)
MLsub$DueDT = as.character(MLsub$DueDT)
str(MLsub)
## 'data.frame': 876 obs. of 9 variables:
## $ StudentID: Factor w/ 230 levels "s3044923","s361850",..: 56 89 31 148 19 198 136 206 137 68 ...
## $ Date : Factor w/ 33 levels "1/06/14","10/04/14",..: 19 20 20 19 21 18 14 16 22 16 ...
## $ Submitted: Factor w/ 866 levels "0:01:00","0:03:09",..: 586 304 427 490 144 244 571 260 218 114 ...
## $ Duration : Factor w/ 785 levels "","0:00:42","0:01:07",..: 73 220 718 697 207 684 541 280 611 435 ...
## $ MLtask : Factor w/ 4 levels "ML1","ML2","ML3",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Open : Factor w/ 4 levels "19/03/14","27/05/14",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Due : Factor w/ 4 levels "14/05/14","16/04/14",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ SubDT : chr "23/03/14 20:43" "24/03/14 14:31" "24/03/14 17:06" "23/03/14 18:28" ...
## $ DueDT : chr "26/03/14 17:00" "26/03/14 17:00" "26/03/14 17:00" "26/03/14 17:00" ...
#MLsub[1:3,]
#dtp = as.POSIXct(dt, format = "%d/%m/%Y %H:%M:%S", tz="UTC")
MLsub$SubDT = as.POSIXct(MLsub$SubDT, "%d/%m/%y %H:%M", tz="UTC")
MLsub$DueDT = as.POSIXct(MLsub$DueDT, "%d/%m/%y %H:%M", tz="UTC")
#MLsub[1:3,]
MLsub$Earliness = difftime(MLsub$DueDT, MLsub$SubDT)
#MLsub[1:3,]
#MLsub[1:5,1:5]
MLsub.names = names(MLsub)
MLsub.names
## [1] "StudentID" "Date" "Submitted" "Duration" "MLtask"
## [6] "Open" "Due" "SubDT" "DueDT" "Earliness"
MLsub.names[1] = "StudentID"
names(MLsub) = MLsub.names
#MLsub[1:5,1:5]
clean - De-ID
## [1] 876 10
## StudentID Date Submitted Duration MLtask Open Due
## 1 S8579275 23/03/14 20:43:42 0:09:16 ML1 19/03/14 26/03/14
## 2 S8587419 24/03/14 14:31:37 0:16:02 ML1 19/03/14 26/03/14
## 3 S8530605 24/03/14 17:06:10 47:58:06 ML1 19/03/14 26/03/14
## 4 S8636955 23/03/14 18:28:04 4:40:22 ML1 19/03/14 26/03/14
## 5 S8475915 25/03/14 12:00:45 0:15:28 ML1 19/03/14 26/03/14
## SubDT DueDT Earliness
## 1 2014-03-23 20:43:00 2014-03-26 17:00:00 4097 mins
## 2 2014-03-24 14:31:00 2014-03-26 17:00:00 3029 mins
## 3 2014-03-24 17:06:00 2014-03-26 17:00:00 2874 mins
## 4 2014-03-23 18:28:00 2014-03-26 17:00:00 4232 mins
## 5 2014-03-25 12:00:00 2014-03-26 17:00:00 1740 mins
str(MLsub)
## 'data.frame': 876 obs. of 10 variables:
## $ StudentID: chr "S8579275" "S8587419" "S8530605" "S8636955" ...
## $ Date : Factor w/ 33 levels "1/06/14","10/04/14",..: 19 20 20 19 21 18 14 16 22 16 ...
## $ Submitted: Factor w/ 866 levels "0:01:00","0:03:09",..: 586 304 427 490 144 244 571 260 218 114 ...
## $ Duration : Factor w/ 785 levels "","0:00:42","0:01:07",..: 73 220 718 697 207 684 541 280 611 435 ...
## $ MLtask : Factor w/ 4 levels "ML1","ML2","ML3",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Open : Factor w/ 4 levels "19/03/14","27/05/14",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Due : Factor w/ 4 levels "14/05/14","16/04/14",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ SubDT : POSIXct, format: "2014-03-23 20:43:00" "2014-03-24 14:31:00" ...
## $ DueDT : POSIXct, format: "2014-03-26 17:00:00" "2014-03-26 17:00:00" ...
## $ Earliness:Class 'difftime' atomic [1:876] 4097 3029 2874 4232 1740 ...
## .. ..- attr(*, "units")= chr "mins"
dim(MLsub)
## [1] 876 10
#install.packages("lubridate")
require(lubridate)
## Loading required package: lubridate
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:chron':
##
## days, hours, minutes, seconds, years
## The following object is masked from 'package:base':
##
## date
mean(MLsub$Earliness)
## Time difference of 5746.612 mins
mean(difftime(MLsub$DueDT, MLsub$SubDT, units = "hours"))
## Time difference of 95.77686 hours
sem(difftime(MLsub$DueDT, MLsub$SubDT, units = "hours"))
## [1] 2.028385
mean(difftime(MLsub$DueDT, MLsub$SubDT, units = "days"))
## Time difference of 3.990703 days
sem(difftime(MLsub$DueDT, MLsub$SubDT, units = "days"))
## [1] 0.08451603
MLsub[1:5,1:3]
## StudentID Date Submitted
## 1 S8579275 23/03/14 20:43:42
## 2 S8587419 24/03/14 14:31:37
## 3 S8530605 24/03/14 17:06:10
## 4 S8636955 23/03/14 18:28:04
## 5 S8475915 25/03/14 12:00:45
correlations within ML submission to check
want ML1 vs 2 vs 3 vs 4 for Earliness
so ML 1…4 need to be columsn where StudID needs to be rows -> too hard to transform data, try boxplot for consistency instead
str(MLsub)
## 'data.frame': 876 obs. of 10 variables:
## $ StudentID: chr "S8579275" "S8587419" "S8530605" "S8636955" ...
## $ Date : Factor w/ 33 levels "1/06/14","10/04/14",..: 19 20 20 19 21 18 14 16 22 16 ...
## $ Submitted: Factor w/ 866 levels "0:01:00","0:03:09",..: 586 304 427 490 144 244 571 260 218 114 ...
## $ Duration : Factor w/ 785 levels "","0:00:42","0:01:07",..: 73 220 718 697 207 684 541 280 611 435 ...
## $ MLtask : Factor w/ 4 levels "ML1","ML2","ML3",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Open : Factor w/ 4 levels "19/03/14","27/05/14",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Due : Factor w/ 4 levels "14/05/14","16/04/14",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ SubDT : POSIXct, format: "2014-03-23 20:43:00" "2014-03-24 14:31:00" ...
## $ DueDT : POSIXct, format: "2014-03-26 17:00:00" "2014-03-26 17:00:00" ...
## $ Earliness:Class 'difftime' atomic [1:876] 4097 3029 2874 4232 1740 ...
## .. ..- attr(*, "units")= chr "mins"
MLsub$Early.hr = difftime(MLsub$DueDT, MLsub$SubDT, units = "hours")
MLsub$Early.hr[1:5]
## Time differences in hours
## [1] 68.28333 50.48333 47.90000 70.53333 29.00000
MLsub$Early.hr.num = as.numeric(MLsub$Early.hr)
boxplot(Early.hr.num ~ MLtask, data=MLsub)
#check if there is a difference in earliness between ML tasks...
aov.out = NULL
aov.out = aov(Early.hr.num ~ MLtask * StudentID + Error(StudentID), data=MLsub)
summary(aov.out)
##
## Error: StudentID
## Df Sum Sq Mean Sq
## MLtask 3 3420 1140
## StudentID 226 1885492 8343
##
## Error: Within
## Df Sum Sq Mean Sq F value Pr(>F)
## MLtask 3 88890 29630 10.786 0.0127 *
## MLtask:StudentID 638 1162107 1821 0.663 0.8149
## Residuals 5 13736 2747
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#now sig difference...
#install.packages("car")
library(car)
with(MLsub, pairwise.t.test(Early.hr.num, MLtask, p.adjust.method = "bonferroni"))
##
## Pairwise comparisons using t tests with pooled SD
##
## data: Early.hr.num and MLtask
##
## ML1 ML2 ML3
## ML2 0.241 - -
## ML3 1.000 0.148 -
## ML4 0.027 8.5e-06 0.053
##
## P value adjustment method: bonferroni
with(MLsub, tapply(Early.hr.num, MLtask, mean))
## ML1 ML2 ML3 ML4
## 94.34444 82.71682 95.53326 110.44762
with(MLsub, tapply(Early.hr.num, MLtask, sem))
## [1] 3.686183
## [1] 3.857621
## [1] 3.887297
## [1] 4.569826
## ML1 ML2 ML3 ML4
## 3.686183 3.857621 3.887297 4.569826
kw = kruskal.test(Early.hr.num ~ MLtask, data=MLsub)
kw
##
## Kruskal-Wallis rank sum test
##
## data: Early.hr.num by MLtask
## Kruskal-Wallis chi-squared = 28.324, df = 3, p-value = 3.106e-06
#Testing For Homogeneity of Variance
bartlett.test(Early.hr.num ~ MLtask, data=MLsub)
##
## Bartlett test of homogeneity of variances
##
## data: Early.hr.num by MLtask
## Bartlett's K-squared = 10.966, df = 3, p-value = 0.01191
MLsub$Early.hr.log = log(MLsub$Early.hr.num)
bartlett.test(Early.hr.log ~ MLtask, data=MLsub)
##
## Bartlett test of homogeneity of variances
##
## data: Early.hr.log by MLtask
## Bartlett's K-squared = 15.65, df = 3, p-value = 0.001338
ML1 = subset(MLsub, MLtask =="ML1")
ML2 = subset(MLsub, MLtask =="ML2")
ML3 = subset(MLsub, MLtask =="ML3")
ML4 = subset(MLsub, MLtask =="ML4")
m <- list(ML1, ML2, ML3, ML4)
for (i in 1:4)
hist(m[[i]][,12])
#install.packages("ggplot2")
require(ggplot2)
df = MLsub
#install.packages("devtools")
library(devtools)
#install_github("easyGgplot2", "kassambara") #got warning, maybe need install_github("kassambara/easyGgplot2")
library(easyGgplot2)
ggplot2.histogram(data=df, xName='Early.hr.num', groupName='MLtask', alpha=0.5, xtitle="Before deadline (hr)")
add MLsub to la.norm.qual
dim(la.norm.qual)
## [1] 99 131
la.norm.qual[1:5,125:131]
## access pattern prevLR access.days Kay.pattern Kay3 cluster2
## 1 21 No Yes No No No 15 4 3 1
## 2 75 No No No No No 26 5 3 1
## 3 122 Yes No No No Yes 33 3 3 1
## 4 13 Maybe No Maybe No Yes 10 0 2 2
## 5 89 Yes No Yes No Yes 28 1 2 1
all = NULL
dim(ML1)
## [1] 225 13
ML1[1:5,]
## StudentID Date Submitted Duration MLtask Open Due
## 1 S8579275 23/03/14 20:43:42 0:09:16 ML1 19/03/14 26/03/14
## 2 S8587419 24/03/14 14:31:37 0:16:02 ML1 19/03/14 26/03/14
## 3 S8530605 24/03/14 17:06:10 47:58:06 ML1 19/03/14 26/03/14
## 4 S8636955 23/03/14 18:28:04 4:40:22 ML1 19/03/14 26/03/14
## 5 S8475915 25/03/14 12:00:45 0:15:28 ML1 19/03/14 26/03/14
## SubDT DueDT Earliness Early.hr
## 1 2014-03-23 20:43:00 2014-03-26 17:00:00 4097 mins 68.28333 hours
## 2 2014-03-24 14:31:00 2014-03-26 17:00:00 3029 mins 50.48333 hours
## 3 2014-03-24 17:06:00 2014-03-26 17:00:00 2874 mins 47.90000 hours
## 4 2014-03-23 18:28:00 2014-03-26 17:00:00 4232 mins 70.53333 hours
## 5 2014-03-25 12:00:00 2014-03-26 17:00:00 1740 mins 29.00000 hours
## Early.hr.num Early.hr.log
## 1 68.28333 4.223666
## 2 50.48333 3.921643
## 3 47.90000 3.869116
## 4 70.53333 4.256085
## 5 29.00000 3.367296
all = merge(la.norm.qual, ML1[, c(1, 12)], by="StudentID")
dim(all)
## [1] 99 132
all[1:5, 125:132]
## access pattern prevLR access.days Kay.pattern Kay3 cluster2
## 1 21 No Yes No No No 15 4 3 1
## 2 21 No Yes No No No 15 4 3 1
## 3 75 No No No No No 26 5 3 1
## 4 122 Yes No No No Yes 33 3 3 1
## 5 13 Maybe No Maybe No Yes 10 0 2 2
## Early.hr.num
## 1 139.866667
## 2 8.733333
## 3 19.866667
## 4 43.200000
## 5 117.333333
all.names = names(all)
all.names[132] = "ML1earliness"
names(all) = all.names
all[1:5,130:132]
## Kay3 cluster2 ML1earliness
## 1 3 1 139.866667
## 2 3 1 8.733333
## 3 3 1 19.866667
## 4 3 1 43.200000
## 5 2 2 117.333333
all = merge(all, ML2[, c(1, 12)], by="StudentID")
all = merge(all, ML3[, c(1, 12)], by="StudentID")
all = merge(all, ML4[, c(1, 12)], by="StudentID")
dim(all)
## [1] 97 135
all[1:5,130:135]
## Kay3 cluster2 ML1earliness Early.hr.num.x Early.hr.num.y Early.hr.num
## 1 3 1 139.866667 163.900000 140.2667 115.9000
## 2 3 1 8.733333 163.900000 140.2667 115.9000
## 3 3 1 19.866667 47.366667 162.9833 186.3167
## 4 3 1 43.200000 66.950000 19.2000 186.7333
## 5 2 2 117.333333 5.466667 0.5000 168.6000
all.names = names(all)
all.names[133] = "ML2earliness"
all.names[134] = "ML3earliness"
all.names[135] = "ML4earliness"
names(all) = all.names
all[1:5,130:135]
## Kay3 cluster2 ML1earliness ML2earliness ML3earliness ML4earliness
## 1 3 1 139.866667 163.900000 140.2667 115.9000
## 2 3 1 8.733333 163.900000 140.2667 115.9000
## 3 3 1 19.866667 47.366667 162.9833 186.3167
## 4 3 1 43.200000 66.950000 19.2000 186.7333
## 5 2 2 117.333333 5.466667 0.5000 168.6000
cor(all[,132:135])
## ML1earliness ML2earliness ML3earliness ML4earliness
## ML1earliness 1.0000000 0.4466270 0.4804985 0.4446415
## ML2earliness 0.4466270 1.0000000 0.5419437 0.3660182
## ML3earliness 0.4804985 0.5419437 1.0000000 0.5333472
## ML4earliness 0.4446415 0.3660182 0.5333472 1.0000000
add in assignment submission
ass = read.csv("Ass.csv")
dim(ass)
## [1] 220 4
str(ass)
## 'data.frame': 220 obs. of 4 variables:
## $ StudentID: Factor w/ 220 levels "s3044923","s361850",..: 144 105 56 90 197 87 213 19 65 52 ...
## $ Ass.mark : int 82 75 86 83 92 94 81 93 83 78 ...
## $ Sub.Date : Factor w/ 12 levels "12/05/2014","13/05/2014",..: 1 2 3 4 4 4 5 5 5 6 ...
## $ Sub.Time : Factor w/ 220 levels "0:07:45","0:21:45",..: 42 88 16 211 103 137 197 80 101 219 ...
clean - consent
dim(ci)
## [1] 231 2
df = merge(ass, ci, by ="StudentID")
dim(df) #drops from 231 to 230 coz uqlipitt removed
## [1] 219 5
#df[1:10,1:10]
#df[1:5,110:116]
df = subset(df, Consent == "Yes")
dim(df)
## [1] 96 5
ass = df
dim(ass)
## [1] 96 5
str(ass)
## 'data.frame': 96 obs. of 5 variables:
## $ StudentID: Factor w/ 220 levels "s3044923","s361850",..: 1 3 4 5 6 7 9 10 11 12 ...
## $ Ass.mark : int 86 75 94 82 91 83 77 88 83 87 ...
## $ Sub.Date : Factor w/ 12 levels "12/05/2014","13/05/2014",..: 8 10 8 10 10 9 7 8 8 9 ...
## $ Sub.Time : Factor w/ 220 levels "0:07:45","0:21:45",..: 125 210 149 64 21 107 81 86 190 202 ...
## $ Consent : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
clean - De-ID
## [1] 96 5
## StudentID Ass.mark Sub.Date Sub.Time Consent
## 1 S6089847 86 19/05/2014 20:49:26 Yes
## 3 S8117889 75 21/05/2014 9:35:45 Yes
## 4 S8118323 94 19/05/2014 22:45:19 Yes
## 5 S8152093 82 21/05/2014 11:54:37 Yes
## 6 S8239113 91 21/05/2014 10:17:41 Yes
merging ass into all Assignment due 12 noon 21/05/14
ass[1:5,]
## StudentID Ass.mark Sub.Date Sub.Time Consent
## 1 S6089847 86 19/05/2014 20:49:26 Yes
## 3 S8117889 75 21/05/2014 9:35:45 Yes
## 4 S8118323 94 19/05/2014 22:45:19 Yes
## 5 S8152093 82 21/05/2014 11:54:37 Yes
## 6 S8239113 91 21/05/2014 10:17:41 Yes
str(ass)
## 'data.frame': 96 obs. of 5 variables:
## $ StudentID: chr "S6089847" "S8117889" "S8118323" "S8152093" ...
## $ Ass.mark : int 86 75 94 82 91 83 77 88 83 87 ...
## $ Sub.Date : Factor w/ 12 levels "12/05/2014","13/05/2014",..: 8 10 8 10 10 9 7 8 8 9 ...
## $ Sub.Time : Factor w/ 220 levels "0:07:45","0:21:45",..: 125 210 149 64 21 107 81 86 190 202 ...
## $ Consent : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
ass$Sub.Date = as.character(ass$Sub.Date)
ass$Sub.Time = as.character(ass$Sub.Time)
str(ass)
## 'data.frame': 96 obs. of 5 variables:
## $ StudentID: chr "S6089847" "S8117889" "S8118323" "S8152093" ...
## $ Ass.mark : int 86 75 94 82 91 83 77 88 83 87 ...
## $ Sub.Date : chr "19/05/2014" "21/05/2014" "19/05/2014" "21/05/2014" ...
## $ Sub.Time : chr "20:49:26" "9:35:45" "22:45:19" "11:54:37" ...
## $ Consent : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
ass$Sub.Ass = paste(ass$Sub.Date, ass$Sub.Time)
dim(ass)
## [1] 96 6
ass[1:5,]
## StudentID Ass.mark Sub.Date Sub.Time Consent Sub.Ass
## 1 S6089847 86 19/05/2014 20:49:26 Yes 19/05/2014 20:49:26
## 3 S8117889 75 21/05/2014 9:35:45 Yes 21/05/2014 9:35:45
## 4 S8118323 94 19/05/2014 22:45:19 Yes 19/05/2014 22:45:19
## 5 S8152093 82 21/05/2014 11:54:37 Yes 21/05/2014 11:54:37
## 6 S8239113 91 21/05/2014 10:17:41 Yes 21/05/2014 10:17:41
str(ass[,6])
## chr [1:96] "19/05/2014 20:49:26" "21/05/2014 9:35:45" ...
ass$Sub.Ass = as.POSIXct(ass$Sub.Ass, format = "%d/%m/%Y %H:%M:%S")
str(ass[,6])
## POSIXct[1:96], format: "2014-05-19 20:49:26" "2014-05-21 09:35:45" ...
ass$Due = as.POSIXct("2014-05-21 12:00:00", tz="UCT")
tz(ass$Sub.Ass)
## [1] ""
ass$Sub.Ass = force_tz(ass$Sub.Ass, "UTC")
tz(ass$Sub.Ass)
## [1] "UTC"
dim(ass)
## [1] 96 7
ass[1:5,]
## StudentID Ass.mark Sub.Date Sub.Time Consent Sub.Ass
## 1 S6089847 86 19/05/2014 20:49:26 Yes 2014-05-19 20:49:26
## 3 S8117889 75 21/05/2014 9:35:45 Yes 2014-05-21 09:35:45
## 4 S8118323 94 19/05/2014 22:45:19 Yes 2014-05-19 22:45:19
## 5 S8152093 82 21/05/2014 11:54:37 Yes 2014-05-21 11:54:37
## 6 S8239113 91 21/05/2014 10:17:41 Yes 2014-05-21 10:17:41
## Due
## 1 2014-05-21 12:00:00
## 3 2014-05-21 12:00:00
## 4 2014-05-21 12:00:00
## 5 2014-05-21 12:00:00
## 6 2014-05-21 12:00:00
str(ass)
## 'data.frame': 96 obs. of 7 variables:
## $ StudentID: chr "S6089847" "S8117889" "S8118323" "S8152093" ...
## $ Ass.mark : int 86 75 94 82 91 83 77 88 83 87 ...
## $ Sub.Date : chr "19/05/2014" "21/05/2014" "19/05/2014" "21/05/2014" ...
## $ Sub.Time : chr "20:49:26" "9:35:45" "22:45:19" "11:54:37" ...
## $ Consent : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
## $ Sub.Ass : POSIXct, format: "2014-05-19 20:49:26" "2014-05-21 09:35:45" ...
## $ Due : POSIXct, format: "2014-05-21 12:00:00" "2014-05-21 12:00:00" ...
ass$Ass.earliness = difftime(ass$Due, ass$Sub.Ass, units="hours")
str(ass)
## 'data.frame': 96 obs. of 8 variables:
## $ StudentID : chr "S6089847" "S8117889" "S8118323" "S8152093" ...
## $ Ass.mark : int 86 75 94 82 91 83 77 88 83 87 ...
## $ Sub.Date : chr "19/05/2014" "21/05/2014" "19/05/2014" "21/05/2014" ...
## $ Sub.Time : chr "20:49:26" "9:35:45" "22:45:19" "11:54:37" ...
## $ Consent : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
## $ Sub.Ass : POSIXct, format: "2014-05-19 20:49:26" "2014-05-21 09:35:45" ...
## $ Due : POSIXct, format: "2014-05-21 12:00:00" "2014-05-21 12:00:00" ...
## $ Ass.earliness:Class 'difftime' atomic [1:96] 39.1761 2.4042 37.2447 0.0897 1.7053 ...
## .. ..- attr(*, "units")= chr "hours"
ass[1:5,]
## StudentID Ass.mark Sub.Date Sub.Time Consent Sub.Ass
## 1 S6089847 86 19/05/2014 20:49:26 Yes 2014-05-19 20:49:26
## 3 S8117889 75 21/05/2014 9:35:45 Yes 2014-05-21 09:35:45
## 4 S8118323 94 19/05/2014 22:45:19 Yes 2014-05-19 22:45:19
## 5 S8152093 82 21/05/2014 11:54:37 Yes 2014-05-21 11:54:37
## 6 S8239113 91 21/05/2014 10:17:41 Yes 2014-05-21 10:17:41
## Due Ass.earliness
## 1 2014-05-21 12:00:00 39.17611111 hours
## 3 2014-05-21 12:00:00 2.40416667 hours
## 4 2014-05-21 12:00:00 37.24472222 hours
## 5 2014-05-21 12:00:00 0.08972222 hours
## 6 2014-05-21 12:00:00 1.70527778 hours
dim(ass)
## [1] 96 8
dim(all)
## [1] 97 135
all = merge(all, ass[, c(1:2, 8)], by="StudentID")
dim(all)
## [1] 94 137
all[1:5,131:ncol(all)]
## cluster2 ML1earliness ML2earliness ML3earliness ML4earliness Ass.mark
## 1 1 8.733333 163.900000 140.2667 115.9000 86
## 2 1 139.866667 163.900000 140.2667 115.9000 86
## 3 1 19.866667 47.366667 162.9833 186.3167 75
## 4 1 43.200000 66.950000 19.2000 186.7333 94
## 5 2 117.333333 5.466667 0.5000 168.6000 82
## Ass.earliness
## 1 39.17611111 hours
## 2 39.17611111 hours
## 3 2.40416667 hours
## 4 37.24472222 hours
## 5 0.08972222 hours
Academic performance as course grade (access vs performance -> AcP.csv)
## StudentID Course.grade
## 1 S8529183 40.5
## 2 S8636687 47.2
## 3 S8624451 47.8
## 4 S8633919 51.9
## 5 S8583807 52.5
## [1] 94 138
## cluster2 ML1earliness ML2earliness ML3earliness ML4earliness Ass.mark
## 1 1 8.733333 163.900000 140.2667 115.9000 86
## 2 1 139.866667 163.900000 140.2667 115.9000 86
## 3 1 19.866667 47.366667 162.9833 186.3167 75
## 4 1 43.200000 66.950000 19.2000 186.7333 94
## 5 2 117.333333 5.466667 0.5000 168.6000 82
## Ass.earliness Course.grade
## 1 39.17611111 hours 66.6
## 2 39.17611111 hours 66.6
## 3 2.40416667 hours 64.7
## 4 37.24472222 hours 78.0
## 5 0.08972222 hours 80.7
correlations (ass.early.num = hours before Assignmnet due date 12noon)
str(all[131:ncol(all)])
## 'data.frame': 94 obs. of 8 variables:
## $ cluster2 : int 1 1 1 1 2 1 1 1 1 1 ...
## $ ML1earliness : num 8.73 139.87 19.87 43.2 117.33 ...
## $ ML2earliness : num 163.9 163.9 47.37 66.95 5.47 ...
## $ ML3earliness : num 140.3 140.3 163 19.2 0.5 ...
## $ ML4earliness : num 116 116 186 187 169 ...
## $ Ass.mark : int 86 86 75 94 82 91 83 77 88 83 ...
## $ Ass.earliness:Class 'difftime' atomic [1:94] 39.1761 39.1761 2.4042 37.2447 0.0897 ...
## .. ..- attr(*, "units")= chr "hours"
## $ Course.grade : num 66.6 66.6 64.7 78 80.7 68.4 82.3 92.2 84.9 81.7 ...
all$ass.early.num = as.numeric(all$Ass.earliness)
dim(all)
## [1] 94 139
all[1:5,131:ncol(all)]
## cluster2 ML1earliness ML2earliness ML3earliness ML4earliness Ass.mark
## 1 1 8.733333 163.900000 140.2667 115.9000 86
## 2 1 139.866667 163.900000 140.2667 115.9000 86
## 3 1 19.866667 47.366667 162.9833 186.3167 75
## 4 1 43.200000 66.950000 19.2000 186.7333 94
## 5 2 117.333333 5.466667 0.5000 168.6000 82
## Ass.earliness Course.grade ass.early.num
## 1 39.17611111 hours 66.6 39.17611111
## 2 39.17611111 hours 66.6 39.17611111
## 3 2.40416667 hours 64.7 2.40416667
## 4 37.24472222 hours 78.0 37.24472222
## 5 0.08972222 hours 80.7 0.08972222
#cor(all[c(132:134, 136:ncol(all))])
cor(all[c(131:136, 138:ncol(all))])
## cluster2 ML1earliness ML2earliness ML3earliness
## cluster2 1.00000000 0.27704850 0.06318748 0.23328512
## ML1earliness 0.27704850 1.00000000 0.44635213 0.49048355
## ML2earliness 0.06318748 0.44635213 1.00000000 0.53738345
## ML3earliness 0.23328512 0.49048355 0.53738345 1.00000000
## ML4earliness 0.20727516 0.45386072 0.35783384 0.52623056
## Ass.mark 0.06074930 0.04823572 0.08963771 0.06346325
## Course.grade 0.10995817 0.18798308 0.24792821 0.18772086
## ass.early.num -0.01724803 0.27721408 0.16754510 0.18931786
## ML4earliness Ass.mark Course.grade ass.early.num
## cluster2 0.20727516 0.06074930 0.1099582 -0.01724803
## ML1earliness 0.45386072 0.04823572 0.1879831 0.27721408
## ML2earliness 0.35783384 0.08963771 0.2479282 0.16754510
## ML3earliness 0.52623056 0.06346325 0.1877209 0.18931786
## ML4earliness 1.00000000 0.02611241 0.1471807 0.17576185
## Ass.mark 0.02611241 1.00000000 0.2400318 0.06852016
## Course.grade 0.14718068 0.24003185 1.0000000 0.28558545
## ass.early.num 0.17576185 0.06852016 0.2855855 1.00000000
Organisation qual coded as categories 3 and 5
## [1] 99 5
## StudentID Cat3 Cat5 Cat3or5 Sum.Cat3and5
## 1 S8646489 4 1 2 5
## 2 S8283571 3 1 2 4
## 3 S8586369 4 0 1 4
## 4 S8641669 3 1 2 4
## 5 S8152093 2 1 2 3
## [1] 94 143
## cluster2 ML1earliness ML2earliness ML3earliness ML4earliness Ass.mark
## 1 1 8.733333 163.9 140.2667 115.9 86
## 2 1 139.866667 163.9 140.2667 115.9 86
## Course.grade Cat5 Cat3or5 Sum.Cat3and5
## 1 66.6 0 0 0
## 2 66.6 0 0 0
## 'data.frame': 2 obs. of 10 variables:
## $ cluster2 : int 1 1
## $ ML1earliness: num 8.73 139.87
## $ ML2earliness: num 164 164
## $ ML3earliness: num 140 140
## $ ML4earliness: num 116 116
## $ Ass.mark : int 86 86
## $ Course.grade: num 66.6 66.6
## $ Cat5 : int 0 0
## $ Cat3or5 : num 0 0
## $ Sum.Cat3and5: int 0 0
## cluster2 ML1earliness ML2earliness ML3earliness
## cluster2 1.00000000 0.27704850 0.063187482 0.23328512
## ML1earliness 0.27704850 1.00000000 0.446352129 0.49048355
## ML2earliness 0.06318748 0.44635213 1.000000000 0.53738345
## ML3earliness 0.23328512 0.49048355 0.537383445 1.00000000
## ML4earliness 0.20727516 0.45386072 0.357833836 0.52623056
## Ass.mark 0.06074930 0.04823572 0.089637706 0.06346325
## Course.grade 0.10995817 0.18798308 0.247928207 0.18772086
## Cat5 0.14578738 0.07512043 -0.006346801 -0.02653697
## Cat3or5 0.01898149 -0.07668516 -0.160959046 -0.11521145
## Sum.Cat3and5 0.01351256 0.01142279 -0.179925085 -0.07817110
## ML4earliness Ass.mark Course.grade Cat5
## cluster2 0.20727516 0.06074930 0.10995817 0.145787375
## ML1earliness 0.45386072 0.04823572 0.18798308 0.075120429
## ML2earliness 0.35783384 0.08963771 0.24792821 -0.006346801
## ML3earliness 0.52623056 0.06346325 0.18772086 -0.026536971
## ML4earliness 1.00000000 0.02611241 0.14718068 0.107219094
## Ass.mark 0.02611241 1.00000000 0.24003185 -0.031986505
## Course.grade 0.14718068 0.24003185 1.00000000 0.103259521
## Cat5 0.10721909 -0.03198650 0.10325952 1.000000000
## Cat3or5 0.04154490 -0.08725225 -0.07779212 0.640939203
## Sum.Cat3and5 0.01498710 -0.10658540 -0.10408124 0.540546730
## Cat3or5 Sum.Cat3and5
## cluster2 0.01898149 0.01351256
## ML1earliness -0.07668516 0.01142279
## ML2earliness -0.16095905 -0.17992509
## ML3earliness -0.11521145 -0.07817110
## ML4earliness 0.04154490 0.01498710
## Ass.mark -0.08725225 -0.10658540
## Course.grade -0.07779212 -0.10408124
## Cat5 0.64093920 0.54054673
## Cat3or5 1.00000000 0.86549100
## Sum.Cat3and5 0.86549100 1.00000000
## [1] 94 144
## cluster2 ML1earliness ML2earliness ML3earliness ML4earliness Ass.mark
## 1 1 8.733333 163.900000 140.2667 115.9000 86
## 2 1 139.866667 163.900000 140.2667 115.9000 86
## 3 1 19.866667 47.366667 162.9833 186.3167 75
## 4 1 43.200000 66.950000 19.2000 186.7333 94
## 5 2 117.333333 5.466667 0.5000 168.6000 82
## Ass.earliness Course.grade ass.early.num Cat3 Cat5 Cat3or5
## 1 39.17611111 hours 66.6 39.17611111 0 0 0
## 2 39.17611111 hours 66.6 39.17611111 0 0 0
## 3 2.40416667 hours 64.7 2.40416667 2 0 1
## 4 37.24472222 hours 78.0 37.24472222 0 1 1
## 5 0.08972222 hours 80.7 0.08972222 2 1 2
## Sum.Cat3and5 MLearliness
## 1 0 107.20000
## 2 0 139.98333
## 3 2 104.13333
## 4 1 79.02083
## 5 3 72.97500
## cluster2 ML1earliness ML2earliness ML3earliness
## cluster2 1.00000000 0.27704850 0.063187482 0.23328512
## ML1earliness 0.27704850 1.00000000 0.446352129 0.49048355
## ML2earliness 0.06318748 0.44635213 1.000000000 0.53738345
## ML3earliness 0.23328512 0.49048355 0.537383445 1.00000000
## ML4earliness 0.20727516 0.45386072 0.357833836 0.52623056
## Ass.mark 0.06074930 0.04823572 0.089637706 0.06346325
## Course.grade 0.10995817 0.18798308 0.247928207 0.18772086
## Cat3 -0.06553437 -0.02848423 -0.209608252 -0.07776687
## Cat5 0.14578738 0.07512043 -0.006346801 -0.02653697
## Cat3or5 0.01898149 -0.07668516 -0.160959046 -0.11521145
## Sum.Cat3and5 0.01351256 0.01142279 -0.179925085 -0.07817110
## MLearliness 0.25007838 0.75390591 0.747976825 0.82090373
## ML4earliness Ass.mark Course.grade Cat3
## cluster2 0.20727516 0.06074930 0.10995817 -0.06553437
## ML1earliness 0.45386072 0.04823572 0.18798308 -0.02848423
## ML2earliness 0.35783384 0.08963771 0.24792821 -0.20960825
## ML3earliness 0.52623056 0.06346325 0.18772086 -0.07776687
## ML4earliness 1.00000000 0.02611241 0.14718068 -0.04221520
## Ass.mark 0.02611241 1.00000000 0.24003185 -0.10838137
## Course.grade 0.14718068 0.24003185 1.00000000 -0.18106140
## Cat3 -0.04221520 -0.10838137 -0.18106140 1.00000000
## Cat5 0.10721909 -0.03198650 0.10325952 0.08106184
## Cat3or5 0.04154490 -0.08725225 -0.07779212 0.66685730
## Sum.Cat3and5 0.01498710 -0.10658540 -0.10408124 0.88236300
## MLearliness 0.77708920 0.07208286 0.24651552 -0.11483597
## Cat5 Cat3or5 Sum.Cat3and5 MLearliness
## cluster2 0.145787375 0.01898149 0.01351256 0.25007838
## ML1earliness 0.075120429 -0.07668516 0.01142279 0.75390591
## ML2earliness -0.006346801 -0.16095905 -0.17992509 0.74797682
## ML3earliness -0.026536971 -0.11521145 -0.07817110 0.82090373
## ML4earliness 0.107219094 0.04154490 0.01498710 0.77708920
## Ass.mark -0.031986505 -0.08725225 -0.10658540 0.07208286
## Course.grade 0.103259521 -0.07779212 -0.10408124 0.24651552
## Cat3 0.081061839 0.66685730 0.88236300 -0.11483597
## Cat5 1.000000000 0.64093920 0.54054673 0.05072051
## Cat3or5 0.640939203 1.00000000 0.86549100 -0.09463110
## Sum.Cat3and5 0.540546730 0.86549100 1.00000000 -0.07298578
## MLearliness 0.050720510 -0.09463110 -0.07298578 1.00000000
t.tests for 2 clusters
dim(all)
## [1] 94 144
wilcox.test(MLearliness ~ cluster2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: MLearliness by cluster2
## W = 706, p-value = 0.01748
## alternative hypothesis: true location shift is not equal to 0
round(with(all, calc(mean, MLearliness, cluster2)),1)
## 1 2
## 94.2 117.3
round(with(all, calc.sem(sem, MLearliness, cluster2)),1)
## [1] 5.813671
## [1] 6.657489
## 1 2
## 5.8 6.7
wilcox.test(ass.early.num ~ cluster2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: ass.early.num by cluster2
## W = 998, p-value = 0.9495
## alternative hypothesis: true location shift is not equal to 0
round(with(all, calc(mean, ass.early.num, cluster2)),1)
## 1 2
## 29.9 28.1
round(with(all, calc.sem(sem, ass.early.num, cluster2)),1)
## [1] 6.842439
## [1] 6.427785
## 1 2
## 6.8 6.4
wilcox.test(Course.grade ~ cluster2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Course.grade by cluster2
## W = 889, p-value = 0.354
## alternative hypothesis: true location shift is not equal to 0
round(with(all, calc(mean, Course.grade, cluster2)),1)
## 1 2
## 77.8 79.8
round(with(all, calc.sem(sem, Course.grade, cluster2)),1)
## [1] 1.236442
## [1] 1.226599
## 1 2
## 1.2 1.2
wilcox.test(Sum.Cat3and5 ~ cluster2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Sum.Cat3and5 by cluster2
## W = 1001.5, p-value = 0.9701
## alternative hypothesis: true location shift is not equal to 0
round(with(all, calc(mean, Sum.Cat3and5, cluster2)),1)
## 1 2
## 1.2 1.2
round(with(all, calc.sem(sem, Sum.Cat3and5, cluster2)),1)
## [1] 0.1450611
## [1] 0.1982766
## 1 2
## 0.1 0.2
anovas for Cat3and5
aov.cat = aov(Course.grade ~ Sum.Cat3and5, data=all)
summary(aov.cat)
## Df Sum Sq Mean Sq F value Pr(>F)
## Sum.Cat3and5 1 79 78.78 1.008 0.318
## Residuals 92 7193 78.19
aov.cat2 = aov(MLearliness ~ Sum.Cat3and5, data=all)
summary(aov.cat2)
## Df Sum Sq Mean Sq F value Pr(>F)
## Sum.Cat3and5 1 969 968.9 0.493 0.484
## Residuals 92 180914 1966.5
aov.cat3 = aov(ass.early.num ~ Sum.Cat3and5, data=all)
summary(aov.cat3)
## Df Sum Sq Mean Sq F value Pr(>F)
## Sum.Cat3and5 1 23 23.4 0.01 0.921
## Residuals 92 215028 2337.3
using more data for clusters
distances.all = dist(all[c(2:115, 117:120, 125, 132:136, 137, 141:143)], method = "euclidean")
cluster.all = hclust(distances.all, method = "ward.D2")
plot(cluster.all)
distances.all2 = dist(all[c(117:120, 125, 132:136, 137, 141:143)], method = "euclidean")
cluster.all2 = hclust(distances.all2, method = "ward.D2")
plot(cluster.all2)
cluster.all2.groups = cutree(cluster.all2, k = 2)
all$cluster.all2 = cluster.all2.groups
with(all, table(cluster2, cluster.all2))
## cluster.all2
## cluster2 1 2
## 1 50 11
## 2 31 2
cluster.all.groups = cutree(cluster.all, k = 2)
all$cluster.all = cluster.all.groups
with(all, table(cluster2, cluster.all))
## cluster.all
## cluster2 1 2
## 1 50 11
## 2 31 2
with(all, table(cluster.all, cluster.all2))
## cluster.all2
## cluster.all 1 2
## 1 81 0
## 2 0 13
dim(all)
## [1] 94 146
all[1:2,115:146]
## X25.06.14 cluster3 ML1.previous ML2.planMS ML3.usedMS ML4.planEOS
## 1 0 1 No Yes No No
## 2 0 1 No Yes No No
## total.no total.yes total.maybe total.noinfo access pattern prevLR
## 1 3 1 0 0 21 No Yes No No No
## 2 3 1 0 0 21 No Yes No No No
## access.days Kay.pattern Kay3 cluster2 ML1earliness ML2earliness
## 1 15 4 3 1 8.733333 163.9
## 2 15 4 3 1 139.866667 163.9
## ML3earliness ML4earliness Ass.mark Ass.earliness Course.grade
## 1 140.2667 115.9 86 39.17611 hours 66.6
## 2 140.2667 115.9 86 39.17611 hours 66.6
## ass.early.num Cat3 Cat5 Cat3or5 Sum.Cat3and5 MLearliness cluster.all2
## 1 39.17611 0 0 0 0 107.2000 1
## 2 39.17611 0 0 0 0 139.9833 1
## cluster.all
## 1 1
## 2 1
all$total.yes.gp = ifelse(all$total.yes == 0, 2, 1)
table(all$total.yes.gp)
##
## 1 2
## 47 47
wilcox.test(access.days ~ total.yes.gp, data=all)
## Warning in wilcox.test.default(x = c(15, 15, 33, 28, 41, 31, 13, 17, 25, :
## cannot compute exact p-value with ties
##
## Wilcoxon rank sum test with continuity correction
##
## data: access.days by total.yes.gp
## W = 1660, p-value = 2.654e-05
## alternative hypothesis: true location shift is not equal to 0
#ML3 submission date
dim(MLsub)
## [1] 876 13
MLsub[1:5,]
## StudentID Date Submitted Duration MLtask Open Due
## 1 S8579275 23/03/14 20:43:42 0:09:16 ML1 19/03/14 26/03/14
## 2 S8587419 24/03/14 14:31:37 0:16:02 ML1 19/03/14 26/03/14
## 3 S8530605 24/03/14 17:06:10 47:58:06 ML1 19/03/14 26/03/14
## 4 S8636955 23/03/14 18:28:04 4:40:22 ML1 19/03/14 26/03/14
## 5 S8475915 25/03/14 12:00:45 0:15:28 ML1 19/03/14 26/03/14
## SubDT DueDT Earliness Early.hr
## 1 2014-03-23 20:43:00 2014-03-26 17:00:00 4097 mins 68.28333 hours
## 2 2014-03-24 14:31:00 2014-03-26 17:00:00 3029 mins 50.48333 hours
## 3 2014-03-24 17:06:00 2014-03-26 17:00:00 2874 mins 47.90000 hours
## 4 2014-03-23 18:28:00 2014-03-26 17:00:00 4232 mins 70.53333 hours
## 5 2014-03-25 12:00:00 2014-03-26 17:00:00 1740 mins 29.00000 hours
## Early.hr.num Early.hr.log
## 1 68.28333 4.223666
## 2 50.48333 3.921643
## 3 47.90000 3.869116
## 4 70.53333 4.256085
## 5 29.00000 3.367296
ML3[1:5,]
## StudentID Date Submitted Duration MLtask Open Due
## 441 S8587419 10/05/14 18:10:01 0:22:41 ML3 7/05/14 14/05/14
## 442 S8530605 12/05/14 15:09:04 0:19:05 ML3 7/05/14 14/05/14
## 443 S8636955 12/05/14 21:42:34 0:45:45 ML3 7/05/14 14/05/14
## 444 S8475915 12/05/14 11:57:22 0:21:43 ML3 7/05/14 14/05/14
## 445 S8645607 7/05/14 17:30:54 0:16:33 ML3 7/05/14 14/05/14
## SubDT DueDT Earliness Early.hr
## 441 2014-05-10 18:10:00 2014-05-14 17:00:00 5690 mins 94.83333 hours
## 442 2014-05-12 15:09:00 2014-05-14 17:00:00 2991 mins 49.85000 hours
## 443 2014-05-12 21:42:00 2014-05-14 17:00:00 2598 mins 43.30000 hours
## 444 2014-05-12 11:57:00 2014-05-14 17:00:00 3183 mins 53.05000 hours
## 445 2014-05-07 17:30:00 2014-05-14 17:00:00 10050 mins 167.50000 hours
## Early.hr.num Early.hr.log
## 441 94.83333 4.552121
## 442 49.85000 3.909018
## 443 43.30000 3.768153
## 444 53.05000 3.971235
## 445 167.50000 5.120983
table(ML3$Date)
##
## 1/06/14 10/04/14 10/05/14 11/04/14 11/05/14 12/04/14 12/05/14 13/04/14
## 0 0 11 0 26 0 28 0
## 13/05/14 14/04/14 14/05/14 15/04/14 16/04/14 19/03/14 2/06/14 20/03/14
## 29 0 20 0 0 0 0 0
## 21/03/14 22/03/14 23/03/14 24/03/14 25/03/14 26/03/14 27/05/14 28/05/14
## 0 0 0 0 0 0 0 0
## 29/05/14 3/06/14 30/05/14 31/05/14 4/06/14 7/05/14 8/05/14 9/04/14
## 0 0 0 0 0 44 37 0
## 9/05/14
## 24
#check submission time clusters against lecture recording clusters
all[1:2,117:ncol(all)]
## ML1.previous ML2.planMS ML3.usedMS ML4.planEOS total.no total.yes
## 1 No Yes No No 3 1
## 2 No Yes No No 3 1
## total.maybe total.noinfo access pattern prevLR access.days
## 1 0 0 21 No Yes No No No 15
## 2 0 0 21 No Yes No No No 15
## Kay.pattern Kay3 cluster2 ML1earliness ML2earliness ML3earliness
## 1 4 3 1 8.733333 163.9 140.2667
## 2 4 3 1 139.866667 163.9 140.2667
## ML4earliness Ass.mark Ass.earliness Course.grade ass.early.num Cat3
## 1 115.9 86 39.17611 hours 66.6 39.17611 0
## 2 115.9 86 39.17611 hours 66.6 39.17611 0
## Cat5 Cat3or5 Sum.Cat3and5 MLearliness cluster.all2 cluster.all
## 1 0 0 0 107.2000 1 1
## 2 0 0 0 139.9833 1 1
## total.yes.gp
## 1 1
## 2 1
with(all, table(cluster2, cluster.all))
## cluster.all
## cluster2 1 2
## 1 50 11
## 2 31 2
with(all, table(cluster2, cluster.all2))
## cluster.all2
## cluster2 1 2
## 1 50 11
## 2 31 2
with(all, tapply(access, cluster2, mean))
## 1 2
## 63.44262 18.93939
with(all, tapply(access, cluster.all, mean))
## 1 2
## 45.58025 61.76923
with(all, tapply(access, cluster.all2, mean))
## 1 2
## 45.58025 61.76923
with(all, tapply(access.days, cluster.all, mean))
## 1 2
## 17.04938 20.07692
with(all, tapply(MLearliness, cluster.all, mean))
## 1 2
## 115.35874 20.92981
with(all, tapply(ass.early.num, cluster.all, mean))
## 1 2
## 32.20305 10.89310
names(all[c(117:120, 125, 132:136, 137, 141:143)])
## [1] "ML1.previous" "ML2.planMS" "ML3.usedMS" "ML4.planEOS"
## [5] "access" "ML1earliness" "ML2earliness" "ML3earliness"
## [9] "ML4earliness" "Ass.mark" "Ass.earliness" "Cat5"
## [13] "Cat3or5" "Sum.Cat3and5"
with(all, tapply(Ass.mark, cluster.all, mean))
## 1 2
## 84.61728 80.53846
with(all, tapply(Course.grade, cluster.all, mean))
## 1 2
## 80.17160 68.26154
CONCLUSIONS:
clusters based on ML responses, lect recording access, earliness, ass mark and Cat3/5 have: huge difference in ML earliness substantial diff in ass earliness no diff in ass mark very large diff in course grade
wilcox.test(Course.grade ~ cluster.all, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Course.grade by cluster.all
## W = 931, p-value = 9.634e-06
## alternative hypothesis: true location shift is not equal to 0
checking impact of qual in cluster.all determination
with(all, tapply(Cat3or5, cluster.all, mean))
## 1 2
## 0.8271605 1.0769231
with(all, tapply(Sum.Cat3and5, cluster.all, mean))
## 1 2
## 1.148148 1.461538
addmargins(with(all, table(ML1.previous, cluster.all)))
## cluster.all
## ML1.previous 1 2 Sum
## Maybe 14 2 16
## No 44 6 50
## Yes 23 5 28
## Sum 81 13 94
with(all, table(ML2.planMS, cluster.all))
## cluster.all
## ML2.planMS 1 2
## Maybe 4 1
## No 64 9
## Yes 12 3
with(all, table(ML3.usedMS, cluster.all))
## cluster.all
## ML3.usedMS 1 2
## Maybe 4 0
## No 61 3
## Yes 14 10
with(all, table(ML4.planEOS, cluster.all))
## cluster.all
## ML4.planEOS 1 2
## Maybe 2 0
## No 60 7
## Yes 13 4
qual categories and phases
## StudentID strat1 strat2 strat3 strat4 strat5 strat6 strat7 strat8 strat9
## 1 S8152093 0 0 2 0 1 1 1 0 1
## 2 S8469547 0 0 2 1 0 1 1 0 0
## 3 S8522577 0 0 1 0 2 1 1 0 0
## 4 S8533121 0 0 2 0 0 1 0 0 1
## 5 S8575195 0 0 1 1 2 1 2 0 2
## strat10 foretht perf eval phases
## 1 0 0 4 1 2
## 2 0 0 4 0 1
## 3 0 0 4 0 1
## 4 0 0 2 1 2
## 5 0 0 5 1 2
## [1] 94 161
## cluster.all
## foretht 1 2 Sum
## 0 74 13 87
## 1 6 0 6
## 2 1 0 1
## Sum 81 13 94
## cluster2
## foretht 1 2 Sum
## 0 56 31 87
## 1 4 2 6
## 2 1 0 1
## Sum 61 33 94
## cluster.all
## perf 1 2 Sum
## 1 5 0 5
## 2 18 2 20
## 3 29 4 33
## 4 21 4 25
## 5 8 1 9
## 6 0 2 2
## Sum 81 13 94
## cluster2
## perf 1 2 Sum
## 1 3 2 5
## 2 14 6 20
## 3 20 13 33
## 4 16 9 25
## 5 6 3 9
## 6 2 0 2
## Sum 61 33 94
## cluster.all
## eval 1 2 Sum
## 0 27 4 31
## 1 40 8 48
## 2 14 1 15
## Sum 81 13 94
## cluster2
## eval 1 2 Sum
## 0 18 13 31
## 1 34 14 48
## 2 9 6 15
## Sum 61 33 94
## cluster.all
## phases 1 2 Sum
## 1 25 4 29
## 2 51 9 60
## 3 5 0 5
## Sum 81 13 94
## cluster2
## phases 1 2 Sum
## 1 17 12 29
## 2 40 20 60
## 3 4 1 5
## Sum 61 33 94
5 cluster and 3 cluster solutions for cluster.all2 check
cluster.all2.groups.k5 = cutree(cluster.all2, k = 5)
all$cluster.all2k5 = cluster.all2.groups.k5
with(all, tapply(Course.grade, cluster.all2k5, mean))
## 1 2 3 4 5
## 80.70000 78.41500 81.62857 80.10588 68.26154
cluster.all2.groups.k3 = cutree(cluster.all2, k = 3)
all$cluster.all2k3 = cluster.all2.groups.k3
with(all, tapply(Course.grade, cluster.all2k3, mean))
## 1 2 3
## 80.18906 80.10588 68.26154
with(all, tapply(Course.grade, cluster.all2, mean))
## 1 2
## 80.17160 68.26154
with(all, tapply(Course.grade, cluster.all2, sem))
## [1] 0.8932916
## [1] 1.812051
## 1 2
## 0.8932916 1.8120515
wilcox.test(Course.grade ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Course.grade by cluster.all2
## W = 931, p-value = 9.634e-06
## alternative hypothesis: true location shift is not equal to 0
with(all, tapply(access, cluster.all2, mean))
## 1 2
## 45.58025 61.76923
with(all, tapply(access, cluster.all2, sem))
## [1] 4.045759
## [1] 6.870742
## 1 2
## 4.045759 6.870742
wilcox.test(access ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: access by cluster.all2
## W = 330, p-value = 0.03178
## alternative hypothesis: true location shift is not equal to 0
with(all, tapply(access.days, cluster.all2, mean))
## 1 2
## 17.04938 20.07692
with(all, tapply(access.days, cluster.all2, sem))
## [1] 0.967034
## [1] 2.076923
## 1 2
## 0.967034 2.076923
wilcox.test(access.days ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: access.days by cluster.all2
## W = 401.5, p-value = 0.1722
## alternative hypothesis: true location shift is not equal to 0
all$ML1.previous[1:5]
## [1] No No No Yes Maybe
## Levels: Maybe No Yes
#with(all, tapply(ML1.previous, cluster.all2, mean))
#with(all, tapply(ML1.previous, cluster.all2, sem))
#wilcox.test(ML1.previous ~ cluster.all2, data=all)
#not numerical
with(all, tapply(total.yes, cluster.all2, mean))
## 1 2
## 0.7654321 1.6923077
with(all, tapply(total.yes, cluster.all2, sem))
## [1] 0.1182063
## [1] 0.3468654
## 1 2
## 0.1182063 0.3468654
wilcox.test(total.yes ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: total.yes by cluster.all2
## W = 304, p-value = 0.008412
## alternative hypothesis: true location shift is not equal to 0
addmargins(with(all, table(ML1.previous, cluster.all2)))
## cluster.all2
## ML1.previous 1 2 Sum
## Maybe 14 2 16
## No 44 6 50
## Yes 23 5 28
## Sum 81 13 94
addmargins(with(all, table(ML2.planMS, cluster.all2)))
## cluster.all2
## ML2.planMS 1 2 Sum
## Maybe 4 1 5
## No 64 9 73
## Yes 12 3 15
## Sum 80 13 93
addmargins(with(all, table(ML3.usedMS, cluster.all2)))
## cluster.all2
## ML3.usedMS 1 2 Sum
## Maybe 4 0 4
## No 61 3 64
## Yes 14 10 24
## Sum 79 13 92
addmargins(with(all, table(ML4.planEOS, cluster.all2)))
## cluster.all2
## ML4.planEOS 1 2 Sum
## Maybe 2 0 2
## No 60 7 67
## Yes 13 4 17
## Sum 75 11 86
str(all$ML1earliness)
## num [1:94] 8.73 139.87 19.87 43.2 117.33 ...
with(all, tapply(ML1earliness, cluster.all2, mean))
## 1 2
## 107.94156 32.46923
with(all, tapply(ML1earliness, cluster.all2, sem))
## [1] 5.19381
## [1] 7.732229
## 1 2
## 5.193810 7.732229
wilcox.test(ML1earliness ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: ML1earliness by cluster.all2
## W = 947, p-value = 4.223e-06
## alternative hypothesis: true location shift is not equal to 0
with(all, tapply(ML2earliness, cluster.all2, mean))
## 1 2
## 101.3214 16.4000
with(all, tapply(ML2earliness, cluster.all2, sem))
## [1] 5.696691
## [1] 4.361642
## 1 2
## 5.696691 4.361642
wilcox.test(ML2earliness ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: ML2earliness by cluster.all2
## W = 973, p-value = 1.035e-06
## alternative hypothesis: true location shift is not equal to 0
with(all, tapply(ML3earliness, cluster.all2, mean))
## 1 2
## 118.88745 19.32692
with(all, tapply(ML3earliness, cluster.all2, sem))
## [1] 5.166531
## [1] 5.197067
## 1 2
## 5.166531 5.197067
wilcox.test(ML3earliness ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: ML3earliness by cluster.all2
## W = 1020, p-value = 6.678e-08
## alternative hypothesis: true location shift is not equal to 0
with(all, tapply(ML4earliness, cluster.all2, mean))
## 1 2
## 133.28457 15.52308
with(all, tapply(ML4earliness, cluster.all2, sem))
## [1] 5.957776
## [1] 4.306585
## 1 2
## 5.957776 4.306585
wilcox.test(ML4earliness ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: ML4earliness by cluster.all2
## W = 1018, p-value = 7.541e-08
## alternative hypothesis: true location shift is not equal to 0
with(all, tapply(MLearliness, cluster.all2, mean))
## 1 2
## 115.35874 20.92981
with(all, tapply(MLearliness, cluster.all2, sem))
## [1] 3.520184
## [1] 3.298145
## 1 2
## 3.520184 3.298145
wilcox.test(MLearliness ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: MLearliness by cluster.all2
## W = 1053, p-value = 8.36e-09
## alternative hypothesis: true location shift is not equal to 0
115/24
## [1] 4.791667
.791667*24
## [1] 19.00001
with(all, table(Ass.earliness, cluster.all2))
## cluster.all2
## Ass.earliness 1 2
## -143.96 0 1
## -0.0652777777777778 1 0
## 0.0736111111111111 1 0
## 0.0897222222222222 1 0
## 0.143055555555556 1 0
## 0.147222222222222 1 0
## 0.203611111111111 1 0
## 0.265277777777778 1 0
## 0.303888888888889 0 1
## 0.516111111111111 0 1
## 0.546111111111111 0 1
## 0.555277777777778 0 1
## 0.681944444444444 1 0
## 0.687777777777778 1 0
## 0.713055555555556 1 0
## 0.771111111111111 1 0
## 0.895555555555556 1 0
## 0.977222222222222 0 1
## 1.51083333333333 0 1
## 1.70527777777778 1 0
## 1.89444444444444 1 0
## 1.91222222222222 1 0
## 2.05 1 0
## 2.40416666666667 1 0
## 2.57666666666667 0 1
## 2.65722222222222 1 0
## 2.66472222222222 1 0
## 2.79722222222222 1 0
## 2.81888888888889 1 0
## 2.9925 1 0
## 3.26722222222222 1 0
## 3.47472222222222 1 0
## 4.72 1 0
## 6.31416666666667 0 1
## 9.57305555555556 1 0
## 9.71305555555556 1 0
## 10.8986111111111 1 0
## 11.6347222222222 1 0
## 11.6375 1 0
## 11.8708333333333 1 0
## 12.3691666666667 1 0
## 12.7441666666667 0 1
## 12.7788888888889 1 0
## 12.9455555555556 1 0
## 13.0108333333333 1 0
## 13.0341666666667 1 0
## 13.2358333333333 1 0
## 13.64 1 0
## 13.7522222222222 1 0
## 14.5788888888889 1 0
## 14.8811111111111 1 0
## 15.5013888888889 1 0
## 15.5069444444444 2 0
## 16.6544444444444 1 0
## 19.9622222222222 0 1
## 21.5005555555556 1 0
## 22.4719444444444 1 0
## 22.9075 0 1
## 23.2275 1 0
## 23.2405555555556 1 0
## 23.6263888888889 1 0
## 24.3558333333333 1 0
## 24.59 1 0
## 26.6661111111111 1 0
## 27.5688888888889 1 0
## 28.4233333333333 1 0
## 37.2447222222222 1 0
## 37.8005555555556 1 0
## 39.1761111111111 2 0
## 40.1525 1 0
## 40.8602777777778 1 0
## 42.7508333333333 1 0
## 43.4602777777778 1 0
## 46.4233333333333 1 0
## 47.0469444444444 1 0
## 47.6402777777778 1 0
## 51.3877777777778 1 0
## 51.4061111111111 1 0
## 71.1552777777778 1 0
## 71.3391666666667 1 0
## 73.3827777777778 1 0
## 75.9794444444444 1 0
## 84.4688888888889 1 0
## 90.9058333333333 1 0
## 119.182222222222 1 0
## 123.219722222222 1 0
## 134.078888888889 1 0
## 137.369722222222 1 0
## 146.389722222222 1 0
## 178.101388888889 1 0
## 189.9875 1 0
## 216.656111111111 0 1
#figuring out why Ass.earliness errors (time diff instead)
str(all$Ass.earliness)
## Class 'difftime' atomic [1:94] 39.1761 39.1761 2.4042 37.2447 0.0897 ...
## ..- attr(*, "units")= chr "hours"
all$Ass.earliness.num = as.numeric(all$Ass.earliness)
with(all, tapply(Ass.earliness, cluster.all2, mean))
## 1 2
## 32.20305 10.89310
with(all, tapply(Ass.earliness, cluster.all2, sem))
## [1] 4.692823
## [1] 20.76396
## 1 2
## 4.692823 20.763958
wilcox.test(Ass.earliness.num ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Ass.earliness.num by cluster.all2
## W = 754, p-value = 0.01291
## alternative hypothesis: true location shift is not equal to 0
with(all, tapply(Ass.earliness, cluster.all2, median))
## 1 2
## 14.881111 1.510833
with(all, boxplot(Ass.earliness.num ~ cluster.all2))
range(all$Ass.earliness.num)
## [1] -143.9600 216.6561
sort(all$Ass.earliness.num)
## [1] -143.96000000 -0.06527778 0.07361111 0.08972222 0.14305556
## [6] 0.14722222 0.20361111 0.26527778 0.30388889 0.51611111
## [11] 0.54611111 0.55527778 0.68194444 0.68777778 0.71305556
## [16] 0.77111111 0.89555556 0.97722222 1.51083333 1.70527778
## [21] 1.89444444 1.91222222 2.05000000 2.40416667 2.57666667
## [26] 2.65722222 2.66472222 2.79722222 2.81888889 2.99250000
## [31] 3.26722222 3.47472222 4.72000000 6.31416667 9.57305556
## [36] 9.71305556 10.89861111 11.63472222 11.63750000 11.87083333
## [41] 12.36916667 12.74416667 12.77888889 12.94555556 13.01083333
## [46] 13.03416667 13.23583333 13.64000000 13.75222222 14.57888889
## [51] 14.88111111 15.50138889 15.50694444 15.50694444 16.65444444
## [56] 19.96222222 21.50055556 22.47194444 22.90750000 23.22750000
## [61] 23.24055556 23.62638889 24.35583333 24.59000000 26.66611111
## [66] 27.56888889 28.42333333 37.24472222 37.80055556 39.17611111
## [71] 39.17611111 40.15250000 40.86027778 42.75083333 43.46027778
## [76] 46.42333333 47.04694444 47.64027778 51.38777778 51.40611111
## [81] 71.15527778 71.33916667 73.38277778 75.97944444 84.46888889
## [86] 90.90583333 119.18222222 123.21972222 134.07888889 137.36972222
## [91] 146.38972222 178.10138889 189.98750000 216.65611111
#with(all, sort(table(cluster.all2, Ass.earliness.num)))
with(all, boxplot(log(Ass.earliness.num) ~ cluster.all2))
## Warning in log(Ass.earliness.num): NaNs produced
which.min(all$Ass.earliness.num)
## [1] 19
dim(all)
## [1] 94 164
all[19,c(143:148, 164)]
## Sum.Cat3and5 MLearliness cluster.all2 cluster.all total.yes.gp strat1
## 19 2 1.020833 2 2 1 0
## Ass.earliness.num
## 19 -143.96
with(all, table(cluster.all2))
## cluster.all2
## 1 2
## 81 13
143.9600/24
## [1] 5.998333
.998333*24
## [1] 23.95999
.95999*60
## [1] 57.5994
.5994*60
## [1] 35.964
#removing value for student who had an extension
#Ass.earliness.num
#19 -143.96
all[19,164] = ""
all[15:20, 160:164]
## eval phases cluster.all2k5 cluster.all2k3 Ass.earliness.num
## 15 0 1 4 2 119.182222222222
## 16 2 2 3 1 73.3827777777778
## 17 2 2 4 2 21.5005555555556
## 18 1 2 3 1 -0.0652777777777778
## 19 1 2 5 3
## 20 2 3 1 1 14.8811111111111
str(all$Ass.earliness.num)
## chr [1:94] "39.1761111111111" "39.1761111111111" ...
all$Ass.earliness.num = as.numeric(all$Ass.earliness.num)
str(all$Ass.earliness.num)
## num [1:94] 39.1761 39.1761 2.4042 37.2447 0.0897 ...
wilcox.test(Ass.earliness.num ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Ass.earliness.num by cluster.all2
## W = 673, p-value = 0.03257
## alternative hypothesis: true location shift is not equal to 0
with(all, tapply(Ass.earliness.num, cluster.all2, mean, na.rm=T))
## 1 2
## 32.20305 23.79752
#with(all, tapply(Ass.earliness.num, cluster.all2, sem, na.rm=T))
with(all, tapply(Ass.earliness.num, cluster.all2, sd, na.rm=T))
## 1 2
## 42.23540 61.25978
61.25978/sqrt(12)
## [1] 17.68418
32/24
## [1] 1.333333
.3*24
## [1] 7.2
#Emailed to Kay
#With that student removed:
#p-value = 0.03257
#High performing cluster: 32.20305 +/- 4.692823 hours
#Low performing cluster: 23.79752 +/- 17.68418 hours
#but actually should have just converted to th 2.5min before the new deadline for their extension
all$Ass.earliness.num[19]
## [1] NA
6*24 -143.96
## [1] 0.04
all$Ass.earliness.num[19] = 0.04
all$Ass.earliness.num[18:20]
## [1] -0.06527778 0.04000000 14.88111111
str(all$Ass.earliness.num)
## num [1:94] 39.1761 39.1761 2.4042 37.2447 0.0897 ...
wilcox.test(Ass.earliness.num ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Ass.earliness.num by cluster.all2
## W = 753, p-value = 0.01331
## alternative hypothesis: true location shift is not equal to 0
with(all, tapply(Ass.earliness.num, cluster.all2, mean))
## 1 2
## 32.20305 21.97002
with(all, tapply(Ass.earliness.num, cluster.all2, sem))
## [1] 4.692823
## [1] 16.36941
## 1 2
## 4.692823 16.369409
#checking if organisation qual differs between high and low performing clusters
with(all, tapply(Cat3, cluster.all2, mean))
## 1 2
## 0.8271605 1.1538462
with(all, tapply(Cat3, cluster.all2, sem))
## [1] 0.1065597
## [1] 0.2492593
## 1 2
## 0.1065597 0.2492593
wilcox.test(Cat3 ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Cat3 by cluster.all2
## W = 405, p-value = 0.1565
## alternative hypothesis: true location shift is not equal to 0
with(all, tapply(Cat5, cluster.all2, mean))
## 1 2
## 0.3209877 0.3076923
with(all, tapply(Cat5, cluster.all2, sem))
## [1] 0.06041819
## [1] 0.1332347
## 1 2
## 0.06041819 0.13323468
wilcox.test(Cat5 ~ cluster.all2, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Cat5 by cluster.all2
## W = 520, p-value = 0.9336
## alternative hypothesis: true location shift is not equal to 0
cluster.all2 independent variable check
names(all[c(117:120, 125, 132:136, 137, 141:143)])
## [1] "ML1.previous" "ML2.planMS" "ML3.usedMS" "ML4.planEOS"
## [5] "access" "ML1earliness" "ML2earliness" "ML3earliness"
## [9] "ML4earliness" "Ass.mark" "Ass.earliness" "Cat5"
## [13] "Cat3or5" "Sum.Cat3and5"
cluster.all check - do we use timing or not?
with(all, table(cluster.all, cluster.all2))
## cluster.all2
## cluster.all 1 2
## 1 81 0
## 2 0 13
access related to academic performance
dim(all)
## [1] 94 164
names(all[120:163])
## [1] "ML4.planEOS" "total.no" "total.yes" "total.maybe"
## [5] "total.noinfo" "access" "pattern" "prevLR"
## [9] "access.days" "Kay.pattern" "Kay3" "cluster2"
## [13] "ML1earliness" "ML2earliness" "ML3earliness" "ML4earliness"
## [17] "Ass.mark" "Ass.earliness" "Course.grade" "ass.early.num"
## [21] "Cat3" "Cat5" "Cat3or5" "Sum.Cat3and5"
## [25] "MLearliness" "cluster.all2" "cluster.all" "total.yes.gp"
## [29] "strat1" "strat2" "strat3" "strat4"
## [33] "strat5" "strat6" "strat7" "strat8"
## [37] "strat9" "strat10" "foretht" "perf"
## [41] "eval" "phases" "cluster.all2k5" "cluster.all2k3"
all$total.yes.gp[1:10]
## [1] 1 1 2 1 2 1 2 2 1 1
with(all, table(total.yes, total.yes.gp))
## total.yes.gp
## total.yes 1 2
## 0 0 47
## 1 26 0
## 2 7 0
## 3 12 0
## 4 2 0
aov.acp.lr = aov(Course.grade ~ total.yes, data=all)
summary(aov.acp.lr)
## Df Sum Sq Mean Sq F value Pr(>F)
## total.yes 1 7 7.10 0.09 0.765
## Residuals 92 7265 78.97
wilcox.test(Course.grade ~ total.yes.gp, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Course.grade by total.yes.gp
## W = 962.5, p-value = 0.2846
## alternative hypothesis: true location shift is not equal to 0
wilcox.test(access.days ~ total.yes.gp, data=all)
##
## Wilcoxon rank sum test with continuity correction
##
## data: access.days by total.yes.gp
## W = 1660, p-value = 2.654e-05
## alternative hypothesis: true location shift is not equal to 0
Post Review relating earliness of ML submission to newly coded (post inter-rater reliability) organisation strategies (Strategies 2 and 5)
## StudentID ML1earliness ML2earliness ML3earliness ML4earliness Ass.mark
## 20 S8533573 72.95000 100.50000 139.66667 117.4333 86
## 21 S8565657 15.96667 75.10000 114.50000 17.7500 97
## 30 S8578997 68.48333 151.21667 166.26667 164.4333 83
## 40 S8596773 119.93333 44.08333 43.51667 153.7167 83
## 90 S8648397 52.30000 140.26667 144.03333 128.6500 84
## Ass.earliness Course.grade MLearliness Strat2 Strat5 Strat2.5
## 20 14.88111 hours 83.2 107.63750 1 0 Strat2
## 21 75.97944 hours 91.4 55.82917 2 0 Strat2
## 30 13.03417 hours 69.9 137.60000 1 1 both
## 40 146.38972 hours 96.4 90.31250 1 2 both
## 90 12.94556 hours 81.0 116.31250 1 1 both
## StudentID ML1earliness ML2earliness ML3earliness ML4earliness Ass.mark
## 1 S6089847 8.733333 163.900000 140.26667 115.900000 86
## 3 S8117889 19.866667 47.366667 162.98333 186.316667 75
## 4 S8118323 43.200000 66.950000 19.20000 186.733333 94
## 5 S8152093 117.333333 5.466667 0.50000 168.600000 82
## 6 S8239113 164.033333 66.500000 124.88333 166.466667 91
## 21 S8565657 15.966667 75.100000 114.50000 17.750000 97
## 22 S8575195 99.216667 119.050000 146.56667 113.750000 88
## 23 S8575571 4.366667 18.166667 29.71667 4.283333 86
## 24 S8576691 150.233333 151.266667 90.40000 150.550000 96
## Ass.earliness Course.grade MLearliness Strat2 Strat5 Strat2.5
## 1 39.17611111 hours 66.6 107.20000 0 0 none
## 3 2.40416667 hours 64.7 104.13333 0 0 none
## 4 37.24472222 hours 78.0 79.02083 0 1 Strat5
## 5 0.08972222 hours 80.7 72.97500 0 1 Strat5
## 6 1.70527778 hours 68.4 130.47083 0 0 none
## 21 75.97944444 hours 91.4 55.82917 2 0 Strat2
## 22 2.81888889 hours 90.7 119.64583 0 2 Strat5
## 23 1.51083333 hours 68.4 14.13333 0 0 none
## 24 13.23583333 hours 89.3 135.61250 0 0 none
##
## both none Strat2 Strat5
## 3 63 2 24
##
## Not Organisers
## 63 29
##
## Welch Two Sample t-test
##
## data: Course.grade by org
## t = -1.1785, df = 61.502, p-value = 0.2431
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -5.943255 1.535155
## sample estimates:
## mean in group Not mean in group Organisers
## 77.82698 80.03103
all strategies recoded
all.strat = read.csv("All new strategies coded.csv", header=T, stringsAsFactors=F)
cor(all.strat[,2:15])
## course.mark X1 X2 X3
## course.mark 1.000000000 0.03275715 0.19154943 -0.2018816967
## X1 0.032757148 1.00000000 0.31999507 0.1255206442
## X2 0.191549431 0.31999507 1.00000000 -0.0370540279
## X3 -0.201881697 0.12552064 -0.03705403 1.0000000000
## X4 -0.006170415 -0.08858909 -0.02444770 -0.0836891433
## X5 0.051955886 0.10443246 0.13615500 0.1116975269
## X6 0.064109286 -0.12306464 -0.09031165 -0.3442794733
## X7 0.101618505 -0.08781616 0.10169791 -0.1637664344
## X8 -0.061660518 -0.10063949 -0.08178829 -0.1161892905
## X9 0.091629805 0.08076773 0.03750570 -0.0733046862
## X10 0.121862666 -0.05201565 -0.03817196 -0.1948123689
## Forethought 0.152014462 0.66487864 0.88228155 0.0003859795
## Perf -0.074095021 -0.09719023 0.01295356 0.1282175287
## Self.reflect 0.127133507 0.01807959 -0.06563776 -0.1832439851
## X4 X5 X6 X7
## course.mark -0.006170415 0.051955886 0.06410929 0.101618505
## X1 -0.088589094 0.104432460 -0.12306464 -0.087816158
## X2 -0.024447700 0.136155003 -0.09031165 0.101697908
## X3 -0.083689143 0.111697527 -0.34427947 -0.163766434
## X4 1.000000000 0.046777419 0.09820317 0.244027344
## X5 0.046777419 1.000000000 -0.25727484 0.005412309
## X6 0.098203173 -0.257274838 1.00000000 0.130611091
## X7 0.244027344 0.005412309 0.13061109 1.000000000
## X8 0.013296562 -0.264383446 0.12908879 0.061323326
## X9 -0.137310715 0.058654311 -0.16509993 -0.100806052
## X10 -0.031530616 -0.183374195 0.30941262 0.017227186
## Forethought -0.024447700 0.192837558 -0.13105690 -0.002141009
## Perf 0.603128803 0.221803167 0.23073862 0.505264464
## Self.reflect -0.077028115 -0.233662427 0.13825486 -0.111591227
## X8 X9 X10 Forethought Perf
## course.mark -0.06166052 0.09162980 0.12186267 0.1520144620 -0.07409502
## X1 -0.10063949 0.08076773 -0.05201565 0.6648786449 -0.09719023
## X2 -0.08178829 0.03750570 -0.03817196 0.8822815534 0.01295356
## X3 -0.11618929 -0.07330469 -0.19481237 0.0003859795 0.12821753
## X4 0.01329656 -0.13731072 -0.03153062 -0.0244477000 0.60312880
## X5 -0.26438345 0.05865431 -0.18337419 0.1928375576 0.22180317
## X6 0.12908879 -0.16509993 0.30941262 -0.1310569034 0.23073862
## X7 0.06132333 -0.10080605 0.01722719 -0.0021410086 0.50526446
## X8 1.00000000 -0.11377992 0.23252609 -0.1346780586 0.30022220
## X9 -0.11377992 1.00000000 0.03939345 0.0996946457 -0.17733845
## X10 0.23252609 0.03939345 1.00000000 -0.0381719626 -0.01455959
## Forethought -0.13467806 0.09969465 -0.03817196 1.0000000000 -0.04833890
## Perf 0.30022220 -0.17733845 -0.01455959 -0.0483388991 1.00000000
## Self.reflect 0.10132603 0.52102121 0.68736266 -0.0112201297 -0.18548301
## Self.reflect
## course.mark 0.12713351
## X1 0.01807959
## X2 -0.06563776
## X3 -0.18324399
## X4 -0.07702811
## X5 -0.23366243
## X6 0.13825486
## X7 -0.11159123
## X8 0.10132603
## X9 0.52102121
## X10 0.68736266
## Forethought -0.01122013
## Perf -0.18548301
## Self.reflect 1.00000000
with(all.strat, cor.test(course.mark, X3))
##
## Pearson's product-moment correlation
##
## data: course.mark and X3
## t = -2.0091, df = 95, p-value = 0.04737
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.385793314 -0.002538571
## sample estimates:
## cor
## -0.2018817
sapply(all.strat[3:12], sum)
## X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
## 1 7 83 47 36 100 56 63 108 20
grouping based on ML submission earliness
names(all.e)
## [1] "StudentID" "ML1earliness" "ML2earliness" "ML3earliness"
## [5] "ML4earliness" "Ass.mark" "Ass.earliness" "Course.grade"
## [9] "MLearliness"
head(all.e)
## StudentID ML1earliness ML2earliness ML3earliness ML4earliness Ass.mark
## 1 S6089847 8.733333 163.900000 140.2667 115.9000 86
## 2 S6089847 139.866667 163.900000 140.2667 115.9000 86
## 3 S8117889 19.866667 47.366667 162.9833 186.3167 75
## 4 S8118323 43.200000 66.950000 19.2000 186.7333 94
## 5 S8152093 117.333333 5.466667 0.5000 168.6000 82
## 6 S8239113 164.033333 66.500000 124.8833 166.4667 91
## Ass.earliness Course.grade MLearliness
## 1 39.17611111 hours 66.6 107.20000
## 2 39.17611111 hours 66.6 139.98333
## 3 2.40416667 hours 64.7 104.13333
## 4 37.24472222 hours 78.0 79.02083
## 5 0.08972222 hours 80.7 72.97500
## 6 1.70527778 hours 68.4 130.47083
write.csv(all.e, file = "ReviewData.csv")
MLearliness.cor = cor(all.e[,2:5])
write.csv(MLearliness.cor, file = "MLearlinessCorrelations.csv")
require(reshape2)
## Loading required package: reshape2
all.melt = melt(all.e[,1:5])
## Using StudentID as id variables
all.melt[seq(1, 376, 10),]
## StudentID variable value
## 1 S6089847 ML1earliness 8.733333
## 11 S8465063 ML1earliness 102.500000
## 21 S8565657 ML1earliness 15.966667
## 31 S8582165 ML1earliness 149.850000
## 41 S8599407 ML1earliness 43.200000
## 51 S8635621 ML1earliness 144.550000
## 61 S8639321 ML1earliness 168.566667
## 71 S8643279 ML1earliness 28.666667
## 81 S8646161 ML1earliness 147.750000
## 91 S8650953 ML1earliness 167.933333
## 101 S8283571 ML2earliness 51.433333
## 111 S8526677 ML2earliness 164.583333
## 121 S8577829 ML2earliness 125.283333
## 131 S8587857 ML2earliness 28.616667
## 141 S8634295 ML2earliness 143.050000
## 151 S8638219 ML2earliness 124.966667
## 161 S8641369 ML2earliness 137.550000
## 171 S8644267 ML2earliness 115.500000
## 181 S8647667 ML2earliness 162.166667
## 191 S8117889 ML3earliness 162.983333
## 201 S8471971 ML3earliness 4.233333
## 211 S8575571 ML3earliness 29.716667
## 221 S8582465 ML3earliness 171.900000
## 231 S8603333 ML3earliness 70.000000
## 241 S8635995 ML3earliness 51.383333
## 251 S8639711 ML3earliness 19.466667
## 261 S8643377 ML3earliness 128.600000
## 271 S8646489 ML3earliness 97.666667
## 281 S8651655 ML3earliness 161.516667
## 291 S8407099 ML4earliness 190.900000
## 301 S8533121 ML4earliness 1.483333
## 311 S8578719 ML4earliness 141.483333
## 321 S8593947 ML4earliness 164.550000
## 331 S8635555 ML4earliness 162.850000
## 341 S8638579 ML4earliness 13.683333
## 351 S8641669 ML4earliness 22.400000
## 361 S8645607 ML4earliness 195.666667
## 371 S8648209 ML4earliness 160.366667
str(all.melt)
## 'data.frame': 376 obs. of 3 variables:
## $ StudentID: chr "S6089847" "S6089847" "S8117889" "S8118323" ...
## $ variable : Factor w/ 4 levels "ML1earliness",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ value : num 8.73 139.87 19.87 43.2 117.33 ...
library(lattice)
histogram(~ value | variable, data = all.melt)
ggplot(all.melt, aes(value, fill = variable)) + geom_histogram(binwidth = 24) + facet_grid(variable ~ ., margins = TRUE, scales = "free") + xlim(0,225)