Summary

This report converge a data analysis of the General Education course taught during the term summer term (2016-3) of the academic year 2016-2017. This course is administered to non-science majors, most of whom are freshman and sophomore students, in a web-enhanced format through Blackboard Learn^TM at Miami Dade College. In this data analysis report, which was performed just after exam 1 was administered (out three to be administered in the course), addressed the following questions:

How does exam 1 scores compare to scores in formative assessments (quizzes, polling questions, and reading assignments)?
How many students had a higher score in exam 1 compared to their mean Quizzes scores?
Which question, topic, bt, kd, and format had the lowest proportion of correct answers?
Is there a relationship between formative (quizzes, ra, reef) and exam 1?
If there was relationship, which formative assessment had more predictive power of the exam1 grades?
Is there a relationship between attendance and exam 1?
Is there a relationship between attendance and quizzes?
How does the proportion of corrects differ between assessments?

Outline of the report

Datasets and packages
Question 1
Question 2
Question 3
Question 4
Question 5
Question 6
Question 7
Question 8
Conclusion and Possible Actions

Datasets and packages

#---datasets-----------------------------
# load packages
library(tidyverse) # packages for data manipulation
library(psych) # package for calculating correlation
library(pastecs) # package for descriptive statistics
library(readxl) # package for loading excel files
library(lubridate) # package to manipulate date and time data

# load datasets
## set working directory
setwd("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments")

## Dataset from Blackboard
bsc1005 <-read_csv(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/", 
                          "gc_BSC1005-2175-1501_fullgc_2017-06-01-16-29-09.csv"))

## Reef polling Dataset
reef1 <- read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/7_2_Reef_Polling/", 
                           "BSC1005_2016-3_1501_Session 2 - 05-11-17.xlsx"),
               range = cell_rows(8:48))

reef2 <- read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/7_2_Reef_Polling/", 
                           "BSC1005_2016-3_1501_Session 3 - 05-16-17.xlsx"),
               range = cell_rows(8:44))

reef3 <- read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/7_2_Reef_Polling/", 
                           "BSC1005_2016-3_1501_Session 4 - 05-18-17.xlsx"),
               range = cell_rows(8:47))

reef4 <- read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/7_2_Reef_Polling/", 
                           "BSC1005_2016-3_1501_Session 5 - 05-23-17.xlsx"),
               range = cell_rows(8:43))

reef5 <- read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/7_2_Reef_Polling/", 
                           "BSC1005_2016-3_1501_Session 6 - 05-25-17.xlsx"),
               range = cell_rows(8:31))

reef6 <- read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/7_2_Reef_Polling/", 
                           "BSC1005_2016-3_1501_Session 7 - 05-30-17.xlsx"),
               range = cell_rows(8:39))

bsc_inclass <-read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/", 
                          "BSC1005_2016-3_1501_InClassExercises.xlsx"))

reefinf <-read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/", 
                            "BSC1005_2016-3_1501_reefpolling.xlsx"))

ra1_attempt <- read_tsv(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/",
                               "BSC1005_2016-3_1501_ra1_attempt.txt"))

ra2_attempt <- read_tsv(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/",
                               "BSC1005_2016-3_1501_ra2_attempt.txt"))

ra1 <- read_csv(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/",
                       "BSC1005_2016-3_1501_ra1_results.csv"))

ra2 <- read_csv(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/",
                       "BSC1005_2016-3_1501_ra2_results.csv"))

qz1 <- read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/",
                         "BSC1005_2016-3_1501_quizzes.xlsx"), sheet=1)

qz2 <- read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/",
                          "BSC1005_2016-3_1501_quizzes.xlsx"), sheet=2)

qz3 <- read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/",
                         "BSC1005_2016-3_1501_quizzes.xlsx"), sheet=3)

ex1 <- read_excel(paste0("C:/Users/Felix/Dropbox/Teaching/1_Miami-Dade College/Statistical assessments/",
                         "BSC1005_2016-3_1501_exams.xlsx"), sheet=1)

Question 1

How does exam 1 scores compare to scores in formative assessments (quizzes, polling questions, and reading assignments)?

To address this first question, I took advantage of one of the many utilities of Blackboard Learn^TM: to download grades files locally (click here for more information of this utility). After downloading and cleaning the current-grades dataset, the assessments were first compared through distribution of scores followed by descriptive statistics (mean, median, and standard deviation).

# take a look at the grades variable downloaded from Blackboard learn
glimpse(bsc1005)

names(bsc1005) <- make.names(names(bsc1005)) # remove spaces in variables' names

bsc1005_1 <- bsc1005[c(1, 9, 11,12, 14:23, 29)] # select variables of interest

### rename selected variables
names(bsc1005_1) <- c("lastname", "pretest", "ra", "attdnc", "ra1", "ra2", 
            "ra3", "rf1", "rf2", "rf3", "rf4", "qz1", "qz2", "qz3", "exam1")

glimpse(bsc1005_1) # take a look at the dataset

### create variables as percentage score
bsc1005_2 <- bsc1005_1 %>% select(pretest:exam1) %>% 
             mutate (pretest1=(pretest/10)*100, ra1_1=(ra1/10)*100, ra2_1=(ra2/10)*100, rf1_1=(rf1/3.5)*100, 
                     rf2_1=(rf2/9.5)*100, rf3_1 = (rf3/9)*100, rf4_1 = (rf4/5.5)*100, qz1_1=(qz1/10)*100, qz2_1=(qz2/10)*100, 
                     qz3_1=(qz3/10)*100) %>% 
             gather(assmnt, score, exam1:qz3_1)

### create categories based on assessment
bsc1005_2 <- mutate(bsc1005_2, asmnt_tp = ifelse(assmnt %in% c("ra1_1", "ra2_1", "ra3_1"), "RA", 
                                          ifelse(assmnt %in% c("qz1_1","qz2_1", "qz3_1"), "QZ", 
                                          ifelse(assmnt %in% c("rf1_1", "rf2_1", "rf3_1", "rf4_1"), "REEFP", "EXAM"))))

### select assment, score, and asmnt_tp
bsc1005_3 <- bsc1005_2[-c(1:13)]
             
### generate box plot to compare distribution of categories
boxplot(score ~ asmnt_tp, bsc1005_3, xlab="Assessment Category", ylab="Scores", 
                                     main="Comparison by Assessment Category", col=c("blue", "green", "orange", "red"),
                                     las =1)

### generate descriptive statistics based on assessment categories
bsc1005_3 %>% group_by(asmnt_tp) %>% summarize(mean_asmnt = mean(score, na.rm=TRUE),
                                               mdn_asmnt = median(score, na.rm=TRUE))

The boxplot in above compares the distribution of scores between all formative assessments (quizzes[QZ], reading assignments[RA], and polling questions[REEFP]) after exam 1 was administered and graded. In exam 1, 75% of the scores were below the median of students’ mean scores of quizzes, reading assignments, and polling questions, and 75% of the scores in exam 1 were below 60. The 25th quantile for exam 1 is comparable to the 25th of students’ mean scores in quizzes and reef polling, but the 75th for quizzes, reading assignments, and polling questions were above 80. This suggest that in all other assessments, there were higher mean scores than in exam 1.

To further evaluate assessment categories against exam 1, the mean, median, and standard deviation were calculated and compared, as well as determining the magnitude of the differences between the scores in exam 1 to the other assessments.

### generate descriptive statistics based on assessment categories
bsc1005_3 %>% group_by(asmnt_tp) %>% summarize(mean_asmnt = mean(score, na.rm=TRUE),
                                               mdn_asmnt = median(score, na.rm=TRUE),
                                               std_asmnt = sd(score, na.rm = TRUE),
                                               mean_mag = mean(score, na.rm=TRUE)/49.94444,
                                               mdn_mag = median(score, na.rm=TRUE)/50,
                                               sd_mag = sd(score)/17.60753)

## # A tibble: 4 x 7
##   asmnt_tp mean_asmnt mdn_asmnt std_asmnt mean_mag  mdn_mag   sd_mag
##      <chr>      <dbl>     <dbl>     <dbl>    <dbl>    <dbl>    <dbl>
## 1     EXAM   49.94444  50.00000  17.60753 1.000000 1.000000      NaN
## 2       QZ   60.30702  62.50000  31.90337 1.207482 1.250000 1.811916
## 3       RA   57.76316  70.00000  31.43665 1.156548 1.400000 1.785410
## 4    REEFP   61.69468  66.66667  30.72915 1.235266 1.333333 1.745228

Further adding to the previous finding in the boxplot, as expected the median as well as the mean in exam 1 were lower than all other three assessment categories. The magnitude of differences was between 15 to 20% higher and 25 ro 40% higher mean and median, respectively, for quizzes, reading assignments and reef polling questions compared to exam 1. Nevertheless, the standard deviation was lower in exam 1, between 74 to 81% lower compared to all other assessments. Taken together, it further supports that exam scores were lower students’ mean quizzes, reading assignments, and reef polling scores, and that there was as a narrower range of scores in exam 1.

To determine if there differences seen earlier in this report have statistical significance, ANOVA would first employed. If any statistical significance is found, it would be further evaluated with a pairwise comparison using Holm’s adjustment.

### anova of the assessments
bsc1005_3sub <- filter(bsc1005_3, score >20)

fit_asmnt_tp <-aov(score ~ asmnt_tp, bsc1005_3sub)
summary(fit_asmnt_tp)

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## asmnt_tp      3  18363    6121   13.65 1.89e-08 ***
## Residuals   356 159614     448                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

### pairwise comparison with bonferroni adjustment
pairwise.t.test(bsc1005_3sub$score, bsc1005_3sub$asmnt_tp, p.adj="holm")

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  bsc1005_3sub$score and bsc1005_3sub$asmnt_tp 
## 
##       EXAM    QZ   RA  
## QZ    3.8e-07 -    -   
## RA    3.8e-07 0.67 -   
## REEFP 3.2e-06 0.67 0.40
## 
## P value adjustment method: holm

After filtering values below 20 in the score variable in bsc1005_3 dataset, which were outliers for reading assignments, ANOVA was employed to identify inter-group statistical differences. There were statistical significant inter-group differences (p = 1.89e-08), and the mean in exam 1 was statistically significantly lower than the mean from all other assessment categories (p = 3.02e-06 to 3.8e-08). Therefore, the scores reported in exam 1 are statistically significantly lower than the students’ mean scores reported in quizzes, reading assignments, and reef polling questions.

Question 2

How many students had a higher score n exam 1 compared to their mean Quizzes scores

Given that exam 1 reported statistically signficant mean lower scores, it is worthwhile to identify how many students had higher scores in exam 1 than their mean in quizzes.

# 2) How many students had a higher score in Exam 1 compared to their mean Quizzes scores?
bsc1005_4 <- bsc1005_1 %>% select(exam1, qz1, qz2, qz3) %>%
                           mutate (ex_v_qz = exam1 - (((qz1 + qz2 + qz3)/30)*100))

## how many students had higher exam than mean quizzes?
paste(sum(bsc1005_4$ex_v_qz > 0, na.rm=TRUE), "students had exam 1 scores higher than their mean in quizzes")

## [1] "17 students had exam 1 scores higher than their mean in quizzes"

## how many students had lower exam than mean quizzes?
paste(sum(bsc1005_4$ex_v_qz < 0, na.rm = TRUE), "student had exam 1 scores lower than their meain in quizzes")

## [1] "18 student had exam 1 scores lower than their meain in quizzes"

In total, exam 1 was administered to 35 students. From this cohort of students, 17 had exam 1 score’s higher than their mean in quizzes, while 18 students had lower exam 1 scores. This suggest that although exam 1 mean was statistically significantly lower than quizzes, half of the classroom performed better in exam 1 compare to the three quizzes.

Question 3

3) Which question, topic, bt, kd, and format had the lowest proportion of correct answers?

To evaluate possible explanations of differences in scores between exam 1 and formative assessments, the proportion of correct answers based questions’ topic, Bloom’s taxonomy (remembering, understanding, applying, evaluating, creating), knowledge dimension (factual, conceptual, procedural, meta-cognitive), and questions’ format (i.e. multiple choice, true/false, short-answer, fill-in-the blank).

Below, the proportion of correct answers was first evaluated for exams.

# 3) Which question, topic, bt, kd, and format had the lowest proportion of correct answers?

## tidy the exam1 dataset
ex1_long <- gather(ex1, student, answer, Akter:Zuluaga)

## Function to generate proportions by topic, bt, kd, format
## and logistic regression models

prop_correct_logit <- function(x){
    require(knitr) # package to use the table function kable
    require(pander) # package to use the table function for the logistic regression model
    by_topic <- prop.table(table(x$answer, x$topic), 2)*100 # evaluate proportions by topic
    by_bt <- prop.table(table(x$answer, x$bt), 2)*100 # evaluate proportions by Bloom's taxonomy
    by_kd <- prop.table(table(x$answer, x$kd), 2)*100 # evaluate proportions by Knowledge dimmension
    by_format <- prop.table(table(x$answer, x$format), 2)*100 # evaluate proportions by question format
    glm_model <- glm(answer ~ topic + bt + kd + format, x, family = binomial()) # logistic regression model
    print(kable(by_topic, caption = "Proportions by Topic")) # print table of proportions by topic
    print(kable(by_bt, caption="Proportions by Bloom's Taxonomy")) # print table of proportions by Bloom's taxonomy
    print(kable(by_kd, caption="Proportions by Knowledge Dimmension")) # print table of proportions by knowledge dimmension
    print(kable(by_format, caption="Proportions by Question Format")) # print table of proportions by questions format
    print(summary(glm_model)) # print out of the logistic regression model 
}

## implement functions to exam1 tidy dataset
prop_correct_logit(ex1_long)

## 
## 
## Table: Proportions by Topic
## 
##       BioHierch     BioMol    ChemBio     Metric    Micrcpy   Organelles   ScieMeth
## ---  ----------  ---------  ---------  ---------  ---------  -----------  ---------
## 0      44.86486   54.05405   34.68468   64.86486   68.91892     67.56757   36.48649
## 1      55.13514   45.94595   65.31532   35.13514   31.08108     32.43243   63.51351
## 
## 
## Table: Proportions by Bloom's Taxonomy
## 
##           ANLZ       APPL        RMB        UND
## ---  ---------  ---------  ---------  ---------
## 0     50.45045   52.36486   42.22973   51.89189
## 1     49.54955   47.63514   57.77027   48.10811
## 
## 
## Table: Proportions by Knowledge Dimmension
## 
##           CNCP      FACT       PROC
## ---  ---------  --------  ---------
## 0     47.66585   48.1982   54.05405
## 1     52.33415   51.8018   45.94595
## 
## 
## Table: Proportions by Question Format
## 
##             MC         SA        T/F
## ---  ---------  ---------  ---------
## 0     48.34835   56.75676   25.67568
## 1     51.65165   43.24324   74.32432
## 
## Call:
## glm(formula = answer ~ topic + bt + kd + format, family = binomial(), 
##     data = x)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8455  -1.0984   0.6341   1.0891   1.7390  
## 
## Coefficients:
##                 Estimate Std. Error z value Pr(>|z|)    
## (Intercept)      0.03646    0.18248   0.200  0.84164    
## topicBioMol     -0.78034    0.24605  -3.171  0.00152 ** 
## topicChemBio     0.63733    0.23160   2.752  0.00593 ** 
## topicMetric     -1.07959    0.42093  -2.565  0.01032 *  
## topicMicrcpy    -0.43705    0.31271  -1.398  0.16223    
## topicOrganelles -1.62141    0.29880  -5.426 5.75e-08 ***
## topicScieMeth    0.24159    0.24867   0.972  0.33128    
## btAPPL           0.79746    0.28363   2.812  0.00493 ** 
## btRMB            2.07221    0.66501   3.116  0.00183 ** 
## btUND            1.10517    0.63698   1.735  0.08274 .  
## kdFACT          -1.11705    0.60705  -1.840  0.06575 .  
## kdPROC           0.03067    0.36640   0.084  0.93329    
## formatSA        -0.85057    0.18991  -4.479 7.51e-06 ***
## formatT/F       -0.89497    0.45099  -1.984  0.04720 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1384.4  on 998  degrees of freedom
## Residual deviance: 1279.8  on 985  degrees of freedom
##   (27 observations deleted due to missingness)
## AIC: 1307.8
## 
## Number of Fisher Scoring iterations: 4

In table 1 above, in which the correct answers in exam 1 was evaluated by topic, biological hierarchy (55%, BioHierch), chemical biology (65%, ChemBio), and the scientific method (63%, SciMeth) were the topic in which there was higher than 50% of correct answers among all students. On the contrary, biological molecules (46%, BioMol), metric system (35%, Metric), and organelles (32%, Organelles).

Table 2 above describes the proportion of correct answers based the levels of Bloomm’s taxonomy levels of the question. With the exception of the remembering (RMB), all other levels (analyzing [ANLZ], applying [APPL], and understanding [UND]) had less than 50% proportion of correct answer.

With regards to knowledge dimension, which include factual, conceptual, procedural, and meta-cognitive, both factual (52.3%, FACT) and conceptual (51.8%, CNCP) had above 50% proportion of correct answers, while procedural has less than 50%.

Lastly, for questions’ format, among multiple choice, short-answer, and true/false, short answers had the lowest proportion of correct answers (43.2%) while true/false had the highest (74.2%).

The influence of topics, Bloom’s taxonomy, knowledge dimension, and question format, a logistic regression model was generated with these four variable. From table 4, the following had statistical significant estimates:

Towards incorrect answers:
1. Biological molecules (topicBioMol)
2. metric (topicMetric)
3. organelles (topicOrganelles)
4. short answers (formatSA)
5. true false (formatT/F)
Towards correct answers
1. chemical biology (topicChemBio)
2. applying (btAPPL)
3. remembering (btRMB)

Question 4

Is there a relationship between formative assessment (quizzes, ra, reef) and exam 1?

To evaluate if the students performance in the formative assessment had any relationship with exam 1 scores, the Pearson correlation coefficient was determine between these variables.

# 4) Is there a relationship between formative (quizzes, ra, reef) and exam 1?

## generate columns of quizzes, ra, rf averages and attendance categories
bsc1005_5 <- bsc1005_1 %>% select(-lastname) %>%
                           mutate (rf_avg = ((rf1 + rf2 + rf3 + rf4)/27.5)*100,
                                   qz_avg = ((qz1 + qz2 + qz3)/30)*100,
                                   ra_avg = ((ra1 + ra2 + ra3)/29)*100,
                                   att_cat = ifelse(attdnc == 20, "none",
                                             ifelse(attdnc == 19.5, "one", "too_many")))

## generate scatterplot to explore possible relationships
library(car)
scatterplot.matrix(bsc1005_5[c(14:17)])

## evaluate correlation between variables
stat.desc(bsc1005_5[c(14:17)], basic=FALSE, norm=TRUE) ## normality and descriptive stat

##                    exam1      rf_avg      qz_avg       ra_avg
## median        56.0000000  60.0000000  60.8333333  55.17241379
## mean          58.5405405  60.0956938  60.3070175  54.80943739
## SE.mean        2.7417894   2.2230546   3.7203362   3.93999686
## CI.mean.0.95   5.5606067   4.5043365   7.5381171   7.98319194
## var          278.1441441 187.7949285 525.9542437 589.89585994
## std.dev       16.6776540  13.7038290  22.9336923  24.28777182
## coef.var       0.2848907   0.2280335   0.3802823   0.44313120
## skewness      -0.1691228   0.1353452  -0.1189107  -0.65206663
## skew.2SE      -0.2181727   0.1767747  -0.1553096  -0.85166574
## kurtosis      -0.5066755  -0.3225522  -0.4760786  -0.20672308
## kurt.2SE      -0.3339020  -0.2151208  -0.3175126  -0.13787047
## normtest.W     0.9687559   0.9767697   0.9878712   0.94516029
## normtest.p     0.3756522   0.6031681   0.9479124   0.06172383

corr.test(bsc1005_5[c(14:17)], method="pearson")

## Call:corr.test(x = bsc1005_5[c(14:17)], method = "pearson")
## Correlation matrix 
##        exam1 rf_avg qz_avg ra_avg
## exam1   1.00   0.45   0.53   0.42
## rf_avg  0.45   1.00   0.41   0.16
## qz_avg  0.53   0.41   1.00   0.53
## ra_avg  0.42   0.16   0.53   1.00
## Sample Size 
##        exam1 rf_avg qz_avg ra_avg
## exam1     37     37     37     37
## rf_avg    37     38     38     38
## qz_avg    37     38     38     38
## ra_avg    37     38     38     38
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##        exam1 rf_avg qz_avg ra_avg
## exam1   0.00   0.02   0.00   0.03
## rf_avg  0.01   0.00   0.03   0.35
## qz_avg  0.00   0.01   0.00   0.00
## ra_avg  0.01   0.35   0.00   0.00
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option

From the scatterplot matrix above, there seems to be an increasing pattern between each student’s mean reef polling (rf_avg), quizzes (qz_avg), and reading assignment (ra_avg) with exam 1 scores. After determining that the observations in all four variables, with the exception of ra_avg, were parametrically distributed (as evaluated by a Shapiro-Wilk test), the Pearson coefficient was determined. All three assessments were positively correlated, and all three correlations had statistical significance.

Question 5

If there was relationship, which formative assessment had more predictive power of the exam1 grades?

To evaluate which assessment had the most influence on the outcome of exam 1, three different linear regression models were evaluated: 1) with three assessments, 2) with quizzes and reef polling, and 3) with all three assessments and attendance.

## mode 1
fit1 <- lm(exam1 ~ rf_avg + qz_avg + ra_avg, bsc1005_5)
summary(fit1)

## 
## Call:
## lm(formula = exam1 ~ rf_avg + qz_avg + ra_avg, data = bsc1005_5)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -27.706  -7.804  -0.369   8.177  29.380 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  16.5751    10.7693   1.539   0.1333  
## rf_avg        0.3505     0.1809   1.938   0.0613 .
## qz_avg        0.2152     0.1263   1.703   0.0979 .
## ra_avg        0.1438     0.1102   1.306   0.2007  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.73 on 33 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.3787, Adjusted R-squared:  0.3223 
## F-statistic: 6.706 on 3 and 33 DF,  p-value: 0.00117

## model 2 (without reading assignments)
fit2 <- lm(exam1 ~ rf_avg + qz_avg, bsc1005_5)
summary(fit2)

## 
## Call:
## lm(formula = exam1 ~ rf_avg + qz_avg, data = bsc1005_5)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -35.417  -7.919   0.929  11.007  26.759 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  20.3668    10.4771   1.944  0.06022 . 
## rf_avg        0.3323     0.1822   1.823  0.07704 . 
## qz_avg        0.3009     0.1090   2.760  0.00924 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.87 on 34 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.3467, Adjusted R-squared:  0.3082 
## F-statistic:  9.02 on 2 and 34 DF,  p-value: 0.0007201

## model 3 (including attendance)
fit3 <- lm(exam1 ~ rf_avg + qz_avg + ra_avg + attdnc, bsc1005_5)
summary(fit3)

## 
## Call:
## lm(formula = exam1 ~ rf_avg + qz_avg + ra_avg + attdnc, data = bsc1005_5)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -28.7093  -7.6135   0.8462   6.8679  30.5751 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  93.2228   153.6000   0.607   0.5482  
## rf_avg        0.3677     0.1862   1.975   0.0569 .
## qz_avg        0.2444     0.1405   1.739   0.0916 .
## ra_avg        0.1381     0.1120   1.233   0.2265  
## attdnc       -4.0001     7.9959  -0.500   0.6203  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 13.89 on 32 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.3836, Adjusted R-squared:  0.3065 
## F-statistic: 4.978 on 4 and 32 DF,  p-value: 0.003102

## function to determine relative weight of predictors in regression models
relweights_graph<-function(fit,...){
    R<-cor(fit$model)
    nvar<-ncol(R)
    rxx<-R[2:nvar,2:nvar]
    rxy<-R[2:nvar,1]
    svd<-eigen(rxx) 
    evec<-svd$vectors
    ev<-svd$values
    delta<-diag(sqrt(ev))
    lambda<-evec%*%delta%*%t(evec)
    lambdasq<-lambda^2 
    beta<-solve(lambda)%*%rxy 
    rsquare<-colSums(beta^2) 
    rawwgt<-lambdasq%*%beta^2
    import<-(rawwgt/rsquare)*100
    import<-as.data.frame(import) 
    row.names(import)<-names(fit$model[2:nvar])
    names(import)<-"Weights" 
    import<-import[order(import),1,drop=FALSE] 
    dotchart(import$Weights,labels=row.names(import), xlab="% of R-Square",
             pch=19,
             sub=paste("Total R-Square=", round(rsquare,digits=3)),...) 
}

par(mfrow=c(3,1))
relweights_graph(fit1, main="RW of Formative Assessments: Model 1", col="blue")
relweights_graph(fit2, main="RW of Formative Assessments: Model 2", col="blue")
relweights_graph(fit3, main="RW of Formative Assessments: Model 3", col="blue")

par(mfrow=c(1,1))

The model that explained the highest percentage of variability in exam scores was the first model, which included only all three formative assessments and excluding attendance. Nevertheless, in all three models, students’ mean in quizzes had the highest relative weight, followed by reef polling means, reading assignment means, and lastly attendance. These findings provide evidence that quizzes have the predictive power with regards to exams scores.

Question 6

Is there a relationship between attendance and exam 1

Attendance is an important part of course success. To evaluate if attendance did play a role in exam 1 scores, three categories were first generated for attendance: no absence, one absence, and too many (which represents two or more absences). A boxplot of exams scores against attendance category would be generated, and any differences would be further evaluated for statistical significance.

# 6) Is there a relationship between attendance and exam 1?

## generate boxplot between attendance and exams
boxplot(exam1 ~ att_cat, bsc1005_5, col=c("green", "orange", "red"), 
                                    main="Relationship between Attendance and Exam",
                                    xlab="Attendance Categories",
                                    ylab="Grade in Exam 1 (%)",
                                    las=1)

## mean and median by attendance category
bsc1005_5 %>% select (exam1, att_cat) %>% 
              group_by(att_cat) %>%
              summarize(mean_ex = mean(exam1, na.rm=TRUE),
                        medn_ex = median(exam1, na.rm=TRUE))

## # A tibble: 3 x 3
##    att_cat mean_ex medn_ex
##      <chr>   <dbl>   <dbl>
## 1     none   61.60      60
## 2      one   51.75      51
## 3 too_many   53.00      55

## anova with attendance and exam1
fit_att <- aov(exam1 ~ att_cat, bsc1005_5)
summary(fit_att)

##             Df Sum Sq Mean Sq F value Pr(>F)
## att_cat      2    726   362.8   1.328  0.278
## Residuals   34   9288   273.2               
## 1 observation deleted due to missingness

In the boxplot above, although the medians in all three categories fell between 50 and 60, the 75th quartile was higher when there was no absence (~ 80), followed by one absence (~64), and lastly too many absences (below 60). Nevertheless, there was no statistical significant inter-group difference between the three categories. This suggest that attendance did not have a strong influence on students performance in exam 1.

Question 7

Is there a relationship between attendance and quizzes?

Given that quizzes were the variable with higher predictive value on the linear regression of exam 1 scores, it is worthwhile to evaluate if there is a relationship between attendance and quizzes scores.

# 7) Is there a relationship between attendance and quizzes?

## boxplot between attendance and quizzes
boxplot(qz_avg ~ att_cat, bsc1005_5, col=c("green", "orange", "red"), 
                                     main="Relationship between Attendance and Quizzes",
                                     xlab="Attendance Categories",
                                     ylab="Mean Quizzes Grades (%)",
                                     las=1)

## mean and median by attendance category
bsc1005_5 %>% select (att_cat, qz_avg) %>% 
    group_by(att_cat) %>%
    summarize(mean_ex = mean(qz_avg, na.rm=TRUE),
              medn_ex = median(qz_avg, na.rm=TRUE))

## # A tibble: 3 x 3
##    att_cat  mean_ex  medn_ex
##      <chr>    <dbl>    <dbl>
## 1     none 68.06667 68.33333
## 2      one 50.18519 53.33333
## 3 too_many 34.58333 31.66667

## anova of quizzes and attendance categories
fit_att <- aov(qz_avg ~ att_cat, bsc1005_5)
summary(fit_att)

##             Df Sum Sq Mean Sq F value  Pr(>F)   
## att_cat      2   5074    2537   6.173 0.00506 **
## Residuals   35  14386     411                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## pairwise t test with holm adjustment
pairwise.t.test(bsc1005_5$qz_avg, bsc1005_5$att_cat, p.adj="holm")

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  bsc1005_5$qz_avg and bsc1005_5$att_cat 
## 
##          none  one  
## one      0.059 -    
## too_many 0.012 0.209
## 
## P value adjustment method: holm

From the boxplot above, it is evidenced that there are differences in medians, 25th and 75th quartiles. with an higher-to-lower pattern from none to too_many absences. The median for no absences is higher than the 75th of both one and too_many absences categories, and the same differences was observed when comparing one to too_many attendance categories.

After performing an ANOVA, there was evidence that there was statistically significant differences between the means of the three attendance categories. From a pairwise t-test with Holms adjustment, the statistical difference was between no absence and too_many absences. This suggest that attendance does have a relationship with how students perform in the quizzes.

Question 8

How does the proportion of corrects, as well as topics, Bloom’s taxonomy, knowledge dimmension, and question format differ between assessments?

Finally, to inquire for possible explanations for differences between exam 1 scores and students’ means in quizzes, reading assignments, and reef polling questions, the proportion of topics, Bloom’s taxonomy, knowledge dimensions, and question format were compared.

## reef polling datasets
### evaluate reef polling datasets
glimpse(reef1)
glimpse(reef2)
glimpse(reef3)
glimpse(reef4)
glimpse(reef5)
glimpse(reef6)

### rename columns
names(reef1) <- c("lastname", "totalpt", "perfm", "part", paste0(rep("q", 5), seq(1,5, by=1)))
names(reef2) <- c("lastname", "totalpt", "perfm", "part", paste0(rep("q", 7), seq(1,7, by=1)))
names(reef3) <- c("lastname", "totalpt", "perfm", "part", paste0(rep("q", 4), seq(1,4, by=1)))
names(reef4) <- c("lastname", "totalpt", "perfm", "part", paste0(rep("q", 10), seq(1,10, by=1)))
names(reef5) <- c("lastname", "totalpt", "perfm", "part", paste0(rep("q", 3), seq(1,3, by=1)))
names(reef6) <- c("lastname", "totalpt", "perfm", "part", paste0(rep("q", 7), seq(1, 7, by=1)))

### function to convert variable class
convert.magic <- function(obj, type){
    FUN1 <- switch(type,
                   character = as.character,
                   numeric = as.numeric,
                   factor = as.factor)
    out <- lapply(obj, FUN1)
    as.data.frame(out)
} 

### convert character variables to numeric in all reef polling datasets
reef1[, paste0(rep("q", 5),seq(1,5,1))] <- convert.magic(reef1[, paste0(rep("q", 5),seq(1,5, 1))], "numeric")
reef2[, paste0(rep("q", 7),seq(1,7,1))] <- convert.magic(reef2[, paste0(rep("q", 7),seq(1,7, 1))], "numeric")
reef3[, paste0(rep("q", 4),seq(1,4,1))] <- convert.magic(reef3[, paste0(rep("q", 4),seq(1,4, 1))], "numeric")
reef4[, paste0(rep("q", 10),seq(1,10,1))] <- convert.magic(reef4[, paste0(rep("q", 10),seq(1,10, 1))], "numeric")
reef5[, paste0(rep("q", 3),seq(1,3,1))] <- convert.magic(reef5[, paste0(rep("q", 3),seq(1,3, 1))], "numeric")
reef6[, paste0(rep("q", 7),seq(1,7,1))] <- convert.magic(reef6[, paste0(rep("q", 7),seq(1,7, 1))], "numeric")


### add reef pollilng session to reef datasets
reef1$session <- rep("rf1", 40)
reef2$session <- rep("rf2", 36)
reef3$session <- rep("rf3", 39)
reef4$session <- rep("rf4", 35)
reef5$session <- rep("rf5", 23)
reef6$session <- rep("rf6", 31)


### tidy all reef polling datasets
stdnames <- strsplit((reef1$lastname), " ") # extract names and separate last names and first names
lastname <- sapply(stdnames, "[", 2) # vector of last names
firstname <- sapply(stdnames, "[", 1) # vector of first names
reef1_1 <- data.frame(lastname, reef1[-1])
reef1_long <- gather(reef1_1[-c(2:6)], qstn, answer, q3:q5) %>% arrange (lastname)
reef1_long <- mutate (reef1_long, topic = rep(c("BioHierch", "SciMeth", "BioHierch"), 40),
                                  bt=rep(c("UND", "ANLZ", "ANLZ"), 40),
                                  kd=rep("CNCP", 120),
                                  format=rep("MC", 120))

stdnames2 <- strsplit((reef2$lastname), " ") # extract names and separate last names and first names
lastname2 <- sapply(stdnames2, "[", 2) # vector of last names
firstname2 <- sapply(stdnames2, "[", 1) # vector of first names
reef2_1 <- data.frame(lastname2, reef2[-1])
reef2_long <- gather(reef2_1[-c(2:6)], qstn, answer, q3:q7) %>% arrange (lastname2)
reef2_long <- mutate(reef2_long, topic=rep(c("BioHierch", "BioHierch", "SciMeth", "BioHierch", "ChemBio"), 36),
                 bt=rep(c("UND", "ANLZ", "ANLZ", "APPL", "APPL"), 36),
                 kd=rep(c("FACT", "CNCP", "CNCP", "PROC", "PROC"), 36),
                 format=rep(c("MC", "MC", "MC", "SA", "SA"), 36))

stdnames3 <- strsplit((reef3$lastname), " ") # extract names and separate last names and first names
lastname3 <- sapply(stdnames3, "[", 2) # vector of last names
firstname3 <- sapply(stdnames3, "[", 1) # vector of first names
reef3_1 <- data.frame(lastname3, reef3[-1])
reef3_long <- gather(reef3_1[-c(2:5)], qstn, answer, q2:q4) %>% arrange (lastname3)
reef3_long <- mutate(reef3_long, topic=rep("ChemBio", 117),
                 bt=rep(c("APPL", "APPL", "UND"), 39),
                 kd=rep(c("PROC", "PROC", "CNCP"), 39),
                 format=rep("SA", 117))

stdnames4 <- strsplit((reef4$lastname), " ") # extract names and separate last names and first names
lastname4 <- sapply(stdnames4, "[", 2) # vector of last names
firstname4 <- sapply(stdnames4, "[", 1) # vector of first names
reef4_1 <- data.frame(lastname4, reef4[-1])
reef4_long <- gather(reef4_1[-c(2:6, 8, 11, 13)], qstn, answer, q3, q5, q6, q8, q10) %>% arrange (lastname4)
reef4_long <- mutate(reef4_long, topic=rep(c("ChemBio", "BioMol", "BioMol", "BioMol", "BioMol"), 35),
                 bt=rep(c("UND", "RMB", "UND", "UND", "UND"), 35),
                 kd=rep(c("FACT", "FACT", "CNCP", "FACT", "FACT"), 35),
                 format=rep(c("SA", "SA", "T/F", "MC", "SA"), 35))

stdnames5 <- strsplit((reef5$lastname), " ") # extract names and separate last names and first names
lastname5 <- sapply(stdnames5, "[", 2) # vector of last names
firstname5 <- sapply(stdnames5, "[", 1) # vector of first names
reef5_1 <- data.frame(lastname5, reef5[-1])
reef5_long <- gather(reef5_1[-c(2:4)], qstn, answer, q1:q3) %>% arrange (lastname5)
reef5_long <- mutate(reef5_long, topic=rep("BioMol", 69),
                 bt=rep("APPL", 69),
                 kd=rep(c("PROC", "CNCP", "CNCP"), 23),
                 format=rep(c("SA", "MC", "MC"), 23))

stdnames6 <- strsplit((reef6$lastname), " ") # extract names and separate last names and first names
lastname6 <- sapply(stdnames6, "[", 2) # vector of last names
firstname6 <- sapply(stdnames6, "[", 1) # vector of first names
reef6_1 <- data.frame(lastname6, reef6[-1])
reef6_long <- gather(reef6_1[-c(2:6)], qstn, answer, q3:q7) %>% arrange (lastname6)
reef6_long <- mutate(reef6_long, topic=rep(c("Micrcpy", "Organelles", "Metric", "Micrcpy", "Micrcpy"), 31),
                 bt=rep(c("ANLZ", "UND", "RMB", "UND", "UND"), 31),
                 kd=rep(c("CNCP", "CNCP", "FACT", "CNCP", "UND"), 31),
                 format=rep(c("MC", "MC", "MC", "MC", "MC"), 31))

colnames(reef2_long)[1] <- "lastname"
colnames(reef3_long)[1] <- "lastname"
colnames(reef4_long)[1] <- "lastname"
colnames(reef5_long)[1] <- "lastname"
colnames(reef6_long)[1] <- "lastname"

reef_all_long <- rbind(reef1_long, reef2_long, reef3_long, reef4_long, reef5_long, reef6_long)

## clean reading assignment dataset
names(ra1) <- make.names(names(ra1))
ra1[, c("Auto.Score", "Manual.Score")] <- convert.magic(ra1[, c("Auto.Score", "Manual.Score")], "numeric")
ra1[is.na(ra1)] <- 0

ra1_1 <- ra1 %>% rename (lastname = Last.Name, qstn = Question.ID) %>% 
         mutate (answer = Auto.Score + Manual.Score, session = rep("ra1", 252)) %>% 
         select (lastname, session, qstn, answer, topic, bt, kd, format) 

names(ra2) <- make.names(names(ra2))
ra2[, c("Auto.Score", "Manual.Score")] <- convert.magic(ra2[, c("Auto.Score", "Manual.Score")], "numeric")
ra2[is.na(ra2)] <- 0

ra2_1 <- ra2 %>% rename (lastname = Last.Name, qstn = Question.ID) %>% 
         mutate (answer = Auto.Score + Manual.Score, session = rep("ra2", 320)) %>% 
         select (lastname, session, qstn, answer, topic, bt, kd, format) 


## clean quizzes
qz1_1 <- qz1 %>% gather(lastname, answer, Akter:Willis) %>%
                 rename(qstn = question) %>% 
                 mutate(session = rep("quiz1", 222)) %>%
                 select(lastname, session, qstn, answer, topic, bt, kd, format)

qz2_1 <- qz2 %>% gather(lastname, answer, Akter:Zuluga) %>%
                 rename(qstn = question) %>% 
                 mutate(session = rep("quiz2", 190)) %>%
                 select(lastname, session, qstn, answer, topic, bt, kd, format)

qz3_1 <- qz3 %>% gather(lastname, answer, Akter:Zuluga) %>%
                 rename(qstn = question) %>% 
                 mutate(session = rep("quiz3", 228)) %>%
                 select(lastname, session, qstn, answer, topic, bt, kd, format)  

## add session to exam
ex1_long1 <- ex1 %>% gather(lastname, answer, Akter:Zuluaga) %>%
                     rename(qstn = question) %>% 
                     mutate(session = rep("exam1", 1026)) %>%
                     select(lastname, session, qstn, answer, topic, bt, kd, format)  

## assessments
all_asmnt <- rbind(reef_all_long, ra1_1, ra2_1, qz1_1, qz2_1, qz3_1, ex1_long1)

all_asmnt <- all_asmnt %>% mutate(asmnt = ifelse(session %in% c("ra1", "ra2"), "RA", 
                                          ifelse(session %in% c("quiz1","quiz2", "quiz3"), "QZ",
                                          ifelse(session %in% c("rf1", "rf2", "rf3", "rf4", "rf5", "rf6"), "REEFP", "EXAM"))))

## evaluate proportions
### evaluate difference in proportion of question's format
prop.table(table(all_asmnt$format, all_asmnt$asmnt), 2)*100

### evaluate differences in proportion of question's bt
prop.table(table(all_asmnt$bt, all_asmnt$asmnt), 2)*100

### evaluate differences in proportion of questoin's kd
prop.table(table(all_asmnt$kd, all_asmnt$asmnt), 2)*100

### evaluate differences in proportion of correct answers
prop.table(table(all_asmnt$answer, all_asmnt$asmnt), 2)

##        
##              EXAM        QZ        RA     REEFP
##   BLANK  0.000000  5.937500  0.000000  0.000000
##   MC    66.666667 11.562500 40.559441 56.862745
##   SA    25.925926 76.718750 27.272727 38.848039
##   T/F    7.407407  5.781250 32.167832  4.289216

##       
##             EXAM        QZ        RA     REEFP
##   ANLZ 22.222222 17.500000  9.790210 22.426471
##   APPL 29.629630 17.812500  0.000000 26.838235
##   RMB  29.629630 35.312500 42.657343  8.088235
##   UND  18.518519 29.375000 47.552448 42.647059

##       
##            EXAM       QZ       RA    REEFP
##   CNCP 40.74074 35.31250 35.66434 49.63235
##   FACT 44.44444 58.75000 64.33566 25.36765
##   PROC 14.81481  5.93750  0.00000 21.20098
##   UND   0.00000  0.00000  0.00000  3.79902

##             
##                   EXAM        QZ        RA     REEFP
##   BioHierch  18.518519 23.281250  0.000000 23.039216
##   BioMol     18.518519 11.875000 55.944056 25.612745
##   ChemBio    22.222222 29.687500 44.055944 23.039216
##   LivProp     0.000000  5.937500  0.000000  0.000000
##   Metric      7.407407  0.000000  0.000000  3.799020
##   Micrcpy     7.407407  0.000000  0.000000 11.397059
##   Organelles 11.111111  0.000000  0.000000  3.799020
##   ScieMeth   14.814815  0.000000  0.000000  0.000000
##   SciMeth     0.000000 29.218750  0.000000  9.313725

##    
##          EXAM        QZ        RA     REEFP
##   0 0.4884885 0.4698206 0.2360140 0.2761905
##   1 0.5115115 0.5301794 0.7639860 0.7238095

For question format, exam 1 followed by reef polling questions had the higher proportion of multiple-choice questions, quizzes had the highest proportion of short-answers, true/false questions, and fill-in the blank.

With regards to Bloom’s taxonomy, exam 1 had a fairly distributed set of questions, and had the highest percentage of application questions. Quizzes were somewhat fairly distributed as well, but its highest proportion of questions are in the remembering category. Nevertheless, Reading assignments had the highest of remembering and understanding questions, and no application questions. On the other hand reef polling had most of its questions in understanding Bloom’s level, and least in remembering.

When considering knowledge dimension, conceptual and factual were the highest proportions in all four assessments (higher than 80% together), with reef polling having the highest for conceptual, and reading assignment for factual.

When comparing topics covered, all four assessments assessments had questions of biological molecules and chemical biology. With the exception of reading assignments, exam 1, quizzes, and polling questions covered biological hierarchy and scientific methods. These topics were covered the first day of class and for that reason were not included in reading assignments. With the exception of organelles, metric, microscopy, and organelles, all remaining topics were covered in all four assessments.

Lastly, reading assignment and reef polling questions had comparable proportion of correct answers, while the comparable proportions for exam 1 and quizzes.

Taken together, these four assessment types had comparable and difference proportions depending on the area being evaluated. For example, most multiple choice questions were among exam, reading assignments, and reef-polling questions. In Bloom’s taxonomy levels, each assessment seems to have a distinctive proportion for each level, but in knowledge dimension all four assessments mainly had questions of concepts and facts. Lastly, quizzes and exams had comparable proportion of corrects, while reading assignments and polling questions had comparable proportion of corrects.

Conclusion and Possible Actions

This data analysis report has at it main purpose to compare students’ performance in exam 1 to formative assessments (i.e. quizzes, reading assignments, polling questions) implemented in a General Education Biology course.

It was found that scores in exam 1 were statistically significantly lower than students’ means in quizzes, reading assignments and polling questions. Nevertheless, there were students (half of the classroom) who did performed better in exam 1 by reporting higher scores than their quizzes means. This could be explained by some students employing a higher effort in exam 1, while others did the opposite. It could also be explained by possible over-confidence on exam 1. Another factor to consider was that most of the incorrect answers were in the bottom half of the exam, which cover topics that students have yet be practice in quizzes and thus have not received feedback.

A possible action would be to include in future exams topics for which students have been able to practice previously in enough in quizzes, reef polling questions, and reading assignments (with the exceptions of topics covered the first day of class) and received feedback.

As expected, there was a relationship between attendance and quizzes: the more absences the lower the mean in quizzes. Nevertheless, although there were no statistical differences in means and medians in exams scores between attendance categories, students with no absence had a higher 75th quartile.

A possible action would be to present students graphical evidence of the relationship between attendance and scores in exams and quizzes.

There was a relationship between formative assessment and scores in exam 1. Furthermore, mean of students’ quizzes was the formative assessment with highest predictive value and explaining the highest percentage of variability of exams 1 scores.

Possible actions could be:

Present graphical representation of the relationship between performance in formative assessments and exams.
Emphasize the important relationship of quizzes in exams scores.

sessionInfo()

## R version 3.4.0 (2017-04-21)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 10586)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] car_2.1-4          pander_0.6.0       knitr_1.16        
##  [4] lubridate_1.6.0    readxl_1.0.0       pastecs_1.3-18    
##  [7] boot_1.3-19        psych_1.7.5        dplyr_0.5.0       
## [10] purrr_0.2.2.2      readr_1.1.1        tidyr_0.6.3       
## [13] tibble_1.3.3       ggplot2_2.2.1.9000 tidyverse_1.1.1   
## 
## loaded via a namespace (and not attached):
##  [1] reshape2_1.4.2     splines_3.4.0      haven_1.0.0       
##  [4] lattice_0.20-35    colorspace_1.3-2   htmltools_0.3.6   
##  [7] mgcv_1.8-17        yaml_2.1.14        rlang_0.1.1       
## [10] nloptr_1.0.4       foreign_0.8-68     DBI_0.6-1         
## [13] modelr_0.1.0       plyr_1.8.4         stringr_1.2.0     
## [16] MatrixModels_0.4-1 munsell_0.4.3      gtable_0.2.0      
## [19] cellranger_1.1.0   rvest_0.3.2        evaluate_0.10     
## [22] forcats_0.2.0      SparseM_1.77       quantreg_5.33     
## [25] pbkrtest_0.4-7     parallel_3.4.0     highr_0.6         
## [28] broom_0.4.2        Rcpp_0.12.11       scales_0.4.1      
## [31] backports_1.1.0    jsonlite_1.5       lme4_1.1-13       
## [34] mnormt_1.5-5       hms_0.3            digest_0.6.12     
## [37] stringi_1.1.5      grid_3.4.0         rprojroot_1.2     
## [40] tools_3.4.0        magrittr_1.5       lazyeval_0.2.0    
## [43] Matrix_1.2-10      MASS_7.3-47        xml2_1.1.1        
## [46] minqa_1.2.4        assertthat_0.2.0   rmarkdown_1.5     
## [49] httr_1.2.1         R6_2.2.1           nnet_7.3-12       
## [52] nlme_3.1-131       compiler_3.4.0

Data Analysis in a General Education Biology Course after the First Summative Assessment

Felix E. Rivera-Mariani, PhD

May 27, 2017