Cassandra Bayer
5/8/2017
Initial Load and Screen
Prior to loading in the data, I read through the surveys to make sure I understood the levels depicted in the data. I then load my libraries, set my directory, and load in the data.
#library(stargazer)
library(tidyverse)
library(shiny)
#set working directory and load in data
setwd("/Users/cassandrabayer/Downloads")
pre <- read.csv("Fall Pre_data.csv")
post <- read.csv("Spring Post_data.csv")Merge
Join both datasets together using an inner join to get all rows of X (pre) that have a matching IDs in Y (post) but also will return all columnns from X and Y (pre and post, respectively).
I check to make sure the merge is successful by first adding the columns (I should have all 64 from pre and 40 from post, this should give me a dataframe of 103 columns, as the merge will share the ID column).
prepost_inner <- inner_join(pre,post, by="ID")
#str(prepost_inner) #check variable types-- all integers or numeric, which is good for now
#browse the dataframe using View() and make sure there are not overwhelming numbers of NAs where they shouldn't be
#check for consistency (values should be 1 through 4 with the exception of a couple, like year or month)
#stargazer(prepost_inner, type="html", summary.logical=FALSE) | Statistic | N | Mean | St. Dev. | Min | Max |
| ID | 31 | 17.419 | 10.388 | 1 | 34 |
| Gender.x | 30 | 1.467 | 0.507 | 1 | 2 |
| Grade.x | 31 | 4.613 | 0.667 | 4 | 6 |
| AmericanIndian.x | 2 | 1.000 | 0.000 | 1 | 1 |
| Asian.x | 2 | 2.000 | 0.000 | 2 | 2 |
| BlackorAfricanAmerican.x | 7 | 3.000 | 0.000 | 3 | 3 |
| Hispanic.x | 18 | 4.000 | 0.000 | 4 | 4 |
| NativeHawaiian.x | 1 | 5.000 | 5 | 5 | |
| WhitenonHispanic.x | 4 | 6.000 | 0.000 | 6 | 6 |
| Other.x | 3 | 7.000 | 0.000 | 7 | 7 |
| Science_Excited.x | 30 | 2.467 | 0.860 | 1 | 4 |
| TakeApart_Learn.x | 31 | 2.871 | 0.619 | 1 | 4 |
| Like_Participate.x | 31 | 2.968 | 0.836 | 1 | 4 |
| Science_Gift.x | 31 | 2.452 | 1.091 | 1 | 4 |
| Like_Made.x | 31 | 3.355 | 0.608 | 2 | 4 |
| Like_NatureTV.x | 31 | 2.355 | 0.985 | 1 | 4 |
| Curious_Science.x | 31 | 2.806 | 0.873 | 1 | 4 |
| Science_Outsideschool.x | 31 | 2.323 | 1.013 | 1 | 4 |
| Growup.x | 31 | 2.194 | 0.910 | 1 | 4 |
| Science_Job.x | 31 | 2.323 | 0.909 | 1 | 4 |
| Understand_Science.x | 31 | 2.903 | 0.790 | 1 | 4 |
| Enjoy_Museums.x | 30 | 3.233 | 0.679 | 1 | 4 |
| Science_Boring.x | 30 | 2.633 | 0.999 | 1 | 4 |
| Excited_Inventions.x | 31 | 2.710 | 0.783 | 1 | 4 |
| Dontlike_Reading.x | 30 | 2.500 | 0.974 | 1 | 4 |
| Pay_Attention.x | 31 | 2.806 | 0.749 | 1 | 4 |
| Curious_Cars.x | 30 | 2.533 | 1.137 | 1 | 4 |
| Excited_Activity.x | 30 | 2.600 | 0.770 | 1 | 4 |
| Enjoy_SciFi.x | 29 | 1.966 | 0.906 | 1 | 4 |
| ScienceNotForSchool.x | 30 | 2.533 | 0.776 | 1 | 4 |
| Like_Scienceatschool.x | 31 | 2.613 | 0.803 | 1 | 4 |
| Like_Scienceoutsideschool.x | 30 | 2.433 | 0.935 | 1 | 4 |
| Favorite_Subject.x | 31 | 2.226 | 0.990 | 1 | 4 |
| Afterschool_Subject.x | 30 | 2.300 | 0.952 | 1 | 4 |
| HavetoTake.x | 31 | 2.742 | 0.930 | 1 | 4 |
| HavetoTake_After.x | 31 | 2.129 | 0.991 | 1 | 4 |
| AtSchool_Future.x | 31 | 2.968 | 0.948 | 1 | 4 |
| AfterSchool_Future.x | 31 | 2.677 | 1.013 | 1 | 4 |
| WeeksDoingScience.x | 29 | 1.724 | 1.251 | 1 | 4 |
| HoursperWeek.x | 28 | 1.643 | 0.911 | 1 | 4 |
| Gender.y | 26 | 1.423 | 0.504 | 1 | 2 |
| Grade.y | 30 | 4.167 | 0.461 | 4 | 6 |
| AmericanIndian.y | 1 | 1.000 | 1 | 1 | |
| BlackorAfricanAmerican.y | 8 | 3.000 | 0.000 | 3 | 3 |
| Hispanic.y | 17 | 4.000 | 0.000 | 4 | 4 |
| WhitenonHispanic.y | 6 | 6.000 | 0.000 | 6 | 6 |
| Other.y | 3 | 7.000 | 0.000 | 7 | 7 |
| Science_Excited.y | 30 | 2.800 | 1.064 | 1 | 4 |
| TakeApart_Learn.y | 31 | 2.613 | 1.054 | 1 | 4 |
| Like_Participate.y | 31 | 3.129 | 0.846 | 1 | 4 |
| Science_Gift.y | 31 | 3.000 | 1.000 | 1 | 4 |
| Like_Made.y | 31 | 3.129 | 1.056 | 1 | 4 |
| Like_NatureTV.y | 30 | 2.633 | 1.159 | 1 | 4 |
| Curious_Science.y | 31 | 2.968 | 1.080 | 1 | 4 |
| Science_Outsideschool.y | 31 | 2.935 | 0.964 | 1 | 4 |
| Growup.y | 31 | 3.000 | 1.000 | 1 | 4 |
| Science_Job.y | 31 | 2.258 | 1.094 | 1 | 4 |
| Understand_Science.y | 30 | 3.100 | 0.995 | 1 | 4 |
| Enjoy_Museums.y | 28 | 3.143 | 0.932 | 1 | 4 |
| Science_Boring.y | 29 | 2.069 | 1.132 | 1 | 4 |
| Excited_Inventions.y | 31 | 2.806 | 0.910 | 1 | 4 |
| Dontlike_Reading.y | 30 | 2.300 | 0.952 | 1 | 4 |
| Pay_Attention.y | 31 | 2.839 | 1.003 | 1 | 4 |
| Curious_Cars.y | 31 | 2.581 | 0.992 | 1 | 4 |
| Excited_Activity.y | 31 | 3.000 | 0.931 | 1 | 4 |
| Enjoy_SciFi.y | 31 | 2.645 | 1.050 | 1 | 4 |
| ScienceNotForSchool.y | 29 | 2.448 | 1.055 | 1 | 4 |
| Like_Scienceatschool.y | 28 | 3.071 | 0.858 | 1 | 4 |
| Like_Scienceoutsideschool.y | 28 | 3.107 | 0.786 | 1 | 4 |
| Favorite_Subject.y | 28 | 3.036 | 0.962 | 1 | 4 |
| Afterschool_Subject.y | 28 | 2.786 | 0.995 | 1 | 4 |
| HavetoTake.y | 29 | 2.414 | 0.983 | 1 | 4 |
| HavetoTake_After.y | 29 | 1.931 | 0.961 | 1 | 4 |
| AtSchool_Future.y | 29 | 2.724 | 1.099 | 1 | 4 |
| AfterSchool_Future.y | 27 | 2.333 | 1.177 | 1 | 4 |
| MonthBorn | 25 | 6.800 | 2.517 | 1 | 10 |
| DayBorn | 25 | 14.120 | 9.479 | 1 | 31 |
| YearBorn | 12 | 1,997.833 | 0.577 | 1,996 | 1,998 |
| WeeksDoingScience.y | 27 | 3.000 | 1.074 | 1 | 4 |
| HoursperWeek.y | 26 | 2.154 | 0.967 | 1 | 4 |
| ASConservation | 6 | 1.000 | 0.000 | 1 | 1 |
| AS_SciencePlus | 4 | 2.000 | 0.000 | 2 | 2 |
| Mixing | 4 | 3.000 | 0.000 | 3 | 3 |
| Wonderwise | 2 | 4.000 | 0.000 | 4 | 4 |
| TechAS | 2 | 5.000 | 0.000 | 5 | 5 |
| NASA | 6 | 6.000 | 0.000 | 6 | 6 |
| More_Interesting | 26 | 1.923 | 1.017 | 1 | 4 |
| More_Excited | 25 | 2.120 | 1.130 | 1 | 4 |
| interested_sciencejob | 25 | 2.280 | 1.173 | 1 | 4 |
| Science_Fun | 25 | 1.920 | 1.038 | 1 | 4 |
| Sure_Job | 24 | 2.500 | 1.216 | 1 | 4 |
| More_Relaxed | 23 | 1.957 | 1.186 | 1 | 4 |
| Confident_Ability | 24 | 2.042 | 1.160 | 1 | 4 |
| Better_Student | 24 | 1.792 | 1.103 | 1 | 4 |
| Job_possible | 24 | 2.125 | 1.035 | 1 | 4 |
| HighSchool | 24 | 1.958 | 1.122 | 1 | 4 |
| Improved_Understanding | 24 | 1.792 | 0.977 | 1 | 4 |
| Learn | 24 | 1.792 | 0.932 | 1 | 4 |
| Help_Future | 25 | 1.720 | 0.891 | 1 | 4 |
| Help_Know_Science | 25 | 2.000 | 1.118 | 1 | 4 |
| Science_Questions | 25 | 1.800 | 0.913 | 1 | 4 |
Quick Clean
Most questions are on a ordinal scale from 1-4, with the exception of questions regarding grade or birth year/month. It’s easy to spot (using the summary function) where the odd ball numbers are (there are a couple). I replace random high values with NAs, especially for the Likert questions (1-4), rather than drop them from the set.
*Another alternative would be to get complete cases, but we’d lose too much data in this dataframe.
*In cases like this data set where there are many NAs, we could also impute using the mean, but the data set is so small that I’m not sure I trust the mean.
prepost_inner[ prepost_inner >=1999 ] <- NA #the highest intentional value that I saw
#stargazer(prepost_inner, type = "html") #do another check, looks good but I have some questions
#Some queries: why is NativeHawaiin.y logical? Are there really no respondents with Novemeber or December birthdays?| Statistic | N | Mean | St. Dev. | Min | Max |
| ID | 31 | 17.419 | 10.388 | 1 | 34 |
| Gender.x | 30 | 1.467 | 0.507 | 1 | 2 |
| Grade.x | 31 | 4.613 | 0.667 | 4 | 6 |
| AmericanIndian.x | 2 | 1.000 | 0.000 | 1 | 1 |
| Asian.x | 2 | 2.000 | 0.000 | 2 | 2 |
| BlackorAfricanAmerican.x | 7 | 3.000 | 0.000 | 3 | 3 |
| Hispanic.x | 18 | 4.000 | 0.000 | 4 | 4 |
| NativeHawaiian.x | 1 | 5.000 | 5 | 5 | |
| WhitenonHispanic.x | 4 | 6.000 | 0.000 | 6 | 6 |
| Other.x | 3 | 7.000 | 0.000 | 7 | 7 |
| Science_Excited.x | 30 | 2.467 | 0.860 | 1 | 4 |
| TakeApart_Learn.x | 31 | 2.871 | 0.619 | 1 | 4 |
| Like_Participate.x | 31 | 2.968 | 0.836 | 1 | 4 |
| Science_Gift.x | 31 | 2.452 | 1.091 | 1 | 4 |
| Like_Made.x | 31 | 3.355 | 0.608 | 2 | 4 |
| Like_NatureTV.x | 31 | 2.355 | 0.985 | 1 | 4 |
| Curious_Science.x | 31 | 2.806 | 0.873 | 1 | 4 |
| Science_Outsideschool.x | 31 | 2.323 | 1.013 | 1 | 4 |
| Growup.x | 31 | 2.194 | 0.910 | 1 | 4 |
| Science_Job.x | 31 | 2.323 | 0.909 | 1 | 4 |
| Understand_Science.x | 31 | 2.903 | 0.790 | 1 | 4 |
| Enjoy_Museums.x | 30 | 3.233 | 0.679 | 1 | 4 |
| Science_Boring.x | 30 | 2.633 | 0.999 | 1 | 4 |
| Excited_Inventions.x | 31 | 2.710 | 0.783 | 1 | 4 |
| Dontlike_Reading.x | 30 | 2.500 | 0.974 | 1 | 4 |
| Pay_Attention.x | 31 | 2.806 | 0.749 | 1 | 4 |
| Curious_Cars.x | 30 | 2.533 | 1.137 | 1 | 4 |
| Excited_Activity.x | 30 | 2.600 | 0.770 | 1 | 4 |
| Enjoy_SciFi.x | 29 | 1.966 | 0.906 | 1 | 4 |
| ScienceNotForSchool.x | 30 | 2.533 | 0.776 | 1 | 4 |
| Like_Scienceatschool.x | 31 | 2.613 | 0.803 | 1 | 4 |
| Like_Scienceoutsideschool.x | 30 | 2.433 | 0.935 | 1 | 4 |
| Favorite_Subject.x | 31 | 2.226 | 0.990 | 1 | 4 |
| Afterschool_Subject.x | 30 | 2.300 | 0.952 | 1 | 4 |
| HavetoTake.x | 31 | 2.742 | 0.930 | 1 | 4 |
| HavetoTake_After.x | 31 | 2.129 | 0.991 | 1 | 4 |
| AtSchool_Future.x | 31 | 2.968 | 0.948 | 1 | 4 |
| AfterSchool_Future.x | 31 | 2.677 | 1.013 | 1 | 4 |
| WeeksDoingScience.x | 29 | 1.724 | 1.251 | 1 | 4 |
| HoursperWeek.x | 28 | 1.643 | 0.911 | 1 | 4 |
| Gender.y | 26 | 1.423 | 0.504 | 1 | 2 |
| Grade.y | 30 | 4.167 | 0.461 | 4 | 6 |
| AmericanIndian.y | 1 | 1.000 | 1 | 1 | |
| BlackorAfricanAmerican.y | 8 | 3.000 | 0.000 | 3 | 3 |
| Hispanic.y | 17 | 4.000 | 0.000 | 4 | 4 |
| WhitenonHispanic.y | 6 | 6.000 | 0.000 | 6 | 6 |
| Other.y | 3 | 7.000 | 0.000 | 7 | 7 |
| Science_Excited.y | 30 | 2.800 | 1.064 | 1 | 4 |
| TakeApart_Learn.y | 31 | 2.613 | 1.054 | 1 | 4 |
| Like_Participate.y | 31 | 3.129 | 0.846 | 1 | 4 |
| Science_Gift.y | 31 | 3.000 | 1.000 | 1 | 4 |
| Like_Made.y | 31 | 3.129 | 1.056 | 1 | 4 |
| Like_NatureTV.y | 30 | 2.633 | 1.159 | 1 | 4 |
| Curious_Science.y | 31 | 2.968 | 1.080 | 1 | 4 |
| Science_Outsideschool.y | 31 | 2.935 | 0.964 | 1 | 4 |
| Growup.y | 31 | 3.000 | 1.000 | 1 | 4 |
| Science_Job.y | 31 | 2.258 | 1.094 | 1 | 4 |
| Understand_Science.y | 30 | 3.100 | 0.995 | 1 | 4 |
| Enjoy_Museums.y | 28 | 3.143 | 0.932 | 1 | 4 |
| Science_Boring.y | 29 | 2.069 | 1.132 | 1 | 4 |
| Excited_Inventions.y | 31 | 2.806 | 0.910 | 1 | 4 |
| Dontlike_Reading.y | 30 | 2.300 | 0.952 | 1 | 4 |
| Pay_Attention.y | 31 | 2.839 | 1.003 | 1 | 4 |
| Curious_Cars.y | 31 | 2.581 | 0.992 | 1 | 4 |
| Excited_Activity.y | 31 | 3.000 | 0.931 | 1 | 4 |
| Enjoy_SciFi.y | 31 | 2.645 | 1.050 | 1 | 4 |
| ScienceNotForSchool.y | 29 | 2.448 | 1.055 | 1 | 4 |
| Like_Scienceatschool.y | 28 | 3.071 | 0.858 | 1 | 4 |
| Like_Scienceoutsideschool.y | 28 | 3.107 | 0.786 | 1 | 4 |
| Favorite_Subject.y | 28 | 3.036 | 0.962 | 1 | 4 |
| Afterschool_Subject.y | 28 | 2.786 | 0.995 | 1 | 4 |
| HavetoTake.y | 29 | 2.414 | 0.983 | 1 | 4 |
| HavetoTake_After.y | 29 | 1.931 | 0.961 | 1 | 4 |
| AtSchool_Future.y | 29 | 2.724 | 1.099 | 1 | 4 |
| AfterSchool_Future.y | 27 | 2.333 | 1.177 | 1 | 4 |
| MonthBorn | 25 | 6.800 | 2.517 | 1 | 10 |
| DayBorn | 25 | 14.120 | 9.479 | 1 | 31 |
| YearBorn | 12 | 1,997.833 | 0.577 | 1,996 | 1,998 |
| WeeksDoingScience.y | 27 | 3.000 | 1.074 | 1 | 4 |
| HoursperWeek.y | 26 | 2.154 | 0.967 | 1 | 4 |
| ASConservation | 6 | 1.000 | 0.000 | 1 | 1 |
| AS_SciencePlus | 4 | 2.000 | 0.000 | 2 | 2 |
| Mixing | 4 | 3.000 | 0.000 | 3 | 3 |
| Wonderwise | 2 | 4.000 | 0.000 | 4 | 4 |
| TechAS | 2 | 5.000 | 0.000 | 5 | 5 |
| NASA | 6 | 6.000 | 0.000 | 6 | 6 |
| More_Interesting | 26 | 1.923 | 1.017 | 1 | 4 |
| More_Excited | 25 | 2.120 | 1.130 | 1 | 4 |
| interested_sciencejob | 25 | 2.280 | 1.173 | 1 | 4 |
| Science_Fun | 25 | 1.920 | 1.038 | 1 | 4 |
| Sure_Job | 24 | 2.500 | 1.216 | 1 | 4 |
| More_Relaxed | 23 | 1.957 | 1.186 | 1 | 4 |
| Confident_Ability | 24 | 2.042 | 1.160 | 1 | 4 |
| Better_Student | 24 | 1.792 | 1.103 | 1 | 4 |
| Job_possible | 24 | 2.125 | 1.035 | 1 | 4 |
| HighSchool | 24 | 1.958 | 1.122 | 1 | 4 |
| Improved_Understanding | 24 | 1.792 | 0.977 | 1 | 4 |
| Learn | 24 | 1.792 | 0.932 | 1 | 4 |
| Help_Future | 25 | 1.720 | 0.891 | 1 | 4 |
| Help_Know_Science | 25 | 2.000 | 1.118 | 1 | 4 |
| Science_Questions | 25 | 1.800 | 0.913 | 1 | 4 |
Quick Overview
There was a small subset of students who took both the pre and post surveys, and even fewer completed it in its entirety. We find that only 31 students took both pre and post surveys, 19 of which were exicted about science before starting (using the Science_Excite.x variable) and 18 were excited about it after the program (using Science_Excite.y); that’s 61% and 58% of the sample, repspectively.
#31 students took both pre and post survey
length(prepost_inner$ID)## [1] 31
length(which(prepost_inner$Science_Excited.x >= 3)) ## [1] 19
length(which(prepost_inner$Science_Excited.y >= 3)) ## [1] 18
Sixth grade males
I made a separate subsetted dataframe just for ease of calculation. Here, we see that there are only 2 sixth grade males in the sample; only one of which thought was boring prior to the program (50% using a simple Boolean function). While we can be certain that one did not find science boring in the post period, the other respondent’s answer was missing. If you pull out that missing values (na.rm=TRUE), there was no sixth grade males who found science boring after the program.
sixth_male <- prepost_inner %>% filter(Grade.x==6 & Gender.x==2)
sum(sixth_male$Science_Boring.x>=3) ## [1] 1
sum(sixth_male$Science_Boring.y>=3) ## [1] NA
sum(sixth_male$Science_Boring.y>=3, na.rm= TRUE) ## [1] 0
Significance of Excitement on Hours Studying
To find if there was a signfificant bearing on how long students did science in the week and their level of agreement that science is something to be excited about,I performed a linear regression. I wanted to see what proportion, if any, of the number of hours devoted to science was explained by excitement for science. I found that the relationship was not signficant.
lm <- lm(HoursperWeek.y ~ Science_Excited.y, data=prepost_inner)
#Not statistically significant | Dependent variable: | |
| HoursperWeek.y | |
| Science_Excited.y | -0.087 |
| t = -0.514 | |
| Constant | 2.323 |
| t = 4.582*** | |
| Observations | 25 |
| R2 | 0.011 |
| Adjusted R2 | -0.032 |
| Residual Std. Error | 0.923 (df = 23) |
| F Statistic | 0.264 (df = 1; 23) |
| Note: | p<0.1; p<0.05; p<0.01 |
Double Checking
As a gut check, I broke the dataframe into a smaller subset with just the variables of interest (vars). I looked at the simple average difference between excitement and hours studied between pre and post (diffexcite and diffhours); the difference was not immense (about 1/3 point increase in excite post-program and a 20-minute increase in studying). I then used an ANOVA test to compare the mean hours studied between the four levels of excitement, and again found no statistically signficant relationship. As a final point of interest, I grouped the smaller science dataframe by each level of excitement’s respective mean hours of studying.
vars <- c("ID", "Gender.x", "Grade.x", "Science_Excited.x", "Science_Excited.y", "HoursperWeek.x", "HoursperWeek.y")
science <- prepost_inner[,vars]
science$diffexcite <- science$Science_Excited.y-science$Science_Excited.x
science$diffhours <- science$HoursperWeek.y-science$HoursperWeek.x
mean(science$diffexcite, na.rm=TRUE)## [1] 0.3333333
mean(science$diffhours, na.rm=TRUE)## [1] 0.3478261
#test for difference using ANOVA where H0 <- L1 = L2 = L3 = L4 (Ls represent each different level of excitement). The null hypothesis claims that these are all the same and that the excitement about science effectively has no bearing on the hours per week spent studying.
aov <- aov(science$HoursperWeek.y~science$Science_Excited.y)
summary(aov)## Df Sum Sq Mean Sq F value Pr(>F)
## science$Science_Excited.y 1 0.225 0.2253 0.264 0.612
## Residuals 23 19.615 0.8528
## 6 observations deleted due to missingness
#again not a significant relationship. The F Stat is far too small and the P-value is huge, which tells us it's very likely we're seeing these results purely by chance. We fail to reject the null hypothesis-- as far as we know, the hours spent studying across kids with varying levels of excitment are all the same.Triple Checking
Get mean for each level of excitement in the post period for a gut check. Oddly enough, the students most excited about science studied the least, which could either be showing that students with a proclivity for science study less or that the data is insufficient/noisy.
science_excite <- group_by(science, Science_Excited.y) %>%
summarise(mean_hours_study=mean(HoursperWeek.y,na.rm=TRUE),count=n())
science_excite <- science_excite[-(5),] #drop the NA
head(science_excite)## # A tibble: 4 × 3
## Science_Excited.y mean_hours_study count
## <int> <dbl> <int>
## 1 1 2.000000 4
## 2 2 2.166667 8
## 3 3 2.500000 8
## 4 4 1.777778 10
Boys versus Girls
I selected a handful of variables to look at means in pre and post data between boys and girls; I use the mean given the small sample size/lack of outliers in the data. I chose a blend of variables that poignantly target liking or understanding science, and questions that are more cloaked; something similar to a proxy. I would have loved to use the Confidence metric, but I found it as an improper ground for comparison given it was not in the pre-survey. I found that greatest difference was in the level of exictement between boys and girls: There was a mean increase of 1 whole point for boths, but a decrease of 1/3 point for girls; there were similar results for boys’ and girls’ desire to visit museums.
However, there was a proportionately larger effect in girls for a desire to understand science: Boys’ interest remained stagnant, but girls’ desire to understand science raised by 1/3 a point on average. It seems that science piqued boys excitement (about the subject and about museums), but it also seemed to increase girls’ interest in learning more despite it not seeming to make them feel more successful or excited about it.
gender_diff <- group_by(prepost_inner, Gender.x) %>%
summarise(diff_hours=mean(HoursperWeek.y,na.rm=TRUE)- #na.rms remove the missing values so they don't influence the mean
mean(HoursperWeek.x,na.rm=TRUE),
diff_excite=mean(Science_Excited.y,na.rm=TRUE)-
mean(Science_Excited.x,na.rm=TRUE),
diff_take_apart=mean(TakeApart_Learn.y,na.rm=TRUE)-
mean(TakeApart_Learn.x,na.rm=TRUE),
diff_boring=mean(Science_Boring.y,na.rm=TRUE)-
mean(Science_Boring.x,na.rm=TRUE),
diff_museum=mean(Enjoy_Museums.y,na.rm=TRUE)-
mean(Enjoy_Museums.x,na.rm=TRUE),
diff_understand=mean(Understand_Science.y,na.rm=TRUE)-
mean(Understand_Science.x,na.rm=TRUE),
diff_like_school= mean(Like_Scienceatschool.y,na.rm=TRUE)-
mean(Like_Scienceatschool.x,na.rm=TRUE),
diff_like_after= mean(Like_Scienceoutsideschool.y,na.rm=TRUE)-
mean(Like_Scienceoutsideschool.x,na.rm=TRUE),
count=n())
gender_diff <- gender_diff[-c(3),]
#gender_diff
#drop NA category that comes from some not stating their gender either in the pre or post| Gender.x | diff_hours | diff_excite | diff_take_apart | diff_boring | diff_museum | diff_understand | diff_like_school | diff_like_after | count | |
| 1 | 1 | 0.527472527472527 | -0.3125 | -0.3125 | -0.4375 | -0.333333333333333 | 0.3125 | 0.5 | 0.666666666666667 | 16 |
| 2 | 2 | 0.376623376623377 | 1 | -0.142857142857143 | -0.538461538461538 | 0.107142857142857 | 0.00549450549450547 | 0.318181818181818 | 0.523809523809524 | 14 |
Post-Program Questions
Out of curiousity, I wanted to see just the post survey questions between boys and girls. It seems that the program made science more fun and interesting for girls than for boys, but made boys feel like they were better students. I wonder if the improvement in boys’ attitudes regarding their performance is because boys’ disporoportionatey feel successful in science and the program allocated more time to a subject they feel comfortable with.
gender_1 <- prepost_inner %>% filter(Gender.x==1)
gender_1 <- gender_1[,c(89:103)]
#summary(gender_1)
gender_2 <- prepost_inner %>% filter(Gender.x==2)
gender_2 <- gender_2[,c(89:103)]
#summary(gender_2)| Statistic | N | Mean | St. Dev. | Min | Max |
| More_Interesting | 14 | 2.000 | 0.961 | 1 | 4 |
| More_Excited | 13 | 2.385 | 1.193 | 1 | 4 |
| interested_sciencejob | 13 | 2.308 | 1.251 | 1 | 4 |
| Science_Fun | 13 | 2.077 | 1.038 | 1 | 3 |
| Sure_Job | 12 | 2.667 | 1.231 | 1 | 4 |
| More_Relaxed | 12 | 2.083 | 1.165 | 1 | 4 |
| Confident_Ability | 12 | 2.083 | 1.165 | 1 | 4 |
| Better_Student | 12 | 1.667 | 0.985 | 1 | 4 |
| Job_possible | 12 | 1.917 | 0.996 | 1 | 4 |
| HighSchool | 12 | 1.750 | 1.055 | 1 | 4 |
| Improved_Understanding | 12 | 1.583 | 0.793 | 1 | 3 |
| Learn | 12 | 1.833 | 0.835 | 1 | 3 |
| Help_Future | 13 | 1.692 | 0.751 | 1 | 3 |
| Help_Know_Science | 13 | 2.154 | 1.214 | 1 | 4 |
| Science_Questions | 13 | 1.846 | 0.801 | 1 | 3 |
| Statistic | N | Mean | St. Dev. | Min | Max |
| More_Interesting | 11 | 1.909 | 1.136 | 1 | 4 |
| More_Excited | 11 | 1.909 | 1.044 | 1 | 4 |
| interested_sciencejob | 11 | 2.364 | 1.120 | 1 | 4 |
| Science_Fun | 11 | 1.818 | 1.079 | 1 | 4 |
| Sure_Job | 11 | 2.455 | 1.214 | 1 | 4 |
| More_Relaxed | 10 | 1.900 | 1.287 | 1 | 4 |
| Confident_Ability | 11 | 2.091 | 1.221 | 1 | 4 |
| Better_Student | 11 | 2.000 | 1.265 | 1 | 4 |
| Job_possible | 11 | 2.364 | 1.120 | 1 | 4 |
| HighSchool | 11 | 2.273 | 1.191 | 1 | 4 |
| Improved_Understanding | 11 | 2.091 | 1.136 | 1 | 4 |
| Learn | 11 | 1.818 | 1.079 | 1 | 4 |
| Help_Future | 11 | 1.818 | 1.079 | 1 | 4 |
| Help_Know_Science | 11 | 1.909 | 1.044 | 1 | 4 |
| Science_Questions | 11 | 1.818 | 1.079 | 1 | 4 |
Questions for Future Inspection
*With unlimited time, I would have liked to explore the relationships within grades and/or race using a fixed-effect model or a two-way fixed effects model. I believe that there may be important systematic differences between these groups that may be lost when all ages and races are being blended together.
*I would also be interested to see how the program differentially affects groups that identify themselves as someone who enjoys or dislikes science. While the program may not affect students who already like science, it could be disparately affecting students who do not. I’d like to explore correlatory relationships between post-performance and existing involvement in after-school programs. It would fascinate me to know it that negatively or positively impacts their perspective on science.
*Lastly, I would like to look closely at students that are systematically not responding to questions in the survey by looking at both pre and post data. I’d like to investigate why they are not answering questions, whether there is a way to engage them more in the survey process, or if there is a common misunderstanding by all students of particular questions.
*The data as it stands could use bolstering, so I’d like to look closely at this sample to challenge whether the next data collection effort could be more holistic.
Visual
One interesting finding what how the program was received differently by boys and girls. On one hand, boys who were interested in science prior to the program had increased interest following the program. Yet, girls who had more interest prior to the program were less interested after the program.
prepost_inner <- prepost_inner %>%
mutate(gender_factor = as.factor(Gender.x))
scatter_plot<- prepost_inner %>%
ggplot( aes(Curious_Science.x, Curious_Science.y, color=gender_factor))+
geom_jitter(size = 2, aes(color= gender_factor), alpha = 1) +
xlab("Pre-Interest in Science") +
ylab("Post-Interest in Science" ) +
ggtitle("Interest in Science Pre- Program Has Opposite Effects on Boys and Girls") +
scale_color_manual(values=c("#FF0000", "#00A08A", "#F2AD00", "#5BBCD6"),
guide = guide_legend(title = 'Gender')) +
scale_x_continuous(breaks= seq(0,180, 15)) +
theme_minimal() + geom_smooth(se=FALSE)
scatter_plot###