NOTE: This example paper demonstrates the following: 1. Key elements and structure of an analytic paper 2. Code for basic ggplot visualizations, including some basic formats/aesthetics
This example does NOT demonstrate how to build a focused analysis on a a given subject or unit of analysis. Remember that for this class it is best to choose one focus (oftentimes it’s best to choose a continuous/numeric measure) and then explore it in myriad ways through the paper.
Also, your job in this class is to learn not only the basics of ggplot
and other vis techniques, but then expand on them using your creativity, and resources like StackExchange, to make more refined and sophisticated visualizations.
Cobra University’s Big Rabbits program, in collaboration with its partnership school districts was awarded a grant to develop a comprehensive, standards aligned, fully field-based residency program to prepare prospective Hyenas for Pennsylvania’s high-need zoos. The Big Rabbits program sought to improve self-efficacy using three major components. These components are:
Big Rabbits aims to establish a new paradigm for selecting Hyenas with have an increased probability of successfully impacting on static and/or declining hunting achievement and will encourage development of additional innovative field-based approaches that are of high quality and affordable for both districts and zoos alike.
The evaluation uses survey data to track the cohort of prospective Hyenas during their time in the project. So far, the Big Rabbits cohort has completed three surveys since program inception that track program hyenas’ self-reported hunting self-efficacy as a future Hyena, their cultural sensitivity and their perceived future challenges as Hyenas. The first two of these measures are peer-reviewed scales from Fullan (2015) and Kostelnik (2009), two real-world researchers who have nothing to do with hyenas or zoos. The goal of the evaluation is to examine growth in Hyenas’ self-efficacy, cultural awareness, and competency in six focus areas. This report focuses on efficacy, culture and challenges candidates foresee.
The self-efficacy scale is an 18-item question bank using a 1 to 9 rating scale anchored at “none at all”" to “a great deal” respectively. Respondents rate themselves on various tasks such as whether he or she can “handle the time demands of the job” of being a Hyena, or “create a positive learning environment.”
The cultural sensitivity scale is a 21-item question bank using a 1 to 6 rating scale anchored at “Strongly Disagree” to “Strongly Agree” respectively. In this item bank, respondents self-rate their agreement with statements such as “I know the cultural values and religious beliefs of other cultures,” or “I enjoy living in cultures that are unfamiliar to me.”
Finally, prospective Hyenas self-rated the challenges they foresee if and when they assume a Hyena position. Data for this item-bank are scaled from 1 to 7 with the anchors being “Not at all challenging” to “A great challenge,” respectively.
Each of the aforementioned scales were computed as composite variables such that one average number could serve as a respondents’ “score” on each measure. The descriptive statistics of these composite variables are shown in the Results section below.
This report focuses on the three aforementioned outcomes that are known correlates of successful Hyenas. These three outcomes were measured in January 2017 (Baseline), May 2017, October 2017 and June 2018.
Data were collected from several school districts. The number of survey respondents from each district by time of measurement is shown below. Several respondents in June 2018 did not have district identifiers and did not participate in earlier surveys, thus their count by district is missing.
qw<-table(fake$district,fake$TimeFac)
qw%>%kable("html")%>%kable_styling(bootstrap_options="striped",full_width=FALSE)
Baseline | June 2017 | October 2017 | June 2018 | |
---|---|---|---|---|
Giraffe Coolness SD | 8 | 8 | 8 | 6 |
Lizards Rule SD | 4 | 4 | 3 | 4 |
Waterstreet SD | 2 | 4 | 3 | 5 |
Willow SD | 9 | 8 | 7 | 7 |
temp<-fake[fake$Time=="Baseline",]
At baseline, the avearge Challmean
value was 4.3623043 for all districts. Details by each district are shown below.
library(expss) # see documentation: https://cran.r-project.org/web/packages/expss/vignettes/tables-with-labels.html#cross-tablulation-examples
temp %>% tab_cells(Challmean)%>%
tab_cols(district)%>%
tab_stat_mean_sd_n()%>%
tab_pivot()
district | ||||
---|---|---|---|---|
Giraffe Coolness SD | Lizards Rule SD | Waterstreet SD | Willow SD | |
Challmean | ||||
Mean | 4.9 | 3.4 | 4.3 | 4.4 |
Std. dev. | 1.9 | 1.4 | 1.8 | 1.2 |
Unw. valid N | 8.0 | 4.0 | 2.0 | 9.0 |
It is essential that some measure of basic program components is completed so that stakeholders can assess whether the activities the planned (e.g. their “program”) was actually implemented. To this end, the October 2017 survey collected information about how often candidates met with their hyenas, collaborated with them, and whether they felt comfortable with their hyenas’ communication and reflective practices, among other measures. The figures below provide a count of how often candidates reported these activities, with data from this Fall and the previous summer.
There was an increase in hyena-rabbit interactions between October 2017 and June 2018. Importantly, the reference period for this varied depending on when the question was asked. In October 2017, respondents were provided a 60 day window to reference in responding, while in June 2018, respondents had a 150 day window. The barchart in Figure 1 accounts for the varying durations by adjusting responses by the number of days.
t<-temp %>% group_by(Time)%>%
summarise(Met=sum(f1),SubDis=sum(f2),Share=sum(f3),Reach=sum(f4))
t$days<-ifelse(t$Time=="June_2018",150,60)
t2<-t %>% group_by(Time) %>%
summarise(Met=Met/days, SubDis=SubDis/days, Share=Share/days, Reach=Reach/days)
t3<-melt(t2, id.vars = c("Time"), measure.vars = c("Met","SubDis","Share","Reach"))
t3$Time <-factor(t3$Time,
levels=c("Oct_2017","June_2018"), labels=c("October 2017","June 2018") )
t3$variable<-factor(t3$variable,
levels=c("Met","SubDis","Share","Reach"),
labels=c("FTF Meetings","Substantial Discussion","Share Resources","Reached Out to Hyena") )
library(Hmisc)
label(t3$variable)<-"Time of Interaction"
ggplot(t3,aes(x=Time,y=value,fill=variable))+
geom_bar(aes(fill=variable), position="dodge",stat="summary") +
ggtitle("Figure 1: Frequency of Hyena Interaction")+
xlab("Time of Measurement")+ylab("Frequency (weighted)")+
scale_fill_manual(values=c("#999999", "#E69F00", "#56B4E9","#a64dff"))+
theme_bw()+
guides(fill=guide_legend(title="Type of Interaction"))
This section reports on the average level of self-efficacy, cultural awareness and perceived challenges, as reported by candidates in the Big Rabbits program. The tables shown below provide basic descriptive statistics for each of these outcomes, respectively. Candidates’ reported Hyena self-efficacy was on average 6.93 (on a scale from 1 to 9), the mean cultural awareness score was 4.69 (on a scale of 1 to 6) and reports of how challenging various Hyena tasks appear to candidates was 5.43 (on a scale of 1 to 6). In relative terms, these scores
# compute mean Peff
fake$PEFFmean<-rowMeans(subset(fake, select = c(15:32)), na.rm = TRUE)
fake$Challmean<-rowMeans(subset(fake, select = c(34:39)), na.rm = TRUE)
fake$Culturemean<-rowMeans(subset(fake, select = c(42:61)), na.rm = TRUE)
The data show small increases in prospective Hyena self-efficacy over the three time points. Figure 2a shows that between baseline and June 2018 there are small fluctuations in the composite measure of Hyena self-efficacy. This pattern is most apparent in ….. When change in self-efficacy is examined with respect to age range of the respondent, we see a similar pattern; respondents who were youngest and oldest appeared to have larger increases between January and October 2017. Statistical tests of the differences will be provided in forthcoming reports.
fake$district<-as.factor(fake$district)
g2<-ggplot(fake,aes(x=TimeFac,y=PEFFmean,fill=TimeFac))+
geom_bar(stat='summary',fun.y=mean, position = position_dodge())+
theme_bw()+
theme(axis.text.x=element_text(angle=90, hjust=1))+
scale_y_continuous(breaks=c(0,5,10),label=c("0","5","10"))+
ylab("Mean Efficacy Score")+xlab("Age Range")+
#geom_text(stat='identity',aes(label=nlevels(Q45_AgeRange)),vjust=-1) +
scale_fill_manual(values=c("#999999", "#E69F00", "#56B4E9","#a64dff"))
g2+ggtitle("Figure 2a: Efficacy Levels by Time")+ ylim(0,12)+
guides(fill=guide_legend(title="Survey Admin. Time"))
When we disaggregate the efficacy analysis by district (Figure 2b), we see that results are generally similar. However, one district has lower self-efficacy reports in October 2017 and June 2018.
g2+ facet_wrap(fake$district, scales = "free_x")+
ggtitle("Figure 2b: Efficacy Levels by Time")+ ylim(0,12)+
guides(fill=guide_legend(title="Survey Admin. Time"))
Participants’ self-ratings on their sensitivity to different cultural values, backgrounds and the like, grew consistently between baseline and June 2018.
g3<-ggplot(fake,aes(x=TimeFac,y=Culturemean,fill=TimeFac))+
geom_bar(stat='summary',fun.y=mean, position = position_dodge())+
scale_y_continuous(breaks=c(0,5,10),label=c("0","5","10"))+
ylab("Mean Efficacy Score")+xlab("Age Range")+
#geom_text(stat='identity',aes(label=nlevels(Q45_AgeRange)),vjust=-1) +
theme_bw()+
scale_fill_manual(values=c("#999999", "#E69F00", "#56B4E9","#a64dff"))+
ggtitle("Figure 3")+ ylim(0,7)
g3+guides(fill=guide_legend(title="Survey Admin. Time"))+
theme_dark()+theme( axis.text.x = element_text(angle = 60, vjust=-.1))
g4<-ggplot(fake,aes(x=TimeFac,y=Challmean,fill=TimeFac))+
geom_bar(stat='summary',fun.y=mean, position = position_dodge(), na.rm=TRUE)+
theme_bw()+
scale_y_continuous(breaks=c(0,5,10),label=c("0","5","10"))+
ylab("Mean Challenge Score")+xlab("District")+
#geom_text(stat='identity',aes(label=nlevels(Q45_AgeRange)),vjust=-1) +
scale_fill_manual(values=c("#999999", "#E69F00", "#56B4E9","#a64dff"))+
ggtitle("Figure 4")+ ylim(0,10)+
guides(fill=guide_legend(title="Survey Admin. Time"))
g4
The next set of analyses examine whether candidates’ self-efficacy is related to their perspectives on culture and the challenges they will later face as Hyenas. Figures 5 and 6 below show a scatter plot of the average efficacy score on the vertical axis and the corresponding covariate (either cultural awareness or perceived challenges) on the horizontal axis. The plots are made for each data collection completed so far.
Efficacy and Culture. Figure 5 shows that the relationship between the composite culture score and self-efficacy is typically small (blue regression lines), although the June 2017 survey indicates a greater correlation was evident. On average the correlation between these two areas as 0.13.
Efficacy and Challenge. Somewhat larger relationships between self-efficacy and perceived future challenges (Figure 6) are apparent for data collected earlier during the program. The most recent data (from October 2017), shows a smaller relationship. Additionally, the October 2017 data also show a relationship that is intuitive: as candidates who perceive more challenges ahead also tend to have less self-efficacy about being a Hyena.
Future analyses will examine whether these three variables interact with one another, as well as how they relate to other data on candidate performance in coursework. ….etc….
library(ggpmisc)
# plug missing values with mean
for(i in 1:ncol(fake)){
fake[is.na(fake[,33]), 33] <- mean(fake[,33], na.rm = TRUE) }
for(i in 1:ncol(fake)){
fake[is.na(fake[,62]), 62] <- mean(fake[,62], na.rm = TRUE) }
my.formula<-fake$PEFFmean~fake$Culturemean
ggplot(fake,aes(y=fake$PEFFmean,x=fake$Culturemean))+geom_point()+
geom_smooth(method="lm",se=FALSE) +facet_grid(.~TimeFac)+
ggtitle("Figure 5: Self-efficacy and Perceived Culture Awareness")+
xlab("Average Perceived Culture Score")+ylab("Average Hyena Self-Efficacy Score")+
theme_bw()
#ggplotly(s2)
ggplot(fake,aes(y=fake$PEFFmean,x=fake$Challmean))+geom_point()+
geom_smooth(method="lm",se=FALSE) +facet_grid(.~TimeFac)+
ggtitle("Figure 6: Self-efficacy and Perceived Challenges")+
xlab("Average Perceived Challenge Score")+ylab("Average Hyena Self-Efficacy Score")+
theme_bw()
#ggplotly(s3)
The Big Rabbits evaluation also used a measure of octupus collaboration as an outcome, collecting these data for the first time in Fall 2017. The table below shows results from regression analyses that …
Trends in Hyena self-efficacy and cultural awareness show small gains between program inception and Fall 2017. Combined with other data from Big Rabbits these results lend support to how the program may be associated with….
Limitations of these data include that the measures are self-rated and prospective. Direct observational studies that collect information from the field may be more valid, although these were beyond the budget and scope of the present project.
Fullan, M. & Quinn, J. (2015). Coherence: The Right Drivers in Action for Schools, Districts, and Systems. 1st edition. Thousand Oaks, California: Corwin.
Kostelnik, M. J. & Grady, M. L. {2009) Getting It Right from the Start: The Principal’s Guide to Early Childhood Education. Thousand Oaks, CA: Corwin Press and NAESP.