Students had a wide variety of ideas for the data and areas their project would include.

par(mar=c(8.1,4.1,4.1,2.1))
items <- c("exercise", "exercise","exercise","exercise","exercise","exercise","exercise","exercise","exercise",
           "diet","diet","diet","diet",
           "weight","weight","weight","weight","weight","weight",
           "mood", "mood", "mood", "mood", "mood", "mood", 
           "sleep","sleep","sleep","sleep","sleep","sleep",
            "mindfulness", "productivity","productivity","productivity")
barplot(table(items),las=2, main="Student Project Topics", col="steelblue")

I need to clarify one thing though. The idea of the project is that you will start an experiment on day 15. You’ll collect data from whatever source for the first 14 days. This establishes a baseline. Then, once you start your experiment (giving up coffee, meditating, eating a jalapeno every morning) you can compare your data from before and after and see if the change was large.

plot(1, type="n", xlab="Days", ylab="", xlim=c(0, 29), ylim=c(0, 10), bty="n", axes=F)
axis(1, at=c(1, 15, 28))
abline(v=15, col="gray60", lty=3)
text(1, 2, "Start Data", cex=.75)
text(15, 2, "Start Experiment", cex=.75)
text(28, 2, "End Data", cex=.75)
text(7, 3, "Baseline Data", cex=.75)
text(21, 3, "Treatment Data", cex=.75)

Does the thing youre interested in (your outcome) go up or down relative to the baseline data after you start your experiment? Thats the big question you want to answer.

It might go up consistently. If I decide to work an extra day every day, I might see a consistent increase in my productivity. More work means more done.

plot(1, type="n", xlab="Days", ylab="", xlim=c(0, 29), ylim=c(0, 10), bty="n", axes=F, "More Work, More Done")
## Warning in xy.coords(x, y, xlabel, ylabel, log): NAs introduced by coercion
axis(1, at=c(1, 15, 28))
abline(v=15, col="gray60", lty=3)
text(1, 2, "Start Data", cex=.75)
text(15, 2, "Start Experiment", cex=.75)
text(28, 2, "End Data", cex=.75)
text(7, 3, "Baseline Data", cex=.75)
text(21, 3, "Treatment Data", cex=.75)
segments(1, 5, 15, 5, col="red", lwd=2)
segments(15, 7, 28, 7, col="blue", lwd=2)

Or I might see an initial change, and then a return to normal. If I wake up an hour earlier to work to get my extra hour in, the first day might be great. Over time though I might miss that extra sleep, and even though I work longer I might not get more done. So there would be an initial bump, but a long term return to the average.

plot(1, type="n", xlab="Days", ylab="", xlim=c(0, 29), ylim=c(0, 10), bty="n", axes=F)
axis(1, at=c(1, 15, 28))
abline(v=15, col="gray60", lty=3)
text(1, 2, "Start Data", cex=.75)
text(15, 2, "Start Experiment", cex=.75)
text(28, 2, "End Data", cex=.75)
text(7, 3, "Baseline Data", cex=.75)
text(21, 3, "Treatment Data", cex=.75)
segments(1, 5, 15, 5, col="red", lwd=2)
segments(15, 7, 28, 5, col="blue", lwd=2)

If I decide to learn something new, I might see steady improvements in my work. It’s possible that learning a new skill like R wont have a huge initial impact, but over time I would become more and more efficient and more productive working with data.

plot(1, type="n", xlab="Days", ylab="", xlim=c(0, 29), ylim=c(0, 10), bty="n", axes=F, "Consistent Improvements")
## Warning in xy.coords(x, y, xlabel, ylabel, log): NAs introduced by coercion
axis(1, at=c(1, 15, 28))
abline(v=15, col="gray60", lty=3)
text(1, 2, "Start Data", cex=.75)
text(15, 2, "Start Experiment", cex=.75)
text(28, 2, "End Data", cex=.75)
text(7, 3, "Baseline Data", cex=.75)
text(21, 3, "Treatment Data", cex=.75)
segments(1, 5, 15, 5, col="red", lwd=2)
segments(15, 5, 28, 7, col="blue", lwd=2)

It’s also possible that learning something new might have real costs. I’m sure you’re feeling that if you think you can do some of what I’ll teach you already in excel. There might be an initial learning curve and decline in productivity, but given long enough you can make up bigger gains.

plot(1, type="n", xlab="Days", ylab="", xlim=c(0, 29), ylim=c(0, 10), bty="n", axes=F, "Consistent Improvements after decline")
## Warning in xy.coords(x, y, xlabel, ylabel, log): NAs introduced by coercion
axis(1, at=c(1, 15, 28))
abline(v=15, col="gray60", lty=3)
text(1, 2, "Start Data", cex=.75)
text(15, 2, "Start Experiment", cex=.75)
text(28, 2, "End Data", cex=.75)
text(7, 3, "Baseline Data", cex=.75)
text(21, 3, "Treatment Data", cex=.75)
segments(1, 5, 15, 5, col="red", lwd=2)
segments(15, 3, 28, 7, col="blue", lwd=2)

Those are just some of the shapes of what your data may show. Obviously, your data will be more messy, but the big thing we want to capture is how our experiment changed what happened after.

A more complicated design has less of a focus on a day where the experiment began. If your experiment or inpu has short term effects, you might be able to mix it in throughout the month. Say I want to meditate, and I think that’ll improve my mood, but it wont improve my mood beyond that day. If I track both, I might see differences over time. That would require more precise data collection though.

plot(1, type="n", xlab="Days", ylab="", xlim=c(0, 29), ylim=c(0, 10), bty="n", axes=F, main="A More Difficult Model")
axis(1, at=c(1, 15, 28))
abline(v=15, col="gray60", lty=3)
text(1, 2, "Start Data", cex=.75)
#text(15, 2, "Start Experiment", cex=.75)
text(28, 2, "End Data", cex=.75)
#text(7, 3, "Baseline Data", cex=.75)
#text(21, 3, "Treatment Data", cex=.75)
segments(1, 5, 2, 5, col="red", lwd=2)
segments(3, 5, 4, 5, col="red", lwd=2)
segments(5, 5, 6, 5, col="red", lwd=2)
segments(7, 5, 8, 5, col="red", lwd=2)
segments(9, 5, 10, 5, col="red", lwd=2)
segments(11, 5, 12, 5, col="red", lwd=2)
segments(13, 5, 14, 5, col="red", lwd=2)
segments(15, 5, 16, 5, col="red", lwd=2)
segments(17, 5, 18, 5, col="red", lwd=2)
segments(19, 5, 20, 5, col="red", lwd=2)
segments(21, 5, 22, 5, col="red", lwd=2)
segments(23, 5, 24, 5, col="red", lwd=2)
segments(25, 5, 26, 5, col="red", lwd=2)
segments(27, 5, 28, 5, col="red", lwd=2)
segments(2, 7, 3, 7, col="blue", lwd=2)
segments(4, 7, 5, 7, col="blue", lwd=2)
segments(6, 7, 7, 7, col="blue", lwd=2)
segments(8, 7, 9, 7, col="blue", lwd=2)
segments(10, 7, 11, 7, col="blue", lwd=2)
segments(12, 7, 13, 7, col="blue", lwd=2)
segments(14, 7, 15, 7, col="blue", lwd=2)
segments(16, 7, 17, 7, col="blue", lwd=2)
segments(18, 7, 19, 7, col="blue", lwd=2)
segments(20, 7, 21, 7, col="blue", lwd=2)
segments(22, 7, 23, 7, col="blue", lwd=2)
segments(24, 7, 25, 7, col="blue", lwd=2)
segments(26, 7, 27, 7, col="blue", lwd=2)

All of this is central to program evaluation. If you’re evaluating what happens to school grades after the introduction of free lunches, you’re doing something similar. You’d want to collect data on the schools before the program starts, then look at the impact after the experiment/introduction of lunches. If we can understand these ideas and methods in micro, we can scale them up later as we add in more complex data.

Is this research?

Yes! What goes into any research project?

  1. Identification of a question
  2. Clear description of how that question will be answered
  3. Consistent data collection
  4. Analysis of results

We’re doing all of that in miniature. That keeps the project more manageable because you wont even learn the methods you use until there’s only one week left. You can complete a really well done project though in this class, which the ideas underlying can be scaled up from in the future.