It’s your first year at Swarthmore. The classes turned out to be hard. For the first time in life, you have to do homework. A lot of it. But what about all the super exciting extracurricular activities, like doing debate and playing rugby? There are so many great people around, too. Spending time with them is also a must. You are overwhelmed. Your RA, a senior, has this peculiar look in her eyes. The look of experience. She looks like she has been through a lot and, ultimately, raised from the ashes. Just like a real Swarthmore Phoenix. The key to success, she says, is “you need to have at least 8 hours of sleep every night. Otherwise, you’ll feel perpetually more miserable.” You sleep six these days, and it feels worse and worse, indeed. But you are a stubborn, intellectually free spirit. So you don’t believe her. Your bullet-proof conviction is that sleep is for losers. The RA responds by saying that “look, what if I told you that if you start sleeping 8 hours a day, you will feel much happier? Like, on the scale from one to ten, how do you feel now?” — “Hm…around 5, to be honest.” — “I think that if you start getting a full night of sleep, you’ll feel a solid seven or even 8. Just go and ask the students who’ve been there, and they’ll tell you it’s the golden rule.”

You, the young free spirit, don’t believe her but decide to conduct an experiment nevertheless. Your assumption going in is that the difference between 6 and 8 hours of sleep is really not that huge. So, numerically, bumping it up to 8 hours will not make you more content daily.

And you know there is a chance you’ll be proven wrong. So, you will have to ditch your initial assumptions and start going to bed earlier. But what will be the criteria? Well, if, on average, people at large consistently experience well-being improvement, then you think you should reconsider your assumptions.

With that in mind, imagine sleep studies are being conducted on college campuses around the country. So you decide to ask the people there to ask college students who have experienced such a scenario. Namely, they used to regularly sleep less than 8 hours a night at some point in their college career. But nowadays, they have bumped the number of hours of sleep up to the suggested amount. And you want to know how they felt and what’s up now with their happiness levels. You made the right calls, and people started to send you such random samples.

There is this rather trippy phenomenon in statistics, which is, if you randomly select a bunch of samples of the same variables, together they will resemble the bell-shaped curve. Like, the majority of the samples will fall in the middle, and then the less likely ones will go on the two sides of the bulk in the center. So it goes. The sleep samples are being collected and sent to you. The bell-shaped thing is starting to emerge. Now, remember the initial assumption: better sleep does not lead to a higher level of subjective well-being. If the scenario is correct, our mean is supposed to be zero. As always, some crazy and lucky people would feel better, and some worse. Overall, however, most of the samples will fall in the center, around zero.

But here’s what’s happened. As your samples drip bit by bit into a coherent picture, you start to realize something. It seems like the average increase level of subjective well-being that they report is not zero, as you thought early on, but the whole two. Two points! Some people felt 5, went to bed at 11 for the next couple of weeks, and now they are at a full 7. What a life! And it looks like you might be wrong! Yet, how do we settle it exactly? What is the best way to do it?

Here where our P-value comes it. We got all of this data. It looks like the average bump is two points. And maybe it was just luck or something, so let’s ask the following question:

Given that the initial mean is 0, how likely or unlikely would it be to get the result of +2 on the happiness scale?

We have our bell-shaped little hill of stacked data. What we can do is calculate the level of likeliness of such a scenario happening under the null hypothesis condition. But have a final say we need to have some sort of a cutoff, because, honestly, anything can happen. Any random sample from a med school in Oregon, or something, where the kids got is 5 points happier now. The P-Value is such a cutoff!

It is a sort of a threshold, which means something like this:

It’s important to mention that it’s up to you to set that cutoff. The P-value in most cases is 0.05, but sometimes it is 0.1 or lower. The higher the P-value, the more strict you’ll get in terms of deciding what is enough to reject your null hypothesis.

That’s pretty much it. Having a 0.05 P-value, your threshold, you get your sleep data and realize that to get +2 out of so many random samples is just too unlikely for it to be a coincidence. In this case, being an enlightened-in-the-making Swattie, reject your null hypothesis, all the night parties in the next few days, and decide to go to bed at 10:30 pm. Congratulations, you are now on the right track to self-discovery and an intuitive understanding of statistics!