I am going to be analyzing the research question of whether there is a difference in how much risk Emory students from different family income levels are willing to take. To do this, I will be using the estimated combined family income of working adults to divide the data on students from different family income levels into different groups.
I am interested in additional questions for the analysis of whether there is a difference in the amount of risk students are willing to take based on income. These additional questions include how strict students are with the practice of social distancing, the speed at which they drive, the type of career these students see themselves going into, and whether they consider themselves risk averse or risk lovers.
How strict social distancing guidelines are followed shows how much students are willing to risk their own health and the health of those around them and how knowledgeable students are about the effects of their actions. The speed at which students drive shows how risky they are willing to be in terms of potential monetary fines. The type of career students choose could show if money is a factor in their decision and if there is a difference in what students pick in terms of being safe or doing something they are passionate about that may not pay as well. Lastly, the question of whether students consider themselves risk averse or risk lovers will answer the questions directly, but I am curious to see if the answer to this question matches their responses to other questions that give insight into how much risk students are willing to take.
As someone who is blessed to come from a family with a higher income level, I am curious to see whether it influences my daily behavior and the decisions I make about my future career. I believe that I do not worry about my future career earnings severely; however, I do care about my future, and I do want to make a positive impact towards society regardless of the field I go into. In terms of following COVID protocol, I personally follow it very strictly because I would consider myself to be a very compassionate and empathetic person, which is why I would never want to be the reason someone is impacted negatively. I think remaining healthy and living a healthy lifestyle is really important, which might be influenced by my parents spending money on me through childhood to make sure that I ate healthy meals and got plenty of exercise by being on sports teams. I also believe my parents have educated me on the seriousness of COVID and other illnesses very well, which has helped me to understand how important it is to keep myself and others around me healthy and safe. However, overall, I am not entirely sure if I would expect students from higher or lower family income levels to be more open to not following COVID protocol. On one hand, students from higher family income levels may not be as concerned about paying hospital/medical fees if they were to get COVID, so they might engage in more risky behavior. On the other hand, students from lower family income levels might consciously engage in more risky behavior if it means earning money for their family or doing what they have to do to make a living regardless of if they are exposing themselves to potential harm.
Null Hypothesis: The proportion of Emory students who consider themselves to be risk-averse, in the middle, or risk loving is equal across all students from all different family income levels (the estimated combined family income of working adults).
Alternative Hypothesis: Emory students from lower family income levels, which will be defined by an estimated combined family income of working adults of $100K or less for purposes of the research question, consider themselves to be more risk-averse.
Null Hypothesis: The proportion of Emory students who go into each job sector is equal across all students from all family income levels (the estimated combined family income of working adults).
Alternative Hypothesis: Emory students from lower family income levels, which will be defined by an estimated combined family income of working adults of $100K or less for purposes of the research question, go into the fields of medicine and education more and the field of business less.
Null Hypothesis: The proportion of Emory students who speed and do not speed while driving is equal across all students from all family income levels (the estimated combined family income of working adults).
Alternative Hypothesis: Emory students from higher family income levels, which will be defined by an estimated combined family income of working adults of $100K or less for purposes of the research question, drive over the speed limit more.
According to the Joseph Rowntree Foundation (JRF), an organization that works to tackle poverty within the United Kingdom, there have been multiple studies that have shown there might be a positive relationship between socioeconomic status and risk taking behavior. In other words, these studies have shown that the lower people’s socioeconomic statuses are, the less risk they are likely to take (or more risk-averse they are).
One of the studies that JRF mentions took place within Turkey utilizing data from a college entrance exam. The study found a relationship between income and career choice. Regardless of the age and gender of students, the lower the income of students, the more likely they were to choose safer career options such as jobs within the health or education sector as opposed to less safe careers like business.
In this study, it is stated that within Turkey, people entering the health or education sector are least likely to lose their jobs and have the most job security. It is also stated that the income of students’ parents and whether or not a student’s father is self-employed or not plays a large role in determining what a college student will major in. Students that come from larger income backgrounds and/or have self-employed fathers are more likely to pick a field and major that may not have the most job security or earn as much such as business; whereas, students that come from smaller income backgrounds and/or do not have self-employed fathers are more likely to pick a major that is known to have good job security such as health or education.
Here in this table, the unemployment rates for different occupations are shown.
As you can see, teachers (the education sector) and medical profesionals (the health sector) have lower unemployment rates than accountants and managers of retail or wholesale (the business sector).
Here in this table, students’ choices of majors are shown based on family income, the employment status of their fathers, and more.
As you can see, students who choose majors within business have a much higher mean income and the proportion of them who have self-employed fathers is much higher as opposed to students who choose to major in education or health.
Another study in Ghana on the factors that play into risky driving showed that people of higher socioeconomic status tend to engage in risky behaviors more while driving.
Here in this table, the mean numbers of risky driving separated by the factor of socioeconomic status is shown alongside many other factors that contribute to risky driving.
As you can see, the mean for risky driving goes up the higher the socioeconomic status is of the driver.
Another study in Germany measured the effects of socioeconomic status on the amount of risk a person is willing to take.
Here in this table, the probability of high risk and no risk is shown separated by household income and other factors.
As you can see, the probability of no risk is higher for people from lower household incomes, and the probability of high risk is higher for people from higher household incomes.
Lastly, according to research done at the University of Bergen, people from a higher socioeconomic status are more likely to follow health recommendations than people from a lower socioeconomic status. These health recommendations are for general healthy living tips, but I am curious to see whether this would carry over to COVID safety and regulations as well.
Links to External Data:
https://www.tandfonline.com/doi/full/10.1080/23311908.2017.1376424?scroll=top&needAccess=true
The data that I will be using to answer my research question comes from a survey conducted by Dr. Paloma Lopez de mesa Moyano for her Spring 2021 Economics 220 class at Emory University. The sample population consists of Emory undergraduate students associated with the Economics department in some way (major, minor, concentration, etc). I believe that this data will be mostly adequate to answer my research question because the external data that I have used as motivation for my research question also studied college aged students. However, there might be some potential bias since the sample may be more similar than we would want in that all of the students attend Emory and are associated with the Economics department in some way.
# Loading the dataset
load("/Users/jaiarora/Downloads/Econ220DataS21_ano.Rdata")
# Taking the dataset, selecting the variables I will be using for my research question, and storing it into an object called 'myresearchdata'
myresearchdata <- Econ220DataS21_ano %>%
select(Income, q30, q54, q55, q58) %>%
na.omit()
myresearchdata$Income
# Explaining what the variables I selected represent
# q30 = how strict students are with social distancing on a scale of 1-5
# q54 = future career/occupation interest
# q55 = whether students speed or follow the speed limit and to what extent
# q58 = whether students are risk averse, in the middle, or risk lovers
# Cleaning the variable 'Income' by turning the categories to numerical values
myresearchdata$Income[myresearchdata$Income=='Under 50,000']<-1
myresearchdata$Income[myresearchdata$Income=='$50,001 - $100,000']<-1
myresearchdata$Income[myresearchdata$Income=='$100,001 - $200,000']<-2
myresearchdata$Income[myresearchdata$Income=='$200,001 - $400,000']<-2
myresearchdata$Income[myresearchdata$Income=='$400,001 - $600,000']<-2
myresearchdata$Income[myresearchdata$Income=='$600,000+']<-2
# Continuing to clean the variable 'Income' by turning the numerical values into factors and setting the levels.
myresearchdata$Income<- factor(as.numeric(myresearchdata$Income), labels=c("$0K-100K",">$100K"))
# Turning the 'Income' variable into a dataframe
data.frame(table(myresearchdata$Income))
# Getting rid of the 'NA' within the variable q30.
myresearchdata$q30
na.omit(myresearchdata$q30)
# Changing the observation "1" to "1 (Least Strict)" and "5" to "5 (Most Strict)" to be more clear and descriptive as to what the numbers mean.
myresearchdata$q30[myresearchdata$q30=='1']<- "1 (Least Strict)"
myresearchdata$q30[myresearchdata$q30=='5']<- "5 (Most Strict)"
# Turning the 'q30' variable into a dataframe
data.frame(table(myresearchdata$q30))
# Turning the "" observation within the variable 'q54' to NA and then omitting all NA values from the variable 'q54.'
myresearchdata$q54[myresearchdata$q54==""]<-NA
myresearchdata$q54
na.omit(myresearchdata$q54)
# combined the two law observations together
myresearchdata$q54[myresearchdata$q54=="Law – Private"]<-"Law"
myresearchdata$q54[myresearchdata$q54=="Law – Public"]<-"Law"
# Turning the 'q54' variable into a dataframe.
data.frame(table(myresearchdata$q54))
# Turning the "" observation within the variable 'q55' to NA and then omitting all NA values from the variable 'q55.' Also, combining the "always UNDER the speed limit" observations with the "always under or at the speed limit" observations since the former is included in the latter.
myresearchdata$q55
myresearchdata$q55[myresearchdata$q55=="always UNDER the speed limit"]<- "always under or at the speed limit"
myresearchdata$q55[myresearchdata$q55==""]<-NA
myresearchdata$q55
na.omit(myresearchdata$q55)
# Turning the 'q55' variable into a dataframe.
data.frame(table(myresearchdata$q55))
# Changing the responses of these two observations to NA because they do not need to be included since these students do not drive and therefore no information about whether they speed or not can be collected. Then, omitting the NA values so that they are not included.
myresearchdata$q55[myresearchdata$q55=='I do not drive']<-NA
myresearchdata$q55[myresearchdata$q55=='I get driven']<-NA
na.omit(myresearchdata$q55)
# Turning the 'q55' variable into a dataframe.
data.frame(table(myresearchdata$q55))
# Turning the "" observation within the variable 'q58' to NA and then omitting all NA values from the variable 'q58.'
myresearchdata$q58[myresearchdata$q58==""]<-NA
myresearchdata$q58
na.omit(myresearchdata$q58)
# Turning the 'q58' variable into a dataframe.
data.frame(table(myresearchdata$q58))
# Creating a new object and renaming the variables I am using to something that is more indicative of what they represent.
myresearchdata2<-myresearchdata
Income<-as.character(myresearchdata$Income)
Socialdistancing<-as.numeric(myresearchdata$q30)
Career<-as.character(myresearchdata$q54)
Driving<-as.character(myresearchdata$q55)
Risktaking<-as.character(myresearchdata$q58)
names(myresearchdata2) <- c("Income", "Social Distancing","Career", "Driving", "Risk Taking")
# Printing the summary statistics of my dataset that I am working with that include only the variables I am working with
summary(myresearchdata2)
myresearchdata2
# Creating proportion tables of family income levels of students using kable function and prop.table function
kable(prop.table(table(myresearchdata2$Income)), digits=3,
col.names=c("Income", "Prop")) %>%
kable_styling(bootstrap_options = "striped", full_width = T)
| Income | Prop |
|---|---|
| $0K-100K | 0.326 |
| >$100K | 0.674 |
# Creating proportion tables of social distancing strictness of students using kable function and prop.table function
kable(prop.table(table(myresearchdata2$`Social Distancing`)), digits=3,
col.names=c("Social Distancing", "Prop")) %>%
kable_styling(bootstrap_options = "striped", full_width = T)
| Social Distancing | Prop |
|---|---|
| 1 (Least Strict) | 0.021 |
| 2 | 0.126 |
| 3 | 0.316 |
| 4 | 0.400 |
| 5 (Most Strict) | 0.137 |
# Creating proportion tables of the careers students' are pursuing using kable function and prop.table function
kable(prop.table(table(myresearchdata2$Career)), digits=3,
col.names=c("Career", "Prop")) %>%
kable_styling(bootstrap_options = "striped", full_width = T)
| Career | Prop |
|---|---|
| Education | 0.053 |
| Fine Arts/Creative/Performance | 0.011 |
| Law | 0.074 |
| Medical – Private | 0.095 |
| NGO; Medical – Public | 0.074 |
| Other | 0.042 |
| Private sector/business | 0.568 |
| Public sector/government | 0.084 |
# Creating proportion tables of the driving speeds of students using kable function and prop.table function
kable(prop.table(table(myresearchdata2$Driving)), digits=3,
col.names=c("Driving", "Prop")) %>%
kable_styling(bootstrap_options = "striped", full_width = T)
| Driving | Prop |
|---|---|
| always under or at the speed limit | 0.139 |
| within 10 MPH of the speed limit | 0.625 |
| within 20 MPH of the speed limit | 0.236 |
# Creating proportion tables based of how risky students would describe themselves to be using kable function and prop.table function
kable(prop.table(table(myresearchdata2$`Risk Taking`)), digits=3,
col.names=c("Risk Taking", "Prop")) %>%
kable_styling(bootstrap_options = "striped", full_width = T)
| Risk Taking | Prop |
|---|---|
| In the middle | 0.400 |
| Risk Averse | 0.463 |
| Risk Lover | 0.137 |
# Creating a barplot using ggplot to graph the "Risktaking" and "Income" variable within my researchdata2 against each other to see the proportions of risk-averse, in the middle, and risk-lover students within each income bracket (100K or less or 100K+) with the x and y axis flipped and storing it into myresearchdata3
myresearchdata3 <-
ggplot(myresearchdata2,
aes(y = Risktaking,
fill = Income, na.rm=TRUE)) +
geom_bar(position = "fill") +
labs(x = "Proportion", y="") +
ggtitle("Levels of Risk By Income Level") +
theme(plot.title = element_text(hjust = 0.5)) +
theme_minimal()
# Turning the ggplot into a plotly using the ggplotly function
ggplotly(myresearchdata3)
Looking at the bar plot above, it can be seen that about 75% of students within the sample that are risk-loving come from family incomes of greater than 100,000 dollars and that about 65% of students within the sample that are both risk-averse or in the middle come from family incomes of greater than 100,000 dollars.
The proportion of students that come from family income levels of 100,000 dollars or less and are risk-averse or in the middle is about 35%, but the proportion of students that come from family income levels of 100,000 dollars or less and are risk-lovers is only about 25%. This might indicate that to some extent, students from lower family income levels are less likely to take risk because their financial situation may not allow them to take such risk.
# Creating a barplot using ggplot to graph the "Career" and "Income" variable within my researchdata2 against each other to see the proportions of students going into each job sector within each income bracket (100K or less or 100K+) with the x and y axis flipped and storing it into myresearchdata4
myresearchdata4 <-
ggplot(myresearchdata2,
aes(y = Career,
fill = Income, na.rm=TRUE)) +
geom_bar(position = "fill") +
labs(x = "Proportion", y="") +
ggtitle("Career Interests By Income Level") +
theme(plot.title = element_text(hjust = 0.5)) +
theme_minimal()
# Turning the ggplot into a plotly using the ggplotly function
ggplotly(myresearchdata4)
Looking at the bar plot above, it can be seen that 100% of students within the sample that are pursuing the fine arts or a job that is not listed come from family incomes of greater than 100,000 dollars. Additionally, for all other job sectors except for law, the majority of students pursuing those jobs come from family income levels of greater than 100,000 dollars. For law, about 57% of students pursuing law come from family income levels of 100,000 dollars or less.
Other than the categories of other and fine arts, the career that is mostly made up of students from family income levels of greater than 100,000 dollars is the medical-private sector (~78%), which could be because of how expensive the medical school route is. Medical school costs a lot of money, and students from lower family incomes may have to take out student loans and take on more student debt, which might not be such an appealing choice.
# Creating a barplot using ggplot to graph the "Driving" and "Income" variable within my researchdata2 against each other to see the proportions of students' driving speeds within each income bracket (100K or less or 100K+) with the x and y axis flipped and storing it into myresearchdata5
# Filtering out the NA values wtihin the variable "Driving" using dplyr's filter function and the !is.na() function within the filter function
myresearchdata5 <-
myresearchdata2 %>%
filter(!is.na(Driving)) %>%
ggplot(aes(y = Driving,
fill = Income, na.rm=TRUE)) +
geom_bar(position = "fill") +
labs(x = "Proportion", y="") +
ggtitle("Driving Speed By Income Level") +
theme(plot.title = element_text(hjust = 0.5)) +
theme_minimal()
# Turning the ggplot into a plotly using the ggplotly function
ggplotly(myresearchdata5)
Looking at the bar plot above, it can be seen that the majority of students who drive within 10 miles per hour (~78%) or 20 miles per hour of the speed limit (~59%) are students with family income levels greater than 100,000 dollars. On the opposite end, it can be seen that the majority of students who drive under or at the speed limit (~70%) are students with family income levels of 100,000 dollars or less.
This difference in driving speeds between students with family incomes of 100,000 dollars or less and students with family incomes of more than 100,000 dollars could be attributed to the cost of speeding tickets, car repair costs, etc. Students coming from lower family incomes may be more careful when they drive in order to avoid unnecessary costs; whereas, students from higher family incomes may not be as concerned about money and therefore drive a little more care-free.
# Changing values of "1 (Least Strict)" and "5 (Most Strict)" in the variable q30 to "1" and "5" respectively for purposes of the graph to come out correct
myresearchdata$q30[myresearchdata$q30=='1 (Least Strict)']<- "1"
myresearchdata$q30[myresearchdata$q30=='5 (Most Strict)']<- "5"
# Turning the variable "Socialdistancing" into a numerical variable using the as.numeric function
Socialdistancing<-as.numeric(myresearchdata$q30)
# Creating a barplot using ggplot to graph the "Socialdistancing" and "Income" variable within my researchdata2 against each other to see the proportions of students' social distancing strictness within each income bracket (100K or less or 100K+) with the x and y axis flipped and storing it into myresearchdata6
# Using the fct_rev function from the forcats package to reorder the levels of social distancing ratings on the y-axis from 5-1 to 1-5
# Using the as.factor to turn the variable "Socialdistancing" into a factor
myresearchdata6 <-
ggplot(myresearchdata2,
aes(y = fct_rev(as.factor(Socialdistancing)),
fill = Income, na.rm=TRUE)) +
geom_bar(position = "fill") +
labs(x = "Proportion", y = "(Least to Most Strict (1-5))",
fill = "Social Distancing Strictness") +
ggtitle("Social Distancing Practices By Income Level") +
theme(plot.title = element_text(hjust = 0.5)) +
theme_minimal()
# Turning the ggplot into a plotly using the ggplotly function
ggplotly(myresearchdata6)
Looking at the bar plot above, it can be seen that 100% of students within the sample that rate their social distancing strictness at a 1 (least strict) come from family incomes of greater than 100,000 dollars. Additionally the majority of students rating their social distancing strictness at a 3 (~73%), 4 (~74%), or 5 (most strict) (~54%) come from family income levels of greater than 100,000 dollars as well. However, 58% of students that rate their social distancing strictness at a 2 come from family income levels of 100,000 dollars or less.
Although the majority of students that rated their social distancing strictness at a 5 (most strict) are students who have a family income of higher than 100,000 dollars, it is barely the majority - 54%. Therefore, the proportion of students who rate their social distancing strictness at a 5 (most strict) is pretty equal between students with family incomes of 100,000 dollars or less and students with family incomes of greater than 100,000 dollars.
# Creating a new variable named "riskaverse" within the myresearchdata2 dataset
# It will give the percentage of students who are risk-averse based on family income
myresearchdata2$riskaverse <- (myresearchdata2$`Risk Taking`=="Risk Averse")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$riskaverse))
##
## FALSE TRUE
## 0.5368421 0.4631579
# Running the t-test to compare the mean proportion of students who are risk averse based on family income
# Subsetting myresearchdata2 by income level
t.test(riskaverse ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: riskaverse by Income
## t = 0.2777, df = 58.797, p-value = 0.7822
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1908123 0.2523043
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.483871 0.453125
Out of the sample, 46.3% of students consider themselves to be risk averse and 53.7% of students consider themselves to be either risk lovers or in the middle.
Looking at the results of the t-test, it can be seen that 48.4% of students that have an estimated combined family income of working adults of 100,000 dollars or less consider themselves to be risk averse; whereas, 45.3% of students that have an estimated combined family income of working adults of greater than 100,000 dollars consider themselves to be risk averse.
This difference is not statistically significant because the p-value is 0.7822, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.19 and 0.2 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who consider themselves to be risk-averse, in the middle, or risk loving being equal across all students from all family income levels (the estimated combined family income of working adults).
# Creating a new variable named "risklover" within the myresearchdata2 dataset
# It will give the percentage of students who are risk-lovers based on family income
myresearchdata2$risklover <- (myresearchdata2$`Risk Taking`=="Risk Lover")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$risklover))
##
## FALSE TRUE
## 0.8631579 0.1368421
# Running the t-test to compare the mean proportion of students who are risk lovers based on family income
# Subsetting myresearchdata2 by income level
t.test(risklover ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: risklover by Income
## t = -0.84059, df = 71.103, p-value = 0.4034
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.20055367 0.08160205
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.09677419 0.15625000
Out of the sample, 13.7% of students consider themselves to be risk lovers and 86.3% of students consider themselves to be either risk averse or in the middle.
Looking at the results of the t-test, it can be seen that 9.7% of students that have an estimated combined family income of working adults of 100,000 dollars or less consider themselves to be risk lovers; whereas, 15.6% of students that have an estimated combined family income of working adults of greater than 100,000 dollars consider themselves to be risk lovers.
This difference is not statistically significant because the p-value is 0.4034, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.20 and 0.08 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who consider themselves to be risk-averse, in the middle, or risk loving being equal across all students from all family income levels (the estimated combined family income of working adults).
# Creating a new variable named "middle" within the myresearchdata2 dataset
# It will give the percentage of students who are in the middle based on family income
myresearchdata2$middle <- (myresearchdata2$`Risk Taking`=="In the middle")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$middle))
##
## FALSE TRUE
## 0.6 0.4
# Running the t-test to compare the mean proportion of students who are in the middle based on family income
# Subsetting myresearchdata2 by income level
t.test(middle ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: middle by Income
## t = 0.26342, df = 58.405, p-value = 0.7932
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1895528 0.2470125
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.4193548 0.3906250
Out of the sample, 40% of students consider themselves to be in the middle and 60% of students consider themselves to be either risk lovers or risk averse.
Looking at the results of the t-test, it can be seen that about 41.9% of students that have an estimated combined family income of working adults of 100,000 dollars or less consider themselves to be in the middle; whereas, 39.1% of students that have an estimated combined family income of working adults of greater than 100,000 dollars consider themselves to be in the middle.
This difference is not statistically significant because the p-value is 0.7932, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.18 and 0.25 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who consider themselves to be risk-averse, in the middle, or risk loving being equal across all students from all family income levels (the estimated combined family income of working adults).
# Creating a new variable named "government" within the myresearchdata2 dataset
# It will give the percentage of students who are pursuing careers in the public sector/government based on family income
myresearchdata2$government <- (myresearchdata2$Career=="Public sector/government")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$government))
##
## FALSE TRUE
## 0.91578947 0.08421053
# Running the t-test to compare the mean proportion of students who are pursuing careers in the public sector/government based on family income
# Subsetting myresearchdata2 by income level
t.test(government ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: government by Income
## t = 0.2928, df = 54.188, p-value = 0.7708
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1090379 0.1463363
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.09677419 0.07812500
Out of the sample, 8.4% of students want to pursue work in the public sector/government and 91.6% of students want to pursue work in other job sectors.
Looking at the results of the t-test, it can be seen that 9.7% of students that have an estimated combined family income of working adults of 100,000 dollars or less want to pursue work in the public sector/government; whereas, 7.8% of students that have an estimated combined family income of working adults of greater than 100,000 dollars want to pursue work in the public sector/government.
This difference is not statistically significant because the p-value is 0.7708, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.10 and 0.15 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who go into each job sector being equal across all students from all family income levels (the estimated combined family income of working adults).
# Creating a new variable named "business" within the myresearchdata2 dataset
# It will give the percentage of students who are pursuing careers in the private sector/business based on family income
myresearchdata2$business <- (myresearchdata2$Career=="Private sector/business")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$business))
##
## FALSE TRUE
## 0.4315789 0.5684211
# Running the t-test to compare the mean proportion of students who are pursuing careers in the private sector/business based on family income
# Subsetting myresearchdata2 by income level
t.test(business ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: business by Income
## t = -0.27005, df = 58.598, p-value = 0.7881
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.2501214 0.1906456
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.5483871 0.5781250
Out of the sample, 56.8% of students want to pursue work in the private sector/business and 43.2% of students want to pursue work in other job sectors.
Looking at the results of the t-test, it can be seen that 54.8% of students that have an estimated combined family income of working adults of 100,000 dollars or less want to pursue work in the private sector/business; whereas, 57.8% of students that have an estimated combined family income of working adults of greater than 100,000 dollars want to pursue work in the private sector/business.
This difference is not statistically significant because the p-value is 0.7881, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.25 and 0.19 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who go into each job sector being equal across all students from all family income levels (the estimated combined family income of working adults).
# Creating a new variable named "other" within the myresearchdata2 dataset
# It will give the percentage of students who are pursuing careers in something other than what is listed based on family income
myresearchdata2$other <- (myresearchdata2$Career=="Other")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$other))
##
## FALSE TRUE
## 0.95789474 0.04210526
# Running the t-test to compare the mean proportion of students who are pursuing careers in something other than what is listed based on family income
# Subsetting myresearchdata2 by income level
t.test(other ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: other by Income
## t = -2.0494, df = 63, p-value = 0.04459
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.123443146 -0.001556854
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.0000 0.0625
Out of the sample, 4.2% of students want to pursue work in another field that is not listed and 95.8% of students want to pursue a career in one of the job sectors listed.
Looking at the results of the t-test, it can be seen that 0% of students that have an estimated combined family income of working adults of 100,000 dollars or less want to pursue work in a field that is not listed; whereas, 6.3% of students that have an estimated combined family income of working adults of greater than 100,000 dollars want to pursue work in a field that is not listed.
This difference is statistically significant because the p-value is 0.04, which is less than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.123 and -0.001 (with 0 difference not included). Therefore, we can reject the null hypothesis of the proportion of Emory students who go into each job sector being equal across all students from all family income levels (the estimated combined family income of working adults) since students who have an estimated combined family income of working adults of greater than 100,000 dollars go into other job sectors that are not listed more often than students who have an estimated combined family income of working adults of 100,000 dollars or less. In fact, no student who has an estimated combined family income of working adults of 100,000 dollars or less is pursuing another field other than the ones listed.
# Creating a new variable named "medical" within the myresearchdata2 dataset
# It will give the percentage of students who are pursuing careers in the medical-private sector based on family income
myresearchdata2$medical <- (myresearchdata2$Career=="Medical – Private")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$medical))
##
## FALSE TRUE
## 0.90526316 0.09473684
# Running the t-test to compare the mean proportion of students who are pursuing careers in medicine based on family income
# Subsetting myresearchdata2 by income level
t.test(medical ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: medical by Income
## t = -0.75205, df = 73.236, p-value = 0.4544
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.16373295 0.07401521
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.06451613 0.10937500
Out of the sample, 9.5% of students want to pursue work in the medical-private sector and 90.5% of students want to pursue work in other job sectors.
Looking at the results of the t-test, it can be seen that 6.5% of students that have an estimated combined family income of working adults of 100,000 dollars or less want to pursue work in the medical-private sector; whereas, 10.9% of students that have an estimated combined family income of working adults of greater than 100,000 dollars want to pursue work in the medical-private sector.
This difference is not statistically significant because the p-value is 0.4544, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.16 and 0.07 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who go into each job sector being equal across all students from all family income levels (the estimated combined family income of working adults).
# Creating a new variable named "NGO" within the myresearchdata2 dataset
# It will give the percentage of students who are pursuing careers in the NGO/medical-public sector based on family income
myresearchdata2$NGO <- (myresearchdata2$Career=="NGO; Medical – Public")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$NGO))
##
## FALSE TRUE
## 0.92631579 0.07368421
# Running the t-test to compare the mean proportion of students who are pursuing careers in nonprofit organizations/medical - public sector based on family income
# Subsetting myresearchdata2 by income level
t.test(NGO ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: NGO by Income
## t = 0.55283, df = 49.793, p-value = 0.5829
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.09026427 0.15881266
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.09677419 0.06250000
Out of the sample, 7.4% of students want to pursue work in the NGO/medical-public sector and 92.6% of students want to pursue work in other job sectors.
Looking at the results of the t-test, it can be seen that 9.7% of students that have an estimated combined family income of working adults of 100,000 dollars or less want to pursue work in the NGO/medical-public sector; whereas, 6.3% of students that have an estimated combined family income of working adults of greater than 100,000 dollars want to pursue work in the NGO/medical-public sector.
This difference is not statistically significant because the p-value is 0.5829, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.09and 0.16 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who go into each job sector being equal across all students from all family income levels (the estimated combined family income of working adults).
# Creating a new variable named "law" within the myresearchdata2 dataset
# It will give the percentage of students who are pursuing careers in law based on family income
myresearchdata2$law<- (myresearchdata2$Career=="Law")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$law))
##
## FALSE TRUE
## 0.92631579 0.07368421
# Running the t-test to compare the mean proportion of students who are pursuing careers in law based on family income
# Subsetting myresearchdata2 by income level
t.test(law ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: law by Income
## t = 1.2309, df = 41.722, p-value = 0.2253
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.05257187 0.21688639
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.1290323 0.0468750
Out of the sample, 7.4% of students want to pursue work in law and 92.6% of students want to pursue work in other job sectors.
Looking at the results of the t-test, it can be seen that 12.9% of students that have an estimated combined family income of working adults of 100,000 dollars or less want to pursue work in law; whereas, 4.7% of students that have an estimated combined family income of working adults of greater than 100,000 dollars want to pursue work in law.
This difference is not statistically significant because the p-value is 0.2253, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.05 and 21.7 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who go into each job sector being equal across all students from all family income levels (the estimated combined family income of working adults).
# Creating a new variable named "finearts" within the myresearchdata2 dataset
# It will give the percentage of students who are pursuing careers in the fine arts based on family income
myresearchdata2$finearts <- (myresearchdata2$Career=="Fine Arts/Creative/Performance")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$finearts))
##
## FALSE TRUE
## 0.98947368 0.01052632
# Running the t-test to compare the mean proportion of students who are pursuing careers in the fine arts based on family income
# Subsetting myresearchdata2 by income level
t.test(finearts ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: finearts by Income
## t = -1, df = 63, p-value = 0.3211
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.04684907 0.01559907
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.000000 0.015625
Out of the sample, 1.1% of students want to pursue fine arts/creative/performance and 98.9% of students want to pursue work in other job sectors.
Looking at the results of the t-test, it can be seen that 0% of students that have an estimated combined family income of working adults of 100,000 dollars or less want to pursue the fine arts/creative/performance; whereas, 1.6% of students that have an estimated combined family income of working adults of greater than 100,000 dollars want to pursue the fine arts/creative/performance.
This difference is not statistically significant because the p-value is 0.3211, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.047 and 0.016 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who go into each job sector being equal across all students from all family income levels (the estimated combined family income of working adults).
# Creating a new variable named "education" within the myresearchdata2 dataset
# It will give the percentage of students who are pursuing careers in education based on family income
myresearchdata2$education <- (myresearchdata2$Career=="Education")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$education))
##
## FALSE TRUE
## 0.94736842 0.05263158
# Running the t-test to compare the mean proportion of students who are pursuing careers in education based on family income
# Subsetting myresearchdata2 by income level
t.test(education ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: education by Income
## t = 0.33819, df = 51.812, p-value = 0.7366
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.08704032 0.12232258
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.06451613 0.04687500
Out of the sample, 5.3% of students want to pursue work in education and 94.7% of students want to pursue work in other job sectors.
Looking at the results of the t-test, it can be seen that 6.5% of students that have an estimated combined family income of working adults of 100,000 dollars or less want to pursue work in education; whereas, 4.7% of students that have an estimated combined family income of working adults of greater than 100,000 dollars want to pursue work in education.
This difference is not statistically significant because the p-value is 0.7366, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.09 and 0.12 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who go into each job sector being equal across all students from all family income levels (the estimated combined family income of working adults).
# Creating a new variable named "under" within the myresearchdata2 dataset
# It will give the percentage of students who drive under or at the speed limit based on family income
myresearchdata2$under <- (myresearchdata2$Driving=="always under or at the speed limit")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$under))
##
## FALSE TRUE
## 0.8611111 0.1388889
# Running the t-test to compare the mean proportion of students who drive under or at the speed limit based on family income
# Subsetting myresearchdata2 by income level
t.test(under ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: under by Income
## t = 2.2659, df = 29.549, p-value = 0.03095
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.02248027 0.43585306
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.2916667 0.0625000
Out of the sample, 13.9% of students always drive either under or at the speed limit and 86.1% of students either drive within 10 MPH of the speed limit or 20 MPH of the speed limit.
Looking at the results of the t-test, it can be seen that 29.2% of students that have an estimated combined family income of working adults of 100,000 dollars or less always drive either under or at the speed limit; whereas, 6.3% of students that have an estimated combined family income of working adults of greater than 100,000 dollars always drive either under or at the speed limit.
This difference is statistically significant because the p-value is 0.03, which is less than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between 0.02 and 0.44 (with 0 difference not included). Therefore, we can reject the null hypothesis of the proportion of Emory students who speed and do not speed while driving being equal across all students from all family income levels (the estimated combined family income of working adults) since students who have an estimated combined family income of working adults of 100,000 dollars or less drive under or at the speed limit more.
# Creating a new variable named "withinten" within the myresearchdata2 dataset
# It will give the percentage of students who drive within 10 miles per hour of the speed limit based on family income
myresearchdata2$withinten<- (myresearchdata2$Driving=="within 10 MPH of the speed limit")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$withinten))
##
## FALSE TRUE
## 0.375 0.625
# Running the t-test to compare the mean proportion of students who drive within 10 MPH of the speed limit based on family income
# Subsetting myresearchdata2 by income level
t.test(withinten ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: withinten by Income
## t = -2.5714, df = 41.7, p-value = 0.01379
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.55780852 -0.06719148
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.4166667 0.7291667
Out of the sample, 62.5% of students always drive within 10 MPH of the speed limit and 37.5% of students either drive under or at the speed limit or 20 MPH within the speed limit.
Looking at the results of the t-test, it can be seen that 41.7% of students that have an estimated combined family income of working adults of 100,000 dollars or less drive within 10 MPH of the speed limit; whereas, 72.9% of students that have an estimated combined family income of working adults of greater than 100,000 dollars drive within 10 MPH of the speed limit.
This difference is statistically significant because the p-value is 0.01, which is less than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.56 and -0.06 (with 0 difference not included). Therefore, we can reject the null hypothesis of the proportion of Emory students who speed and do not speed while driving being equal across all students from all family income levels (the estimated combined family income of working adults) since students who have an estimated combined family income of working adults of more than 100,000 dollars drive within 10 MPH of the speed limit more.
# Creating a new variable named "withintwenty" within the myresearchdata2 dataset
# It will give the percentage of students who drive within 20 miles per hour of the speed limit based on family income
myresearchdata2$withintwenty<- (myresearchdata2$Driving=="within 20 MPH of the speed limit")
# Showing the results through a proportion table
prop.table(table(myresearchdata2$withintwenty))
##
## FALSE TRUE
## 0.7638889 0.2361111
# Running the t-test to compare the mean proportion of students who drive within 20 MPH of the speed limit based on family income
# Subsetting myresearchdata2 by income level
t.test(withintwenty ~ Income, data = subset(myresearchdata2, ( (Income=="$0K-100K" | Income==">$100K"))))
##
## Welch Two Sample t-test
##
## data: withintwenty by Income
## t = 0.74561, df = 41.39, p-value = 0.4601
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1423183 0.3089849
## sample estimates:
## mean in group $0K-100K mean in group >$100K
## 0.2916667 0.2083333
Out of the sample, 23.6% of students drive within 20 MPH of the speed limit and 76.4% of students either drive within 10 MPH of the speed limit or always under or at the speed limit.
Looking at the results of the t-test, it can be seen that 29.2% of students that have an estimated combined family income of working adults of 100,000 dollars or less always drive within 20 MPH of the speed limit; whereas, 20.8% of students that have an estimated combined family income of working adults of greater than 100,000 dollars drive within 20 MPH of the speed limit.
This difference is not statistically significant because the p-value is 0.46, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.14 and 0.31 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who speed and do not speed while driving being equal across all students from all family income levels (the estimated combined family income of working adults).
There was no evidence found to support any difference in the proportion of Emory students that consider themselves to be risk-averse, in the middle, or risk-loving based on the estimated combined family income of working adults for Emory students. Therefore, regardless of the family income levels of Emory students within the sample, there was no statistically significant difference in the proportion of students who consider themselves to be risk-averse, in the middle, or risk-loving. In conclusion, there was no evidence found against the null hypothesis.
Because the sample consists of Emory students pursuing a degree something to do with economics, I believe students are automatically typically more on the risk-averse side since it is such a stable major. Also, looking over the proportion of students within the sample that are risk-averse, in the middle, and risk-loving, less than 15% of the students within the sample indicated that they were risk-loving, which means most students within the sample are risk-averse or in the middle with it being 46% and 40% respectively. This could just be because students that chose to pursue a rigorous major are automatically more inclined to avoid risk since if they do a good job, they will most likely graduate with a good, steady job. I also think that in general, most people are risk-averse because we as humans are programmed to avoid danger and protect ourselves. We are naturally wired to try and avoid dangerous situations, which is why it is less common for people to be risk-loving. I also think risk-loving might be a little too strong of a word since I do not believe most people love to put themselves at risk at all times, which could have dissuaded students from picking that option on the survey. Lastly, I believe some people may not have answered this question about them as accurately as possible based on the potential inaccuracies of how they view themselves as well as the wording of the answer choices.
There was no evidence found to support any difference in the proportion of Emory students who go into each job sector based on the estimated combined family income of working adults except in the category of other. For all of the job sectors/industries listed, regardless of the family income levels of Emory students within the sample, there was no statistically significant difference in the proportion of students who go into each job sector; however, for the “other” category, the proportion of Emory students with an estimated combined family income of working adults of greater than 100,000 dollars going into a job sector not listed was higher and the proportion of Emory students with an estimated combined family income of working adults of 100,000 dollars or less was actually zero. The proportion of Emory students with family income levels of 100,000 dollars or less pursuing the fine arts was also zero. Therefore, other than the students going into a job sector not listed, there was no statistically significant difference found in the proportion of students who are going into each job sector. In conclusion, there was only evidence found against the null hypothesis within the proportion of students going into a job sector not listed, but other than that, there was no evidence found against the null hypothesis.
Because the sample consists of Emory students who are associated with the Economics department in some way, I believe there might be some bias in the results of comparing income to career choice. There may not have been a significant difference in the proportion of Emory students who go into fields like private sector/business based on family income level since all students within the sample are taking economics, which is a subject heavily used within the field of business. The majority of students within the sample if not all will most likely be applying their economics knowledge to their future career in some way, and it is very likely that they will enter the private sector/business in some way or at some point. Economics can also be applied to fields like medicine for statistical testing and to fields like law and pubic sector/government to make economic laws. Additionally, students who come to Emory are typically pre-professional, and it is heavily concentrated in being pre-health, pre-business, or pre-law, which is why there may not be as many students pursuing the fine arts. However, in the small proportion of students within the sample pursuing the arts, all of them come from family income levels of higher than 100,000 dollars, which I believe makes sense since the fine arts is one of the riskier job sectors in terms of income. Students with family incomes of 100,000 dollars or less may be looking for more lucrative and stable careers to help their family income in the future. Also, only students with family income levels of greater than 100,000 dollars indicated pursuing something other than the job sectors listed, which I also believe makes sense since students from lower family income levels may want to choose a career that is well known for being stable and lucrative.
There was some evidence found to support a difference in the proportion of Emory students who speed and who do not speed while driving based on the estimated combined family income of working adults for Emory students. Emory students who have an estimated family income of working adults of 100,000 dollars or less tend to drive under or at the speed limit at a higher proportion, and Emory students who have an estimated family income of working adults more than 100,000 dollars tend to drive within 10 MPH of the speed limit at a higher proportion. However, there was no evidence found of a statistically significant difference in the proportion of Emory students who drive within 20 MPH of the speed limit based on the family income levels of the Emory students within the sample. Therefore, there was a statistically significant difference found in the proportion of students who drive at or under the speed limit and who drive within 10 MPH of the speed limit but not for those who drive within 20 MPH of the speed limit. In conclusion, there was some evidence found against the null hypothesis but not completely.
The results found between driving speed and income is what I was expecting. It was found that Emory students from family incomes of 100,000 dollars or less tend to drive either under or at the speed limit more often than Emory students from family incomes of 100,000 dollars ore more. I believe this could be because speeding tickets are usually expensive, and Emory students from lower family income levels may have more trouble paying an expensive ticket. I also think that Emory students from lower family income levels may drive at a safer speed to avoid damaging their cars since cars are expensive and repairs also cost a lot of money. It was also found that Emory students from family incomes of more than 100,000 dollars tend to drive within 10 miles per hour of the speed limit more often than Emory students from family incomes of 100,000 dollars or less. I believe this could be because students from higher family income levels worry less about payments of speeding tickets, car repair costs, etc.
I took one main explanatory variable - Income, and 4 response variables (Risktaking, Career, Driving, Socialdistancing) to see how students’ family income levels relate to them considering themselves risk takers or risk-averse, their future career plans, their driving speed, and their care level for social distancing. There was a significant result found when a statistical test was run to see the impact of income on students’ driving speeds; however, there was no significant result found when statistical tests were run to see the impact of income on students’ risk taking comfort, career plans, and social distancing tendencies.
I believe that the results were heavily influenced by the sample size being small and the sample only consisting of Emory students taking ECON 220 during a particular semester (Spring 2021) by a specific professor - Dr. Paloma Lopez de mesa Moyano. I believe that this research question should be analyzed again but with a larger sample size and a more diverse pool of college students. I believe that it is highly probable that the results would have turned out much different had the sample size been college students from multiple campuses and multiple areas of interest instead of being limited to Emory University Economics majors, Economics concentrators, and Economics minors. I believe that college students from lower family income levels would be shown to be more risk-averse than college students from higher family income levels, and it was seen through the results of the driving speeds which I expected; however, I was surprised to see that there was no significant findings in the social distancing, career, and risktaking variables. However, I believe that if the research question was expanded to include a more diverse sample, the results would be more significant and college students from lower family income levels would be shown to be more risk-averse than college students from higher family income levels
I also believe that in some questions of the survey, the questions could have been phrased a bit differently and the answer choices could have included less absolute language. For example, “always under or at the speed limit” may have discouraged students from picking that answer even if they do not always speed but speed a majority of the time because the word “always” was used. Also, risk-loving might be too strong of a word since I think there are very few people that love risk, but there may be people that are more okay with taking risks. This could have dissuaded students from picking the option of risk-loving on the survey. In the future, the survey could use less absolute language in certain areas.
Atombo, Charles, et al. “Personality, socioeconomic status, attitude, intention and risky driving behavior.” Cogent Psychology 4.1 (2017): 1376424.
Caner, Asena, and Cagla Okten. “Risk and career choice: Evidence from Turkey.” Economics of Education Review 29.6 (2010): 1060-1075.
Iversen, Anette Christine, and Pål Kraft. “Does socio-economic status and health consciousness influence how women respond to health related messages in media?.” Health education research 21.5 (2006): 601-610.
Schurer, Stefanie. “Lifecycle patterns in the socioeconomic gradient of risk preferences.” Journal of Economic Behavior & Organization 119 (2015): 482-495.
Sheehy-Skeffington, Jennifer, and Jessica Rea. How poverty affects People’s decision-making processes. York: Joseph Rowntree Foundation, 2017.
Social Distancing Practices T-tests
Out of the sample, 2.1% of students rate their social distancing strictness at a 1 (least strict) and 97.9% of students rate their social distancing strictness at a 2, 3, 4, or 5.
Looking at the results of the t-test, it can be seen that 0% of students that have an estimated combined family income of working adults of 100,000 dollars or less rate their social distancing strictness at a 1 (least strict); whereas, 3.1% of students that have an estimated combined family income of working adults of greater than 100,000 dollars rate their social distancing strictness at a 1 (least strict).
This difference is not statistically significant because the p-value is 0.16, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.075 and 0.013 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who are strict and who are not strict about social distancing being equal across all students from all family income levels (the estimated combined family income of working adults).
Out of the sample, 12.6% of students rate their social distancing strictness at a 2 and 87.4% of students rate their social distancing strictness at a 1, 3, 4, or 5.
Looking at the results of the t-test, it can be seen that 22.6% of students that have an estimated combined family income of working adults of 100,000 dollars or less rate their social distancing strictness at a 2; whereas, 7.8% of students that have an estimated combined family income of working adults of greater than 100,000 dollars rate their social distancing strictness at a 2.
This difference is not statistically significant because the p-value is 0.08, which is greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.02 and 0.32 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who are strict and who are not strict about social distancing being equal across all students from all family income levels (the estimated combined family income of working adults).
Out of the sample, 31.6% of students rate their social distancing strictness at a 3 and 68.4% of students rate their social distancing strictness at a 1, 2, 4, or 5.
Looking at the results of the t-test, it can be seen that 25.8% of students that have an estimated combined family income of working adults of 100,000 dollars or less rate their social distancing strictness at a 3; whereas, 34.4% of students that have an estimated combined family income of working adults of greater than 100,000 dollars rate their social distancing strictness at a 3.
This difference is not statistically significant because the p-value is 0.39, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.29 and 0.11 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who are strict and who are not strict about social distancing being equal across all students from all family income levels (the estimated combined family income of working adults).
Out of the sample, 40% of students rate their social distancing strictness at a 4 and 60% of students rate their social distancing strictness at a 1, 2, 3, or 5.
Looking at the results of the t-test, it can be seen that 32.3% of students that have an estimated combined family income of working adults of 100,000 dollars or less rate their social distancing strictness at a 4; whereas, 43.8% of students that have an estimated combined family income of working adults of greater than 100,000 dollars rate their social distancing strictness at a 4.
This difference is not statistically significant because the p-value is 0.28, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.33 and 0.10 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who are strict and who are not strict about social distancing being equal across all students from all family income levels (the estimated combined family income of working adults).
Out of the sample, 13.7% of students rate their social distancing strictness at a 5 and 86.3% of students rate their social distancing strictness at a 1, 2, 3, or 4.
Looking at the results of the t-test, it can be seen that 19.4% of students that have an estimated combined family income of working adults of 100,000 dollars or less rate their social distancing strictness at a 5; whereas, 10.9% of students that have an estimated combined family income of working adults of greater than 100,000 dollars rate their social distancing strictness at a 5.
This difference is not statistically significant because the p-value is 0.31, which is much greater than 0.05 (the alpha level), and according to the t-test, it can be said with 95% confidence that the difference between the two groups is somewhere between -0.08 and 0.25 (with 0 difference included). Therefore, we fail to reject the null hypothesis of the proportion of Emory students who are strict and who are not strict about social distancing being equal across all students from all family income levels (the estimated combined family income of working adults).