Midterm Project

Basic Rules

Each team gets a unique simulated dataset.
Each student writes his/her own code.
While help is allowed within and across teams, I expect each student to write and understand every line of his/her code.
- To ensure the above, I reserve the right to ask any student to explain lines of code after answers are submitted
  - If a student is unable to explain, the student will not get credit for their submission.
- Copying and pasting exact code from another team or team member is prohbitied and will result in an academic integrity violation for both parties. Students who copy or let others copy from them will be reported to the Dean’s office.
Unless specified, each task is worth 1 point.
This activity is due on Thursday, March 14th by 11:59 pm. Every student will submit an html file and an Rmd file.

Tips:

READ EVERYTHING CAREFULLY
To solve each task, refer to html files posted on Moodle
- Copy, paste, and modify code to solve each task. (You rarely have to come up with code from scratch)
If you encounter an error, Google it
Don’t forget to set your working directory by hand
- add the following code to your Rmd file to avoid wd problems (don’t forget to change your working directory):

Preliminaries:

Don’t forget to knit the Rmd file as you go. Knitting is different than running the R code, as it generates the html file right away.

Libraries

Make sure you install them in your console (never in the Rmd file!)

library(ggplot2)
library(MASS)
library(stargazer)
library(AER)

## Warning: package 'AER' was built under R version 3.5.2

library(plm)

## Warning: package 'plm' was built under R version 3.5.2

library(tidyverse)

## Warning: package 'stringr' was built under R version 3.5.2

library(lmtest)
library(sandwich)
library(dplyr)

1) Before you inspect the data, you must load it. Each team received an .Rda file through Slack. Save the .Rda file within your working directory and load the data by modifying the following code (Suppose the file is named: jimena.Rda):

load("Matt_Paul.Rda")
names(Matt.Paul)

##  [1] "age"             "gender"          "income"         
##  [4] "college"         "private.school"  "race"           
##  [7] "marital"         "kids"            "region"         
## [10] "urban"           "green.donations"

Suppose you are hired by an Environmental NGO that wants to understand its donors for climate change initiatives. You are given a dataset with donors’ information that includes the amount each donor contributed per year as well as various demographic and socio-economic characteristics. The variables are described below:

age = the age of an individual
gender = gender of individual (either female or male)
income = annual income of each individual
college = is a dummy variable that takes value of 1 if the individual has at least a college degree and 0 otherwise
private.school = is a dummy variable that takes value of 1 if the individual studied at a private college and 0 otherwise
race = individual’s race/ethnicity (white, black, latino, asian, or other)
marital = is the marital status of the individual (single, married, separated, or widow)
kids = number of children the individual has
region = region where individual lives (midwest, mountain, north east, south, or west coast)
urban = a dummy variable that takes the value of 1 if individual lives in an urban area with population above 500,000
green.donations = individual’s annual donation to the environmental NGO towards climate change initiatives.

Learning Objectives

Learning Objective 1: Students will inspect and summarize the data
Learning Objective 2: Students will create new variables using tidyverse
Learning Objective 3: Students will visualize the data using ggplot2
Learning Objective 4: Students will run several OLS models,using robust standard errors, to find the best fit for the data
Learning Objective 5: Students will use stargazer package to generate tables summarizing the regression results
Learning Objective 6: Students will assess the goodness of fit of the regressions
Learning Objective 7: Students will interpret the coefficients from the models
Learning Objective 8: Students will comment about potential bias in their estimation strategy
Learning Objective 9: Students will communicate findings and provide recommentaions based on analysis

Learning Objective 1

Students will inspect and summarize the data

2) Rename your data as data and convert it to a tibble data by using as_tibble()

data<-as_tibble(Matt.Paul)

3) Write code to find if there are NAs in the dataset. You may use tidyverse or a for loop or some other strategy. Hint: find examples online and modify them.

cat("Are there any NA's?  (T/F).",any(is.na(data)))

## Are there any NA's?  (T/F). FALSE

4) Display the top 10 observations within your data

head(data,10)

## # A tibble: 10 x 11
##      age gender income college private.school race  marital  kids region
##    <int> <fct>   <dbl>   <dbl>          <dbl> <fct> <fct>   <dbl> <fct> 
##  1    42 male    79186       1              1 black married     0 west.…
##  2    55 male    69082       0              1 white single      0 north…
##  3    20 female  63642       0              1 white married     0 west.…
##  4    28 male    68250       0              1 lati… single      0 north…
##  5    38 male    69375       0              0 white married     0 west.…
##  6    42 male    73457       0              1 white single      0 north…
##  7    51 male    69405       0              0 black married     0 west.…
##  8    35 male    67153       0              0 white single      0 north…
##  9    39 male    74032       1              0 white married     0 west.…
## 10    43 female  66327       1              0 white single      0 north…
## # ... with 2 more variables: urban <dbl>, green.donations <I(dbl)>

5) Write code to display the number observations in your data

obs<-nrow(data)
obs

## [1] 1800

6) Summarize the variable green.donations

attach(data)
summary(green.donations)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    20.0   540.0   710.0   696.8   860.0  1260.0

7) Summarize the variable age

summary(age)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    19.0    27.0    37.0    36.5    46.0    55.0

8) Write code to find the share (proportion) of each race within the dataset:

race.table<-table(race)
race.table

## race
##  asian  black latino  other  white 
##    115    111    367     65   1142

cat(race.table[1]/obs*100,"% of people are Asian")

## 6.388889 % of people are Asian

cat(race.table[2]/obs*100,"% of people are Black" )

## 6.166667 % of people are Black

cat(race.table[3]/obs*100,"% of people are Latino" )

## 20.38889 % of people are Latino

cat(race.table[4]/obs*100,"% of people are 'Other'" )

## 3.611111 % of people are 'Other'

cat(race.table[5]/obs*100,"% of people are White" )

## 63.44444 % of people are White

Learning Objective 2

Students will create new variables using tidyverse

9) [5 points] Use tidyverse and the function mutate to create a dummy variable for each race named white, latino, black, asian, and other, respectively:

data.1 <- data %>% #rename dataset to lower case
  mutate(white = ifelse(race == "white",1,0)) %>% 
  mutate(black = ifelse(race == "black",1,0)) %>% # creating dummy for deny.y
  mutate(asian = ifelse(race == "asian",1,0)) %>% 
  mutate(latino = ifelse(race == "latino",1,0)) %>% 
  mutate(other = ifelse(race == "other",1,0))

To check (Remove the # once you get to this part)

attach(data.1)

## The following objects are masked from data:
## 
##     age, college, gender, green.donations, income, kids, marital,
##     private.school, race, region, urban

names(data.1)

##  [1] "age"             "gender"          "income"         
##  [4] "college"         "private.school"  "race"           
##  [7] "marital"         "kids"            "region"         
## [10] "urban"           "green.donations" "white"          
## [13] "black"           "asian"           "latino"         
## [16] "other"

race.table

## race
##  asian  black latino  other  white 
##    115    111    367     65   1142

white.tab<-table(white)
latino.tab<-table(latino)
black.tab<-table(black)
other.tab<-table(other)
asian.tab<-table(asian)

cat(ifelse(white.tab[2] == race.table[5],"YES","NO"), ",the variable 'white' was created correctly")

## YES ,the variable 'white' was created correctly

cat(ifelse(latino.tab[2] == race.table[3],"YES","NO"), ",the variable 'latino' was created correctly")

## YES ,the variable 'latino' was created correctly

cat(ifelse(black.tab[2] == race.table[2],"YES","NO"), ",the variable 'black' was created correctly")

## YES ,the variable 'black' was created correctly

cat(ifelse(other.tab[2] == race.table[4],"YES","NO"), ",the variable 'other' was created correctly")

## YES ,the variable 'other' was created correctly

cat(ifelse(asian.tab[2] == race.table[1],"YES","NO"), ",the variable 'asian' was created correctly")

## YES ,the variable 'asian' was created correctly

The number of 1’s per dummy variable should match the number of observations per race within each category.

10) Write code to find the share (proportion) of each gender within the dataset:

gender.table <- table(gender)
gender.table

## gender
## female   male 
##    622   1178

cat(gender.table[1]/obs*100,"% of people are female")

## 34.55556 % of people are female

cat(gender.table[2]/obs*100,"% of people are male")

## 65.44444 % of people are male

11) [2 points] Use tidyverse and the function mutate to create a dummy variable for each gender named female and male:

data.2<- data.1 %>% 
  mutate(gender.m = ifelse(gender == "male",1,0)) %>% 
  mutate(gender.f = ifelse(gender == "female",1,0))

12) Write code to check if you created the dummy variable correctly

detach(data.1)
attach(data.2)

## The following objects are masked from data:
## 
##     age, college, gender, green.donations, income, kids, marital,
##     private.school, race, region, urban

gender.table

## gender
## female   male 
##    622   1178

gender.tab.m<-table(gender.m)
gender.tab.f<-table(gender.f)

cat(ifelse(gender.tab.m[2] == gender.table[2],"YES","NO"), "the variable 'gender.m' was created correctly")

## YES the variable 'gender.m' was created correctly

cat(ifelse(gender.tab.f[2] == gender.table[1],"YES","NO"), "the variable 'gender.f' was created correctly")

## YES the variable 'gender.f' was created correctly

16) Write code to find the share (proportion) of each region within the dataset:

table.region<-table(region)
table.region

## region
##    midwest   mountain north.east      south west.coast 
##         23         40        846         35        856

cat(table.region[1]/obs*100,"% of people are from the Midwest")

## 1.277778 % of people are from the Midwest

cat(table.region[2]/obs*100,"% of people are from the Mountain")

## 2.222222 % of people are from the Mountain

cat(table.region[3]/obs*100,"% of people are from the Northeast")

## 47 % of people are from the Northeast

cat(table.region[4]/obs*100,"% of people are from the South")

## 1.944444 % of people are from the South

cat(table.region[5]/obs*100,"% of people are from the West Coast")

## 47.55556 % of people are from the West Coast

17) [5 points] Use tidyverse and the function mutate to create a dummy variable for each region named north.east, west.coast, south, mountain, and midwest:

data.3 <- data.2 %>% #rename dataset to lower case
  mutate(midwest.m = ifelse(region == "midwest",1,0)) %>% 
  mutate(mountain.m = ifelse(region == "mountain",1,0)) %>% # creating dummy for deny.y
  mutate(north.east.m = ifelse(region == "north.east",1,0)) %>% 
  mutate(south.m = ifelse(region == "south",1,0)) %>% 
  mutate(west.coast.m = ifelse(region == "west.coast",1,0))

18) Write code to check if you created the dummy variable correctly

detach(data.2)
attach(data.3)

## The following objects are masked from data:
## 
##     age, college, gender, green.donations, income, kids, marital,
##     private.school, race, region, urban

table.region

## region
##    midwest   mountain north.east      south west.coast 
##         23         40        846         35        856

midwest.tab<-table(midwest.m)
mountain.tab<-table(mountain.m)
NE.tab<-table(north.east.m)
south.tab<-table(south.m)
westcoast.tab<-table(west.coast.m)

cat(ifelse(midwest.tab[2] == table.region[1],"YES","NO"), "the variable 'midwest.m' was created correctly")

## YES the variable 'midwest.m' was created correctly

cat(ifelse(mountain.tab[2] == table.region[2],"YES","NO"), "the variable 'mountain.m' was created correctly")

## YES the variable 'mountain.m' was created correctly

cat(ifelse(NE.tab[2] == table.region[3],"YES","NO"), "the variable 'north.east.m' was created correctly")

## YES the variable 'north.east.m' was created correctly

cat(ifelse(south.tab[2] == table.region[4],"YES","NO"), "the variable 'south' was created correctly")

## YES the variable 'south' was created correctly

cat(ifelse(westcoast.tab[2] == table.region[5],"YES","NO"), "the variable 'west.coast.m' was created correctly")

## YES the variable 'west.coast.m' was created correctly

19) Write code to find the share (proportion) of each marital status within the dataset:

table.marital<-table(marital)
table.marital

## marital
##   married separated    single     widow 
##       846        31       869        54

cat(table.marital[1]/obs*100,"% of people are married")

## 47 % of people are married

cat(table.marital[2]/obs*100,"% of people seperated are ")

## 1.722222 % of people seperated are

cat(table.marital[3]/obs*100,"% of people are single")

## 48.27778 % of people are single

cat(table.marital[4]/obs*100,"% of people are widows")

## 3 % of people are widows

20) Use tidyverse and the function mutate to create a dummy variable for single individuals named single:

data.4<-data.3 %>% 
  mutate(single =ifelse(marital == "single",1,0))

21) Write code to check if you created the dummy variable correctly

detach(data.3)
attach(data.4)

## The following objects are masked from data:
## 
##     age, college, gender, green.donations, income, kids, marital,
##     private.school, race, region, urban

table.marital

## marital
##   married separated    single     widow 
##       846        31       869        54

single.tab<-table(single)

cat(ifelse(single.tab[2] == table.marital[3],"YES","NO"), "the variable 'single' was created correctly")

## YES the variable 'single' was created correctly

Learning Objective 3

Students will visualize the data using ggplot2

22) [4 points] Generate a histogram of green.donations using ggplot2, which should include titles and appropriate horizontal and vertical scales.

data.4 %>% ggplot(aes(x = green.donations,)) + 
  geom_histogram(size = 1, colour = "darkgreen", na.rm = TRUE, binwidth = 50)+
  ggtitle(paste("Green Donations")) +
  theme_bw() + #to have black and white background
  scale_x_continuous(breaks = seq(0,1200,100))+
  scale_y_continuous(breaks=seq(0,200,10)) +
  theme(plot.title = element_text(hjust = 0.5))+
  theme(axis.text.x=element_text(size=12)) +
  theme(axis.text.y=element_text(size=16)) +
  theme(axis.title.x=element_text(size=16)) +
  theme(axis.title.y=element_text(size=16, angle=90)) +
  theme(plot.title=element_text(size=14)) +
  theme(legend.text=element_text(size=18)) +
  xlab("Annual Donations ($)")+ # to label the axis
  ylab("Frequency") # to label axis

23) [4 points] Generate a barplot of race using ggplot2, which should include titles and appropriate horizontal and vertical scales.

data.4 %>% ggplot(aes(x = race, fill= race)) + 
  geom_bar(size = 1, colour = "black", na.rm = TRUE)+
  ggtitle(paste("Count of Race")) +
  theme_bw() + #to have black and white background
  scale_y_continuous(breaks=seq(0,2000,100)) +
  theme(plot.title = element_text(hjust = 0.5))+
  theme(axis.text.x=element_text(size=12)) +
  theme(axis.text.y=element_text(size=16)) +
  theme(axis.title.x=element_text(size=16)) +
  theme(axis.title.y=element_text(size=16, angle=90)) +
  theme(plot.title=element_text(size=14)) +
  theme(legend.text=element_text(size=18)) +
  xlab("Race")+ # to label the axis
  ylab("Count") # to label axis

24) [4 points] Visualize the relationship green.donations and gender using ggplot2. Your plot should include titles and appropriate horizontal and vertical scales.

data.4 %>% ggplot(aes(x = gender, y= green.donations,fill = gender)) + 
  geom_boxplot(size = 1, colour = "black", na.rm = TRUE)+
  ggtitle(paste("Donations by Gender")) +
  theme_bw() + #to have black and white background
  scale_y_continuous(breaks=seq(0,2000,100)) +
  theme(plot.title = element_text(hjust = 0.5))+
  theme(axis.text.x=element_text(size=12)) +
  theme(axis.text.y=element_text(size=16)) +
  theme(axis.title.x=element_text(size=16)) +
  theme(axis.title.y=element_text(size=16, angle=90)) +
  theme(plot.title=element_text(size=14)) +
  theme(legend.text=element_text(size=18)) +
  xlab("Gender")+ # to label the axis
  ylab("Annual Donation ($)") # to label axis

25) [4 points] Visualize the relationship green.donations and income using ggplot2. Your plot should include titles and appropriate horizontal and vertical scales.

data.4 %>% ggplot(aes(y= green.donations, x=income)) + 
  geom_point(size = 1, colour = "blue", na.rm = TRUE)+
  ggtitle(paste("Donations VS Kids ")) +
  theme_bw() + #to have black and white background
  scale_x_continuous(breaks = seq(50000,100000,10000))+
  scale_y_continuous(breaks=seq(0,2000,100)) +
  theme(plot.title = element_text(hjust = 0.5))+
  theme(axis.text.x=element_text(size=12)) +
  theme(axis.text.y=element_text(size=16)) +
  theme(axis.title.x=element_text(size=16)) +
  theme(axis.title.y=element_text(size=16, angle=90)) +
  theme(plot.title=element_text(size=14)) +
  theme(legend.text=element_text(size=18)) +
  xlab("Income")+ # to label the axis
  ylab("Annual Donation ($)") # to label axis

26) [4 points] Visualize the relationship green.donations and kids using ggplot2. Your plot should include titles and appropriate horizontal and vertical scales.

data.4 %>% ggplot(aes(y= green.donations, x=kids, group = kids)) + 
  geom_boxplot(size = 1,  na.rm = TRUE)+
  ggtitle(paste("Donations VS Kids ")) +
  theme_bw() + #to have black and white background
  scale_x_continuous(breaks = seq(0,5,1))+
  scale_y_continuous(breaks=seq(0,2000,100)) +
  theme(plot.title = element_text(hjust = 0.5))+
  theme(axis.text.x=element_text(size=12)) +
  theme(axis.text.y=element_text(size=16)) +
  theme(axis.title.x=element_text(size=16)) +
  theme(axis.title.y=element_text(size=16, angle=90)) +
  theme(plot.title=element_text(size=14)) +
  theme(legend.text=element_text(size=18)) +
  xlab("# of Kids")+ # to label the axis
  ylab("Annual Donation ($)") # to label axis

Learning Objective 4

Students will run several OLS models to find the best fit for their data using robust standard errors

Estimate the following OLS models and named them accordingly:

ols1:

$\begin{equation} green.donations_{i}=\beta_0 + \beta_1age_{i} + \beta_3college_i +\beta_7white_i+\beta_8female_i +u_{i} \end{equation}$

ols2:

$\begin{equation} green.donations_{i}=\beta_0 + \beta_1age_{i} + \beta_3private.school_i +\beta_7white_i+\beta_8female_i +u_{i} \end{equation}$

ols3:

$\begin{equation} green.donations_{i}=\beta_0 + \beta_1age_{i} + \beta_3college_i +\beta_7latino_i+\beta_8female_i +u_{i} \end{equation}$

ols4:

$\begin{equation} green.donations_{i}=\beta_0 + \beta_1age_{i}+\beta_2age_{i}^2 + \beta_3college_i + \beta_4ln(income_i) +\beta_5single_i + \beta_6kids_i +\beta_7white_i+\beta_8female_i+\beta_9urban_i +u_{i} \end{equation}$

ols5:

$\begin{equation} green.donations_{i}=\beta_0 + \beta_1age_{i}+\beta_2age_{i}^2 + \beta_3private.school_i + \beta_4ln(income_i) +\beta_5single_i + \beta_6kids_i +\beta_7white_i+\beta_8female_i+\beta_9(north.east_i*west.coast_i) +u_{i} \end{equation}$

27) [5 points] Write code to estimate the models above.

attach(data.4)

## The following objects are masked from data.4 (pos = 3):
## 
##     age, asian, black, college, gender, gender.f, gender.m,
##     green.donations, income, kids, latino, marital, midwest.m,
##     mountain.m, north.east.m, other, private.school, race, region,
##     single, south.m, urban, west.coast.m, white

## The following objects are masked from data:
## 
##     age, college, gender, green.donations, income, kids, marital,
##     private.school, race, region, urban

mod1<- lm(green.donations~ age +college +white +gender.f)
mod2<-lm(green.donations~age+private.school+white+gender.f)
mod3<-lm(green.donations ~ age + college + latino + gender.f)
mod4<-lm(green.donations ~ age + age^2 + private.school + log(income) + single + kids+ white +gender.f + urban)
mod5<-lm(green.donations ~age +age^2 + private.school + log(income) + single + kids + white + gender.f + north.east.m + west.coast.m)

28) [5 points] Write code to gather the robust standard errors in a list.

rob_se <- list(sqrt(diag(vcovHC(mod1, type = "HC1"))),
               sqrt(diag(vcovHC(mod2, type = "HC1"))),
               sqrt(diag(vcovHC(mod3, type = "HC1"))),
               sqrt(diag(vcovHC(mod4, type = "HC1"))),
               sqrt(diag(vcovHC(mod5, type = "HC1"))))

Learning Objective 5

Students will use stargazer package to generate tables summarizing their regression results

29) [10 points] Use stargazer (only works in Rmd file) to generate a table with the results (Hint: make sure you include type = “html” within stargazer()):

stargazer(mod1, mod2, mod3, mod4, mod5,
         digits = 3,
         header = FALSE,
         type = "html",
         se = rob_se,
         title = "Linear Regression of Green Donations",
         model.numbers = FALSE,
         column.labels = c("(1)", "(2)", "(3)", "(4)", "(5)"))

**Linear Regression of Green Donations**

	Dependent variable:

	green.donations
	(1)	(2)	(3)	(4)	(5)

age	-15.574^***	-15.483^***	-15.576^***	-14.928^***	-14.929^***
	(0.248)	(0.327)	(0.248)	(0.064)	(0.066)

college	197.928^***		198.054^***
	(5.389)		(5.387)

private.school		2.648		2.266^*	2.648^*
		(7.010)		(1.344)	(1.394)

log(income)				1,711.266^***	1,708.884^***
				(8.154)	(8.516)

single				13.696^***	15.601^***
				(1.352)	(3.703)

kids				-23.687^***	-15.528^***
				(1.421)	(2.972)

white	-7.519	-10.885		2.728^**	2.379^*
	(5.465)	(7.250)		(1.383)	(1.435)

latino			0.072
			(6.566)

gender.f	-28.124^***	-33.092^***	-28.275^***	-27.282^***	-27.925^***
	(5.712)	(7.453)	(5.718)	(1.426)	(1.471)

urban				19.028^***
				(1.579)

north.east.m					7.406
					(7.416)

west.coast.m					9.437
					(7.827)

Constant	1,157.989^***	1,278.980^***	1,153.264^***	-17,847.060^***	-17,825.750^***
	(10.843)	(13.957)	(10.326)	(91.368)	(96.112)


Observations	1,800	1,800	1,800	1,800	1,800
R²	0.742	0.554	0.742	0.984	0.983
Adjusted R²	0.741	0.553	0.741	0.984	0.982
Residual Std. Error	113.073 (df = 1795)	148.579 (df = 1795)	113.131 (df = 1795)	28.441 (df = 1791)	29.422 (df = 1790)
F Statistic	1,289.101^*** (df = 4; 1795)	557.742^*** (df = 4; 1795)	1,287.317^*** (df = 4; 1795)	13,510.470^*** (df = 8; 1791)	11,208.830^*** (df = 9; 1790)

Note:	p<0.1; p<0.05; p<0.01

Learning Objective 6

Students will assess the goodness of fit of the regressions

30) [6 points] Complegte bullet points below describing the goodness of fit of the regression and state which regression is better

(I): Adj. R^2 = .741
(II): Adj. R^2 = .553
(III): Adj. R^2 = .741
(IV): Adj. R^2 = .984
(V): Adj. R^2 = .982
The best regression is: Model 4 because it has the highest Adj. R^2 value.

31) [1 points] UselinearHypothesis() to test simultaneously whether the following variables should be included in ols5 model.

linearHypothesis(mod5, c("white=0", "private.school=0"), white.adjust = "hc1")

## Linear hypothesis test
## 
## Hypothesis:
## white = 0
## private.school = 0
## 
## Model 1: restricted model
## Model 2: green.donations ~ age + age^2 + private.school + log(income) + 
##     single + kids + white + gender.f + north.east.m + west.coast.m
## 
## Note: Coefficient covariance matrix supplied.
## 
##   Res.Df Df      F  Pr(>F)  
## 1   1792                    
## 2   1790  2 3.0802 0.04619 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

32) [2 points] Interpret the results from the test above. Should you keep both variables?

With the F-Statistic being 3.08 and the pvalue of .04619 which is significant at the 99% level, we should keep both variables in the model.

Learning Objective 7

Students will interpret the coefficients from the models

33) [10 points] Interpret the coefficients of the following variables from the best ols model. Write a sentence explaining the intuition behind the result. Complete the following bullet points with your answers.

age:
- interpretation: Every year older the person gets, they donate $14.93 less dollars
- intuition: I believed that the relationship would be opposite, a possible reason for it being this way is that younger people may be more concerned with the environment than older people.
college:
- interpretation: N/A
- intuition: N/A
ln(income):
- interpretation: As income increases by 1% green donations increase by $17.11
- intuition: As people are making more money they feel more willing to donate to help the environment
single:
- interpretation: If someone is single, they donate $13.7 more dollars
- intuition: More income to spend on other things
kids:
- interpretation: As the amount of kids go up by 1 the amount donated goes down by $23.69
- intuition: having more kids forces the parents to spend money on them rather than donating it

Learning Objective 8

Students will comment about potential bias in their estimation strategy

34) [5 points] Write about potential sources of bias in your regression. Hint: Review Chapter 10 HAGS.

Ommited variable bias. If we added a variable for political orientation, i believe that it would have been a great addition to the model.

Learning Objective 9

Students will communicate findings and provide recommentaions based on analysis

35) [5 points] Write a paragraph reporting your findings to the NGO. In this paragraph, include recommendations for increasing green donations for climate change initiatives.

Based on my findings, the best demographic to collect donations from would be young wealthy individuals. This is because according to my best Model, younger individuals are more likely to donate, as well as more wealthy individuals. It would also be beneficial to target people whith few to zero children, as they are more likely to donate as well.