{r setup, include=FALSE} #####DO NOT MODIFY THIS CODE knitr::opts_chunk$set(echo = TRUE) library(tidyverse) library(knitr) #####DO NOT MODIFY THIS CODE - This will import the survey data we have been working with in this course. dat <- drop_na(read.csv(url("https://www.dropbox.com/s/uhfstf6g36ghxwp/cces_sample_coursera.csv?raw=1")))

Problem 1

Create a vector of five numbers of your choice between 0 and 10, save that vector to an object, and use the sum() function to calculate the sum of the numbers.


JJF <- c(2,3,5,8,9)
sum(JJF)

Problem 2

Create a data frame that includes two columns. One column must have the numbers 1 through 5, and the other column must have the numbers 6 through 10. The first column must be named “alpha” and the second column must be named “beta”. Name the object “my_dat”. Display the data.

Put your code and solution here:


alpha <- c(1,2,3,4,5)
beta <- c(6,7,8,9,10)
my_dat <- data.frame(alpha, beta)
my_dat

Problem 3

Using the data frame created in Problem 2, use the summary() command a create a five-number summary for the column named “beta”.

Put your code and solution here:


summary(my_dat$beta)


# Problem 4

There is code for importing the example survey data that will run automatically in the setup chunk for this report (Line 13). 
Using that data, make a boxplot of the Family Income column using the Base R function (not a figure drawn using qplot). 
Include your name in the title for the plot. Your name should be in the title. Relabel that x-axis as "Family Income".

Hint: consult the codebook to identify the correct column name for the family income question.

Put your code and solution here:

```{r,problem4}

view(dat)
boxplot(dat$faminc_new, main="Boxplot CCES Family Income - by Jennifer Fowler", xlab = "Family Income")

Problem 5

Using the survey data, filter to subset the survey data so you only have male survey respondents who live in the northwest or midwest of the United States, are married, and identify as being interested in the news most of the time.

Use the str() function to provide information about the resulting dataset.

Put your code and solution here:

```{r problem5,include=TRUE,echo=TRUE}

men_nwmw_married_news <- filter(dat, gender==1, region<=2, marstat==1, newsint==1) men_nwmw_married_news str(men_nwmw_married_news)


# Problem 6

Filter the data the same as in Problem 5. Use a R function to create a frequency table for the responses for the question asking whether these survey respondents are invested in the stock market. 

Put your code and solution here:
  
  ```{r problem6,include=TRUE,echo=TRUE}

men_nwmw_married_news <- filter(dat, gender==1, region<=2, marstat==1, newsint==1)
table(men_nwmw_married_news$investor)

Problem 7

Going back to using all rows in the dataset, create a new column in the data using mutate that is equal to either 0, 1, or 2, to reflect whether the respondent supports increasing the standard deduction from 12,000 to 25,000, supports cutting the corporate income tax rate from 39 to 21 percent, or both (so, support for neither policy equals 0, one of the two policies equals 1, and both policies equals two). Name the column “tax_scale”. Hint: you will need to use recode() as well.

Display the first twenty elements of the new column you create.

Put your code and solution here:

```{r problem7,include=TRUE,echo=TRUE}

dat2 <- dat %>% mutate(tax_scale = CC18_325d + CC18_325a) tax_scale <- recode(dat2\(tax_scale,`4`=0,`3`=1,`2`=2, ) dat2\)tax_scale <- tax_scale head(dat2$tax_scale, 20)


# Problem 8

Use a frequency table command to show how many 0s, 1s, and 2s are in the column you created in Problem 7.

Put your code and solution here:
  
  ```{r problem8,include=TRUE,echo=TRUE}

table(dat2$tax_scale)

Problem 9

Again using all rows in the original dataset, use summarise and group_by to calculate the average (mean) job of approval for President Trump in each of the four regions listed in the “region” column.

Put your code and solution here:

```{r problem9}

test <- rename(dat,trump_approval=CC18_308a) test$trump_approval dat <- test trump_rating <- dat %>% group_by(region) %>% summarize(mean(trump_approval)) trump_rating


# Problem 10

Again start with all rows in the original dataset, use summarise() to create a summary table for survey respondents who  
are not investors and who have an annual family income of between $40,000 and $119,999 per year. 
The table should have the mean, median and standard deviations for the importance of religion column.

Put your code and solution here:
  
  ```{r problem10}

no_invest_mid_inc <- filter(dat, investor==2 & faminc_new>=5 & faminc_new<=10)
no_invest_mid_inc %>% summarise(mean(pew_religimp), median(pew_religimp), sd(pew_religimp))

Problem 11

Use kable() and the the summarise() function to create a table with one row and three columns that provides the mean, median, and standard deviation for the column named faminc_new in the survey data.

Put your code and solution here:

```{r problem11}

kable(summarise(dat,Mean=mean(dat\(faminc_new),Median=median(dat\)faminc_new), StDev=sd(dat$faminc_new)),label=“Summary Statistics for Family Income”)


# Problem 12

With the survey data, use qplot() to make a histogram of the column named pid7. Change the x-axis label to "Seven Point Party ID" and the y-axis label to "Count".

Note: you can ignore the "stat_bin()" message that R generates when you draw this. The setup for the code chunk will suppress the message.

Put your code and solution here:
  
  ```{r problem12,message=FALSE}

qplot(x=pid7, data=dat, geom="histogram", bins=14, main="DEMOCRAT or REPUBLICAN", xlab="Seven Point Party ID", ylab="Count")