Question 1 Create the vectors
x<- c(5,10,15,20,25,30)
y<- c(-1,NA,75,3,5,8)
z<- c(5)
Question 2. Multiply the first two vector by the z vector, ans store these in new objects. Print these new vectors
A<- c(x*z)
print(A)
## [1] 25 50 75 100 125 150
B<- c(y*z)
print(B)
## [1] -5 NA 375 15 25 40
Question 3
library(haven)
library(readr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
stata<- read_dta("C:/Users/chris/Downloads/stata_PSID_w1.dta")
View(stata)
assignment1<-subset(x=stata,select=c("id","age","marpi","adjwlth2","educ","h_race_ethnic_new","race5"))
Question 3.1 How many variables are there in this data and what are the variable names, and how many observations in the data file?
ncol(assignment1)
## [1] 7
names(assignment1)
## [1] "id" "age" "marpi"
## [4] "adjwlth2" "educ" "h_race_ethnic_new"
## [7] "race5"
nrow(assignment1)
## [1] 131361
Question 3.2 show frequency distribution of race/ethnicity variaable
hist(assignment1$race5, main="Frequency Distribution of Race/Ethnicity")

assignment1$race5<-factor(assignment1$race5,
levels=c(1,2,3,4,5),
labels=c("Latino","Asian","Black", "Other","White" ))
barplot(table(assignment1$race5), main="Frequency Distribution of Race")

Question 3.5, How many people in the data received public assistance? How many Latino received public assistance?
PublicAsst<-subset(stata,select=c("id","pubhs","h_race_ethnic_new","race5"))
View(PublicAsst)
sum(PublicAsst$pubhs == 1, na.rm = T)
## [1] 6961
unique(PublicAsst$race5)
## <labelled<double>[5]>: Race/ethnicity updated codes (5/26/14)
## [1] 5 3 1 4 2
##
## Labels:
## value label
## 1 Latino- Any Race
## 2 NL Asian
## 3 NL Black
## 4 NL Other
## 5 NL White
PublicAsstLatinoRace5<-subset(filter(PublicAsst,race5 == 1))
View(PublicAsstLatinoRace5)
sum(PublicAsstLatinoRace5$pubhs == 1, na.rm = T)
## [1] 366
Question 3.6 Anything you wish to know about individuals’ experiences that are not included in the data set? (Note: unit of analysis is individual here. Open-ended question. E.g., occupation, childhood maltreatment, neighborhood characteristics, etc). List three variables that you wish you had access to.
"occupation, child and parent information, Literacy level,Household size, Educational level of Parents/Guardian
Number of Children, Type of occupation/occupation"
## [1] "occupation, child and parent information, Literacy level,Household size, Educational level of Parents/Guardian\nNumber of Children, Type of occupation/occupation"