Lab 0

Your name: Mo Pei

Collaborators names:

## Warning: package 'mosaic' was built under R version 3.0.3
## Warning: package 'car' was built under R version 3.0.3
## Warning: package 'dplyr' was built under R version 3.0.3
## Warning: package 'mosaicData' was built under R version 3.0.3

Exercise 1:

Q: How many women were treated? How many men?

# code for Exercise 1 is already entered below as an example
tally( treat~sex, data=HELPrct, format="count" )
##      sex
## treat female male
##   no      55  173
##   yes     52  173

A: 52 women and 173 men were treated out of 107 women and 346 men.

Exercise 2: Making a Table

Q: Please generate a table showing the number of subjects who prefer different drugs. Which drug is most commonly preferred?

# see R tutorial #1 for code.
tally( ~substance, data=HELPrct, format="count" )
## 
## alcohol cocaine  heroin 
##     177     152     124

A: alcohol is most commonly preferred.

Exercise 3: Examining a Bar Chart

Q: Please generate a bar chart showing the number of subjects who prefer different drugs. Then identify which drug is most popular for women, and which is most popular for men. (You will have to modify the code to get the right chart.)

 bargraph(~substance, group=sex, data = HELPrct, auto.key=TRUE, horizontal = TRUE)

plot of chunk unnamed-chunk-4

A: The drug most used by women in this sample is cocaine.

Exercise 4: Race and drug preference

Q: Using the ‘racegrp’ variable, write code to examine the relationship in the sample between race and preferred drug. Write a few sentences describing this relationship, making reference to any figures or charts you generate.

tally(substance~racegrp,data = HELPrct,format = "count")
##          racegrp
## substance black hispanic other white
##   alcohol    55       17     9    96
##   cocaine   125        7     7    13
##   heroin     31       26    10    57
tally(substance~racegrp,data = HELPrct)
##          racegrp
## substance   black hispanic   other   white
##   alcohol 0.26066  0.34000 0.34615 0.57831
##   cocaine 0.59242  0.14000 0.26923 0.07831
##   heroin  0.14692  0.52000 0.38462 0.34337
bargraph(~substance, group=racegrp, data = HELPrct, auto.key=TRUE, horizontal = TRUE)

plot of chunk unnamed-chunk-5

A: based on chart and stats, cocaine is most selective for black with 59 %, 52% hispanic go with heorin, while for white alcohol accounts for 57.8%, and other race groups take herorin at 38.4%.

Also, black has most people in the dataset with 211 people, followed by white 166, and other has the least, 26 people.

Specificially, although white mostly choose alcohol but white has the most people smoke heroin.

One interest thing is that black is the only one race group cocaine overnumbers herorin, with 125 vs 31. The rest groups smoking herion has more people than cocaine.

Exercise 5: Generalizing to the Population

Q: You have explored some relationships in the above between different variables. Do these data give you evidence that these relationships would hold for the US population in general? Why or why not. Explain in one or two sentences.

A: No, it does not stands for U.S populaton. It is not a random sample. It exists bais. For example, it tests only speak spanish or english. Also, patients without contact infomration are excluded from the test. The data only includes adult and ignores teenagers. 6 month and 2 years is a big difference for drug’s effect.