Instruction:

For all the coding tasks below, create a single R Script and save it with the file name format {familyname_firstname.R}. Make sure to submit the file on or before the deadline.

  1. In a study to estimate the proportion of residents in a certain city and its suburbs who favor the construction of a nuclear power plant, it is found that 63 of 100 urban residents favor the construction while only 59 of 125 suburban residents are in favor. Is there a significant difference between the proportions of urban and suburban residents who favor construction of the nuclear plant? Make use of a P-value.

  2. A marketing expert for a pasta-making company believes that 40% of pasta lovers prefer lasagna. If 9 out of 20 pasta lovers choose lasagna over other pastas, what can be concluded about the expert’s claim? Use a 0.05 level of significance.

  3. In an experiment to study the dependence of hypertension on smoking habits, the following data were taken on 180 individuals:

##                 Non-smokers Moderate Smokers Heavy Smokers
## hypertension             21               36            30
## no hypertension          48               26            19
  • Test the hypothesis that the presence or absence of hypertension is independent of smoking habits. Use a 0.05 level of significance.
  1. If the result in 3. suggests that there is an association between hypertension and smoking habits, compute the Pearson residuals (r) and visualize them using the corrplot() function. Make a short discussion on the notable associations between rows and columns.

  2. The following data represent the running times of films produced by two motion-picture companies:

  • Perform appropriate preliminary analysis to check for the assumptions of normality and equal variances.
  • Using the appropriate statistical test based on the results of the preliminary analysis, test the hypothesis that the average running time of films produced by company 2 exceeds the average running time of films produced by company 1 by 10 minutes against the one-sided alternative that the difference is less than 10 minutes. Use a 0.1 level of significance.
  1. The yield of a chemical process is being studied. The two most important variables are thought to be the pressure and the temperature. Three levels of each factor are selected, and a factorial experiment with tw replicates is performed. The yield data follow:
  • test the hypothesis that

    • there is no difference in the mean yield of the chemical process under different pressure levels
    • there is no difference in the mean yield of the chemical process under different temperature levels
    • the interaction between pressure and temperature has no significant effect on the yield of the chemical process.