Problem 1

One Catalan polling agency conducted a research dedicated to the Catalan independence. They organized an opinion poll: 1200 respondents were asked whether they approve the independence of Catalonia from Spain. Their analytical report included the following result: 95% confidence interval for the percentage of people voting for the independence of Catalonia is (75%;80%).

A student decided to interpret this result. He wrote:

(1) The percentage of all people supporting the independence of Catalonia lies between 75% and 80%. (2) To be more precise, there is 0.95 probability that the true percentage of people who approve the Catalan independence is between 75% and 80%.

1.1. This interpretation includes incorrect statements. Find them and explain why they are incorrect. Please, put the number of the wrong statement (1-2) and then write your explanation.

1.2. Provide your own interpretation of the 95% confidence interval described above.

Problem 2

The figures A-С show the 95% confidence intervals for the mean value of the same population. The estimation of the mean was conducted on the samples of different sizes. Match each figure with the sample size \(n\) (\(n=50\), \(n=200\), \(n=1000\)) and explain your answer.

Problem 3

The figures A-C show the same normal distribution with the different confidence intervals for the mean marked (shaded area). Decide which figure corresponds to each of the following confidence intervals: 90%, 95%, 99%. Explain your choice.

Problem 4

One polling agency X organized a survey so as to evaluate citizens’ attitude towards the smoking ban policy. If this survey was repeated 100 times by different polling organizations, in 90 cases the sample proportion of people favoring this policy would lie between 0.44 and 0.49, and in 10 cases it would lie outside this range. Express the same idea using statistical concept(s) your know.

Problem 5

A hypothesis test comparing two population means results in p-value of 0.04.

Choose all the incorrect statements. Explain why these statements are incorrect (one explanation for each statement).

Problem 6

Consider the following situation. We want to decide whether the person N is ill. Our null hypothesis is that he is not ill. We have some empirical data – results of analysis. What can be regarded as the p-value in this case? (Write the idea, you should not imagine any numbers and do calculations).

Problem 7

A group of students doing their master degree in sociology made a sample of 30 political science students and asked them how many hours a week they spend reading news.

7.1. Which measure, the standard deviation, or the standard error of the mean, provides the estimate of the error they make if they want to approximate the population mean by the sample mean?

7.2. Is it possible that they get the standard deviation equals 2 and the standard error of the mean equals 6? Explain your answer.

Problem 8

Imagine that you conduct a research on the parliamental representation in presidential and parliamental European states. As a part of your research you need to decide whether the average number of seats held by all opposition parties differs in countries with presidential system and in countries with parliamentary system.

For this particular problem you have a small dataset (hw2-dpi.csv) that contains data on European countries in 1990–2015. It is a subset of the large Database of Political Institutions (DPI).

Load the dataset for the task. Your variables of interest are numopp (number of seats held by opposition parties) and system (0 - presidential, 1 - parliamental).

8.1. Formulate the null hypothesis you are going to test. Formulate the alternative hypothesis.

8.2. Run commands in R so as to test your null hypothesis. Provide the code you use to do it.

8.3. Interpret the output you get. Remember that your interpretation should include the correct statistical explanation as well as the substantial conclusions.