What is a sample?
What is a population? (Think about sample mean and population mean)
What is inferential statistics? (Try to connect sample to population)
What are the common types of data?
How will you describe a continuous variable, a categorical variable?
What plot will you choose to visualize a continuous variable or a categorical variable?
What is left-skewed data and what is right-skewed data?
What is the numeric order of mean, median, and mode for a right-skewed data? What is the order if the data is normally distributed?
How will you choose to visualize the association between two variables if the two variables are:
What are the common steps of hypothesis testing?
How will you explain p-value to your neighbor who knows nothing about statistics?
What do you say if a test p-value is smaller than 0.05?
How will you explain confidence interval to your neighbor who somehow got interested in statistics after your previous question?
If you have to choose one, will you report p-value or confidence interval, why?
What is type-I error and what is type-II error?
How do type-I and type-II errors connect with power (or sample size)?
When do you use each of the following test?
What test will you choose to understand the association between the following two variables, if:
What will you do if an ANOVA test was significant (comparison made across multiple groups)?
What will you suggest if multiple tests were conducted simultaneously?
What is simple linear regression, and what is multiple linear regression?
When do you use linear regression, and what are the assumptions?
What if the assumptions were violated?
What if the outcome variable is not numeric, what if it’s binary, ordinal, nominal?
How do you interpret \(\beta_1\) in this hypothetical model: \(income=\beta_0+\beta_1*age+\beta_2*gender+\beta_3*education\ years\)
What do you know about machine learning?
Is regression a type of machine learning?
What are the other types of machine learning models that you know of, and when were they commonly used?
What are some statistical software that you use?
How familiar are you with each?
Pick one software, and give examples of what you used it in? (consider a past project that you used this software, what did you do? i.e., regression, t-test, data transformation, visualization?)
Why do you prefer this software than the other ones that you mentioned?