Instructions

You must follow the instructions below to get credits for this assignment.

Q1 Describe the Early Childhood Longitudinal Study that the U.S. Department of Education undertook in the late 1990s.

Hint: Make sure to discuss study’s goal, subjects, and variables in the data.

The purpose of this study was to measure the academic progress of 20,000 5th graders based on factors such as race, economic status, and gender (to name a few). Students from various ethnicities, socioeconic backrounds, and population density (city vs. rural) were studied to better th Department of Educations understanding of what factors effect children the most poorly.

Q2 Describe regression analysis in your own words.

Hint: A correct answer must have a discussion on a main concept of regression, often called as, “all else being equal”, “controlling for other variables”, or “Ceteris paribus”. The author explained this concept using “the circuit board analogy”.

Regression analysis is used in situations where there are too many variables to make conclusions aout correlations between variables. It’s helpful in isolating a few variables in order to achieve more accurate results about correlation. You should alwasy try to make your variables as equally weighted within your analysis as you can.

Q3 What is a drawback of regression analysis. What type of questions can regression analysis not answer?

Hint: A correct answer must have a discussion on causality versus correlation.

Since there are so many variables, it’s often hard to assign causation to one factor or another. But it does help to prove correlation betweeen variables. Causation is different than correlation. Causation is when one variable causes another to appear more frequently. While correlation means two variables are more likely to appear with eachother anyway.

Q4 What role does the quality of schools play in academic performance of students.

Hint: See page 150.

The quality of the schools plays a large part in the determination of students academic success. Growing up around gangs, lack of funding, and high student teacher ratios all contribute to a poor learning experience. Compared to a student in the suburbs who attends a safer school and has a better socioeconomic status.

Q5 Continued from Q4. How could you control for the quality of schools in the study?

Hint: For this question, you may need additional information in addition to the assigned reading. You may Google search something like “how does regression control for variables”.

When selecting controls for regression analysis, the best route is to compare variables that will have a large impact on the two variables you’re trying to isolate. So you could make the control schools those in middle class environments. Then see how poorer school districts childrens grades are affected when compared to the middle class. You could also compare rich to middle class and finally find the difference between poor and rich school childrens grades.

Q6 The author says that regression is more art than science. What does he mean?

Regression analysis is more about interpretation rather than clean data. The results will vary depending on the selected variales, and no one set formula can be used. Some trial and error is often part of the process considering the amount of variables.

Q7 What are major takeaways from the reading?

Regression analysis is good for showing correlations between variables, but not for causation. And there are lots of variables so regression analysis takes time and effort to come to conclusions about your data. You have to learn to manipulate the data yourself to make good and accurate data by using regression analysis.

Q8 Hide the messages, but display the code and its results on the webpage.

Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.

Q9 Display the title and your name correctly at the top of the webpage.

Q10 Use the correct slug.