You must follow the instructions below to get credits for this assignment.
Hint: Make sure to discuss study’s goal, subjects, and variables in the data.
During the 1990s, the U.S. Department of educastion conducted a study known as the Early Childhood Longitudinal Study, or ECLS. The goal of this study was to measure the academic progress of more than 20,000 students from kindergarten to 5th grade. The ECLS measured academic performance and variables such as race, gender, family structure, socioeconomic status, parent education and more.
Hint: A correct answer must have a discussion on a main concept of regression, often called as, “all else being equal”, “controlling for other variables”, or “Ceteris paribus”. The author explained this concept using “the circuit board analogy”.
Regression analysis is a tool used by economists and data analysts to utilize statistical techniques that can identify correlations in the data set. Regression analysis is most useful when given hundreds of different variables to see what demonstrates correlation. For example, the article mentions that “X can cause Y, Y can cause X, or some other factor can cause X and Y.” A regression can show correlation, but not demomnstrate the cause.
Hint: A correct answer must have a discussion on causality versus correlation.
A drawback of regression analysis, as I mentioned in the previous question, that it can show correlation but not the cause. The article explains that regression analysis cannot answer a question such as “does having a lot of books in your home lead your child to do well in school?”, but can answer a question such as “does a child with a lot of books in his home tend to do better than a child with no books?”
Hint: See page 150.
In schools separated in black and white, they are very similar in the traditional measure of education such as class size, teacher education, and computer to student ratio. However, regardless of the majority of the race of the students, the schools with gang problems, people loitering on school property, and lack of PTA funding have a large impact on learning environment.
Hint: For this question, you may need additional information in addition to the assigned reading. You may Google search someting like “how does regression control for variables”.
To control for the quality of schools, you must isolate only the variable that correlates with school quality from all other variables in the model. This would mean to hold variables such as race or gender constant, and only look at the effects of school quality.
The main takeaway that I got from this reading is that there is a lot to read between the lines when it comes to data sets. This article shows that most things are not just black and white answers (such as X causes Y or Y causes X), but hundreds of different variables that have many different correlations. Not only thinking about children and their school performance, I related this study to something such as the stock market. There are millions of different variables that can cause a stock to rise or fall, many of which have nothing to do with the companies performance at all.
Hint: Use message, echo and results in the chunk options. Refer to the RMarkdown Reference Guide.