Project 1 improvement
1 Introduction
In this document you will find directions to improve your first draft of Project 1 “Analyzing the US financial market”.
2 Feedback
You have to meet with me to receive feedback about your first version. I will focus on:
Reviewing how you selected and calculated the financial ratios as explanatory variables
How you interpreted the final multiple regression model: you have to take notes how you can improve your interpretations. Your interpretations at least must have the following:
Provide a “statistical” interpretation of the beta coefficients along with their corresponding p-values and 95% confidence intervals. You have to focus on the type of relationships and their statistical significance.
Interpret the Adjusted R-square of the model
Which is the variable with the highest explanatory power?
Provide a meaningful interpretation in the context of Finance. What does a negative and significant coefficient mean in this context? how this can be helpful for making business decisions?
Provide a conclusion of your interpretation with the most important points for making decisions
3 Regarding descriptive statistics of financial ratios
For the industry you selected and the ratios you selected, you have to provide descriptive statistics of the ratios with the following considerations:
You have to use only the most recent fiscal year observations (fiscalmonth=12, year=2023).
Besides calculating the arithmetic mean of the ratios, calculate the weighted average of the ratios. To calculate the weighted average of a ratio, you have to divide the sum of the numerator variable by the sum of the denominator variable. For example, to calculate the weighted average of profit margin you first sum all the net income of all firms and then divide it by the sum of revenue of all firms.
Compare this weighted average with the arithmetic mean and the median. Which is the best measure for central tendency of the ratios? Interpret the weighted average of each of your ratios for your industry.
4 Regression Diagnosis
Do a diagnosis for possible outliers and influential observations.
You have to learn by yourself about regression diagnosis HERE before you work on this section.
You have to do the following:
Using matrix algebra calculate the beta coefficients of your multiple regression model
Using matrix algebra calculate the standard errors of the beta coefficients
Using matrix algebra calculate the R-squared and the adjusted R-squared of the model
Using matrix algebra calculate the Hat Matrix
With the hat matrix identify possible leverage observations. EXPLAIN why these observations can be leverage points.
Identify possible outliers using studentized residuals. EXPLAIN your criteria to identify these outliers, and how studentized residuals work.
Identify possible outliers using Cook’s distance. EXPLAIN your criteria to identify outliers with Cook’s distance and how they work.
Using studentized residuals and Cook’s distance identify possible influential observations. EXPLAIN your criteria.
List the influential observations and decide whether to drop them or keep some of them
Learn about winsorization. Check whether it is necessary to do winsorize any of your variables. You have to justify whether you need winsorization or not. If you need to winsorize any variable, do so before you run your 2nd version of the model.
Re-run your multiple regression model without outliers and influential observations
Compare the model with the previous one. Which model was better? Explain the differences
5 Interpretation of the 2nd version of your model
You have to provide a detailed and clear interpretation of your 2nd version of the model. Check the points I mention in the feedback section
You have to deliver your improved project next Tuesday