The Human Freedom Index (HFI) has 1,458 rows/observations and 123 columns/variables. Specifically, each row represents one country in the HFI.
## [1] 1458 123
## [1] "year" "ISO_code"
## [3] "countries" "region"
## [5] "pf_rol_procedural" "pf_rol_civil"
## [7] "pf_rol_criminal" "pf_rol"
## [9] "pf_ss_homicide" "pf_ss_disappearances_disap"
## [11] "pf_ss_disappearances_violent" "pf_ss_disappearances_organized"
## [13] "pf_ss_disappearances_fatalities" "pf_ss_disappearances_injuries"
## [15] "pf_ss_disappearances" "pf_ss_women_fgm"
## [17] "pf_ss_women_missing" "pf_ss_women_inheritance_widows"
## [19] "pf_ss_women_inheritance_daughters" "pf_ss_women_inheritance"
## [21] "pf_ss_women" "pf_ss"
## [23] "pf_movement_domestic" "pf_movement_foreign"
## [25] "pf_movement_women" "pf_movement"
## [27] "pf_religion_estop_establish" "pf_religion_estop_operate"
## [29] "pf_religion_estop" "pf_religion_harassment"
## [31] "pf_religion_restrictions" "pf_religion"
## [33] "pf_association_association" "pf_association_assembly"
## [35] "pf_association_political_establish" "pf_association_political_operate"
## [37] "pf_association_political" "pf_association_prof_establish"
## [39] "pf_association_prof_operate" "pf_association_prof"
## [41] "pf_association_sport_establish" "pf_association_sport_operate"
## [43] "pf_association_sport" "pf_association"
## [45] "pf_expression_killed" "pf_expression_jailed"
## [47] "pf_expression_influence" "pf_expression_control"
## [49] "pf_expression_cable" "pf_expression_newspapers"
## [51] "pf_expression_internet" "pf_expression"
## [53] "pf_identity_legal" "pf_identity_parental_marriage"
## [55] "pf_identity_parental_divorce" "pf_identity_parental"
## [57] "pf_identity_sex_male" "pf_identity_sex_female"
## [59] "pf_identity_sex" "pf_identity_divorce"
## [61] "pf_identity" "pf_score"
## [63] "pf_rank" "ef_government_consumption"
## [65] "ef_government_transfers" "ef_government_enterprises"
## [67] "ef_government_tax_income" "ef_government_tax_payroll"
## [69] "ef_government_tax" "ef_government"
## [71] "ef_legal_judicial" "ef_legal_courts"
## [73] "ef_legal_protection" "ef_legal_military"
## [75] "ef_legal_integrity" "ef_legal_enforcement"
## [77] "ef_legal_restrictions" "ef_legal_police"
## [79] "ef_legal_crime" "ef_legal_gender"
## [81] "ef_legal" "ef_money_growth"
## [83] "ef_money_sd" "ef_money_inflation"
## [85] "ef_money_currency" "ef_money"
## [87] "ef_trade_tariffs_revenue" "ef_trade_tariffs_mean"
## [89] "ef_trade_tariffs_sd" "ef_trade_tariffs"
## [91] "ef_trade_regulatory_nontariff" "ef_trade_regulatory_compliance"
## [93] "ef_trade_regulatory" "ef_trade_black"
## [95] "ef_trade_movement_foreign" "ef_trade_movement_capital"
## [97] "ef_trade_movement_visit" "ef_trade_movement"
## [99] "ef_trade" "ef_regulation_credit_ownership"
## [101] "ef_regulation_credit_private" "ef_regulation_credit_interest"
## [103] "ef_regulation_credit" "ef_regulation_labor_minwage"
## [105] "ef_regulation_labor_firing" "ef_regulation_labor_bargain"
## [107] "ef_regulation_labor_hours" "ef_regulation_labor_dismissal"
## [109] "ef_regulation_labor_conscription" "ef_regulation_labor"
## [111] "ef_regulation_business_adm" "ef_regulation_business_bureaucracy"
## [113] "ef_regulation_business_start" "ef_regulation_business_bribes"
## [115] "ef_regulation_business_licensing" "ef_regulation_business_compliance"
## [117] "ef_regulation_business" "ef_regulation"
## [119] "ef_score" "ef_rank"
## [121] "hf_score" "hf_rank"
## [123] "hf_quartile"
See below.
plot(hfi$pf_expression_control, hfi$pf_score,
xlab = "Political Pressures and Media Control Score",
ylab = "Personal Freedom Score",
col = "skyblue",
main = "Scatter Plot of pf_expression_control vs. pf_score")
lm_model <- lm(pf_score ~ pf_expression_control, data = hfi)
abline(lm_model, col = "red")If we use a scatter plot, we can much clearly see that there is a linear relationship between “pf_expression_control” and “pf_score.”
The direction of the relationship is positive. This means that as the “pf_expression_control” score (which indicates more political pressures and controls on media content) increases, the “pf_score” tends to increase. Another way to describe it would be: countries with higher levels of political pressures on media tends to have larger personal freedom scores. The strength of the relationship looks very strong by looking at how tightly the data points cluster around the regression line because the data points are grouped closey around the line.
## # A tibble: 1 × 1
## `cor(pf_expression_control, pf_score, use = "complete.obs")`
## <dbl>
## 1 0.796
Call:
lm(formula = y ~ x, data = pts)
Coefficients:
(Intercept) x 4.7963 0.4348
Sum = 112.255 - is the smallest sum of squares.
## [1] 952.1532
The slope regression line is positive. Which means, that higher political pressure is related to greater levels of human freedom. Further, the positive slope signifies that there is a direct relationship between political pressure on media content and human freedom in the HFI data set.
plot(hfi$pf_expression_control, hfi$hf_score,
xlab = "Political Pressures and Media Control Score",
ylab = "Total Human Freedom Score",
col = "green",
main = "Scatter Plot of pf_expression_control vs. hf_score")
lm_model <- lm(hf_score ~ pf_expression_control, data = hfi)
abline(lm_model, col = "black")Best answer I could get:
The equation of least squares regression line for the linear model is: y^=4.61707+0.49143×pf_expression_control
Predicted value of a country’s pf_score with pf_expression_control rating of 6.7 is y^=4.61707+0.49143×6.7≈7.92145
Residual is the difference between the predicted score and the actual score. If the actual score is higher than 7.92145, the prediction is an underestimate; therefore, if it is lower, then the prediction is an overestimate.
##
## Call:
## lm(formula = pf_score ~ pf_expression_control, data = hfi)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.8467 -0.5704 0.1452 0.6066 3.2060
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.61707 0.05745 80.36 <2e-16 ***
## pf_expression_control 0.49143 0.01006 48.85 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8318 on 1376 degrees of freedom
## Multiple R-squared: 0.6342, Adjusted R-squared: 0.634
## F-statistic: 2386 on 1 and 1376 DF, p-value: < 2.2e-16
ggplot(data = hfi, aes(x = pf_expression_control, y = pf_score)) +
geom_point() +
stat_smooth(method = "lm", se = FALSE)## `geom_smooth()` using formula = 'y ~ x'
The residuals plot helps to assess linearity and identify any patterns of deviations. It shows that residuals are randomly scattered around the zero area; thus, the relationship between two variables are linear. The QQ plot helps to assess the normality of the residuals. The points in the QQ plot closely follow a straight line. Which means, the residuals are approximately normally distributed and fit with linear regression model.
ggplot(data = m1, aes(x = .fitted, y = .resid)) +
geom_point() +
geom_hline(yintercept = 0, linetype = "dashed") +
xlab("Fitted values") +
ylab("Residuals")This histogram appears symmetric and bell-shaped. In addition, it is slightly skewed to the left. However, there is only one real peak on the bell curve in the residual histogram. Therefore, it would mean that the nearly normal residuals condition is met, not violated.
The variance of the residuals appears to be consistent.