Email : ferdinand.widjaya@student.matanauniversity.ac.id
RPubs : https://rpubs.com/ferdnw/
Address : ARA
Center, Matana University Tower
Jl. CBD
Barat Kav, RT.1, Curug Sangereng, Kelapa Dua, Tangerang, Banten
15810.
library(ggplot2)
library(dplyr)
library(broom)
library(ggpubr)inc = read.csv("income.data.csv")
summary (inc)## X income happiness
## Min. : 1.0 Min. :1.506 Min. :0.266
## 1st Qu.:125.2 1st Qu.:3.006 1st Qu.:2.266
## Median :249.5 Median :4.424 Median :3.473
## Mean :249.5 Mean :4.467 Mean :3.393
## 3rd Qu.:373.8 3rd Qu.:5.992 3rd Qu.:4.503
## Max. :498.0 Max. :7.482 Max. :6.863
Because we only have one independent variable and one dependent variable, we don’t need to test for any hidden relationships among variables.
Testing apakah data yang digunakan berdistribusi normal atau tidak
hist(inc$happiness)shapiro.test(inc$happiness)##
## Shapiro-Wilk normality test
##
## data: inc$happiness
## W = 0.98705, p-value = 0.0002095
Karena p-value < 0,05 dan Histrogram mengvisualisasikan data nya mirip seperti lonceng, maka diasumsikan datanya memiliki distribusi normal.
Variabel Dependen dan Independen harus memiliki hubungan linear yang jelas
plot(happiness ~ income, data = inc)
Hasil Grafk menunjukkan sebuah hubungan linear positif yang kuat antara
happiness dan income
Homogenitas Variansi akan diuji setelah model sudsh dibuat untuk menunjukkan predksi tidak akan meleset jauh daripada prediksi lainnya.
lm.inc <- lm(happiness ~ income, data = inc)
summary(lm.inc)##
## Call:
## lm(formula = happiness ~ income, data = inc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.02479 -0.48526 0.04078 0.45898 2.37805
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.20427 0.08884 2.299 0.0219 *
## income 0.71383 0.01854 38.505 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7181 on 496 degrees of freedom
## Multiple R-squared: 0.7493, Adjusted R-squared: 0.7488
## F-statistic: 1483 on 1 and 496 DF, p-value: < 2.2e-16
Let’s see if there’s a linear relationship between income and happiness in our survey of 500 people with incomes ranging from $15k to $75k, where happiness is measured on a scale of 1 to 10.
\[ Y = 0.204 + 0.7138 X \]
Karena p-value < 0.05 bisa di bilang Model Linear yang ada akan berfungsu cukup baik dengan tingkat akurasi (R_SQ) kurang lebih 75%.
artinya setiap X atau $10000 dapat meningkatkan Index Kebahagian sebesar 0.71
par(mfrow=c(2,2))
plot(lm.inc)par(mfrow=c(1,1))income.graph<-ggplot(inc, aes(x=income, y=happiness))+
geom_point()
income.graphincome.graph <- income.graph + geom_smooth(method="lm", col="black")
income.graph## `geom_smooth()` using formula 'y ~ x'
income.graph <- income.graph +
stat_regline_equation(label.x = 3, label.y = 7)
income.graph## `geom_smooth()` using formula 'y ~ x'
income.graph +
theme_minimal() +
labs(title = "Reported happiness as a function of income",
x = "Income (x$10,000)",
y = "Happiness score (0 to 10)")## `geom_smooth()` using formula 'y ~ x'
After we see the Graph, we can conclude that there is significant relation between Income and Hppiness
cor(inc$income, inc$happiness)## [1] 0.8656337
Correlation test antara keduanya menyatakan bahwa Income memengaruhi Happiness sebesar kira-kira 86,5% Artinya, dari model yang telah kita buat, memang Keduanya memiliki korealsi psitif dengan tiap kenaikan $10000 Income akan naik 0,71 pada Skala Happiness
Therefore, It’s True That MONEY CAN’T BUY HAPPINESS, But it can be used to buy a BMW which will make u Happy :)