SOME TEXT HERE ABOUT YOUR DATA.
0# replace this by a basic sample description (by applying row(), table(), means(), sd(), summary(), ... (whatever applies best)
## [1] 0
df$salary <- as.numeric(df$salary)
df$years_exp <- as.numeric(df$years_exp)
df$gender <- as.factor(df$gender)
nrow(df)
## [1] 200
mean_salary <- mean(df$salary, na.rm=T)
mean_years <- mean(df$years_exp, na.rm=T)
table(mean_salary)
## mean_salary
## 108490.894848091
## 1
table(mean_years)
## mean_years
## 15.6664792847936
## 1
sd_salary <- sd(df$salary)
sd_years <- sd(df$ears_exp)
summary(df$salary)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 30028 60076 93164 108491 150437 255381
The plot describes the relationship between years and employment (independent variable) and salary (dependent variable).The association appears to be positive.
The R value tells 0.9396925
plot(df$years_exp, df$salary)
lm(df$years_exp ~ df$salary)
##
## Call:
## lm(formula = df$years_exp ~ df$salary)
##
## Coefficients:
## (Intercept) df$salary
## 1.232249 0.000133
cor(df$years_exp, df$salary, use = "complete.obs", method = "pearson")
## [1] 0.9129636
cor(df$years_exp, df$salary, use = "complete.obs", method = "spearman")
## [1] 0.9396925
A non-liner relationship can be observed between the salary and the years of employment. a liner regression model is being applied to check the association with the salary variable.
df$salary_model <- log(df$salary)
model <- lm(df$salary_model ~ df$year)
summary(model)
##
## Call:
## lm(formula = df$salary_model ~ df$year)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.74993 -0.11686 0.00666 0.11146 0.77461
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.436444 0.032197 324.14 <2e-16 ***
## df$year 0.063322 0.001795 35.28 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2218 on 198 degrees of freedom
## Multiple R-squared: 0.8628, Adjusted R-squared: 0.8621
## F-statistic: 1245 on 1 and 198 DF, p-value: < 2.2e-16
The model descript positive linear realationship between year and salary model.
SOME TEXT HERE OR DELETE THIS SECTION.
0# replace this by two regression models, separated by gender.
## [1] 0
SOME TEXT HERE TO INTERPRET YOUR MODEL OUTPUT.