1. Sample description

The sample includes 100 female and 100 male (see table 1), with both genders equally represented. The years of employment and salary variables are both metric. The dependent variable salary has a mean of 1.2230345^{5} and the standard deviation 7.9030117^{4}. The independent variable years of employment has the mean 15.7343624 and standard deviation 9.0356184.

2. Association between years and salary as scatterplot

The plot showed a strong positive association between salary and years of experience, with a clear upward trend visible, but the line is curved, non-linear.

plot(df$years_empl, df$salary)

#### 3. Estimate salary by years of employment

To transform the dependent variable salary, the we took the logarithm base 10 of salary, which transformed the geometric salary growth into an arithmetic scale.

lm(df$salary ~ df$years_empl)
## 
## Call:
## lm(formula = df$salary ~ df$years_empl)
## 
## Coefficients:
##   (Intercept)  df$years_empl  
##         -2684           7944
lm(log(salary) ~ years_empl, data = df)
## 
## Call:
## lm(formula = log(salary) ~ years_empl, data = df)
## 
## Coefficients:
## (Intercept)   years_empl  
##      10.383        0.071
# Add linear regression line & plot
plot(df$years_empl, log10(df$salary), 
     main = "Linearized: Years of Experience vs log10(Salary)", 
     xlab = "Years of Experience", ylab = "log10(Salary)")
lm(log10(salary) ~ years_empl, data = df)
## 
## Call:
## lm(formula = log10(salary) ~ years_empl, data = df)
## 
## Coefficients:
## (Intercept)   years_empl  
##     4.50918      0.03083
abline(lm(df$years_empl ~ log10(df$salary))) # can´t draw the line

#### 4. Interpretation

The model showed that for a person with zero years of employment, the estimated salary is 7943.255 and the salary increases by -2684.255 with each additional year of employment.

The transformed model shows that with each extra year of employment, the log (base 10) of salary goes up by 0.071.