hw1

Suschevskiy Vsevolod

2020-02-15

##                                                                               summary(df$MainBranch)
## I am a developer by profession                                                                 65679
## I am a student who is learning to code                                                         10189
## I am not primarily a developer, but I write code sometimes as part of my work                   7539
## I code primarily as a hobby                                                                     3340
## I used to be a developer by profession, but no longer am                                        1584
## NA's                                                                                             552

0.1 Intro

“Will People Born Today Have a Better Life Than Their Parents?” [1], is a popular question to measure optimism and faith in a future.

Since programming is a popular job, and only 3.7 % (3340 out of 88331) does coding mostly as a hobby, while the rest is employed or going to be employed. In this homework I will look at coding as a profession, so I will try to link optimism with some of the working conditions. With the help of SEM Xanthopoulou at all [2] showed the major role of personal resources (f.e. optimism) in work environment. They proved that satisfaction with the job connected with optimism. Also, [3] found a correlation between early career and optimism, so we could expect to see more optimism among people who have started to work earlier. Cheng [4] uses theory of job adaption by Hulin [5]. Who wrote about gradual process of integration people into the institution of career. So we could expect that smaller experience could lead to a lower optimism.

0.2 RQ

  • RQ1: Gender does not have influence on optimism
  • RQ2: Satisfaction with the job have positive influence on optimism
  • RQ3: Earlier start of a job have positive influence on optimism
  • RQ4: Longer career have positive influence on optimism

lets take a look at our data, and data that we do not have

## 
##  Variables sorted by number of missings: 
##      Variable      Count
##        JobSat 0.20133209
##     CareerSat 0.18041695
##  YearsCodePro 0.16504843
##           Age 0.10882846
##  LastHireDate 0.10158298
##        Gender 0.03911884
##    BetterLife 0.02940945
##    Age1stCode 0.02019509
##    Employment 0.01914877
##     YearsCode 0.01063195

Most of the data is missing in Job or career satisfaction, since some people might have not experience any job. we could filter them out, and impute the other data.

we will save this imputed data for later, to compare with our final model.

## 
##  Descriptive statistics by group 
## group: No
##               vars     n  mean   sd median trimmed  mad min  max range  skew
## Employment*      1 22972  1.23 0.61      1    1.04 0.00   1    3     2  2.40
## Gender*          2 22972   NaN   NA     NA     NaN   NA Inf -Inf  -Inf    NA
## CareerSat*       3 22972  3.45 1.36      3    3.56 1.48   1    5     4 -0.22
## JobSat*          4 22972  3.25 1.36      3    3.31 1.48   1    5     4 -0.05
## LastHireDate*    5 22972  3.15 1.66      4    3.13 1.48   1    6     5 -0.11
## YearsCode        6 22972 13.68 9.42     11   12.42 7.41   0   51    51  1.13
## YearsCodePro     7 22972 13.68 9.42     11   12.42 7.41   0   51    51  1.13
## Age1stCode       8 22972 15.33 4.84     15   15.02 4.45   5   65    60  1.08
## Age              9 22972 32.60 8.83     30   31.54 7.41   1   99    98  1.22
## BetterLife*     10 22972  1.00 0.00      1    1.00 0.00   1    1     0   NaN
##               kurtosis   se
## Employment*       4.03 0.00
## Gender*             NA   NA
## CareerSat*       -1.11 0.01
## JobSat*          -1.14 0.01
## LastHireDate*    -1.50 0.01
## YearsCode         0.81 0.06
## YearsCodePro      0.81 0.06
## Age1stCode        3.70 0.03
## Age               2.16 0.06
## BetterLife*        NaN 0.00
## ------------------------------------------------------------ 
## group: Yes
##               vars     n  mean   sd median trimmed  mad min  max range  skew
## Employment*      1 38807  1.23 0.60      1    1.04 0.00   1    3     2  2.42
## Gender*          2 38807   NaN   NA     NA     NaN   NA Inf -Inf  -Inf    NA
## CareerSat*       3 38807  3.68 1.33      3    3.82 2.97   1    5     4 -0.45
## JobSat*          4 38807  3.36 1.37      3    3.45 1.48   1    5     4 -0.15
## LastHireDate*    5 38807  3.09 1.64      4    3.07 1.48   1    6     5 -0.08
## YearsCode        6 38807 12.23 8.42     10   11.04 7.41   0   51    51  1.32
## YearsCodePro     7 38807 12.23 8.42     10   11.04 7.41   0   51    51  1.32
## Age1stCode       8 38807 15.34 4.50     15   15.14 4.45   5   79    74  0.91
## Age              9 38807 30.77 7.94     29   29.80 5.93   1   99    98  1.38
## BetterLife*     10 38807  2.00 0.00      2    2.00 0.00   2    2     0   NaN
##               kurtosis   se
## Employment*       4.12 0.00
## Gender*             NA   NA
## CareerSat*       -0.99 0.01
## JobSat*          -1.15 0.01
## LastHireDate*    -1.49 0.01
## YearsCode         1.68 0.04
## YearsCodePro      1.68 0.04
## Age1stCode        3.86 0.02
## Age               2.92 0.04
## BetterLife*        NaN 0.00
Employment Gender CareerSat JobSat LastHireDate YearsCode YearsCodePro Age1stCode Age BetterLife
Employed full-time Man Slightly satisfied Slightly satisfied 1-2 years ago 3 3 22 28 Yes
Employed full-time Man Very satisfied Slightly satisfied Less than a year ago 3 3 16 22 Yes
Employed full-time Man Very dissatisfied Slightly dissatisfied Less than a year ago 16 16 14 30 Yes
Employed full-time Man Very satisfied Slightly satisfied 1-2 years ago 13 13 15 28 No
Independent contractor, freelancer, or self-employed Man Slightly satisfied Neither satisfied nor dissatisfied NA - I am an independent contractor or self employed 6 6 17 42 No
Employed full-time Man Slightly satisfied Slightly satisfied Less than a year ago 12 12 11 23 No

lest check our cells, because we have to, but with such a big data there is a small chance of having less than 40 observations, also we have a huge dataset, so no problem with a minimal size

0.3 assumptions

##           CareerSat
## BetterLife Neither satisfied nor dissatisfied Slightly dissatisfied
##        No                                2425                  2721
##        Yes                               3131                  3563
##           CareerSat
## BetterLife Slightly satisfied Very dissatisfied Very satisfied
##        No                8372              1034           8420
##        Yes              13123              1911          17079

look at the outliers

And remove them once

correlation

years of codePro is highly correlated with years of code, so lets exclude years o code, since we are talked about career

0.4 model

Lets build our base model

## 
## Call:
## glm(formula = BetterLife ~ Employment + CareerSat + JobSat, family = "binomial", 
##     data = df2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.5396  -1.3812   0.8822   0.9777   1.1348  
## 
## Coefficients:
##                                                                 Estimate
## (Intercept)                                                     0.290054
## EmploymentEmployed part-time                                    0.054970
## EmploymentIndependent contractor, freelancer, or self-employed  0.043073
## CareerSatSlightly dissatisfied                                  0.088125
## CareerSatSlightly satisfied                                     0.197477
## CareerSatVery dissatisfied                                      0.472137
## CareerSatVery satisfied                                         0.473048
## JobSatSlightly dissatisfied                                    -0.048260
## JobSatSlightly satisfied                                        0.002233
## JobSatVery dissatisfied                                        -0.189046
## JobSatVery satisfied                                           -0.020303
##                                                                Std. Error
## (Intercept)                                                      0.032924
## EmploymentEmployed part-time                                     0.044756
## EmploymentIndependent contractor, freelancer, or self-employed   0.030990
## CareerSatSlightly dissatisfied                                   0.039838
## CareerSatSlightly satisfied                                      0.032980
## CareerSatVery dissatisfied                                       0.053094
## CareerSatVery satisfied                                          0.035479
## JobSatSlightly dissatisfied                                      0.034275
## JobSatSlightly satisfied                                         0.030454
## JobSatVery dissatisfied                                          0.045441
## JobSatVery satisfied                                             0.033606
##                                                                z value Pr(>|z|)
## (Intercept)                                                      8.810  < 2e-16
## EmploymentEmployed part-time                                     1.228    0.219
## EmploymentIndependent contractor, freelancer, or self-employed   1.390    0.165
## CareerSatSlightly dissatisfied                                   2.212    0.027
## CareerSatSlightly satisfied                                      5.988 2.13e-09
## CareerSatVery dissatisfied                                       8.893  < 2e-16
## CareerSatVery satisfied                                         13.333  < 2e-16
## JobSatSlightly dissatisfied                                     -1.408    0.159
## JobSatSlightly satisfied                                         0.073    0.942
## JobSatVery dissatisfied                                         -4.160 3.18e-05
## JobSatVery satisfied                                            -0.604    0.546
##                                                                   
## (Intercept)                                                    ***
## EmploymentEmployed part-time                                      
## EmploymentIndependent contractor, freelancer, or self-employed    
## CareerSatSlightly dissatisfied                                 *  
## CareerSatSlightly satisfied                                    ***
## CareerSatVery dissatisfied                                     ***
## CareerSatVery satisfied                                        ***
## JobSatSlightly dissatisfied                                       
## JobSatSlightly satisfied                                          
## JobSatVery dissatisfied                                        ***
## JobSatVery satisfied                                              
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 75357  on 57479  degrees of freedom
## Residual deviance: 74919  on 57469  degrees of freedom
## AIC: 74941
## 
## Number of Fisher Scoring iterations: 4

As we see, I forgot to properly re-level factors, so we have to do it now

Now it should be much better, because the lowest satisfaction with the job and career is our base level

## 
## Call:
## glm(formula = BetterLife ~ Employment + CareerSat + JobSat, family = "binomial", 
##     data = df2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.5396  -1.3812   0.8822   0.9777   1.1348  
## 
## Coefficients:
##                                                                  Estimate
## (Intercept)                                                     0.5731454
## EmploymentEmployed part-time                                    0.0549699
## EmploymentIndependent contractor, freelancer, or self-employed  0.0430733
## CareerSatSlightly dissatisfied                                 -0.3840118
## CareerSatNeither satisfied nor dissatisfied                    -0.4721371
## CareerSatSlightly satisfied                                    -0.2746599
## CareerSatVery satisfied                                         0.0009108
## JobSatSlightly dissatisfied                                     0.1407860
## JobSatNeither satisfied nor dissatisfied                        0.1890460
## JobSatSlightly satisfied                                        0.1912792
## JobSatVery satisfied                                            0.1687434
##                                                                Std. Error
## (Intercept)                                                     0.0442484
## EmploymentEmployed part-time                                    0.0447565
## EmploymentIndependent contractor, freelancer, or self-employed  0.0309898
## CareerSatSlightly dissatisfied                                  0.0505812
## CareerSatNeither satisfied nor dissatisfied                     0.0530936
## CareerSatSlightly satisfied                                     0.0482488
## CareerSatVery satisfied                                         0.0496199
## JobSatSlightly dissatisfied                                     0.0421903
## JobSatNeither satisfied nor dissatisfied                        0.0454415
## JobSatSlightly satisfied                                        0.0418779
## JobSatVery satisfied                                            0.0439376
##                                                                z value Pr(>|z|)
## (Intercept)                                                     12.953  < 2e-16
## EmploymentEmployed part-time                                     1.228 0.219372
## EmploymentIndependent contractor, freelancer, or self-employed   1.390 0.164553
## CareerSatSlightly dissatisfied                                  -7.592 3.15e-14
## CareerSatNeither satisfied nor dissatisfied                     -8.893  < 2e-16
## CareerSatSlightly satisfied                                     -5.693 1.25e-08
## CareerSatVery satisfied                                          0.018 0.985355
## JobSatSlightly dissatisfied                                      3.337 0.000847
## JobSatNeither satisfied nor dissatisfied                         4.160 3.18e-05
## JobSatSlightly satisfied                                         4.568 4.93e-06
## JobSatVery satisfied                                             3.841 0.000123
##                                                                   
## (Intercept)                                                    ***
## EmploymentEmployed part-time                                      
## EmploymentIndependent contractor, freelancer, or self-employed    
## CareerSatSlightly dissatisfied                                 ***
## CareerSatNeither satisfied nor dissatisfied                    ***
## CareerSatSlightly satisfied                                    ***
## CareerSatVery satisfied                                           
## JobSatSlightly dissatisfied                                    ***
## JobSatNeither satisfied nor dissatisfied                       ***
## JobSatSlightly satisfied                                       ***
## JobSatVery satisfied                                           ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 75357  on 57479  degrees of freedom
## Residual deviance: 74919  on 57469  degrees of freedom
## AIC: 74941
## 
## Number of Fisher Scoring iterations: 4

Employment is not significant so lets change it for years of code

## 
## Call:
## glm(formula = BetterLife ~ CareerSat + JobSat + YearsCodePro + 
##     Age1stCode, family = "binomial", data = df2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.6340  -1.3586   0.8610   0.9621   1.2865  
## 
## Coefficients:
##                                              Estimate Std. Error z value
## (Intercept)                                  0.905860   0.063770  14.205
## CareerSatSlightly dissatisfied              -0.387145   0.050673  -7.640
## CareerSatNeither satisfied nor dissatisfied -0.485309   0.053202  -9.122
## CareerSatSlightly satisfied                 -0.279607   0.048328  -5.786
## CareerSatVery satisfied                     -0.009344   0.049699  -0.188
## JobSatSlightly dissatisfied                  0.137550   0.042275   3.254
## JobSatNeither satisfied nor dissatisfied     0.177702   0.045525   3.903
## JobSatSlightly satisfied                     0.189553   0.041956   4.518
## JobSatVery satisfied                         0.178589   0.044023   4.057
## YearsCodePro                                -0.019979   0.001351 -14.791
## Age1stCode                                  -0.005690   0.002375  -2.395
##                                             Pr(>|z|)    
## (Intercept)                                  < 2e-16 ***
## CareerSatSlightly dissatisfied              2.17e-14 ***
## CareerSatNeither satisfied nor dissatisfied  < 2e-16 ***
## CareerSatSlightly satisfied                 7.23e-09 ***
## CareerSatVery satisfied                      0.85087    
## JobSatSlightly dissatisfied                  0.00114 ** 
## JobSatNeither satisfied nor dissatisfied    9.49e-05 ***
## JobSatSlightly satisfied                    6.25e-06 ***
## JobSatVery satisfied                        4.98e-05 ***
## YearsCodePro                                 < 2e-16 ***
## Age1stCode                                   0.01661 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 75357  on 57479  degrees of freedom
## Residual deviance: 74687  on 57469  degrees of freedom
## AIC: 74709
## 
## Number of Fisher Scoring iterations: 4

and add gender, just because we have a tradition to add genders in our model

## 
## Call:
## glm(formula = BetterLife ~ CareerSat + JobSat + YearsCodePro + 
##     Age1stCode + Gender + Age, family = "binomial", data = df2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0480  -1.3436   0.8413   0.9569   1.6490  
## 
## Coefficients:
##                                                                    Estimate
## (Intercept)                                                        1.483756
## CareerSatSlightly dissatisfied                                    -0.375586
## CareerSatNeither satisfied nor dissatisfied                       -0.478794
## CareerSatSlightly satisfied                                       -0.276521
## CareerSatVery satisfied                                           -0.012703
## JobSatSlightly dissatisfied                                        0.130877
## JobSatNeither satisfied nor dissatisfied                           0.159409
## JobSatSlightly satisfied                                           0.179867
## JobSatVery satisfied                                               0.180895
## YearsCodePro                                                       0.003721
## Age1stCode                                                         0.006921
## GenderMan;Non-binary, genderqueer, or gender non-conforming       -0.466250
## GenderNon-binary, genderqueer, or gender non-conforming           -0.683841
## GenderWoman                                                       -0.517619
## GenderWoman;Man                                                    0.958924
## GenderWoman;Man;Non-binary, genderqueer, or gender non-conforming  0.436499
## GenderWoman;Non-binary, genderqueer, or gender non-conforming     -1.142684
## Age                                                               -0.032988
##                                                                   Std. Error
## (Intercept)                                                         0.072939
## CareerSatSlightly dissatisfied                                      0.050927
## CareerSatNeither satisfied nor dissatisfied                         0.053471
## CareerSatSlightly satisfied                                         0.048574
## CareerSatVery satisfied                                             0.049949
## JobSatSlightly dissatisfied                                         0.042476
## JobSatNeither satisfied nor dissatisfied                            0.045745
## JobSatSlightly satisfied                                            0.042151
## JobSatVery satisfied                                                0.044232
## YearsCodePro                                                        0.002131
## Age1stCode                                                          0.002493
## GenderMan;Non-binary, genderqueer, or gender non-conforming         0.191272
## GenderNon-binary, genderqueer, or gender non-conforming             0.114235
## GenderWoman                                                         0.033754
## GenderWoman;Man                                                     0.413599
## GenderWoman;Man;Non-binary, genderqueer, or gender non-conforming   0.474418
## GenderWoman;Non-binary, genderqueer, or gender non-conforming       0.206510
## Age                                                                 0.002148
##                                                                   z value
## (Intercept)                                                        20.342
## CareerSatSlightly dissatisfied                                     -7.375
## CareerSatNeither satisfied nor dissatisfied                        -8.954
## CareerSatSlightly satisfied                                        -5.693
## CareerSatVery satisfied                                            -0.254
## JobSatSlightly dissatisfied                                         3.081
## JobSatNeither satisfied nor dissatisfied                            3.485
## JobSatSlightly satisfied                                            4.267
## JobSatVery satisfied                                                4.090
## YearsCodePro                                                        1.746
## Age1stCode                                                          2.776
## GenderMan;Non-binary, genderqueer, or gender non-conforming        -2.438
## GenderNon-binary, genderqueer, or gender non-conforming            -5.986
## GenderWoman                                                       -15.335
## GenderWoman;Man                                                     2.318
## GenderWoman;Man;Non-binary, genderqueer, or gender non-conforming   0.920
## GenderWoman;Non-binary, genderqueer, or gender non-conforming      -5.533
## Age                                                               -15.360
##                                                                   Pr(>|z|)    
## (Intercept)                                                        < 2e-16 ***
## CareerSatSlightly dissatisfied                                    1.64e-13 ***
## CareerSatNeither satisfied nor dissatisfied                        < 2e-16 ***
## CareerSatSlightly satisfied                                       1.25e-08 ***
## CareerSatVery satisfied                                           0.799249    
## JobSatSlightly dissatisfied                                       0.002062 ** 
## JobSatNeither satisfied nor dissatisfied                          0.000493 ***
## JobSatSlightly satisfied                                          1.98e-05 ***
## JobSatVery satisfied                                              4.32e-05 ***
## YearsCodePro                                                      0.080854 .  
## Age1stCode                                                        0.005503 ** 
## GenderMan;Non-binary, genderqueer, or gender non-conforming       0.014784 *  
## GenderNon-binary, genderqueer, or gender non-conforming           2.15e-09 ***
## GenderWoman                                                        < 2e-16 ***
## GenderWoman;Man                                                   0.020423 *  
## GenderWoman;Man;Non-binary, genderqueer, or gender non-conforming 0.357535    
## GenderWoman;Non-binary, genderqueer, or gender non-conforming     3.14e-08 ***
## Age                                                                < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 75357  on 57479  degrees of freedom
## Residual deviance: 74136  on 57462  degrees of freedom
## AIC: 74172
## 
## Number of Fisher Scoring iterations: 4

Check if gender improved our model

## Analysis of Deviance Table
## 
## Model 1: BetterLife ~ CareerSat + JobSat + YearsCodePro + Age1stCode
## Model 2: BetterLife ~ CareerSat + JobSat + YearsCodePro + Age1stCode + 
##     Gender + Age
##   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
## 1     57469      74687                          
## 2     57462      74136  7   550.92 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

yes, It did, same as age

## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: BetterLife
## 
## Terms added sequentially (first to last)
## 
## 
##              Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
## NULL                         57479      75357              
## CareerSat     4   411.69     57475      74945 < 2.2e-16 ***
## JobSat        4    22.56     57471      74922 0.0001549 ***
## YearsCodePro  1   229.74     57470      74693 < 2.2e-16 ***
## Age1stCode    1     5.74     57469      74687 0.0166171 *  
## Gender        6   316.03     57463      74371 < 2.2e-16 ***
## Age           1   234.88     57462      74136 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

every predictor is significant. nice

Let`s write our model equation

## [1] "Y = 1.48 + -0.38 * CareerSatSlightly dissatisfied + -0.48 * CareerSatNeither satisfied nor dissatisfied + -0.28 * CareerSatSlightly satisfied + -0.01 * CareerSatVery satisfied + 0.13 * JobSatSlightly dissatisfied + 0.16 * JobSatNeither satisfied nor dissatisfied + 0.18 * JobSatSlightly satisfied + 0.18 * JobSatVery satisfied + 0 * YearsCodePro + 0.01 * Age1stCode + -0.47 * GenderMan;Non-binary, genderqueer, or gender non-conforming + -0.68 * GenderNon-binary, genderqueer, or gender non-conforming + -0.52 * GenderWoman + 0.96 * GenderWoman;Man + 0.44 * GenderWoman;Man;Non-binary, genderqueer, or gender non-conforming + -1.14 * GenderWoman;Non-binary, genderqueer, or gender non-conforming + -0.03 * Age + e"
##                                                                          OR
## (Intercept)                                                       4.4094747
## CareerSatSlightly dissatisfied                                    0.6868864
## CareerSatNeither satisfied nor dissatisfied                       0.6195304
## CareerSatSlightly satisfied                                       0.7584177
## CareerSatVery satisfied                                           0.9873773
## JobSatSlightly dissatisfied                                       1.1398276
## JobSatNeither satisfied nor dissatisfied                          1.1728174
## JobSatSlightly satisfied                                          1.1970578
## JobSatVery satisfied                                              1.1982894
## YearsCodePro                                                      1.0037279
## Age1stCode                                                        1.0069451
## GenderMan;Non-binary, genderqueer, or gender non-conforming       0.6273502
## GenderNon-binary, genderqueer, or gender non-conforming           0.5046746
## GenderWoman                                                       0.5959381
## GenderWoman;Man                                                   2.6088865
## GenderWoman;Man;Non-binary, genderqueer, or gender non-conforming 1.5472808
## GenderWoman;Non-binary, genderqueer, or gender non-conforming     0.3189619
## Age                                                               0.9675503
##                                                                       2.5 %
## (Intercept)                                                       3.8227457
## CareerSatSlightly dissatisfied                                    0.6214952
## CareerSatNeither satisfied nor dissatisfied                       0.5577673
## CareerSatSlightly satisfied                                       0.6893486
## CareerSatVery satisfied                                           0.8950569
## JobSatSlightly dissatisfied                                       1.0487313
## JobSatNeither satisfied nor dissatisfied                          1.0722096
## JobSatSlightly satisfied                                          1.1020595
## JobSatVery satisfied                                              1.0987073
## YearsCodePro                                                      0.9995423
## Age1stCode                                                        1.0020361
## GenderMan;Non-binary, genderqueer, or gender non-conforming       0.4314732
## GenderNon-binary, genderqueer, or gender non-conforming           0.4032243
## GenderWoman                                                       0.5578043
## GenderWoman;Man                                                   1.2346419
## GenderWoman;Man;Non-binary, genderqueer, or gender non-conforming 0.6445233
## GenderWoman;Non-binary, genderqueer, or gender non-conforming     0.2110720
## Age                                                               0.9634859
##                                                                      97.5 %
## (Intercept)                                                       5.0880546
## CareerSatSlightly dissatisfied                                    0.7588282
## CareerSatNeither satisfied nor dissatisfied                       0.6878406
## CareerSatSlightly satisfied                                       0.8339485
## CareerSatVery satisfied                                           1.0886574
## JobSatSlightly dissatisfied                                       1.2387380
## JobSatNeither satisfied nor dissatisfied                          1.2828051
## JobSatSlightly satisfied                                          1.3000733
## JobSatVery satisfied                                              1.3067331
## YearsCodePro                                                      1.0079289
## Age1stCode                                                        1.0118773
## GenderMan;Non-binary, genderqueer, or gender non-conforming       0.9151325
## GenderNon-binary, genderqueer, or gender non-conforming           0.6312645
## GenderWoman                                                       0.6367214
## GenderWoman;Man                                                   6.4064091
## GenderWoman;Man;Non-binary, genderqueer, or gender non-conforming 4.2885341
## GenderWoman;Non-binary, genderqueer, or gender non-conforming     0.4755201
## Age                                                               0.9716319

so our baseline is a Very dissatisfied in career and job male with mean age and experience

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    12.0    25.0    29.0    30.1    34.0    50.0
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    5.00   12.00   15.00   15.09   18.00   27.00
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0     6.0    10.0    11.6    15.0    33.0
## [1] 57.60989

that have almost mean chance of being optimistic (i.e. 57%)

while woman that is satisfied with the job and career with twice as much of coding experience and at the age 49 have

## [1] 89.77978

almost 90% chance to be optimistic about the future

Lets draw some picturies, to understand effect of satisfaction with the career better

##       Age Age1stCode YearsCodePro Gender            JobSat
## 1 30.1006    15.0901      11.6039    Man Very dissatisfied
## 2 30.1006    15.0901      11.6039    Man Very dissatisfied
## 3 30.1006    15.0901      11.6039    Man Very dissatisfied
## 4 30.1006    15.0901      11.6039    Man Very dissatisfied
## 5 30.1006    15.0901      11.6039    Man Very dissatisfied
##                            CareerSat     rankP
## 1                  Very dissatisfied 0.6543962
## 2              Slightly dissatisfied 0.5653328
## 3 Neither satisfied nor dissatisfied 0.5398221
## 4                 Slightly satisfied 0.5895001
## 5                     Very satisfied 0.6515177

Neither satisfied nor dissatisfied with the career less good for the optimism, while being very dissatisfied or satisfied equally increases chance to be optimistic. (on 10%)

So at the begging of the carrer satisfaction with the career playes crusial role in probability of being optimistic.

While in the late career there is almost no difference in satisfaction, between all levels, exept for “Neither satisfied nor dissatisfied”, hovewer that could be explained as problem of my data, because there are not enough observations to make a valid prediction.

##           llh       llhNull            G2      McFadden          r2ML 
## -3.706799e+04 -3.767831e+04  1.220640e+03  1.619818e-02  2.101201e-02 
##          r2CU 
##  2.876580e-02

however, our model explain only 2% of our data

## 
##  Hosmer and Lemeshow test (binary model)
## 
## data:  df2$BetterLife, fitted(mlg4)
## X-squared = 9.2123, df = 8, p-value = 0.3247

and we could for sure say, that parameters in this model were chosen poorly. Good student (and researcher) will start the whole hw from the beginning, so I will move on.

0.5 diagnostic

residuals are not normal, there are still some outliers, that needs to be removed, and in Scale- Location observation cross the line. that a sign of a poor model fit.

residuals distributed pretty well, and we do not have any outliers here

## # A tibble: 0 x 15
## # … with 15 variables: BetterLife <fct>, CareerSat <fct>, JobSat <fct>,
## #   YearsCodePro <dbl>, Age1stCode <dbl>, Gender <chr>, Age <dbl>,
## #   .fitted <dbl>, .se.fit <dbl>, .resid <dbl>, .hat <dbl>, .sigma <dbl>,
## #   .cooksd <dbl>, .std.resid <dbl>, index <int>

yes, nothing to remove

Multicollinearity

##                  GVIF Df GVIF^(1/(2*Df))
## CareerSat    2.321464  4        1.111016
## JobSat       2.326986  4        1.111346
## YearsCodePro 3.018257  1        1.737313
## Age1stCode   1.342531  1        1.158676
## Gender       1.018154  6        1.001500
## Age          2.523435  1        1.588532

nothing is over 5, so we are fine at least here.

0.6 Conclusion

Taking everything into consideration, proper Logistic regression supposed to start with a good descriptive statistics, where this research fails. On the other side, hypothesis were confirmed (RQ2- RQ4), while RQ1 needs more examination since I have not used to this classification.

However, overall poor model fit might be explained with the culture of programming, where common models of analisys could not be applied. So futher research should pay more attention to the specific case of https://stackoverflow.com/

0.7 Resources

[1] Inc, G. (2018, April 3). Americans More Optimistic About Future of Next Generation. Gallup.Com. https://news.gallup.com/poll/232076/americans-optimistic-future-next-generation.aspx

[2] Xanthopoulou, D., Bakker, A. B., Demerouti, E., & Schaufeli, W. B. (2007). The role of personal resources in the job demands-resources model. International Journal of Stress Management, 14(2), 121–141. https://doi.org/10.1037/1072-5245.14.2.121

[3] Burke, R. J. (1991). Early Work and Career Experiences of Female and Male Managers and Professionals: Reasons for Optimism? Canadian Journal of Administrative Sciences / Revue Canadienne Des Sciences de l’Administration, 8(4), 224–230. https://doi.org/10.1111/j.1936-4490.1991.tb00565.x

[4] Cheng, G. H.-L., & Chan, D. K.-S. (2008). Who Suffers More from Job Insecurity? A Meta-Analytic Review. Applied Psychology, 57(2), 272–303. https://doi.org/10.1111/j.1464-0597.2007.00312.x

[5] APA Handbook of Industrial and Organizational Psychology. (n.d.). Https://Www.Apa.Org. Retrieved February 14, 2020, from https://www.apa.org/pubs/books/4311502