This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.

attach(Group_3_)
The following objects are masked from Group_3_ (pos = 3):

    ANNUITY, CHILDREN, CREDIT, DAYS_BIRTH, DAYS_EMPLOYED, DAYS_LAST_PHONE_CHANGE,
    DAYS_REGISTRATION, DEF_LARGE_SOCIAL_CIRCLE, DEF_SMALL_SOCIAL_CIRCLE, DOC_01,
    DOC_02, DOC_03, DOC_04, DOC_05, DOC_06, DOC_07, DOC_08, DOC_09, DOC_10, DOC_11,
    DOC_12, DOC_13, DOC_14, DOC_15, DOC_16, DOC_17, DOC_18, DOC_19, DOC_20,
    EDUCATION, EMAIL, EMPLOYMENT, EMP_PHONE, FAMILY, FAMILY_STATUS, GENDER,
    GOODS_PRICE, HOME_PHONE, HOUSING_TYPE, INCOME, LIVE_CITY_NOT_WORK_CITY,
    LIVE_REGION_NOT_WORK_REGION, NAME_TYPE_SUITE, OBS_LARGE_SOCIAL_CIRCLE,
    OBS_SMALL_SOCIAL_CIRCLE, OCCUPATION, ORGANIZATION_TYPE, OTHER_MOBILE, OWN_CAR,
    OWN_MOBILE, OWN_REALTY, PROCESS_START_DAY, PROCESS_START_HOUR, REGION_RATING,
    REGION_RATING_CITY, REG_CITY_NOT_LIVE_CITY, REG_CITY_NOT_WORK_CITY,
    REG_REGION_NOT_LIVE_REGION, REG_REGION_NOT_WORK_REGION, WORK_PHONE

The following objects are masked from Group_3_ (pos = 4):

    ANNUITY, CHILDREN, CREDIT, DAYS_BIRTH, DAYS_EMPLOYED, DAYS_LAST_PHONE_CHANGE,
    DAYS_REGISTRATION, DEF_LARGE_SOCIAL_CIRCLE, DEF_SMALL_SOCIAL_CIRCLE, DOC_01,
    DOC_02, DOC_03, DOC_04, DOC_05, DOC_06, DOC_07, DOC_08, DOC_09, DOC_10, DOC_11,
    DOC_12, DOC_13, DOC_14, DOC_15, DOC_16, DOC_17, DOC_18, DOC_19, DOC_20,
    EDUCATION, EMAIL, EMPLOYMENT, EMP_PHONE, FAMILY, FAMILY_STATUS, GENDER,
    GOODS_PRICE, HOME_PHONE, HOUSING_TYPE, INCOME, LIVE_CITY_NOT_WORK_CITY,
    LIVE_REGION_NOT_WORK_REGION, NAME_TYPE_SUITE, OBS_LARGE_SOCIAL_CIRCLE,
    OBS_SMALL_SOCIAL_CIRCLE, OCCUPATION, ORGANIZATION_TYPE, OTHER_MOBILE, OWN_CAR,
    OWN_MOBILE, OWN_REALTY, PROCESS_START_DAY, PROCESS_START_HOUR, REGION_RATING,
    REGION_RATING_CITY, REG_CITY_NOT_LIVE_CITY, REG_CITY_NOT_WORK_CITY,
    REG_REGION_NOT_LIVE_REGION, REG_REGION_NOT_WORK_REGION, WORK_PHONE

The following objects are masked from Group_3_ (pos = 5):

    ANNUITY, CHILDREN, CREDIT, DAYS_BIRTH, DAYS_EMPLOYED, DAYS_LAST_PHONE_CHANGE,
    DAYS_REGISTRATION, DEF_LARGE_SOCIAL_CIRCLE, DEF_SMALL_SOCIAL_CIRCLE, DOC_01,
    DOC_02, DOC_03, DOC_04, DOC_05, DOC_06, DOC_07, DOC_08, DOC_09, DOC_10, DOC_11,
    DOC_12, DOC_13, DOC_14, DOC_15, DOC_16, DOC_17, DOC_18, DOC_19, DOC_20,
    EDUCATION, EMAIL, EMPLOYMENT, EMP_PHONE, FAMILY, FAMILY_STATUS, GENDER,
    GOODS_PRICE, HOME_PHONE, HOUSING_TYPE, INCOME, LIVE_CITY_NOT_WORK_CITY,
    LIVE_REGION_NOT_WORK_REGION, NAME_TYPE_SUITE, OBS_LARGE_SOCIAL_CIRCLE,
    OBS_SMALL_SOCIAL_CIRCLE, OCCUPATION, ORGANIZATION_TYPE, OTHER_MOBILE, OWN_CAR,
    OWN_MOBILE, OWN_REALTY, PROCESS_START_DAY, PROCESS_START_HOUR, REGION_RATING,
    REGION_RATING_CITY, REG_CITY_NOT_LIVE_CITY, REG_CITY_NOT_WORK_CITY,
    REG_REGION_NOT_LIVE_REGION, REG_REGION_NOT_WORK_REGION, WORK_PHONE

The following objects are masked from Group_1_ (pos = 6):

    ANNUITY, CHILDREN, CREDIT, DAYS_BIRTH, DAYS_EMPLOYED, DAYS_LAST_PHONE_CHANGE,
    DAYS_REGISTRATION, DEF_LARGE_SOCIAL_CIRCLE, DEF_SMALL_SOCIAL_CIRCLE, DOC_01,
    DOC_02, DOC_03, DOC_04, DOC_05, DOC_06, DOC_07, DOC_08, DOC_09, DOC_10, DOC_11,
    DOC_12, DOC_13, DOC_14, DOC_15, DOC_16, DOC_17, DOC_18, DOC_19, DOC_20,
    EDUCATION, EMAIL, EMPLOYMENT, EMP_PHONE, FAMILY, FAMILY_STATUS, GENDER,
    GOODS_PRICE, HOME_PHONE, HOUSING_TYPE, INCOME, LIVE_CITY_NOT_WORK_CITY,
    LIVE_REGION_NOT_WORK_REGION, NAME_TYPE_SUITE, OBS_LARGE_SOCIAL_CIRCLE,
    OBS_SMALL_SOCIAL_CIRCLE, OCCUPATION, ORGANIZATION_TYPE, OTHER_MOBILE, OWN_CAR,
    OWN_MOBILE, OWN_REALTY, PROCESS_START_DAY, PROCESS_START_HOUR, REGION_RATING,
    REGION_RATING_CITY, REG_CITY_NOT_LIVE_CITY, REG_CITY_NOT_WORK_CITY,
    REG_REGION_NOT_LIVE_REGION, REG_REGION_NOT_WORK_REGION, WORK_PHONE

The following objects are masked from Group_1_ (pos = 7):

    ANNUITY, CHILDREN, CREDIT, DAYS_BIRTH, DAYS_EMPLOYED, DAYS_LAST_PHONE_CHANGE,
    DAYS_REGISTRATION, DEF_LARGE_SOCIAL_CIRCLE, DEF_SMALL_SOCIAL_CIRCLE, DOC_01,
    DOC_02, DOC_03, DOC_04, DOC_05, DOC_06, DOC_07, DOC_08, DOC_09, DOC_10, DOC_11,
    DOC_12, DOC_13, DOC_14, DOC_15, DOC_16, DOC_17, DOC_18, DOC_19, DOC_20,
    EDUCATION, EMAIL, EMPLOYMENT, EMP_PHONE, FAMILY, FAMILY_STATUS, GENDER,
    GOODS_PRICE, HOME_PHONE, HOUSING_TYPE, INCOME, LIVE_CITY_NOT_WORK_CITY,
    LIVE_REGION_NOT_WORK_REGION, NAME_TYPE_SUITE, OBS_LARGE_SOCIAL_CIRCLE,
    OBS_SMALL_SOCIAL_CIRCLE, OCCUPATION, ORGANIZATION_TYPE, OTHER_MOBILE, OWN_CAR,
    OWN_MOBILE, OWN_REALTY, PROCESS_START_DAY, PROCESS_START_HOUR, REGION_RATING,
    REGION_RATING_CITY, REG_CITY_NOT_LIVE_CITY, REG_CITY_NOT_WORK_CITY,
    REG_REGION_NOT_LIVE_REGION, REG_REGION_NOT_WORK_REGION, WORK_PHONE

The following objects are masked from Group_1_ (pos = 8):

    ANNUITY, CHILDREN, CREDIT, DAYS_BIRTH, DAYS_EMPLOYED, DAYS_LAST_PHONE_CHANGE,
    DAYS_REGISTRATION, DEF_LARGE_SOCIAL_CIRCLE, DEF_SMALL_SOCIAL_CIRCLE, DOC_01,
    DOC_02, DOC_03, DOC_04, DOC_05, DOC_06, DOC_07, DOC_08, DOC_09, DOC_10, DOC_11,
    DOC_12, DOC_13, DOC_14, DOC_15, DOC_16, DOC_17, DOC_18, DOC_19, DOC_20,
    EDUCATION, EMAIL, EMPLOYMENT, EMP_PHONE, FAMILY, FAMILY_STATUS, GENDER,
    GOODS_PRICE, HOME_PHONE, HOUSING_TYPE, INCOME, LIVE_CITY_NOT_WORK_CITY,
    LIVE_REGION_NOT_WORK_REGION, NAME_TYPE_SUITE, OBS_LARGE_SOCIAL_CIRCLE,
    OBS_SMALL_SOCIAL_CIRCLE, OCCUPATION, ORGANIZATION_TYPE, OTHER_MOBILE, OWN_CAR,
    OWN_MOBILE, OWN_REALTY, PROCESS_START_DAY, PROCESS_START_HOUR, REGION_RATING,
    REGION_RATING_CITY, REG_CITY_NOT_LIVE_CITY, REG_CITY_NOT_WORK_CITY,
    REG_REGION_NOT_LIVE_REGION, REG_REGION_NOT_WORK_REGION, WORK_PHONE
# Regression Dependence on the basis of Income
model_1<-lm(ANNUITY~INCOME)
summary(model_1)

Call:
lm(formula = ANNUITY ~ INCOME)

Residuals:
    Min      1Q  Median      3Q     Max 
-109340   -8734   -1626    6664  142769 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 1.585e+04  2.356e+02   67.30   <2e-16 ***
INCOME      6.645e-02  1.207e-03   55.05   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 12840 on 11798 degrees of freedom
Multiple R-squared:  0.2043,    Adjusted R-squared:  0.2043 
F-statistic:  3030 on 1 and 11798 DF,  p-value: < 2.2e-16
plot(model_1)

# Regression dependence on the basis of gender.
model_2<-lm(ANNUITY~factor(GENDER)-1)
summary(model_2)

Call:
lm(formula = ANNUITY ~ factor(GENDER) - 1)

Residuals:
   Min     1Q Median     3Q    Max 
-25389 -10300  -2247   7349 151056 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
factor(GENDER)F    26090.6      162.9  160.19   <2e-16 ***
factor(GENDER)M    28943.6      225.1  128.58   <2e-16 ***
factor(GENDER)XNA  16312.5    10134.8    1.61    0.108    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 14330 on 11797 degrees of freedom
Multiple R-squared:  0.7815,    Adjusted R-squared:  0.7815 
F-statistic: 1.407e+04 on 3 and 11797 DF,  p-value: < 2.2e-16
plot(model_2)

# Regression dependence on the basis of ownership of home.
model_3<-lm(ANNUITY~factor(HOUSING_TYPE)-1)
summary(model_3)

Call:
lm(formula = ANNUITY ~ factor(HOUSING_TYPE) - 1)

Residuals:
   Min     1Q Median     3Q    Max 
-24583 -10576  -2179   7513 152756 

Coefficients:
                                        Estimate Std. Error t value Pr(>|t|)    
factor(HOUSING_TYPE)Co-op Apartment      27315.3     2193.7   12.45   <2e-16 ***
factor(HOUSING_TYPE)Municipal Apartment  26769.7      672.9   39.78   <2e-16 ***
factor(HOUSING_TYPE)Office Apartment     26870.6     1329.9   20.20   <2e-16 ***
factor(HOUSING_TYPE)Own Apartment        27244.0      140.8  193.45   <2e-16 ***
factor(HOUSING_TYPE)Rented Apartment     26482.0     1046.4   25.31   <2e-16 ***
factor(HOUSING_TYPE)With Parents         24281.1      607.3   39.98   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 14390 on 11794 degrees of freedom
Multiple R-squared:   0.78, Adjusted R-squared:  0.7799 
F-statistic:  6968 on 6 and 11794 DF,  p-value: < 2.2e-16
plot(model_3)

#Regression dependence on the basis of Education
model_4<-lm(ANNUITY~factor(EDUCATION)-1)
summary(model_4)

Call:
lm(formula = ANNUITY ~ factor(EDUCATION) - 1)

Residuals:
   Min     1Q Median     3Q    Max 
-27097 -10198  -2013   7314 149181 

Coefficients:
                                   Estimate Std. Error t value Pr(>|t|)    
factor(EDUCATION)Academic degree    49114.5     7115.9   6.902 5.39e-12 ***
factor(EDUCATION)Higher education   30818.9      264.9 116.333  < 2e-16 ***
factor(EDUCATION)Incomplete higher  26070.7      723.4  36.037  < 2e-16 ***
factor(EDUCATION)Lower secondary    24140.4     1202.8  20.070  < 2e-16 ***
factor(EDUCATION)Secondary          25862.7      155.4 166.384  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 14230 on 11795 degrees of freedom
Multiple R-squared:  0.7846,    Adjusted R-squared:  0.7845 
F-statistic:  8593 on 5 and 11795 DF,  p-value: < 2.2e-16
plot(model_4)

#Regression dependence on the basis of Employment
model_5<-lm(ANNUITY~factor(EMPLOYMENT)-1)
plot(model_5)
not plotting observations with leverage one:
  7280

not plotting observations with leverage one:
  7280

#Regression on employment and gender
model_6<-lm(ANNUITY~factor(EMPLOYMENT)-1+INCOME)
summary(model_6)

Call:
lm(formula = ANNUITY ~ factor(EMPLOYMENT) - 1 + INCOME)

Residuals:
    Min      1Q  Median      3Q     Max 
-106258   -8683   -1606    6633  142945 

Coefficients:
                                        Estimate Std. Error t value Pr(>|t|)    
factor(EMPLOYMENT)Commercial Associate 1.694e+04  3.484e+02  48.617   <2e-16 ***
factor(EMPLOYMENT)Pensioner            1.481e+04  3.293e+02  44.977   <2e-16 ***
factor(EMPLOYMENT)State Servant        1.724e+04  4.971e+02  34.671   <2e-16 ***
factor(EMPLOYMENT)Student              1.033e+04  1.282e+04   0.805    0.421    
factor(EMPLOYMENT)Unemployed           7.047e+03  9.070e+03   0.777    0.437    
factor(EMPLOYMENT)Working              1.604e+04  2.609e+02  61.477   <2e-16 ***
INCOME                                 6.484e-02  1.239e-03  52.334   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 12820 on 11793 degrees of freedom
Multiple R-squared:  0.8252,    Adjusted R-squared:  0.8251 
F-statistic:  7951 on 7 and 11793 DF,  p-value: < 2.2e-16
plot(model_6)
not plotting observations with leverage one:
  7280

not plotting observations with leverage one:
  7280

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.

LS0tCnRpdGxlOiAiUiBOb3RlYm9vayIKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKICB3b3JkX2RvY3VtZW50OiBkZWZhdWx0CiAgcGRmX2RvY3VtZW50OiBkZWZhdWx0Ci0tLQoKVGhpcyBpcyBhbiBbUiBNYXJrZG93bl0oaHR0cDovL3JtYXJrZG93bi5yc3R1ZGlvLmNvbSkgTm90ZWJvb2suIFdoZW4geW91IGV4ZWN1dGUgY29kZSB3aXRoaW4gdGhlIG5vdGVib29rLCB0aGUgcmVzdWx0cyBhcHBlYXIgYmVuZWF0aCB0aGUgY29kZS4gCgpUcnkgZXhlY3V0aW5nIHRoaXMgY2h1bmsgYnkgY2xpY2tpbmcgdGhlICpSdW4qIGJ1dHRvbiB3aXRoaW4gdGhlIGNodW5rIG9yIGJ5IHBsYWNpbmcgeW91ciBjdXJzb3IgaW5zaWRlIGl0IGFuZCBwcmVzc2luZyAqQ21kK1NoaWZ0K0VudGVyKi4gCgpgYGB7cn0KYXR0YWNoKEdyb3VwXzNfKQojIFJlZ3Jlc3Npb24gRGVwZW5kZW5jZSBvbiB0aGUgYmFzaXMgb2YgSW5jb21lCm1vZGVsXzE8LWxtKEFOTlVJVFl+SU5DT01FKQpzdW1tYXJ5KG1vZGVsXzEpCnBsb3QobW9kZWxfMSkKCiMgUmVncmVzc2lvbiBkZXBlbmRlbmNlIG9uIHRoZSBiYXNpcyBvZiBnZW5kZXIuCm1vZGVsXzI8LWxtKEFOTlVJVFl+ZmFjdG9yKEdFTkRFUiktMSkKc3VtbWFyeShtb2RlbF8yKQpwbG90KG1vZGVsXzIpCgojIFJlZ3Jlc3Npb24gZGVwZW5kZW5jZSBvbiB0aGUgYmFzaXMgb2Ygb3duZXJzaGlwIG9mIGhvbWUuCm1vZGVsXzM8LWxtKEFOTlVJVFl+ZmFjdG9yKEhPVVNJTkdfVFlQRSktMSkKc3VtbWFyeShtb2RlbF8zKQpwbG90KG1vZGVsXzMpCgojUmVncmVzc2lvbiBkZXBlbmRlbmNlIG9uIHRoZSBiYXNpcyBvZiBFZHVjYXRpb24KbW9kZWxfNDwtbG0oQU5OVUlUWX5mYWN0b3IoRURVQ0FUSU9OKS0xKQpzdW1tYXJ5KG1vZGVsXzQpCnBsb3QobW9kZWxfNCkKI1JlZ3Jlc3Npb24gZGVwZW5kZW5jZSBvbiB0aGUgYmFzaXMgb2YgRW1wbG95bWVudAptb2RlbF81PC1sbShBTk5VSVRZfmZhY3RvcihFTVBMT1lNRU5UKS0xKQpwbG90KG1vZGVsXzUpCiNSZWdyZXNzaW9uIG9uIGVtcGxveW1lbnQgYW5kIGdlbmRlcgptb2RlbF82PC1sbShBTk5VSVRZfmZhY3RvcihFTVBMT1lNRU5UKS0xK0lOQ09NRSkKc3VtbWFyeShtb2RlbF82KQpwbG90KG1vZGVsXzYpCmBgYAoKQWRkIGEgbmV3IGNodW5rIGJ5IGNsaWNraW5nIHRoZSAqSW5zZXJ0IENodW5rKiBidXR0b24gb24gdGhlIHRvb2xiYXIgb3IgYnkgcHJlc3NpbmcgKkNtZCtPcHRpb24rSSouCgpXaGVuIHlvdSBzYXZlIHRoZSBub3RlYm9vaywgYW4gSFRNTCBmaWxlIGNvbnRhaW5pbmcgdGhlIGNvZGUgYW5kIG91dHB1dCB3aWxsIGJlIHNhdmVkIGFsb25nc2lkZSBpdCAoY2xpY2sgdGhlICpQcmV2aWV3KiBidXR0b24gb3IgcHJlc3MgKkNtZCtTaGlmdCtLKiB0byBwcmV2aWV3IHRoZSBIVE1MIGZpbGUpLiAKClRoZSBwcmV2aWV3IHNob3dzIHlvdSBhIHJlbmRlcmVkIEhUTUwgY29weSBvZiB0aGUgY29udGVudHMgb2YgdGhlIGVkaXRvci4gQ29uc2VxdWVudGx5LCB1bmxpa2UgKktuaXQqLCAqUHJldmlldyogZG9lcyBub3QgcnVuIGFueSBSIGNvZGUgY2h1bmtzLiBJbnN0ZWFkLCB0aGUgb3V0cHV0IG9mIHRoZSBjaHVuayB3aGVuIGl0IHdhcyBsYXN0IHJ1biBpbiB0aGUgZWRpdG9yIGlzIGRpc3BsYXllZC4KCg==