Part1.CODES REPRODUCTION

x <- 1:50
x <- 1:50
w <- 1 + sqrt(x)/2
example1 <- data.frame(x=x, y= x + rnorm(x)*w)
attach(example1)
## The following object is masked _by_ .GlobalEnv:
## 
##     x

Variables X &Y are created

fm <- lm(y ~ x)
summary(fm)
## 
## Call:
## lm(formula = y ~ x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.2914 -2.4942  0.3117  2.3208  6.2038 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.25815    1.03345  -1.217    0.229    
## x            1.08099    0.03527  30.648   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.599 on 48 degrees of freedom
## Multiple R-squared:  0.9514, Adjusted R-squared:  0.9504 
## F-statistic: 939.3 on 1 and 48 DF,  p-value: < 2.2e-16
lrf <- lowess(x, y)
plot(x, y)
lines(x, lrf$y)
abline(0, 1, lty=3)
abline(coef(fm))

#Data Explore: PUMS (Census Bureau’s "Public Use Microdata Survey)

load("/Users/new/Desktop/acs2017_ny/acs2017_ny_data.RData")
#glimpse(acs2017_ny) try this later
acs2017_ny[1:10,1:7]
##    AGE female educ_nohs educ_hs educ_somecoll educ_college educ_advdeg
## 1   72      1         0       0             0            0           1
## 2   72      0         0       0             0            0           1
## 3   31      0         0       0             0            1           0
## 4   28      1         0       0             0            1           0
## 5   54      0         0       0             0            0           1
## 6   45      1         0       1             0            0           0
## 7   84      1         0       0             1            0           0
## 8   71      0         0       0             0            1           0
## 9   68      1         0       0             1            0           0
## 10  37      1         1       0             0            0           0
attach(acs2017_ny)
summary(acs2017_ny)
##       AGE            female         educ_nohs        educ_hs      
##  Min.   : 0.00   Min.   :0.0000   Min.   :0.000   Min.   :0.0000  
##  1st Qu.:22.00   1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.0000  
##  Median :42.00   Median :1.0000   Median :0.000   Median :0.0000  
##  Mean   :41.57   Mean   :0.5156   Mean   :0.271   Mean   :0.2804  
##  3rd Qu.:60.00   3rd Qu.:1.0000   3rd Qu.:1.000   3rd Qu.:1.0000  
##  Max.   :95.00   Max.   :1.0000   Max.   :1.000   Max.   :1.0000  
##                                                                   
##  educ_somecoll    educ_college     educ_advdeg                  SCHOOL      
##  Min.   :0.000   Min.   :0.0000   Min.   :0.000   N/A              :  5569  
##  1st Qu.:0.000   1st Qu.:0.0000   1st Qu.:0.000   No, not in school:144968  
##  Median :0.000   Median :0.0000   Median :0.000   Yes, in school   : 46048  
##  Mean   :0.173   Mean   :0.1567   Mean   :0.119   Missing          :     0  
##  3rd Qu.:0.000   3rd Qu.:0.0000   3rd Qu.:0.000                             
##  Max.   :1.000   Max.   :1.0000   Max.   :1.000                             
##                                                                             
##                         EDUC      
##  Grade 12                 :55119  
##  4 years of college       :30802  
##  5+ years of college      :23385  
##  1 year of college        :19947  
##  Nursery school to grade 4:14240  
##  2 years of college       :14065  
##  (Other)                  :39027  
##                                           EDUCD      
##  Regular high school diploma                 :35689  
##  Bachelor's degree                           :30802  
##  1 or more years of college credit, no degree:19947  
##  Master's degree                             :17010  
##  Associate's degree, type not specified      :14065  
##  Some college, but less than 1 year          : 9086  
##  (Other)                                     :69986  
##                                      DEGFIELD     
##  N/A                                     :142398  
##  Business                                :  9802  
##  Education Administration and Teaching   :  6708  
##  Social Sciences                         :  4836  
##  Medical and Health Sciences and Services:  3919  
##  Fine Arts                               :  3491  
##  (Other)                                 : 25431  
##                                   DEGFIELDD     
##  N/A                                   :142398  
##  Psychology                            :  2926  
##  Business Management and Administration:  2501  
##  Accounting                            :  2284  
##  General Education                     :  2238  
##  English Language and Literature       :  2202  
##  (Other)                               : 42036  
##                                  DEGFIELD2     
##  N/A                                  :190425  
##  Business                             :   972  
##  Social Sciences                      :   853  
##  Education Administration and Teaching:   611  
##  Fine Arts                            :   465  
##  Communications                       :   352  
##  (Other)                              :  2907  
##                                                            DEGFIELD2D    
##  N/A                                                            :190425  
##  Psychology                                                     :   284  
##  Economics                                                      :   260  
##  Political Science and Government                               :   243  
##  Business Management and Administration                         :   217  
##  French, German, Latin and Other Common Foreign Language Studies:   205  
##  (Other)                                                        :  4951  
##       PUMA            GQ           OWNERSHP       OWNERSHPD        MORTGAGE    
##  Min.   : 100   Min.   :1.000   Min.   :0.000   Min.   : 0.00   Min.   :0.000  
##  1st Qu.:1500   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:12.00   1st Qu.:0.000  
##  Median :3201   Median :1.000   Median :1.000   Median :13.00   Median :1.000  
##  Mean   :2713   Mean   :1.148   Mean   :1.266   Mean   :14.95   Mean   :1.453  
##  3rd Qu.:3902   3rd Qu.:1.000   3rd Qu.:2.000   3rd Qu.:22.00   3rd Qu.:3.000  
##  Max.   :4114   Max.   :5.000   Max.   :2.000   Max.   :22.00   Max.   :4.000  
##                                                                                
##     OWNCOST           RENT         COSTELEC       COSTGAS        COSTWATR   
##  Min.   :    0   Min.   :   0   Min.   :   0   Min.   :   0   Min.   :   0  
##  1st Qu.: 1208   1st Qu.:   0   1st Qu.: 960   1st Qu.: 840   1st Qu.: 320  
##  Median : 2891   Median :   0   Median :1560   Median :2400   Median :1400  
##  Mean   :38582   Mean   : 393   Mean   :2311   Mean   :5032   Mean   :4836  
##  3rd Qu.:99999   3rd Qu.: 630   3rd Qu.:2520   3rd Qu.:9993   3rd Qu.:9993  
##  Max.   :99999   Max.   :3800   Max.   :9997   Max.   :9997   Max.   :9997  
##                                                                             
##     COSTFUEL       HHINCOME          FOODSTMP        LINGISOL    
##  Min.   :   0   Min.   : -11800   Min.   :1.000   Min.   :0.000  
##  1st Qu.:9993   1st Qu.:  41600   1st Qu.:1.000   1st Qu.:1.000  
##  Median :9993   Median :  81700   Median :1.000   Median :1.000  
##  Mean   :7935   Mean   : 114902   Mean   :1.147   Mean   :1.002  
##  3rd Qu.:9993   3rd Qu.: 140900   3rd Qu.:1.000   3rd Qu.:1.000  
##  Max.   :9997   Max.   :2030000   Max.   :2.000   Max.   :2.000  
##                 NA's   :10630                                    
##      ROOMS           BUILTYR2         UNITSSTR        FUELHEAT    
##  Min.   : 0.000   Min.   : 0.000   Min.   : 0.00   Min.   :0.000  
##  1st Qu.: 4.000   1st Qu.: 1.000   1st Qu.: 3.00   1st Qu.:2.000  
##  Median : 6.000   Median : 3.000   Median : 3.00   Median :2.000  
##  Mean   : 5.887   Mean   : 3.711   Mean   : 4.39   Mean   :2.959  
##  3rd Qu.: 8.000   3rd Qu.: 5.000   3rd Qu.: 6.00   3rd Qu.:4.000  
##  Max.   :16.000   Max.   :22.000   Max.   :10.00   Max.   :9.000  
##                                                                   
##       SSMC            FAMSIZE           NCHILD           NCHLT5       
##  Min.   :0.00000   Min.   : 1.000   Min.   :0.0000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.: 2.000   1st Qu.:0.0000   1st Qu.:0.00000  
##  Median :0.00000   Median : 3.000   Median :0.0000   Median :0.00000  
##  Mean   :0.01102   Mean   : 3.087   Mean   :0.5009   Mean   :0.08441  
##  3rd Qu.:0.00000   3rd Qu.: 4.000   3rd Qu.:1.0000   3rd Qu.:0.00000  
##  Max.   :2.00000   Max.   :19.000   Max.   :9.0000   Max.   :5.00000  
##                                                                       
##      RELATE          RELATED           MARST            RACE          RACED    
##  Min.   : 1.000   Min.   : 101.0   Min.   :1.000   Min.   :1.00   Min.   :100  
##  1st Qu.: 1.000   1st Qu.: 101.0   1st Qu.:1.000   1st Qu.:1.00   1st Qu.:100  
##  Median : 2.000   Median : 201.0   Median :5.000   Median :1.00   Median :100  
##  Mean   : 3.307   Mean   : 335.6   Mean   :3.742   Mean   :2.03   Mean   :205  
##  3rd Qu.: 3.000   3rd Qu.: 301.0   3rd Qu.:6.000   3rd Qu.:2.00   3rd Qu.:200  
##  Max.   :13.000   Max.   :1301.0   Max.   :6.000   Max.   :9.00   Max.   :990  
##                                                                                
##      HISPAN          HISPAND                  BPL        
##  Min.   :0.0000   Min.   :  0.00   New York     :128517  
##  1st Qu.:0.0000   1st Qu.:  0.00   West Indies  :  8481  
##  Median :0.0000   Median :  0.00   China        :  4964  
##  Mean   :0.4153   Mean   : 44.75   SOUTH AMERICA:  4957  
##  3rd Qu.:0.0000   3rd Qu.:  0.00   India        :  3476  
##  Max.   :4.0000   Max.   :498.00   Pennsylvania :  3303  
##                                    (Other)      : 42887  
##                  BPLD                            ANCESTR1    
##  New York          :128517   Not Reported            :32021  
##  China             :  4116   Italian                 :20577  
##  Dominican Republic:  3517   Irish, various subheads,:16388  
##  Pennsylvania      :  3303   German                  :12781  
##  New Jersey        :  3127   African-American        : 9559  
##  Puerto Rico       :  2272   United States           : 8209  
##  (Other)           : 51733   (Other)                 :97050  
##                                    ANCESTR1D             ANCESTR2     
##  Not Reported                           :32021   Not Reported:141487  
##  Italian (1990-2000, ACS, PRCS)         :20577   German      :  9476  
##  Irish                                  :15651   Irish       :  9238  
##  German (1990-2000, ACS/PRCS)           :12605   English     :  4895  
##  African-American (1990-2000, ACS, PRCS): 9559   Italian     :  4531  
##  United States                          : 8209   Polish      :  3113  
##  (Other)                                :97963   (Other)     : 23845  
##                           ANCESTR2D         CITIZEN          YRSUSA1      
##  Not Reported                  :141487   Min.   :0.0000   Min.   : 0.000  
##  German (1990-2000, ACS, PRCS) :  9441   1st Qu.:0.0000   1st Qu.: 0.000  
##  Irish                         :  8809   Median :0.0000   Median : 0.000  
##  English                       :  4895   Mean   :0.4793   Mean   : 5.377  
##  Italian (1990-2000, ACS, PRCS):  4531   3rd Qu.:0.0000   3rd Qu.: 0.000  
##  Polish                        :  3113   Max.   :3.0000   Max.   :92.000  
##  (Other)                       : 24309                                    
##     HCOVANY         HCOVPRIV         SEX            EMPSTAT     
##  Min.   :1.000   Min.   :1.000   Male  : 95222   Min.   :0.000  
##  1st Qu.:2.000   1st Qu.:1.000   Female:101363   1st Qu.:1.000  
##  Median :2.000   Median :2.000                   Median :1.000  
##  Mean   :1.951   Mean   :1.691                   Mean   :1.514  
##  3rd Qu.:2.000   3rd Qu.:2.000                   3rd Qu.:3.000  
##  Max.   :2.000   Max.   :2.000                   Max.   :3.000  
##                                                                 
##     EMPSTATD        LABFORCE          OCC              IND       
##  Min.   : 0.00   Min.   :0.000   0      : 79987   0      :79987  
##  1st Qu.:10.00   1st Qu.:1.000   2310   :  3494   7860   : 9025  
##  Median :10.00   Median :2.000   5700   :  3235   8680   : 6354  
##  Mean   :15.16   Mean   :1.331   430    :  3025   770    : 6279  
##  3rd Qu.:30.00   3rd Qu.:2.000   4720   :  2666   8190   : 5873  
##  Max.   :30.00   Max.   :2.000   4760   :  2563   7870   : 4041  
##                                  (Other):101615   (Other):85026  
##     CLASSWKR       CLASSWKRD        WKSWORK2        UHRSWORK    
##  Min.   :0.000   Min.   : 0.00   Min.   :0.000   Min.   : 0.00  
##  1st Qu.:0.000   1st Qu.: 0.00   1st Qu.:0.000   1st Qu.: 0.00  
##  Median :2.000   Median :22.00   Median :1.000   Median :12.00  
##  Mean   :1.116   Mean   :13.03   Mean   :2.701   Mean   :19.77  
##  3rd Qu.:2.000   3rd Qu.:22.00   3rd Qu.:6.000   3rd Qu.:40.00  
##  Max.   :2.000   Max.   :29.00   Max.   :6.000   Max.   :99.00  
##                                                                 
##      INCTOT           FTOTINC           INCWAGE          POVERTY     
##  Min.   :  -7300   Min.   : -11800   Min.   :     0   Min.   :  0.0  
##  1st Qu.:   8000   1st Qu.:  35550   1st Qu.:     0   1st Qu.:159.0  
##  Median :  25000   Median :  74000   Median : 10000   Median :351.0  
##  Mean   :  45245   Mean   : 107110   Mean   : 33796   Mean   :318.7  
##  3rd Qu.:  56500   3rd Qu.: 132438   3rd Qu.: 47000   3rd Qu.:501.0  
##  Max.   :1563000   Max.   :2030000   Max.   :638000   Max.   :501.0  
##  NA's   :31129     NA's   :10817     NA's   :33427                   
##     MIGRATE1       MIGRATE1D        MIGPLAC1         MIGCOUNTY1     
##  Min.   :0.000   Min.   : 0.00   Min.   :  0.000   Min.   :  0.000  
##  1st Qu.:1.000   1st Qu.:10.00   1st Qu.:  0.000   1st Qu.:  0.000  
##  Median :1.000   Median :10.00   Median :  0.000   Median :  0.000  
##  Mean   :1.122   Mean   :11.51   Mean   :  6.184   Mean   :  4.117  
##  3rd Qu.:1.000   3rd Qu.:10.00   3rd Qu.:  0.000   3rd Qu.:  0.000  
##  Max.   :4.000   Max.   :40.00   Max.   :900.000   Max.   :810.000  
##                                                                     
##     MIGPUMA1        VETSTAT          VETSTATD         PWPUMA00    
##  Min.   :    0   Min.   :0.0000   Min.   : 0.000   Min.   :    0  
##  1st Qu.:    0   1st Qu.:1.0000   1st Qu.:11.000   1st Qu.:    0  
##  Median :    0   Median :1.0000   Median :11.000   Median :    0  
##  Mean   :  277   Mean   :0.8621   Mean   : 9.412   Mean   : 1255  
##  3rd Qu.:    0   3rd Qu.:1.0000   3rd Qu.:11.000   3rd Qu.: 3100  
##  Max.   :70100   Max.   :2.0000   Max.   :20.000   Max.   :59300  
##                                                                   
##     TRANWORK         TRANTIME         DEPARTS           in_NYC      
##  Min.   : 0.000   Min.   :  0.00   Min.   :   0.0   Min.   :0.0000  
##  1st Qu.: 0.000   1st Qu.:  0.00   1st Qu.:   0.0   1st Qu.:0.0000  
##  Median : 0.000   Median :  0.00   Median :   0.0   Median :0.0000  
##  Mean   : 9.725   Mean   : 14.75   Mean   : 373.3   Mean   :0.3615  
##  3rd Qu.:10.000   3rd Qu.: 20.00   3rd Qu.: 732.0   3rd Qu.:1.0000  
##  Max.   :70.000   Max.   :138.00   Max.   :2345.0   Max.   :1.0000  
##                                                                     
##     in_Bronx       in_Manhattan       in_StatenI       in_Brooklyn   
##  Min.   :0.0000   Min.   :0.00000   Min.   :0.00000   Min.   :0.000  
##  1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.000  
##  Median :0.0000   Median :0.00000   Median :0.00000   Median :0.000  
##  Mean   :0.0538   Mean   :0.04981   Mean   :0.02084   Mean   :0.126  
##  3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.000  
##  Max.   :1.0000   Max.   :1.00000   Max.   :1.00000   Max.   :1.000  
##                                                                      
##    in_Queens      in_Westchester      in_Nassau          Hispanic     
##  Min.   :0.0000   Min.   :0.00000   Min.   :0.00000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.00000   Median :0.00000   Median :0.0000  
##  Mean   :0.1111   Mean   :0.04413   Mean   :0.07032   Mean   :0.1387  
##  3rd Qu.:0.0000   3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.00000   Max.   :1.00000   Max.   :1.0000  
##                                                                       
##     Hisp_Mex          Hisp_PR         Hisp_Cuban         Hisp_DomR      
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.000000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.000000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.0000   Median :0.000000   Median :0.00000  
##  Mean   :0.01626   Mean   :0.0436   Mean   :0.003403   Mean   :0.02827  
##  3rd Qu.:0.00000   3rd Qu.:0.0000   3rd Qu.:0.000000   3rd Qu.:0.00000  
##  Max.   :1.00000   Max.   :1.0000   Max.   :1.000000   Max.   :1.00000  
##                                                                         
##      white             AfAm          Amindian            Asian        
##  Min.   :0.0000   Min.   :0.000   Min.   :0.000000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:0.000   1st Qu.:0.000000   1st Qu.:0.00000  
##  Median :1.0000   Median :0.000   Median :0.000000   Median :0.00000  
##  Mean   :0.6997   Mean   :0.125   Mean   :0.003779   Mean   :0.08656  
##  3rd Qu.:1.0000   3rd Qu.:0.000   3rd Qu.:0.000000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :1.000   Max.   :1.000000   Max.   :1.00000  
##                                                                       
##     race_oth        unmarried       veteran        has_AnyHealthIns
##  Min.   :0.0000   Min.   :0.00   Min.   :0.00000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.00   1st Qu.:0.00000   1st Qu.:1.0000  
##  Median :0.0000   Median :0.00   Median :0.00000   Median :1.0000  
##  Mean   :0.1324   Mean   :0.45   Mean   :0.04443   Mean   :0.9513  
##  3rd Qu.:0.0000   3rd Qu.:1.00   3rd Qu.:0.00000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.00   Max.   :1.00000   Max.   :1.0000  
##                                                                    
##  has_PvtHealthIns  Commute_car      Commute_bus      Commute_subway   
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :1.0000   Median :0.0000   Median :0.00000   Median :0.00000  
##  Mean   :0.6906   Mean   :0.2997   Mean   :0.02162   Mean   :0.07468  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:0.00000   3rd Qu.:0.00000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.00000   Max.   :1.00000  
##                                                                       
##   Commute_rail     Commute_other     below_povertyline below_150poverty
##  Min.   :0.00000   Min.   :0.00000   Min.   :0.000     Min.   :0.0000  
##  1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.000     1st Qu.:0.0000  
##  Median :0.00000   Median :0.00000   Median :0.000     Median :0.0000  
##  Mean   :0.01332   Mean   :0.05506   Mean   :0.122     Mean   :0.1965  
##  3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.000     3rd Qu.:0.0000  
##  Max.   :1.00000   Max.   :1.00000   Max.   :1.000     Max.   :1.0000  
##                                                                        
##  below_200poverty   foodstamps    
##  Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :0.0000   Median :0.0000  
##  Mean   :0.2676   Mean   :0.1465  
##  3rd Qu.:1.0000   3rd Qu.:0.0000  
##  Max.   :1.0000   Max.   :1.0000  
## 
print(NN_obs <- length(AGE))
## [1] 196585

the total population in the data.

Average Age of Men&Women

summary(AGE[female == 1])
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   23.00   44.00   42.72   61.00   95.00
summary(AGE[!female])
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   21.00   40.00   40.35   59.00   95.00

note: female=1 (i.e. women) and those not female=1 (so logical not, denoted with the “!” symbol, i.e. men)

Mean&Standard Deviation Function

mean(AGE[female == 1])
## [1] 42.71629
sd(AGE[female == 1])
## [1] 23.72012
mean(AGE[!female])
## [1] 40.35398
sd(AGE[!female])
## [1] 23.1098

Educational Attainments in NYC

Q4. Tell me something else interesting, that you learned from the data, for example about educational attainments In different neighborhoods in the city. Are there surprises for you?

In New York city,the majority of people in NYC receive at least high school education among which people with regular high school diploma and Bachelor Degree holer rank the first top2 with 35689 and 30802 respectively.Only small portion quit school before primary education or even receive no education.

summary(EDUCD)
##                            N/A or no schooling 
##                                              0 
##                                            N/A 
##                                           5569 
##                         No schooling completed 
##                                           6310 
##                      Nursery school to grade 4 
##                                              0 
##                      Nursery school, preschool 
##                                           2760 
##                                   Kindergarten 
##                                           2247 
##                            Grade 1, 2, 3, or 4 
##                                              0 
##                                        Grade 1 
##                                           2131 
##                                        Grade 2 
##                                           2276 
##                                        Grade 3 
##                                           2418 
##                                        Grade 4 
##                                           2408 
##                            Grade 5, 6, 7, or 8 
##                                              0 
##                                   Grade 5 or 6 
##                                              0 
##                                        Grade 5 
##                                           2835 
##                                        Grade 6 
##                                           3365 
##                                   Grade 7 or 8 
##                                              0 
##                                        Grade 7 
##                                           2839 
##                                        Grade 8 
##                                           4040 
##                                        Grade 9 
##                                           4088 
##                                       Grade 10 
##                                           4644 
##                                       Grade 11 
##                                           5337 
##                                       Grade 12 
##                                              0 
##                         12th grade, no diploma 
##                                           3879 
##                    High school graduate or GED 
##                                              0 
##                    Regular high school diploma 
##                                          35689 
##                  GED or alternative credential 
##                                           6465 
##             Some college, but less than 1 year 
##                                           9086 
##                              1 year of college 
##                                              0 
##   1 or more years of college credit, no degree 
##                                          19947 
##                             2 years of college 
##                                              0 
##         Associate's degree, type not specified 
##                                          14065 
##       Associate's degree, occupational program 
##                                              0 
##           Associate's degree, academic program 
##                                              0 
##                             3 years of college 
##                                              0 
##                             4 years of college 
##                                              0 
##                              Bachelor's degree 
##                                          30802 
##                            5+ years of college 
##                                              0 
##           6 years of college (6+ in 1960-1970) 
##                                              0 
##                             7 years of college 
##                                              0 
##                            8+ years of college 
##                                              0 
##                                Master's degree 
##                                          17010 
## Professional degree beyond a bachelor's degree 
##                                           4051 
##                                Doctoral degree 
##                                           2324 
##                                        Missing 
##                                              0
plot(EDUCD)

summary(DEGFIELD)
##                                                        N/A 
##                                                     142398 
##                                                Agriculture 
##                                                        262 
##                          Environment and Natural Resources 
##                                                        282 
##                                               Architecture 
##                                                        442 
##                     Area, Ethnic, and Civilization Studies 
##                                                        258 
##                                             Communications 
##                                                       2105 
##                                 Communication Technologies 
##                                                         79 
##                          Computer and Information Sciences 
##                                                       1530 
##                     Cosmetology Services and Culinary Arts 
##                                                         50 
##                      Education Administration and Teaching 
##                                                       6708 
##                                                Engineering 
##                                                       3145 
##                                   Engineering Technologies 
##                                                        246 
##                          Linguistics and Foreign Languages 
##                                                        683 
##                               Family and Consumer Sciences 
##                                                        272 
##                                                        Law 
##                                                        124 
##              English Language, Literature, and Composition 
##                                                       2315 
##                                Liberal Arts and Humanities 
##                                                        779 
##                                            Library Science 
##                                                         42 
##                                  Biology and Life Sciences 
##                                                       2361 
##                                 Mathematics and Statistics 
##                                                        904 
##                                      Military Technologies 
##                                                          3 
## Interdisciplinary and Multi-Disciplinary Studies (General) 
##                                                        358 
##           Physical Fitness, Parks, Recreation, and Leisure 
##                                                        264 
##                           Philosophy and Religious Studies 
##                                                        523 
##                           Theology and Religious Vocations 
##                                                        259 
##                                          Physical Sciences 
##                                                       1597 
## Nuclear, Industrial Radiology, and Biological Technologies 
##                                                         12 
##                                                 Psychology 
##                                                       3208 
##                       Criminal Justice and Fire Protection 
##                                                        884 
##                    Public Affairs, Policy, and Social Work 
##                                                        789 
##                                            Social Sciences 
##                                                       4836 
##                                      Construction Services 
##                                                         46 
##           Electrical and Mechanic Repairs and Technologies 
##                                                         14 
##                   Precision Production and Industrial Arts 
##                                                          0 
##                   Transportation Sciences and Technologies 
##                                                         84 
##                                                  Fine Arts 
##                                                       3491 
##                   Medical and Health Sciences and Services 
##                                                       3919 
##                                                   Business 
##                                                       9802 
##                                                    History 
##                                                       1511
plot(DEGFIELD)

Q4. Dice experiments

Before class, you should have done about 20 experiments where you roll the dice (or use the app) and record whether the result was a 6 or not. If you’ve got an app, then drawing integers from interval [1,6] is like fair dice; integers from [1,5] will be rather boring and never produce a 6; but integers from [1,8] or [1,10] will produce 6 but at a lower rate than fair dice.

Rolling the dice physically

[1] p(6)=0

1.Fair Dice

roll_dice=function(n) sample(1:6,n,rep=T)
roll_dice(20)
##  [1] 6 4 1 4 2 4 1 2 3 5 6 4 4 2 1 5 1 4 5 3
mean(roll_dice(20))
## [1] 4.15
sd(roll_dice(20))
## [1] 1.65434

p(6)=0.3

roll_dice(40)
##  [1] 3 2 3 6 1 3 1 5 1 1 3 4 2 5 2 1 6 1 3 6 3 5 4 5 3 6 2 2 4 2 5 3 3 5 3 1 2 5
## [39] 5 4
mean(roll_dice(40))
## [1] 3.5
sd(roll_dice(40))
## [1] 1.648426
p(6)=0.225


## fingdings:
When I rolled dice physically, the probability of getting a "6" is much less than the expected probability each time rolling the dice 1/6. Judging from the previous streak, our intuition tend to believe I will continue to have a thin chance in the following roll to get more"6".

Q5. S&P500

Differences in means can be complicated. Find the mean return on SP500 index (choose a time period). What is the mean return on days when the previous day’s return was positive? When the previous 2 days were positive? Negative? Now read about “hot hands fallacy” and tell if you think that helps investment strategy. (You might start with this tweet, and read the papers referenced.)

Data Source:Yahoo Finance

  
**a chart named return.pgn attached**  just fail to insert
##the mean return from Aug. 3nd to Aug.10th is 0.447% during which it has experienced a consecutive six day of positive return.

##the previous two days when the return were positive from Aug.7th to Aug. 8th is 0.168%

##the mean return of Aug. 13th to Aug.14 is -0.111%.

##the mean return of the whole month(Aug.) is 0.32%.



#advice from observing the historical streaks

Historical prices of SP500 from Aug.03 to Sep.11 are selected to observe the return streaks of the market. From the beginning of August, the market enjoyed a consecutive 7 days’ rise in return which generated a mean return rate of 0.447%. When calculating the previous two days that the market started to generate positive return, the mean return rate is 0.57% much higher than that in a comparatively long rising period 0.447%. In this case, in my opinion, in a short period, when the market is in a thrive, investors may tend to have faith that the return rate will be more often to remain positive. Similarly, in the dice experiment, the first 20 times rolling dice physically, I got 0time of "6". I tended to think in the following rolls the chance is thin to get a 6--**Hot Hand Fallacy**. However, when the market has been increasing for a certain long period, to think marginally, our intuition may alert us that there might be a turnaround after a consecutive positive return from stock. In this situation, irrational investors may tend to go short and the negative inference becomes contagious to other investors. 
So, my conclusion is that the streak of stock will lead us to hot hand fallacy or gambler fallacy. But when we go back to the simplest dice rolling game, we should also be aware the fact that the probability of getting any number from 1-6 is actually the same 1/6. So, same as the stock market. Since the daily return of the stock is an independent event, the probability of fall or rise is half and half. We should take other factors into consideration such as, government policy, performance of firms, other economic indexs ect.

```r
getwd()
## [1] "/Users/new/Desktop/ecob2000_leture1"