x <- 1:50
x <- 1:50
w <- 1 + sqrt(x)/2
example1 <- data.frame(x=x, y= x + rnorm(x)*w)
attach(example1)
## The following object is masked _by_ .GlobalEnv:
##
## x
Variables X &Y are created
fm <- lm(y ~ x)
summary(fm)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.2914 -2.4942 0.3117 2.3208 6.2038
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.25815 1.03345 -1.217 0.229
## x 1.08099 0.03527 30.648 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.599 on 48 degrees of freedom
## Multiple R-squared: 0.9514, Adjusted R-squared: 0.9504
## F-statistic: 939.3 on 1 and 48 DF, p-value: < 2.2e-16
lrf <- lowess(x, y)
plot(x, y)
lines(x, lrf$y)
abline(0, 1, lty=3)
abline(coef(fm))
#Data Explore: PUMS (Census Bureau’s "Public Use Microdata Survey)
load("/Users/new/Desktop/acs2017_ny/acs2017_ny_data.RData")
#glimpse(acs2017_ny) try this later
acs2017_ny[1:10,1:7]
## AGE female educ_nohs educ_hs educ_somecoll educ_college educ_advdeg
## 1 72 1 0 0 0 0 1
## 2 72 0 0 0 0 0 1
## 3 31 0 0 0 0 1 0
## 4 28 1 0 0 0 1 0
## 5 54 0 0 0 0 0 1
## 6 45 1 0 1 0 0 0
## 7 84 1 0 0 1 0 0
## 8 71 0 0 0 0 1 0
## 9 68 1 0 0 1 0 0
## 10 37 1 1 0 0 0 0
attach(acs2017_ny)
summary(acs2017_ny)
## AGE female educ_nohs educ_hs
## Min. : 0.00 Min. :0.0000 Min. :0.000 Min. :0.0000
## 1st Qu.:22.00 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.0000
## Median :42.00 Median :1.0000 Median :0.000 Median :0.0000
## Mean :41.57 Mean :0.5156 Mean :0.271 Mean :0.2804
## 3rd Qu.:60.00 3rd Qu.:1.0000 3rd Qu.:1.000 3rd Qu.:1.0000
## Max. :95.00 Max. :1.0000 Max. :1.000 Max. :1.0000
##
## educ_somecoll educ_college educ_advdeg SCHOOL
## Min. :0.000 Min. :0.0000 Min. :0.000 N/A : 5569
## 1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.000 No, not in school:144968
## Median :0.000 Median :0.0000 Median :0.000 Yes, in school : 46048
## Mean :0.173 Mean :0.1567 Mean :0.119 Missing : 0
## 3rd Qu.:0.000 3rd Qu.:0.0000 3rd Qu.:0.000
## Max. :1.000 Max. :1.0000 Max. :1.000
##
## EDUC
## Grade 12 :55119
## 4 years of college :30802
## 5+ years of college :23385
## 1 year of college :19947
## Nursery school to grade 4:14240
## 2 years of college :14065
## (Other) :39027
## EDUCD
## Regular high school diploma :35689
## Bachelor's degree :30802
## 1 or more years of college credit, no degree:19947
## Master's degree :17010
## Associate's degree, type not specified :14065
## Some college, but less than 1 year : 9086
## (Other) :69986
## DEGFIELD
## N/A :142398
## Business : 9802
## Education Administration and Teaching : 6708
## Social Sciences : 4836
## Medical and Health Sciences and Services: 3919
## Fine Arts : 3491
## (Other) : 25431
## DEGFIELDD
## N/A :142398
## Psychology : 2926
## Business Management and Administration: 2501
## Accounting : 2284
## General Education : 2238
## English Language and Literature : 2202
## (Other) : 42036
## DEGFIELD2
## N/A :190425
## Business : 972
## Social Sciences : 853
## Education Administration and Teaching: 611
## Fine Arts : 465
## Communications : 352
## (Other) : 2907
## DEGFIELD2D
## N/A :190425
## Psychology : 284
## Economics : 260
## Political Science and Government : 243
## Business Management and Administration : 217
## French, German, Latin and Other Common Foreign Language Studies: 205
## (Other) : 4951
## PUMA GQ OWNERSHP OWNERSHPD MORTGAGE
## Min. : 100 Min. :1.000 Min. :0.000 Min. : 0.00 Min. :0.000
## 1st Qu.:1500 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:12.00 1st Qu.:0.000
## Median :3201 Median :1.000 Median :1.000 Median :13.00 Median :1.000
## Mean :2713 Mean :1.148 Mean :1.266 Mean :14.95 Mean :1.453
## 3rd Qu.:3902 3rd Qu.:1.000 3rd Qu.:2.000 3rd Qu.:22.00 3rd Qu.:3.000
## Max. :4114 Max. :5.000 Max. :2.000 Max. :22.00 Max. :4.000
##
## OWNCOST RENT COSTELEC COSTGAS COSTWATR
## Min. : 0 Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 1208 1st Qu.: 0 1st Qu.: 960 1st Qu.: 840 1st Qu.: 320
## Median : 2891 Median : 0 Median :1560 Median :2400 Median :1400
## Mean :38582 Mean : 393 Mean :2311 Mean :5032 Mean :4836
## 3rd Qu.:99999 3rd Qu.: 630 3rd Qu.:2520 3rd Qu.:9993 3rd Qu.:9993
## Max. :99999 Max. :3800 Max. :9997 Max. :9997 Max. :9997
##
## COSTFUEL HHINCOME FOODSTMP LINGISOL
## Min. : 0 Min. : -11800 Min. :1.000 Min. :0.000
## 1st Qu.:9993 1st Qu.: 41600 1st Qu.:1.000 1st Qu.:1.000
## Median :9993 Median : 81700 Median :1.000 Median :1.000
## Mean :7935 Mean : 114902 Mean :1.147 Mean :1.002
## 3rd Qu.:9993 3rd Qu.: 140900 3rd Qu.:1.000 3rd Qu.:1.000
## Max. :9997 Max. :2030000 Max. :2.000 Max. :2.000
## NA's :10630
## ROOMS BUILTYR2 UNITSSTR FUELHEAT
## Min. : 0.000 Min. : 0.000 Min. : 0.00 Min. :0.000
## 1st Qu.: 4.000 1st Qu.: 1.000 1st Qu.: 3.00 1st Qu.:2.000
## Median : 6.000 Median : 3.000 Median : 3.00 Median :2.000
## Mean : 5.887 Mean : 3.711 Mean : 4.39 Mean :2.959
## 3rd Qu.: 8.000 3rd Qu.: 5.000 3rd Qu.: 6.00 3rd Qu.:4.000
## Max. :16.000 Max. :22.000 Max. :10.00 Max. :9.000
##
## SSMC FAMSIZE NCHILD NCHLT5
## Min. :0.00000 Min. : 1.000 Min. :0.0000 Min. :0.00000
## 1st Qu.:0.00000 1st Qu.: 2.000 1st Qu.:0.0000 1st Qu.:0.00000
## Median :0.00000 Median : 3.000 Median :0.0000 Median :0.00000
## Mean :0.01102 Mean : 3.087 Mean :0.5009 Mean :0.08441
## 3rd Qu.:0.00000 3rd Qu.: 4.000 3rd Qu.:1.0000 3rd Qu.:0.00000
## Max. :2.00000 Max. :19.000 Max. :9.0000 Max. :5.00000
##
## RELATE RELATED MARST RACE RACED
## Min. : 1.000 Min. : 101.0 Min. :1.000 Min. :1.00 Min. :100
## 1st Qu.: 1.000 1st Qu.: 101.0 1st Qu.:1.000 1st Qu.:1.00 1st Qu.:100
## Median : 2.000 Median : 201.0 Median :5.000 Median :1.00 Median :100
## Mean : 3.307 Mean : 335.6 Mean :3.742 Mean :2.03 Mean :205
## 3rd Qu.: 3.000 3rd Qu.: 301.0 3rd Qu.:6.000 3rd Qu.:2.00 3rd Qu.:200
## Max. :13.000 Max. :1301.0 Max. :6.000 Max. :9.00 Max. :990
##
## HISPAN HISPAND BPL
## Min. :0.0000 Min. : 0.00 New York :128517
## 1st Qu.:0.0000 1st Qu.: 0.00 West Indies : 8481
## Median :0.0000 Median : 0.00 China : 4964
## Mean :0.4153 Mean : 44.75 SOUTH AMERICA: 4957
## 3rd Qu.:0.0000 3rd Qu.: 0.00 India : 3476
## Max. :4.0000 Max. :498.00 Pennsylvania : 3303
## (Other) : 42887
## BPLD ANCESTR1
## New York :128517 Not Reported :32021
## China : 4116 Italian :20577
## Dominican Republic: 3517 Irish, various subheads,:16388
## Pennsylvania : 3303 German :12781
## New Jersey : 3127 African-American : 9559
## Puerto Rico : 2272 United States : 8209
## (Other) : 51733 (Other) :97050
## ANCESTR1D ANCESTR2
## Not Reported :32021 Not Reported:141487
## Italian (1990-2000, ACS, PRCS) :20577 German : 9476
## Irish :15651 Irish : 9238
## German (1990-2000, ACS/PRCS) :12605 English : 4895
## African-American (1990-2000, ACS, PRCS): 9559 Italian : 4531
## United States : 8209 Polish : 3113
## (Other) :97963 (Other) : 23845
## ANCESTR2D CITIZEN YRSUSA1
## Not Reported :141487 Min. :0.0000 Min. : 0.000
## German (1990-2000, ACS, PRCS) : 9441 1st Qu.:0.0000 1st Qu.: 0.000
## Irish : 8809 Median :0.0000 Median : 0.000
## English : 4895 Mean :0.4793 Mean : 5.377
## Italian (1990-2000, ACS, PRCS): 4531 3rd Qu.:0.0000 3rd Qu.: 0.000
## Polish : 3113 Max. :3.0000 Max. :92.000
## (Other) : 24309
## HCOVANY HCOVPRIV SEX EMPSTAT
## Min. :1.000 Min. :1.000 Male : 95222 Min. :0.000
## 1st Qu.:2.000 1st Qu.:1.000 Female:101363 1st Qu.:1.000
## Median :2.000 Median :2.000 Median :1.000
## Mean :1.951 Mean :1.691 Mean :1.514
## 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:3.000
## Max. :2.000 Max. :2.000 Max. :3.000
##
## EMPSTATD LABFORCE OCC IND
## Min. : 0.00 Min. :0.000 0 : 79987 0 :79987
## 1st Qu.:10.00 1st Qu.:1.000 2310 : 3494 7860 : 9025
## Median :10.00 Median :2.000 5700 : 3235 8680 : 6354
## Mean :15.16 Mean :1.331 430 : 3025 770 : 6279
## 3rd Qu.:30.00 3rd Qu.:2.000 4720 : 2666 8190 : 5873
## Max. :30.00 Max. :2.000 4760 : 2563 7870 : 4041
## (Other):101615 (Other):85026
## CLASSWKR CLASSWKRD WKSWORK2 UHRSWORK
## Min. :0.000 Min. : 0.00 Min. :0.000 Min. : 0.00
## 1st Qu.:0.000 1st Qu.: 0.00 1st Qu.:0.000 1st Qu.: 0.00
## Median :2.000 Median :22.00 Median :1.000 Median :12.00
## Mean :1.116 Mean :13.03 Mean :2.701 Mean :19.77
## 3rd Qu.:2.000 3rd Qu.:22.00 3rd Qu.:6.000 3rd Qu.:40.00
## Max. :2.000 Max. :29.00 Max. :6.000 Max. :99.00
##
## INCTOT FTOTINC INCWAGE POVERTY
## Min. : -7300 Min. : -11800 Min. : 0 Min. : 0.0
## 1st Qu.: 8000 1st Qu.: 35550 1st Qu.: 0 1st Qu.:159.0
## Median : 25000 Median : 74000 Median : 10000 Median :351.0
## Mean : 45245 Mean : 107110 Mean : 33796 Mean :318.7
## 3rd Qu.: 56500 3rd Qu.: 132438 3rd Qu.: 47000 3rd Qu.:501.0
## Max. :1563000 Max. :2030000 Max. :638000 Max. :501.0
## NA's :31129 NA's :10817 NA's :33427
## MIGRATE1 MIGRATE1D MIGPLAC1 MIGCOUNTY1
## Min. :0.000 Min. : 0.00 Min. : 0.000 Min. : 0.000
## 1st Qu.:1.000 1st Qu.:10.00 1st Qu.: 0.000 1st Qu.: 0.000
## Median :1.000 Median :10.00 Median : 0.000 Median : 0.000
## Mean :1.122 Mean :11.51 Mean : 6.184 Mean : 4.117
## 3rd Qu.:1.000 3rd Qu.:10.00 3rd Qu.: 0.000 3rd Qu.: 0.000
## Max. :4.000 Max. :40.00 Max. :900.000 Max. :810.000
##
## MIGPUMA1 VETSTAT VETSTATD PWPUMA00
## Min. : 0 Min. :0.0000 Min. : 0.000 Min. : 0
## 1st Qu.: 0 1st Qu.:1.0000 1st Qu.:11.000 1st Qu.: 0
## Median : 0 Median :1.0000 Median :11.000 Median : 0
## Mean : 277 Mean :0.8621 Mean : 9.412 Mean : 1255
## 3rd Qu.: 0 3rd Qu.:1.0000 3rd Qu.:11.000 3rd Qu.: 3100
## Max. :70100 Max. :2.0000 Max. :20.000 Max. :59300
##
## TRANWORK TRANTIME DEPARTS in_NYC
## Min. : 0.000 Min. : 0.00 Min. : 0.0 Min. :0.0000
## 1st Qu.: 0.000 1st Qu.: 0.00 1st Qu.: 0.0 1st Qu.:0.0000
## Median : 0.000 Median : 0.00 Median : 0.0 Median :0.0000
## Mean : 9.725 Mean : 14.75 Mean : 373.3 Mean :0.3615
## 3rd Qu.:10.000 3rd Qu.: 20.00 3rd Qu.: 732.0 3rd Qu.:1.0000
## Max. :70.000 Max. :138.00 Max. :2345.0 Max. :1.0000
##
## in_Bronx in_Manhattan in_StatenI in_Brooklyn
## Min. :0.0000 Min. :0.00000 Min. :0.00000 Min. :0.000
## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.000
## Median :0.0000 Median :0.00000 Median :0.00000 Median :0.000
## Mean :0.0538 Mean :0.04981 Mean :0.02084 Mean :0.126
## 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.000
## Max. :1.0000 Max. :1.00000 Max. :1.00000 Max. :1.000
##
## in_Queens in_Westchester in_Nassau Hispanic
## Min. :0.0000 Min. :0.00000 Min. :0.00000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.0000
## Median :0.0000 Median :0.00000 Median :0.00000 Median :0.0000
## Mean :0.1111 Mean :0.04413 Mean :0.07032 Mean :0.1387
## 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.0000
## Max. :1.0000 Max. :1.00000 Max. :1.00000 Max. :1.0000
##
## Hisp_Mex Hisp_PR Hisp_Cuban Hisp_DomR
## Min. :0.00000 Min. :0.0000 Min. :0.000000 Min. :0.00000
## 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.000000 1st Qu.:0.00000
## Median :0.00000 Median :0.0000 Median :0.000000 Median :0.00000
## Mean :0.01626 Mean :0.0436 Mean :0.003403 Mean :0.02827
## 3rd Qu.:0.00000 3rd Qu.:0.0000 3rd Qu.:0.000000 3rd Qu.:0.00000
## Max. :1.00000 Max. :1.0000 Max. :1.000000 Max. :1.00000
##
## white AfAm Amindian Asian
## Min. :0.0000 Min. :0.000 Min. :0.000000 Min. :0.00000
## 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.000000 1st Qu.:0.00000
## Median :1.0000 Median :0.000 Median :0.000000 Median :0.00000
## Mean :0.6997 Mean :0.125 Mean :0.003779 Mean :0.08656
## 3rd Qu.:1.0000 3rd Qu.:0.000 3rd Qu.:0.000000 3rd Qu.:0.00000
## Max. :1.0000 Max. :1.000 Max. :1.000000 Max. :1.00000
##
## race_oth unmarried veteran has_AnyHealthIns
## Min. :0.0000 Min. :0.00 Min. :0.00000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.00 1st Qu.:0.00000 1st Qu.:1.0000
## Median :0.0000 Median :0.00 Median :0.00000 Median :1.0000
## Mean :0.1324 Mean :0.45 Mean :0.04443 Mean :0.9513
## 3rd Qu.:0.0000 3rd Qu.:1.00 3rd Qu.:0.00000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.00 Max. :1.00000 Max. :1.0000
##
## has_PvtHealthIns Commute_car Commute_bus Commute_subway
## Min. :0.0000 Min. :0.0000 Min. :0.00000 Min. :0.00000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.00000
## Median :1.0000 Median :0.0000 Median :0.00000 Median :0.00000
## Mean :0.6906 Mean :0.2997 Mean :0.02162 Mean :0.07468
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.00000 3rd Qu.:0.00000
## Max. :1.0000 Max. :1.0000 Max. :1.00000 Max. :1.00000
##
## Commute_rail Commute_other below_povertyline below_150poverty
## Min. :0.00000 Min. :0.00000 Min. :0.000 Min. :0.0000
## 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.000 1st Qu.:0.0000
## Median :0.00000 Median :0.00000 Median :0.000 Median :0.0000
## Mean :0.01332 Mean :0.05506 Mean :0.122 Mean :0.1965
## 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.000 3rd Qu.:0.0000
## Max. :1.00000 Max. :1.00000 Max. :1.000 Max. :1.0000
##
## below_200poverty foodstamps
## Min. :0.0000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000
## Median :0.0000 Median :0.0000
## Mean :0.2676 Mean :0.1465
## 3rd Qu.:1.0000 3rd Qu.:0.0000
## Max. :1.0000 Max. :1.0000
##
print(NN_obs <- length(AGE))
## [1] 196585
the total population in the data.
summary(AGE[female == 1])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 23.00 44.00 42.72 61.00 95.00
summary(AGE[!female])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 21.00 40.00 40.35 59.00 95.00
note: female=1 (i.e. women) and those not female=1 (so logical not, denoted with the “!” symbol, i.e. men)
mean(AGE[female == 1])
## [1] 42.71629
sd(AGE[female == 1])
## [1] 23.72012
mean(AGE[!female])
## [1] 40.35398
sd(AGE[!female])
## [1] 23.1098
Q4. Tell me something else interesting, that you learned from the data, for example about educational attainments In different neighborhoods in the city. Are there surprises for you?
In New York city,the majority of people in NYC receive at least high school education among which people with regular high school diploma and Bachelor Degree holer rank the first top2 with 35689 and 30802 respectively.Only small portion quit school before primary education or even receive no education.
summary(EDUCD)
## N/A or no schooling
## 0
## N/A
## 5569
## No schooling completed
## 6310
## Nursery school to grade 4
## 0
## Nursery school, preschool
## 2760
## Kindergarten
## 2247
## Grade 1, 2, 3, or 4
## 0
## Grade 1
## 2131
## Grade 2
## 2276
## Grade 3
## 2418
## Grade 4
## 2408
## Grade 5, 6, 7, or 8
## 0
## Grade 5 or 6
## 0
## Grade 5
## 2835
## Grade 6
## 3365
## Grade 7 or 8
## 0
## Grade 7
## 2839
## Grade 8
## 4040
## Grade 9
## 4088
## Grade 10
## 4644
## Grade 11
## 5337
## Grade 12
## 0
## 12th grade, no diploma
## 3879
## High school graduate or GED
## 0
## Regular high school diploma
## 35689
## GED or alternative credential
## 6465
## Some college, but less than 1 year
## 9086
## 1 year of college
## 0
## 1 or more years of college credit, no degree
## 19947
## 2 years of college
## 0
## Associate's degree, type not specified
## 14065
## Associate's degree, occupational program
## 0
## Associate's degree, academic program
## 0
## 3 years of college
## 0
## 4 years of college
## 0
## Bachelor's degree
## 30802
## 5+ years of college
## 0
## 6 years of college (6+ in 1960-1970)
## 0
## 7 years of college
## 0
## 8+ years of college
## 0
## Master's degree
## 17010
## Professional degree beyond a bachelor's degree
## 4051
## Doctoral degree
## 2324
## Missing
## 0
plot(EDUCD)
summary(DEGFIELD)
## N/A
## 142398
## Agriculture
## 262
## Environment and Natural Resources
## 282
## Architecture
## 442
## Area, Ethnic, and Civilization Studies
## 258
## Communications
## 2105
## Communication Technologies
## 79
## Computer and Information Sciences
## 1530
## Cosmetology Services and Culinary Arts
## 50
## Education Administration and Teaching
## 6708
## Engineering
## 3145
## Engineering Technologies
## 246
## Linguistics and Foreign Languages
## 683
## Family and Consumer Sciences
## 272
## Law
## 124
## English Language, Literature, and Composition
## 2315
## Liberal Arts and Humanities
## 779
## Library Science
## 42
## Biology and Life Sciences
## 2361
## Mathematics and Statistics
## 904
## Military Technologies
## 3
## Interdisciplinary and Multi-Disciplinary Studies (General)
## 358
## Physical Fitness, Parks, Recreation, and Leisure
## 264
## Philosophy and Religious Studies
## 523
## Theology and Religious Vocations
## 259
## Physical Sciences
## 1597
## Nuclear, Industrial Radiology, and Biological Technologies
## 12
## Psychology
## 3208
## Criminal Justice and Fire Protection
## 884
## Public Affairs, Policy, and Social Work
## 789
## Social Sciences
## 4836
## Construction Services
## 46
## Electrical and Mechanic Repairs and Technologies
## 14
## Precision Production and Industrial Arts
## 0
## Transportation Sciences and Technologies
## 84
## Fine Arts
## 3491
## Medical and Health Sciences and Services
## 3919
## Business
## 9802
## History
## 1511
plot(DEGFIELD)
Before class, you should have done about 20 experiments where you roll the dice (or use the app) and record whether the result was a 6 or not. If you’ve got an app, then drawing integers from interval [1,6] is like fair dice; integers from [1,5] will be rather boring and never produce a 6; but integers from [1,8] or [1,10] will produce 6 but at a lower rate than fair dice.
[1] p(6)=0
roll_dice=function(n) sample(1:6,n,rep=T)
roll_dice(20)
## [1] 6 4 1 4 2 4 1 2 3 5 6 4 4 2 1 5 1 4 5 3
mean(roll_dice(20))
## [1] 4.15
sd(roll_dice(20))
## [1] 1.65434
p(6)=0.3
roll_dice(40)
## [1] 3 2 3 6 1 3 1 5 1 1 3 4 2 5 2 1 6 1 3 6 3 5 4 5 3 6 2 2 4 2 5 3 3 5 3 1 2 5
## [39] 5 4
mean(roll_dice(40))
## [1] 3.5
sd(roll_dice(40))
## [1] 1.648426
p(6)=0.225
## fingdings:
When I rolled dice physically, the probability of getting a "6" is much less than the expected probability each time rolling the dice 1/6. Judging from the previous streak, our intuition tend to believe I will continue to have a thin chance in the following roll to get more"6".
Differences in means can be complicated. Find the mean return on SP500 index (choose a time period). What is the mean return on days when the previous day’s return was positive? When the previous 2 days were positive? Negative? Now read about “hot hands fallacy” and tell if you think that helps investment strategy. (You might start with this tweet, and read the papers referenced.)
**a chart named return.pgn attached** just fail to insert
##the mean return from Aug. 3nd to Aug.10th is 0.447% during which it has experienced a consecutive six day of positive return.
##the previous two days when the return were positive from Aug.7th to Aug. 8th is 0.168%
##the mean return of Aug. 13th to Aug.14 is -0.111%.
##the mean return of the whole month(Aug.) is 0.32%.
#advice from observing the historical streaks
Historical prices of SP500 from Aug.03 to Sep.11 are selected to observe the return streaks of the market. From the beginning of August, the market enjoyed a consecutive 7 days’ rise in return which generated a mean return rate of 0.447%. When calculating the previous two days that the market started to generate positive return, the mean return rate is 0.57% much higher than that in a comparatively long rising period 0.447%. In this case, in my opinion, in a short period, when the market is in a thrive, investors may tend to have faith that the return rate will be more often to remain positive. Similarly, in the dice experiment, the first 20 times rolling dice physically, I got 0time of "6". I tended to think in the following rolls the chance is thin to get a 6--**Hot Hand Fallacy**. However, when the market has been increasing for a certain long period, to think marginally, our intuition may alert us that there might be a turnaround after a consecutive positive return from stock. In this situation, irrational investors may tend to go short and the negative inference becomes contagious to other investors.
So, my conclusion is that the streak of stock will lead us to hot hand fallacy or gambler fallacy. But when we go back to the simplest dice rolling game, we should also be aware the fact that the probability of getting any number from 1-6 is actually the same 1/6. So, same as the stock market. Since the daily return of the stock is an independent event, the probability of fall or rise is half and half. We should take other factors into consideration such as, government policy, performance of firms, other economic indexs ect.
```r
getwd()
## [1] "/Users/new/Desktop/ecob2000_leture1"