Statistics in life

tjlee

Fri Sep 09 20:45:42 2022


統計是什麼?

Introduction

生活中的統計

weather forecast

天氣預報

【陰天有短暫雨,氣溫偏低,沿海空曠地區可能有強陣風。】

【更新時間:03/05 05:18】

昨天(4日)台南市中西區最低溫發生在半夜,低溫17.3度,中午高溫26.9度。

今天(5日)大陸冷氣團影響及華南雲雨區東移,臺南地區為陰天且有短暫雨, 清晨低溫約 23度, 高溫約18度, 建議攜帶雨具備用及衣物保暖。沿海空曠地區可能有強陣風,沿海及鄰近海域並有較大風浪,海邊活動請注意安全。

台南市一週預報

一周天氣
03/05 03/06 03/07 03/08 03/09 03/10 03/11
白天天氣 陰短暫雨 晴時多雲 多雲時晴 陰時晴 多雲時晴 晴時多雲 晴時多雲
白天溫度 23-21 28-21 27-24 29-25 27-23 25-18 26-18
晚上天氣 多雲 多雲短暫雨 陰短暫雨 多雲短暫雨 多雲 多雲 多雲
晚上溫度 21-18 25-23 25-22 26-23 23-18 22-19 23-20

資料出處

Map

你到過哪些國家?

##   Country
## 1     TWN
## 2     JPN
## 2 codes from your data successfully matched countries in the map
## 0 codes from your data failed to match with a country code in the map
## 241 codes from the map weren't represented in your data

location

你去過哪裡?

library(dplyr)
library(leaflet)
m <- leaflet() %>%
 addTiles() %>%  
 addMarkers(lng=120.31885387460432, 
            lat=22.6688240398752977, 
            popup="求真樓") %>%  
 addMarkers(lng=120.31822107036588, 
            lat=22.670444017273542, 
            popup="行政大樓")

m
dta9 <- list(Brunch = list(lat = c(22.670129444739622, 22.671108427618943),
                          lng = c(120.320389647869, 120.3206087585040)),
                          loc = c('食堂', '咖哩'))
m <- leaflet() %>% addTiles() %>%
  addMarkers(lat = dta9[[1]]$lat, lng = dta9[[1]]$lng, popup = dta9[[1]]$loc,
             icon = icons(iconUrl = dta9[[1]]$icon, iconWidth = 32, iconHeight = 32)) 
 
m

lottery

台灣彩券

sport lottery

運動彩券

Poll

2020 總統大選民調

Cards

probability

  • Sampling Distribution

  • Law of large numbers

https://onlinestatbook.com/stat_sim/sampling_dist/index.html

Your grades

Measure

統計量數

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Plots

You can also embed plots, for example:

library(gridExtra)
library(grid)
library(ggplot2)
library(lattice)
p1 <- ggplot(resultdata, 
             aes(lang, arith,color=IQsize)) +
  geom_point(size=rel(.5), 
             alpha=.5) +
  stat_smooth(method="lm", 
              formula=y ~ x,
              se=F,
              col='gray') +
  facet_wrap(~ IQsize) +
  labs(x="Language score", y="Arithmetic score") +
  theme_bw() 

p1

statistical inference

  • Example 飼料類型的雞重量

將新孵化的71隻小雞隨機分成6組,每組給予不同的飼料添加劑。六週後它們的重量(以克為單位)與飼料類型一起記錄。

str(chickwts)
## 'data.frame':    71 obs. of  2 variables:
##  $ weight: num  179 160 136 227 217 168 108 124 143 140 ...
##  $ feed  : Factor w/ 6 levels "casein","horsebean",..: 2 2 2 2 2 2 2 2 2 2 ...

Computing numerical summaries

aggregate(weight ~ feed, data = chickwts, mean)
##        feed   weight
## 1    casein 323.5833
## 2 horsebean 160.2000
## 3   linseed 218.7500
## 4  meatmeal 276.9091
## 5   soybean 246.4286
## 6 sunflower 328.9167

Including Plots

You can also embed plots, for example:

One-way analysis of variance

We can compare the effects of diet supplements on chick weights.

anova(lm(weight ~ feed, data = chickwts))
## Analysis of Variance Table
## 
## Response: weight
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## feed       5 231129   46226  15.365 5.936e-10 ***
## Residuals 65 195556    3009                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

software

SPSS

SAS

AMOS

R

https://www.r-project.org

excel (?)

變項

To be continued