統計是什麼?
Introduction
生活中的統計
weather forecast
天氣預報
【陰天有短暫雨,氣溫偏低,沿海空曠地區可能有強陣風。】
【更新時間:03/05 05:18】
昨天(4日)台南市中西區最低溫發生在半夜,低溫17.3度,中午高溫26.9度。
今天(5日)大陸冷氣團影響及華南雲雨區東移,臺南地區為陰天且有短暫雨, 清晨低溫約 23度, 高溫約18度, 建議攜帶雨具備用及衣物保暖。沿海空曠地區可能有強陣風,沿海及鄰近海域並有較大風浪,海邊活動請注意安全。
台南市一週預報
| 03/05 | 03/06 | 03/07 | 03/08 | 03/09 | 03/10 | 03/11 | |
|---|---|---|---|---|---|---|---|
| 白天天氣 | 陰短暫雨 | 晴時多雲 | 多雲時晴 | 陰時晴 | 多雲時晴 | 晴時多雲 | 晴時多雲 |
| 白天溫度 | 23-21 | 28-21 | 27-24 | 29-25 | 27-23 | 25-18 | 26-18 |
| 晚上天氣 | 多雲 | 多雲短暫雨 | 陰短暫雨 | 多雲短暫雨 | 多雲 | 多雲 | 多雲 |
| 晚上溫度 | 21-18 | 25-23 | 25-22 | 26-23 | 23-18 | 22-19 | 23-20 |
Map
你到過哪些國家?
## Country
## 1 TWN
## 2 JPN
## 2 codes from your data successfully matched countries in the map
## 0 codes from your data failed to match with a country code in the map
## 241 codes from the map weren't represented in your data
location
你去過哪裡?
library(dplyr)
library(leaflet)
m <- leaflet() %>%
addTiles() %>%
addMarkers(lng=120.31885387460432,
lat=22.6688240398752977,
popup="求真樓") %>%
addMarkers(lng=120.31822107036588,
lat=22.670444017273542,
popup="行政大樓")
m
dta9 <- list(Brunch = list(lat = c(22.670129444739622, 22.671108427618943),
lng = c(120.320389647869, 120.3206087585040)),
loc = c('食堂', '咖哩'))
m <- leaflet() %>% addTiles() %>%
addMarkers(lat = dta9[[1]]$lat, lng = dta9[[1]]$lng, popup = dta9[[1]]$loc,
icon = icons(iconUrl = dta9[[1]]$icon, iconWidth = 32, iconHeight = 32))
m
lottery
sport lottery
Poll
2020 總統大選民調
Cards
probability
Sampling Distribution
Law of large numbers
https://onlinestatbook.com/stat_sim/sampling_dist/index.html
Your grades
Measure
統計量數
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
Plots
You can also embed plots, for example:
library(gridExtra)
library(grid)
library(ggplot2)
library(lattice)
p1 <- ggplot(resultdata,
aes(lang, arith,color=IQsize)) +
geom_point(size=rel(.5),
alpha=.5) +
stat_smooth(method="lm",
formula=y ~ x,
se=F,
col='gray') +
facet_wrap(~ IQsize) +
labs(x="Language score", y="Arithmetic score") +
theme_bw()
p1
statistical inference
- Example 飼料類型的雞重量
將新孵化的71隻小雞隨機分成6組,每組給予不同的飼料添加劑。六週後它們的重量(以克為單位)與飼料類型一起記錄。
str(chickwts)
## 'data.frame': 71 obs. of 2 variables:
## $ weight: num 179 160 136 227 217 168 108 124 143 140 ...
## $ feed : Factor w/ 6 levels "casein","horsebean",..: 2 2 2 2 2 2 2 2 2 2 ...
Computing numerical summaries
aggregate(weight ~ feed, data = chickwts, mean)
## feed weight
## 1 casein 323.5833
## 2 horsebean 160.2000
## 3 linseed 218.7500
## 4 meatmeal 276.9091
## 5 soybean 246.4286
## 6 sunflower 328.9167
Including Plots
You can also embed plots, for example:
One-way analysis of variance
We can compare the effects of diet supplements on chick weights.
anova(lm(weight ~ feed, data = chickwts))
## Analysis of Variance Table
##
## Response: weight
## Df Sum Sq Mean Sq F value Pr(>F)
## feed 5 231129 46226 15.365 5.936e-10 ***
## Residuals 65 195556 3009
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
變項
To be continued