# 第一回目ではコメントアウトして,パッケージをインストールする。
# install.packages("tidyverse")
# install.packages("rstatix")
# install.packages("ggpubr")
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.0.5 ✓ dplyr 1.0.3
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(rstatix)
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
##
## filter
library(ggpubr)
The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).
3 (0.5, 1.0, 2.0) X 2 (VC, OJ) 被験者間 X 被験者間
data("ToothGrowth") # サンプルデータを読み込む
# dat <- read_csv() パソコンにあるデータを読み込む場合
ToothGrowth
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
## 7 11.2 VC 0.5
## 8 11.2 VC 0.5
## 9 5.2 VC 0.5
## 10 7.0 VC 0.5
## 11 16.5 VC 1.0
## 12 16.5 VC 1.0
## 13 15.2 VC 1.0
## 14 17.3 VC 1.0
## 15 22.5 VC 1.0
## 16 17.3 VC 1.0
## 17 13.6 VC 1.0
## 18 14.5 VC 1.0
## 19 18.8 VC 1.0
## 20 15.5 VC 1.0
## 21 23.6 VC 2.0
## 22 18.5 VC 2.0
## 23 33.9 VC 2.0
## 24 25.5 VC 2.0
## 25 26.4 VC 2.0
## 26 32.5 VC 2.0
## 27 26.7 VC 2.0
## 28 21.5 VC 2.0
## 29 23.3 VC 2.0
## 30 29.5 VC 2.0
## 31 15.2 OJ 0.5
## 32 21.5 OJ 0.5
## 33 17.6 OJ 0.5
## 34 9.7 OJ 0.5
## 35 14.5 OJ 0.5
## 36 10.0 OJ 0.5
## 37 8.2 OJ 0.5
## 38 9.4 OJ 0.5
## 39 16.5 OJ 0.5
## 40 9.7 OJ 0.5
## 41 19.7 OJ 1.0
## 42 23.3 OJ 1.0
## 43 23.6 OJ 1.0
## 44 26.4 OJ 1.0
## 45 20.0 OJ 1.0
## 46 25.2 OJ 1.0
## 47 25.8 OJ 1.0
## 48 21.2 OJ 1.0
## 49 14.5 OJ 1.0
## 50 27.3 OJ 1.0
## 51 25.5 OJ 2.0
## 52 26.4 OJ 2.0
## 53 22.4 OJ 2.0
## 54 24.5 OJ 2.0
## 55 24.8 OJ 2.0
## 56 30.9 OJ 2.0
## 57 26.4 OJ 2.0
## 58 27.3 OJ 2.0
## 59 29.4 OJ 2.0
## 60 23.0 OJ 2.0
長過ぎるので
head(ToothGrowth)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
データフレームの中身は、summary,もしくは str で確認できます。
summary(ToothGrowth)
## len supp dose
## Min. : 4.20 OJ:30 Min. :0.500
## 1st Qu.:13.07 VC:30 1st Qu.:0.500
## Median :19.25 Median :1.000
## Mean :18.81 Mean :1.167
## 3rd Qu.:25.27 3rd Qu.:2.000
## Max. :33.90 Max. :2.000
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
doseを要因に変換しないと
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
もう一回確認します
str(ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: Factor w/ 3 levels "0.5","1","2": 1 1 1 1 1 1 1 1 1 1 ...
データの中身をさらに詳しく知りたい場合には
ToothGrowth %>%
group_by(dose, supp) %>%
get_summary_stats(len)
## # A tibble: 6 x 15
## supp dose variable n min max median q1 q3 iqr mad mean
## <fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 OJ 0.5 len 10 8.2 21.5 12.2 9.7 16.2 6.48 4.3 13.2
## 2 VC 0.5 len 10 4.2 11.5 7.15 5.95 10.9 4.95 3.56 7.98
## 3 OJ 1 len 10 14.5 27.3 23.4 20.3 25.6 5.35 3.93 22.7
## 4 VC 1 len 10 13.6 22.5 16.5 15.3 17.3 2.02 1.70 16.8
## 5 OJ 2 len 10 22.4 30.9 26.0 24.6 27.1 2.5 2.08 26.1
## 6 VC 2 len 10 18.5 33.9 26.0 23.4 28.8 5.42 4.60 26.1
## # … with 3 more variables: sd <dbl>, se <dbl>, ci <dbl>
type = “mean_sd”を追加すると
ToothGrowth %>%
group_by(dose, supp) %>%
get_summary_stats(len, type = "mean_sd")
## # A tibble: 6 x 6
## supp dose variable n mean sd
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 OJ 0.5 len 10 13.2 4.46
## 2 VC 0.5 len 10 7.98 2.75
## 3 OJ 1 len 10 22.7 3.91
## 4 VC 1 len 10 16.8 2.52
## 5 OJ 2 len 10 26.1 2.66
## 6 VC 2 len 10 26.1 4.80
Boxplots (or Box plots) are used to visualize the distribution of a grouped continuous variable through their quartiles.
Box Plots have the advantage of taking up less space compared to Histogram and Density plot. This is useful when comparing distributions between many groups.
boxplot(ToothGrowth$len ~ ToothGrowth$dose)
e <- ggplot(ToothGrowth, aes(x = dose, y = len)) + geom_boxplot(aes(fill = supp))
e
e + facet_wrap(~supp)
‘ggplot2’ Based Publication Ready Plots
ggboxplot(ToothGrowth, x = "dose", y = "len", color = "supp", add = "jitter")