library(gtsummary)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
ToothGrowth %>%
tbl_summary()
Characteristic | N = 601 |
---|---|
len | 19 (13, 25) |
supp | |
OJ | 30 (50%) |
VC | 30 (50%) |
dose | |
0.5 | 20 (33%) |
1 | 20 (33%) |
2 | 20 (33%) |
1 Median (Q1, Q3); n (%) |
Interpretation : 1. What the Code Does
This line uses the gtsummary package to create a descriptive summary table of all variables in the built-in dataset ToothGrowth.
It provides:
Summary statistics (mean, median, SD, etc.) for numeric variables
Frequencies and percentages for categorical variables
No grouping (by = …) is used here — so it summarizes the whole dataset together.
📊 2. About the Dataset ToothGrowth
ToothGrowth contains results from an experiment studying the effect of vitamin C on tooth growth in guinea pigs.
It includes 60 observations and 3 variables:
Variable Type Description len Numeric Tooth length supp Factor Supplement type: “OJ” = orange juice, “VC” = vitamin C dose Numeric (0.5, 1, 2) Vitamin C dose in milligrams 🧾 3. Typical Output (Example)
When you run tbl_summary(), you’ll get a table that looks like this:
Variable N Summary len 60 18.81 (7.65) supp 60 OJ: 30 (50%) VC: 30 (50%) dose 60 0.5: 20 (33.3%) 1: 20 (33.3%) 2: 20 (33.3%)
(The exact formatting may vary slightly in your R output.)
🧩 4. Interpretation 🔹 Variable 1: Tooth Length (len)
There are 60 total observations of tooth length.
The mean tooth length is 18.81 units, with a standard deviation (SD) of 7.65. → This indicates that the typical tooth length is around 19, but values vary moderately (±7.6 units) among guinea pigs.
The range of tooth lengths in the data (if you check with summary(ToothGrowth$len)) is from 4.2 to 33.9, showing considerable variation in tooth growth responses.
Interpretation:
On average, guinea pigs had a tooth length of 18.81 units. The moderate standard deviation suggests variability among individuals, which might be influenced by supplement type or dose.
🔹 Variable 2: Supplement Type (supp)
There are two supplement types: Orange Juice (OJ) and Vitamin C (VC).
Each group has 30 guinea pigs (50% each).
Interpretation:
The dataset is balanced between the two supplement types, with an equal number of guinea pigs in both OJ and VC groups. This balance ensures fair comparison between the effects of the two supplements.
🔹 Variable 3: Vitamin C Dose (dose)
The doses used were 0.5 mg, 1 mg, and 2 mg.
Each dose level has 20 guinea pigs (33.3%).
Interpretation:
Each dose level of vitamin C is equally represented in the data. This balanced experimental design allows for reliable comparison of the dose effect on tooth growth.
⚙️ 5. Overall Interpretation of the Table
The tbl_summary() output provides a clear descriptive overview of the dataset. It shows that:
The dataset includes 60 guinea pigs.
The mean tooth length is approximately 18.8 units, indicating moderate tooth growth overall.
Supplement types (OJ and VC) are evenly distributed (each 50%).
Dose levels (0.5 mg, 1 mg, 2 mg) are also evenly distributed (each 33.3%).
This summary confirms that the experiment was well-balanced, making it suitable for further analysis to determine whether tooth growth depends on supplement type or dose level.
ToothGrowth %>%
select(len, supp, dose) %>%
tbl_summary(by = supp) %>%
add_p()
## The following warnings were returned during `add_p()`:
## ! For variable `len` (`supp`) and "estimate", "statistic", "p.value",
## "conf.low", and "conf.high" statistics: cannot compute exact p-value with
## ties
## ! For variable `len` (`supp`) and "estimate", "statistic", "p.value",
## "conf.low", and "conf.high" statistics: cannot compute exact confidence
## intervals with ties
Characteristic | OJ N = 301 |
VC N = 301 |
p-value2 |
---|---|---|---|
len | 23 (15, 26) | 17 (11, 23) | 0.064 |
dose | >0.9 | ||
0.5 | 10 (33%) | 10 (33%) | |
1 | 10 (33%) | 10 (33%) | |
2 | 10 (33%) | 10 (33%) | |
1 Median (Q1, Q3); n (%) | |||
2 Wilcoxon rank sum test; Pearson’s Chi-squared test |
Interpretation : ✅ 1. ToothGrowth dataset
What it is: A built-in R dataset.
Content: Contains data on the effect of vitamin C on tooth growth in guinea pigs.
Variables include:
len: Tooth length (numeric)
supp: Supplement type (factor; either “VC” for Vitamin C or “OJ” for Orange Juice)
dose: Dose of supplement in mg/day (numeric; 0.5, 1.0, or 2.0)
✅ 2. select(len, supp, dose)
Purpose: Selects only the columns len, supp, and dose from the dataset.
Why: These are the key variables of interest for the summary—especially since we are comparing groups based on supplement type.
✅ 3. tbl_summary(by = supp)
Function from: gtsummary package.
Purpose: Creates a summary table of descriptive statistics.
Grouping: The summary is done by the supp variable, meaning two groups:
One group for VC (Vitamin C)
One group for OJ (Orange Juice)
Output:
For len (tooth length): typically shows mean (SD) or median (IQR) by group.
For dose (numerical): summarized similarly across supplement types.
The table will include the number of observations (n) per group.
✅ 4. add_p()
Function from: gtsummary package.
Purpose: Adds p-values to the summary table to test for statistical differences between the two supplement groups.
How it works:
For numeric variables (like len and dose), it typically performs a t-test or Wilcoxon test, depending on the data distribution.
For categorical variables (not present here), it would use a chi-squared test or Fisher’s exact test.
Interpretation of p-value:
A small p-value (usually < 0.05) suggests a statistically significant difference between the two supplement types for that variable.
🧠 Interpretation in Context of Assignment
This code is used to compare the effects of two supplement types (Vitamin C and Orange Juice) on tooth growth length (len) and the dose levels (dose) in guinea pigs.
The tbl_summary(by = supp) helps us break down the summary statistics for each supplement type separately.
The add_p() function provides a statistical test of whether the observed differences between the supplement groups are significant.
🔎 This is useful in determining whether the type of supplement (Vitamin C or Orange Juice) has a significant impact on tooth growth or whether the dose levels differ across the supplement groups.
📊 Example Output (Hypothetical) Variable VC (N=30) OJ (N=30) p-value len 17.0 (7.1) 20.7 (6.6) 0.02 dose 1.17 (0.6) 1.17 (0.6) 1.00
This would suggest a statistically significant difference in tooth length between the supplement groups (p = 0.02), but no difference in dose (p = 1.00), which makes sense since the study was controlled to give equal doses.
📝 Summary :
This R code analyzes the ToothGrowth dataset by summarizing key variables (len, supp, and dose) and comparing them across the two supplement types (VC and OJ). The tbl_summary() function provides descriptive statistics by group, while add_p() calculates p-values to determine whether the differences between groups are statistically significant. This helps assess whether supplement type has a measurable effect on tooth growth in guinea pigs.