Load libraries
library(gtsummary)
library(dplyr)
library(ggplot2)
Load the airquality dataset
data("airquality")
Show the first few rows of the dataset
head(airquality)
head(airquality)
Univariate analysis using gtsummary
univariate_summary <- airquality %>%
select(Ozone, Solar.R, Wind, Temp) %>%
tbl_summary() %>%
add_n()
Display the univariate summary
univariate_summary
| Characteristic |
N |
N = 153 |
| Ozone |
116 |
32 (18, 64) |
| Unknown |
|
37 |
| Solar.R |
146 |
205 (115, 259) |
| Unknown |
|
7 |
| Wind |
153 |
9.7 (7.4, 11.5) |
| Temp |
153 |
79 (72, 85) |
Create a categorical variable for Ozone
airquality <- airquality %>%
mutate(Ozone_Category = ifelse(Ozone > 50, "High", "Low"))
Cross table between Ozone_Category and Wind
cross_table <- airquality %>%
tbl_cross(Ozone_Category,Month)
Display the cross table
cross_table
|
Month
|
Total |
| 5 |
6 |
7 |
8 |
9 |
| Ozone_Category |
|
|
|
|
|
|
| High |
1 |
1 |
15 |
13 |
4 |
34 |
| Low |
25 |
8 |
11 |
13 |
25 |
82 |
| Unknown |
5 |
21 |
5 |
5 |
1 |
37 |
| Total |
31 |
30 |
31 |
31 |
30 |
153 |
Bivariate analysis with p-values
bivariate_summary <- airquality %>%
tbl_summary(by = Ozone_Category) %>%
add_p()
Display the bivariate summary
bivariate_summary
| Characteristic |
High
N = 34 |
Low
N = 82 |
p-value |
| Ozone |
80 (66, 97) |
22 (14, 34) |
<0.001 |
| Solar.R |
222 (188, 255) |
191 (81, 259) |
0.10 |
| Unknown |
2 |
3 |
|
| Wind |
6.3 (5.1, 8.0) |
10.6 (9.2, 13.2) |
<0.001 |
| Temp |
88 (84, 92) |
75 (68, 81) |
<0.001 |
| Month |
|
|
<0.001 |
| 5 |
1 (2.9%) |
25 (30%) |
|
| 6 |
1 (2.9%) |
8 (9.8%) |
|
| 7 |
15 (44%) |
11 (13%) |
|
| 8 |
13 (38%) |
13 (16%) |
|
| 9 |
4 (12%) |
25 (30%) |
|
| Day |
16 (7, 27) |
16 (9, 21) |
0.8 |
Multilayer analysis
multilayer_summary <- airquality %>%
tbl_summary(by = Ozone_Category) %>%
add_p() %>%
add_overall() # Add overall statistics
Display the multilayer summary
multilayer_summary
| Characteristic |
Overall
N = 116 |
High
N = 34 |
Low
N = 82 |
p-value |
| Ozone |
32 (18, 64) |
80 (66, 97) |
22 (14, 34) |
<0.001 |
| Solar.R |
207 (112, 256) |
222 (188, 255) |
191 (81, 259) |
0.10 |
| Unknown |
5 |
2 |
3 |
|
| Wind |
9.7 (7.4, 11.5) |
6.3 (5.1, 8.0) |
10.6 (9.2, 13.2) |
<0.001 |
| Temp |
79 (71, 85) |
88 (84, 92) |
75 (68, 81) |
<0.001 |
| Month |
|
|
|
<0.001 |
| 5 |
26 (22%) |
1 (2.9%) |
25 (30%) |
|
| 6 |
9 (7.8%) |
1 (2.9%) |
8 (9.8%) |
|
| 7 |
26 (22%) |
15 (44%) |
11 (13%) |
|
| 8 |
26 (22%) |
13 (38%) |
13 (16%) |
|
| 9 |
29 (25%) |
4 (12%) |
25 (30%) |
|
| Day |
16 (8, 22) |
16 (7, 27) |
16 (9, 21) |
0.8 |
Interpretation of Results
Univariate Analysis: The univariate summary provides insights
into the distribution of each variable (Ozone, Solar Radiation, Wind,
Temperature). For example, we can see the mean, median, and range of
Ozone levels, which helps us understand the air quality in terms of
ozone concentration.
Bivariate Analysis:
• Cross Table: The cross table shows the relationship between Ozone
levels (categorized as High or Low) and Wind speed (greater than 10).
This helps us understand if there is a significant association between
wind speed and ozone levels.
• P-values: The bivariate summary with p-values indicates whether the
differences observed between groups (e.g., High vs. Low Ozone) are
statistically significant. A p-value less than 0.05 typically indicates
a significant difference.
• Multilayer Analysis: This analysis allows us to see how multiple
variables interact. For example, we can assess how temperature and solar
radiation affect ozone levels across different categories.
Conclusion
The analysis of the air quality dataset using gtsummary provides a
comprehensive overview of the data through univariate and bivariate
analyses. The results can help in understanding the factors affecting
air quality and guide further research or policy-making decisions.
Make sure to run the above code in your R environment to see the
actual outputs and interpretations based on your dataset.
LS0tDQp0aXRsZTogIlIgTm90ZWJvb2siDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCi0tLQ0KDQojIExvYWQgbGlicmFyaWVzDQpgYGB7cn0NCmxpYnJhcnkoZ3RzdW1tYXJ5KQ0KbGlicmFyeShkcGx5cikNCmxpYnJhcnkoZ2dwbG90MikNCmBgYA0KIyBMb2FkIHRoZSBhaXJxdWFsaXR5IGRhdGFzZXQNCg0KYGBge3J9DQpkYXRhKCJhaXJxdWFsaXR5IikNCmBgYA0KDQojIFNob3cgdGhlIGZpcnN0IGZldyByb3dzIG9mIHRoZSBkYXRhc2V0DQpoZWFkKGFpcnF1YWxpdHkpDQoNCmBgYHtyfQ0KaGVhZChhaXJxdWFsaXR5KQ0KYGBgDQojIFVuaXZhcmlhdGUgYW5hbHlzaXMgdXNpbmcgZ3RzdW1tYXJ5DQoNCmBgYHtyfQ0KIyBVbml2YXJpYXRlIGFuYWx5c2lzIHVzaW5nIGd0c3VtbWFyeQ0KdW5pdmFyaWF0ZV9zdW1tYXJ5IDwtIGFpcnF1YWxpdHkgJT4lDQogIHNlbGVjdChPem9uZSwgU29sYXIuUiwgV2luZCwgVGVtcCkgJT4lDQogIHRibF9zdW1tYXJ5KCkgJT4lDQogIGFkZF9uKCkgIyBBZGQgc2FtcGxlIHNpemUNCmBgYA0KIyBEaXNwbGF5IHRoZSB1bml2YXJpYXRlIHN1bW1hcnkNCg0KYGBge3J9DQp1bml2YXJpYXRlX3N1bW1hcnkNCmBgYA0KIyBDcmVhdGUgYSBjYXRlZ29yaWNhbCB2YXJpYWJsZSBmb3IgT3pvbmUNCg0KYGBge3J9DQphaXJxdWFsaXR5IDwtIGFpcnF1YWxpdHkgJT4lDQogIG11dGF0ZShPem9uZV9DYXRlZ29yeSA9IGlmZWxzZShPem9uZSA+IDUwLCAiSGlnaCIsICJMb3ciKSkNCmBgYA0KIyBDcm9zcyB0YWJsZSBiZXR3ZWVuIE96b25lX0NhdGVnb3J5IGFuZCBXaW5kDQoNCmBgYHtyfQ0KY3Jvc3NfdGFibGUgPC0gYWlycXVhbGl0eSAlPiUNCiAgdGJsX2Nyb3NzKE96b25lX0NhdGVnb3J5LE1vbnRoKSANCmBgYA0KIyBEaXNwbGF5IHRoZSBjcm9zcyB0YWJsZQ0KDQpgYGB7cn0NCmNyb3NzX3RhYmxlDQpgYGANCiMgQml2YXJpYXRlIGFuYWx5c2lzIHdpdGggcC12YWx1ZXMNCg0KYGBge3J9DQpiaXZhcmlhdGVfc3VtbWFyeSA8LSBhaXJxdWFsaXR5ICU+JQ0KICB0Ymxfc3VtbWFyeShieSA9IE96b25lX0NhdGVnb3J5KSAlPiUNCiAgYWRkX3AoKSANCmBgYA0KIyBEaXNwbGF5IHRoZSBiaXZhcmlhdGUgc3VtbWFyeQ0KDQpgYGB7cn0NCmJpdmFyaWF0ZV9zdW1tYXJ5DQpgYGANCiMgTXVsdGlsYXllciBhbmFseXNpcw0KDQpgYGB7cn0NCm11bHRpbGF5ZXJfc3VtbWFyeSA8LSBhaXJxdWFsaXR5ICU+JQ0KICB0Ymxfc3VtbWFyeShieSA9IE96b25lX0NhdGVnb3J5KSAlPiUNCiAgYWRkX3AoKSAlPiUNCiAgYWRkX292ZXJhbGwoKSAjIEFkZCBvdmVyYWxsIHN0YXRpc3RpY3MNCg0KYGBgDQojIERpc3BsYXkgdGhlIG11bHRpbGF5ZXIgc3VtbWFyeQ0KDQpgYGB7cn0NCm11bHRpbGF5ZXJfc3VtbWFyeQ0KYGBgDQpJbnRlcnByZXRhdGlvbiBvZiBSZXN1bHRzDQoNCjEuIFVuaXZhcmlhdGUgQW5hbHlzaXM6IFRoZSB1bml2YXJpYXRlIHN1bW1hcnkgcHJvdmlkZXMgaW5zaWdodHMgaW50byB0aGUgZGlzdHJpYnV0aW9uIG9mIGVhY2ggdmFyaWFibGUgKE96b25lLCBTb2xhciBSYWRpYXRpb24sIFdpbmQsIFRlbXBlcmF0dXJlKS4gRm9yIGV4YW1wbGUsIHdlIGNhbiBzZWUgdGhlIG1lYW4sIG1lZGlhbiwgYW5kIHJhbmdlIG9mIE96b25lIGxldmVscywgd2hpY2ggaGVscHMgdXMgdW5kZXJzdGFuZCB0aGUgYWlyIHF1YWxpdHkgaW4gdGVybXMgb2Ygb3pvbmUgY29uY2VudHJhdGlvbi4NCg0KMi4gQml2YXJpYXRlIEFuYWx5c2lzOg0KDQrigKIgQ3Jvc3MgVGFibGU6IFRoZSBjcm9zcyB0YWJsZSBzaG93cyB0aGUgcmVsYXRpb25zaGlwIGJldHdlZW4gT3pvbmUgbGV2ZWxzIChjYXRlZ29yaXplZCBhcyBIaWdoIG9yIExvdykgYW5kIFdpbmQgc3BlZWQgKGdyZWF0ZXIgdGhhbiAxMCkuIFRoaXMgaGVscHMgdXMgdW5kZXJzdGFuZCBpZiB0aGVyZSBpcyBhIHNpZ25pZmljYW50IGFzc29jaWF0aW9uIGJldHdlZW4gd2luZCBzcGVlZCBhbmQgb3pvbmUgbGV2ZWxzLg0KDQrigKIgUC12YWx1ZXM6IFRoZSBiaXZhcmlhdGUgc3VtbWFyeSB3aXRoIHAtdmFsdWVzIGluZGljYXRlcyB3aGV0aGVyIHRoZSBkaWZmZXJlbmNlcyBvYnNlcnZlZCBiZXR3ZWVuIGdyb3VwcyAoZS5nLiwgSGlnaCB2cy4gTG93IE96b25lKSBhcmUgc3RhdGlzdGljYWxseSBzaWduaWZpY2FudC4gQSBwLXZhbHVlIGxlc3MgdGhhbiAwLjA1IHR5cGljYWxseSBpbmRpY2F0ZXMgYSBzaWduaWZpY2FudCBkaWZmZXJlbmNlLg0KDQrigKIgTXVsdGlsYXllciBBbmFseXNpczogVGhpcyBhbmFseXNpcyBhbGxvd3MgdXMgdG8gc2VlIGhvdyBtdWx0aXBsZSB2YXJpYWJsZXMgaW50ZXJhY3QuIEZvciBleGFtcGxlLCB3ZSBjYW4gYXNzZXNzIGhvdyB0ZW1wZXJhdHVyZSBhbmQgc29sYXIgcmFkaWF0aW9uIGFmZmVjdCBvem9uZSBsZXZlbHMgYWNyb3NzIGRpZmZlcmVudCBjYXRlZ29yaWVzLg0KDQpDb25jbHVzaW9uDQoNClRoZSBhbmFseXNpcyBvZiB0aGUgYWlyIHF1YWxpdHkgZGF0YXNldCB1c2luZyBndHN1bW1hcnkgcHJvdmlkZXMgYSBjb21wcmVoZW5zaXZlIG92ZXJ2aWV3IG9mIHRoZSBkYXRhIHRocm91Z2ggdW5pdmFyaWF0ZSBhbmQgYml2YXJpYXRlIGFuYWx5c2VzLiBUaGUgcmVzdWx0cyBjYW4gaGVscCBpbiB1bmRlcnN0YW5kaW5nIHRoZSBmYWN0b3JzIGFmZmVjdGluZyBhaXIgcXVhbGl0eSBhbmQgZ3VpZGUgZnVydGhlciByZXNlYXJjaCBvciBwb2xpY3ktbWFraW5nIGRlY2lzaW9ucy4NCg0KTWFrZSBzdXJlIHRvIHJ1biB0aGUgYWJvdmUgY29kZSBpbiB5b3VyIFIgZW52aXJvbm1lbnQgdG8gc2VlIHRoZSBhY3R1YWwgb3V0cHV0cyBhbmQgaW50ZXJwcmV0YXRpb25zIGJhc2VkIG9uIHlvdXIgZGF0YXNldC4NCg0KYGBge3J9DQpgYGANCg0KDQpgYGB7cn0NCmBgYA0KDQoNCmBgYHtyfQ0KYGBgDQoNCg0KYGBge3J9DQpgYGANCg0KDQpgYGB7cn0NCmBgYA0KDQoNCmBgYHtyfQ0KYGBgDQoNCg0KYGBge3J9DQpgYGANCg0KDQpgYGB7cn0NCmBgYA0KDQoNCmBgYHtyfQ0KYGBgDQoNCg0KYGBge3J9DQpgYGANCg0K