I chose this Data set because it provides a rich source of data on a topic that is both timely and important.Tobacco use is a major public health concern and is the leading cause of preventable disease, disability and death in the United States. Nearly 40 million U.S adults still smoke ciggarettes. 3.08 million middle and high school students use at least one tobacco product including 3-ciggarettes. I got this data set from the center for disease control and prevention. This dataset highlights the trends in adult total and per capita consumption of both combustilble tobacco from 2000 to present There are 15 variables in this data set and 273 observations. I got this data set from This is the link to the data set I used https://www.cdc.gov/statesystem/featured-datasets/index.html An article in the CDC page states that a characteristic of adults who used smokeless tobacco in 2020 is that more than 2 in every 100 adults aged 18 or older reported current use of smokeless tobacco products. Which represents 5.7 million adults.( https://www.cdc.gov/tobacco/data_statistics/fact_sheets/smokeless/use_us/index.htm#adult-national) I am going to look at the relationship between the total tobacco consumption per capita and different measures related to tobacco use in the US in 2022.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.0 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.1 ✔ tibble 3.1.8
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
library(tmap)
## Warning: package 'tmap' was built under R version 4.2.3
library(tmaptools)
## Warning: package 'tmaptools' was built under R version 4.2.3
library(leaflet)
## Warning: package 'leaflet' was built under R version 4.2.3
library(sf)
## Warning: package 'sf' was built under R version 4.2.3
## Linking to GEOS 3.9.3, GDAL 3.5.2, PROJ 8.2.1; sf_use_s2() is TRUE
library(leaflet.extras)
## Warning: package 'leaflet.extras' was built under R version 4.2.3
library(dplyr)
library(rio)
## Warning: package 'rio' was built under R version 4.2.3
library(sp)
## Warning: package 'sp' was built under R version 4.2.3
library(ggplot2)
library(plotly)
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:rio':
##
## export
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
setwd("C:/Users/amani/OneDrive/Desktop/Data110")
adultTobaccoUseUS <- read.csv("adultTobaccoUseUS.csv")
head(adultTobaccoUseUS)
## Year LocationAbbrev LocationDesc Population Topic
## 1 2000 US National 209,786,736 Noncombustible Tobacco
## 2 2000 US National 209,786,736 Combustible Tobacco
## 3 2000 US National 209,786,736 Combustible Tobacco
## 4 2000 US National 209,786,736 Combustible Tobacco
## 5 2000 US National 209,786,736 Combustible Tobacco
## 6 2000 US National 209,786,736 Combustible Tobacco
## Measure Submeasure Data.Value.Unit Domestic
## 1 Smokeless Tobacco Chewing Tobacco Pounds 45,502,156
## 2 Cigarettes Cigarette Removals Cigarettes 423,250,355,675
## 3 Cigars Total Cigars Cigars 5,612,867,329
## 4 Loose Tobacco Total Loose Tobacco Cigarette Equivalents 8,291,276,800
## 5 Loose Tobacco Total Loose Tobacco Pounds 16,841,656
## 6 Cigars Small Cigars Cigars 2,243,135,044
## Imports Total Domestic.Per.Capita Imports.Per.Capita
## 1 91,965 45,594,121 0.217 0
## 2 12,319,663,000 435,570,018,675 2,018 59
## 3 548,243,000 6,161,110,329 27 3
## 4 702,741,662 8,994,018,462 40 3
## 5 1,427,444 18,269,100 0 0
## 6 36,049,000 2,279,184,044 11 0
## Total.Per.Capita
## 1 0.217
## 2 2,076
## 3 29
## 4 43
## 5 0
## 6 11
df <- read.csv("adultTobaccoUseUS.csv")
ggplot(df, aes(x = Topic, fill = Submeasure)) +
geom_bar(position = "dodge", alpha = 0.8) +
scale_y_continuous(labels = function(x) paste0(x/1, "%")) +
labs(x = "Topic", y = "Population Percentage", title = "Percentage of Tobacco being used")
Recent_data <- adultTobaccoUseUS %>%
filter(Year >= 2020)