Original Visualization

Source: CDC Youth Risk Behavior Surveillance

The data visualization shows the percentage in trends for three variables from the year 2001 to 2017. This data is sourced from the Centers for Disease Control and Prevention (CDC) Youth Risk Behavior Surveillance System focusing on students aged 13 to 18 that tracks behaviors that can range from sexual activities to recreational drugs. The CDC have these surveys every two years with those surveyed being anonymous and it is mainly to inform school districts and health departments of these trends to support the youth of the United States of America and thus reduce the transmission of sexually transmitted diseases/infections and human immunodeficiency virus cases.

Online Article

https://www.ashasexualhealth.org/national-youth-risk-behavior-survey-yrbs/

Objective

Based on the organization and the content it hosts on their website, their main audience are families with teenagers who wish to learn about sexually healthy lives with a focus on raising awareness via education and providing accessible information to achieve that. Their aim is to help increase the prevention rate of STIs or unwanted pregnancies among the youth by informing teenagers of the dangers and providing support to avoid that by practicing safe sex.

The article above shows a data visualization that indicates there is a downward trend in unsafe sexual activities as of 2017. This indicates that the organization (American Sexual Health Association) and other similar groups are on the right track and that their efforts are showing a positive result.

The objective of the data visualization is thus to inform all related parties that in general risky behavior of those surveyed has gone down in percentage compared to previous years. Hence, they should keep up the work to further decrease the percentage for the future.

Issues

Reconstruction Code

#for the dataset I have included values for the four categories mentioned in both the data visualization and the online article. also, I will be using a time frame from 2001 to 2019 to show a 20 year trend (since the article was published around 2018, the author only had access up to 2017. the author neglected to find data from 1999 so that they themselves can have a 20 year time frame however. CDC has documentation from all the way back to 1991)
dataset <- read.csv("Assignment2.csv")

dataset
##    Year Ever.had.sex Had.4.or.more.partners Used.a.condom.during.sex
## 1  2001         45.6                   14.2                     57.9
## 2  2003         46.7                   14.4                     63.0
## 3  2005         46.8                   14.3                     62.8
## 4  2007         47.8                   14.9                     61.5
## 5  2009         46.0                   13.8                     61.1
## 6  2011         47.4                   15.3                     60.2
## 7  2013         46.8                   15.0                     59.1
## 8  2015         41.2                   11.5                     56.9
## 9  2017         39.5                    9.7                     53.8
## 10 2019         38.4                    8.6                     54.3
##    Have.used.drugs
## 1             25.6
## 2             25.4
## 3             23.3
## 4             22.6
## 5             20.0
## 6             22.5
## 7             17.3
## 8             15.4
## 9             14.0
## 10            14.8
#use pivot_longer() to convert to long format
dataset <- dataset %>% pivot_longer(cols=c("Ever.had.sex","Had.4.or.more.partners","Used.a.condom.during.sex","Have.used.drugs"),names_to="Category",values_to="Percentage")

dataset
## # A tibble: 40 × 3
##     Year Category                 Percentage
##    <int> <chr>                         <dbl>
##  1  2001 Ever.had.sex                   45.6
##  2  2001 Had.4.or.more.partners         14.2
##  3  2001 Used.a.condom.during.sex       57.9
##  4  2001 Have.used.drugs                25.6
##  5  2003 Ever.had.sex                   46.7
##  6  2003 Had.4.or.more.partners         14.4
##  7  2003 Used.a.condom.during.sex       63  
##  8  2003 Have.used.drugs                25.4
##  9  2005 Ever.had.sex                   46.8
## 10  2005 Had.4.or.more.partners         14.3
## # ℹ 30 more rows
#change Category to factor and set according to intended legend ordering
dataset$Category <- dataset$Category %>% factor(levels=c("Used.a.condom.during.sex","Ever.had.sex","Have.used.drugs","Had.4.or.more.partners"))

#graph plotting, y-axis using intervals of 5 to better show the trend
datasetplot <- ggplot(dataset,aes(x=Year,y=Percentage,label=Percentage)) + geom_line(aes(color=Category),size=1) + scale_color_manual(name="Category",labels=c("Used a condom","Ever had sex","Have taken drugs","Had 4 or more sexual partners"),values=c("darkgrey","blue","darkgreen","red")) + scale_x_continuous(breaks=seq(2001,2019,2)) + scale_y_continuous(breaks=seq(0,70,5)) + ggtitle("Trends in Youth Risk Behaviors: 2001-2019")

datasetplot

References

American Sexual Health Association (2017-2018), National Youth Risk Behavior Survey, viewed 5th April 2023, https://www.ashasexualhealth.org/national-youth-risk-behavior-survey-yrbs/

Centers for Disease Control and Prevention (2001), Youth Risk Behavior Surveillance – United States, 2001, https://www.cdc.gov/mmwr/preview/mmwrhtml/ss5104a1.htm

Centers for Disease Control and Prevention (2003), Youth Risk Behavior Surveillance – United States, 2003, https://www.cdc.gov/mmwr/preview/mmwrhtml/ss5302a1.htm

Centers for Disease Control and Prevention (2005), Youth Risk Behavior Surveillance – United States, 2005, https://www.cdc.gov/mmwr/preview/mmwrhtml/ss5505a1.htm

Centers for Disease Control and Prevention (2018), Youth Risk Behavior Survey Data Summary & Trends Report 2007-2017, https://www.cdc.gov/healthyyouth/data/yrbs/pdf/trendsreport.pdf

Centers for Disease Control and Prevention (2020), Youth Risk Behavior Survey Data Summary & Trends Report 2009-2019, https://www.cdc.gov/healthyyouth/data/yrbs/pdf/YRBSDataSummaryTrendsReport2019-508.pdf