HIV_AID_NY

Author

Zijin Wang

Introduction

HIV/AIDS has always been a significant public health concern, particularly in densely populated areas. This document seeks to explore a dataset that contains statistics related to HIV/AIDS in New York City. The dataset includes variables such as year, borough, gender, race, number of diagnoses, death rates, and various other metrics. Throughout this analysis, our main objective is to uncover trends, patterns, and insights about the disease’s prevalence and its impact. The dataset has been sourced from NYC Health.

Loading necessary libraries

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout

Loading the dataset

getwd()
[1] "/Users/zwang30/Desktop/DATA110"
data <- read_csv("HIV_AIDS_NY.csv")
Rows: 6005 Columns: 18
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (5): Borough, UHF, Gender, Age, Race
dbl (13): Year, HIV diagnoses, HIV diagnosis rate, Concurrent diagnoses, % l...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Exploratory data analysis to understand the structure and cleanliness of the data
head(data)
# A tibble: 6 × 18
   Year Borough UHF   Gender    Age   Race  `HIV diagnoses` `HIV diagnosis rate`
  <dbl> <chr>   <chr> <chr>     <chr> <chr>           <dbl>                <dbl>
1  2011 All     All   All       All   All              3379                 48.3
2  2011 All     All   Male      All   All              2595                 79.1
3  2011 All     All   Female    All   All               733                 21.1
4  2011 All     All   Transgen… All   All                51              99999  
5  2011 All     All   Female    13 -… All                47                 13.6
6  2011 All     All   Female    20 -… All               178                 24.7
# ℹ 10 more variables: `Concurrent diagnoses` <dbl>,
#   `% linked to care within 3 months` <dbl>, `AIDS diagnoses` <dbl>,
#   `AIDS diagnosis rate` <dbl>, `PLWDHI prevalence` <dbl>,
#   `% viral suppression` <dbl>, Deaths <dbl>, `Death rate` <dbl>,
#   `HIV-related death rate` <dbl>, `Non-HIV-related death rate` <dbl>

Conclusion and Analysis

My analysis began with the loading of essential libraries, including tidyverse, ggplot2, and plotly, which provide powerful tools for data manipulation and visualization. One of the central components of this analysis was the visualization of yearly trends in HIV diagnoses across different boroughs of New York City. The visualization and interactive elements of the plot allow for a dynamic exploration of the data, helping to uncover patterns and trends that may inform public health interventions. Further analysis and exploration could delve into more specific aspects of the dataset, such as examining disparities among different demographic groups or investigating the impact of interventions over time. While this analysis successfully provided insights into overall trends, future work could explore additional aspects of the dataset. For instance, examining the relationship between HIV diagnoses and demographic variables such as gender or race could reveal important disparities. In summary, this analysis represents a preliminary step in understanding HIV/AIDS in New York City, and it demonstrates the potential for further exploration and research in this critical public health domain.