Reporting Data

The Department of Health and Human Services (HHS) in the Office for Civil Rights (OCR) is responsible for the collecting and reporting disclosures of protected health information as mandated by law. With this law, it requires that they report cases where covered entities have a breach that affects more than 500 individuals.

What kind of breaches are included?

• Name of the covered entity (Organization responsible for the PHI)

• State (US State where the breach was reported)

• Covered Entity Type (Type of organization responsible for the PHI)

• Individuals Affected (Number of records affected by the breach)

• Breach submission date (Date the breach was reported by the CE)

• Type of breach (how unauthorized access to the PHI was obtained)

• Location of breached information (Where was the PHI when unauthorized access was obtained)

• Business associate present (Was a business associate such as a consultant or contractor involved in the breach)

• Web description (A optional statement explaining what happened and the resolution)

Individual Breaches

This table shows total breaches by type within the database.

## # A tibble: 1 × 5
##   Unkown Theft IT_Hacking  Loss Improper_Disposal
##    <dbl> <dbl>      <dbl> <dbl>             <dbl>
## 1     13   765        234   765                66

The table shows that theft and loss are the most common types of breaches, as they both have 765 total cases. IT Hacking is third on the list with 234 cases. As more and more information is going digitally, I would expect the number of IT Hacking to be way more in the future.

We want to know how many breaches a year we have seen

The graph shows that the most amount of breaches happened in 2014. Based on the graph, it shows that it was around 375 breaches in 2014. The 2nd and 3rd highest was in 2013 and 2016. As technology gets better throughout time, this could be why we see more breaches as time goes on.

Where are the top 25 healthcare breaches from?

## `summarise()` has grouped output by 'Name of Covered Entity', 'year', 'State'.
## You can override using the `.groups` argument.
## # A tibble: 25 × 5
## # Groups:   Name of Covered Entity, year, State [25]
##    `Name of Covered Entity`                           year State Cover…¹ Indiv…²
##    <chr>                                             <dbl> <chr> <chr>     <dbl>
##  1 Anthem, Inc. Affiliated Covered Entity             2015 IN    Health…  7.88e7
##  2 Science Applications International Corporation (…  2011 VA    Busine…  4.9 e6
##  3 Advocate Health and Hospitals Corporation, d/b/a…  2013 IL    Health…  4.03e6
##  4 21st Century Oncology                              2016 FL    Health…  2.21e6
##  5 Xerox State Healthcare, LLC                        2014 TX    Busine…  2   e6
##  6 IBM                                                2011 NY    Busine…  1.9 e6
##  7 GRM Information Management Services                2011 NJ    Busine…  1.7 e6
##  8 AvMed, Inc.                                        2010 FL    Health…  1.22e6
##  9 Montana Department of Public Health & Human Serv…  2014 MT    Health…  1.06e6
## 10 The Nemours Foundation                             2011 FL    Health…  1.06e6
## # … with 15 more rows, and abbreviated variable names ¹​`Covered Entity Type`,
## #   ²​Individuals_Affected

The top 3 highest healthcare data breaches are from Anthem, INC, Science Applications International Corporation and Advocate Health and Hospitals Corporation

Which states have the high amount of breaches in the United States?

As expected, Indiana has the most out of any state because the highest breach was in Indiana (Anthem, Inc). Virginia was second because of the help from the breach from Science Applications International Corporation in 2011

Which months have the most amount of breaches?

The graph shows that the most amount of breaches happen in the months of April, May and October. The lowest amount of breaches happened in in the months of February and June.

Whate covered entity has the highest amount of breaches?

## # A tibble: 4 × 2
##   `Covered Entity Type`     `Amount of Breaches`
##   <chr>                                    <int>
## 1 Business Associate                         285
## 2 Health Plan                                200
## 3 Healthcare Clearing House                    4
## 4 Healthcare Provider                       1220

Healthcare Provider has the most amount of breaches by a large margin in comparison to the other 3 types of Covered Entity Types. I found it interesting that there was only 4 Breaches of Healthcare Clearing Houses. Since Healthcare Clearing Houses are the middleman between the big companies and the clients, I assumed data would get lost more within them.

Which day of the week has the most amount of breaches?

## # A tibble: 7 × 2
##   day   `Amount of Breaches`
##   <ord>                <int>
## 1 Sun                     19
## 2 Mon                    286
## 3 Tue                    281
## 4 Wed                    282
## 5 Thu                    300
## 6 Fri                    512
## 7 Sat                     29

The graph shows that the majority of the breaches happen later in the weekdays. Thursday and Friday have the highest amounts. This makes sense because companies works usually Monday through Friday so there wouldn’t be many chances throughout the week. The Breaches that happen on Saturday and Sunday could possibly be mostly from hospitals because they are open 7 days a week obviously.

Number of Business Associate Breaches

## # A tibble: 10 × 2
##     year `Number of Business Associate Breaches > 50`
##    <dbl>                                        <int>
##  1  2009                                            3
##  2  2010                                           43
##  3  2011                                           45
##  4  2012                                           37
##  5  2013                                           64
##  6  2014                                           67
##  7  2015                                            9
##  8  2016                                           14
##  9  2017                                            3
## 10  2018                                            0

As the graph states, there was 2 years where there was at least 50. In 2013, there was 64 breaches and 67 breaches in 2014 where it happened as well.

What states had the most Unauthorized Access type of Breaches?

The question states who had the most Unauthorized Access type of Breaches? I find this very interesting to me because I have had social media accounts that have been stolen and used by hackers. Someone got a hold of my password which lead them to having access to my account where they could see my personal data. I find it very interesting that California, Florida, Texas and New York have the most amount of Unauthorized Access’s. Since they are bigger states with a lot of tech companies, I feel like this makes sense because since there is a lot more residents, there is lots of data that can be stolen. I created a barplot because it is the easiest way of viewing the data when comparing states.

Was a business associate present during the theft breaches?

Sometimes breaches and data leaks always aren’t unfortunate circumstances. There could be human errors which include leaving a computer inside a car. I wanted to know how many theft breaches were a consultant or some type of worker involved with the breach. According to the data, about 600 thefts did not include a business associate and about 175 did have an associate. I expected the number of employees to be present to be around this number. If you are an employee of a hospital, you obviously are careful about potential information being stolen. I am not sure why employees would be involved in certain cases of these thefts. Many of the theft descriptions say that a laptop was stolen from a bag or car. One interesting one I read was a employee of FTGU medical Consulting sent data to a unknown 3rd party. If I would change the data or make the types of breaches more specific, I would break down the different types of theft. A computer stolen and an employee sending patients information to a third party source is not the same in my opinion.