Reporting Data
The Department of Health and Human Services (HHS) in the Office for Civil Rights (OCR) is responsible for the collecting and reporting disclosures of protected health information as mandated by law. With this law, it requires that they report cases where covered entities have a breach that affects more than 500 individuals.
What kind of breaches are included?
• Name of the covered entity (Organization responsible for the PHI)
• State (US State where the breach was reported)
• Covered Entity Type (Type of organization responsible for the PHI)
• Individuals Affected (Number of records affected by the breach)
• Breach submission date (Date the breach was reported by the CE)
• Type of breach (how unauthorized access to the PHI was obtained)
• Location of breached information (Where was the PHI when unauthorized access was obtained)
• Business associate present (Was a business associate such as a consultant or contractor involved in the breach)
• Web description (A optional statement explaining what happened and the resolution)
Individual Breaches
This table shows total breaches by type within the database.
## # A tibble: 1 × 5
## Unkown Theft IT_Hacking Loss Improper_Disposal
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 13 765 234 765 66
The table shows that theft and loss are the most common types of breaches, as they both have 765 total cases. IT Hacking is third on the list with 234 cases. As more and more information is going digitally, I would expect the number of IT Hacking to be way more in the future.
We want to know how many breaches a year we have seen
The graph shows that the most amount of breaches happened in 2014. Based on the graph, it shows that it was around 375 breaches in 2014. The 2nd and 3rd highest was in 2013 and 2016. As technology gets better throughout time, this could be why we see more breaches as time goes on.
Where are the top 25 healthcare breaches from?
## `summarise()` has grouped output by 'Name of Covered Entity', 'year', 'State'.
## You can override using the `.groups` argument.
## # A tibble: 25 × 5
## # Groups: Name of Covered Entity, year, State [25]
## `Name of Covered Entity` year State Cover…¹ Indiv…²
## <chr> <dbl> <chr> <chr> <dbl>
## 1 Anthem, Inc. Affiliated Covered Entity 2015 IN Health… 7.88e7
## 2 Science Applications International Corporation (… 2011 VA Busine… 4.9 e6
## 3 Advocate Health and Hospitals Corporation, d/b/a… 2013 IL Health… 4.03e6
## 4 21st Century Oncology 2016 FL Health… 2.21e6
## 5 Xerox State Healthcare, LLC 2014 TX Busine… 2 e6
## 6 IBM 2011 NY Busine… 1.9 e6
## 7 GRM Information Management Services 2011 NJ Busine… 1.7 e6
## 8 AvMed, Inc. 2010 FL Health… 1.22e6
## 9 Montana Department of Public Health & Human Serv… 2014 MT Health… 1.06e6
## 10 The Nemours Foundation 2011 FL Health… 1.06e6
## # … with 15 more rows, and abbreviated variable names ¹`Covered Entity Type`,
## # ²Individuals_Affected
The top 3 highest healthcare data breaches are from Anthem, INC, Science Applications International Corporation and Advocate Health and Hospitals Corporation
Which states have the high amount of breaches in the United States?
As expected, Indiana has the most out of any state because the highest breach was in Indiana (Anthem, Inc). Virginia was second because of the help from the breach from Science Applications International Corporation in 2011
Which months have the most amount of breaches?
The graph shows that the most amount of breaches happen in the months of April, May and October. The lowest amount of breaches happened in in the months of February and June.
Whate covered entity has the highest amount of breaches?
## # A tibble: 4 × 2
## `Covered Entity Type` `Amount of Breaches`
## <chr> <int>
## 1 Business Associate 285
## 2 Health Plan 200
## 3 Healthcare Clearing House 4
## 4 Healthcare Provider 1220
Healthcare Provider has the most amount of breaches by a large margin in comparison to the other 3 types of Covered Entity Types. I found it interesting that there was only 4 Breaches of Healthcare Clearing Houses. Since Healthcare Clearing Houses are the middleman between the big companies and the clients, I assumed data would get lost more within them.
Which day of the week has the most amount of breaches?
## # A tibble: 7 × 2
## day `Amount of Breaches`
## <ord> <int>
## 1 Sun 19
## 2 Mon 286
## 3 Tue 281
## 4 Wed 282
## 5 Thu 300
## 6 Fri 512
## 7 Sat 29
The graph shows that the majority of the breaches happen later in the weekdays. Thursday and Friday have the highest amounts. This makes sense because companies works usually Monday through Friday so there wouldn’t be many chances throughout the week. The Breaches that happen on Saturday and Sunday could possibly be mostly from hospitals because they are open 7 days a week obviously.
Number of Business Associate Breaches
## # A tibble: 10 × 2
## year `Number of Business Associate Breaches > 50`
## <dbl> <int>
## 1 2009 3
## 2 2010 43
## 3 2011 45
## 4 2012 37
## 5 2013 64
## 6 2014 67
## 7 2015 9
## 8 2016 14
## 9 2017 3
## 10 2018 0
As the graph states, there was 2 years where there was at least 50. In 2013, there was 64 breaches and 67 breaches in 2014 where it happened as well.
Was a business associate present during the theft breaches?
Sometimes breaches and data leaks always aren’t unfortunate circumstances. There could be human errors which include leaving a computer inside a car. I wanted to know how many theft breaches were a consultant or some type of worker involved with the breach. According to the data, about 600 thefts did not include a business associate and about 175 did have an associate. I expected the number of employees to be present to be around this number. If you are an employee of a hospital, you obviously are careful about potential information being stolen. I am not sure why employees would be involved in certain cases of these thefts. Many of the theft descriptions say that a laptop was stolen from a bag or car. One interesting one I read was a employee of FTGU medical Consulting sent data to a unknown 3rd party. If I would change the data or make the types of breaches more specific, I would break down the different types of theft. A computer stolen and an employee sending patients information to a third party source is not the same in my opinion.