The US Department of Health and Human Services (HHS) in the Office for Civil Rights (OCR) is responsible for collecting and reporting disclosures of protected health information (PHI) as mandated by law. Part of the law requires that the OCR report cases where covered entities (CE—organizations responsible for protecting health information) have a breach that affects more than 500 individuals.
The data for this analysis is derived from the Breach Portal: Notice to the Secretary of HHS Breach of Unsecured Protected Health Information from U.S. Department of Health and Human Services Office for Civil Rights. The data contains the 9 variables described and consists of 1,709 rows. Each row of the data represents a health care entity that has had some form of data breach. Each column records an aspect of the data breach, including, state, entity type, individuals affected, submission date of breach, type of breach,and more. The data has been reported on a clunky OCR data portal since 2009: https://ocrportal.hhs.gov/ocr/breach/breach_report.jsf.
The data reported for each of these breaches include:
• Name of the covered entity (Organization responsible for the PHI) • State (US State where the breach was reported) • Covered Entity Type (Type of organization responsible for the PHI) • Individuals Affected (Number of records affected by the breach) • Breach submission date (Date the breach was reported by the CE) • Type of breach (how unauthorized access to the PHI was obtained) • Location of breached information (Where was the PHI when unauthorized access was obtained) • Business associate present (Was a business associate such as a consultant or contractor involved in the breach) • Web description (A optional statement explaining what happened and the resolution)
The following table highlighting the number of instances of data breaches by state and the average number of individuals affected in each instance of a data breach. California has had 252 data breaches the most of any state in the dataset.
##Breaches by entity type
The following table creates a summary of the number of breaches by covered entity type. Healthcare entities account for 71% of the 2048 entries.
| Covered Entity Type | number_of_breaches |
|---|---|
| Business Associate | 315 |
| Health Plan | 268 |
| Healthcare Clearing House | 4 |
| Healthcare Provider | 1459 |
| NA | 2 |
##Summary of all Thefts
Below is a searchable table that contains all instances where an entity had classified a data breach as theft.
##Breaches with a business associate present for each entity type
This is a quick break down of the the number of breaches with a business associate present.
| Covered Entity Type | number_of_breaches |
|---|---|
| Business Associate | 308 |
| Health Plan | 39 |
| Healthcare Provider | 61 |
| NA | 2 |
##Summary of all breaches on paper by state
Here is a quick break down of the number of paper/film data breaches to take place in the time frame. California is seen to have the most out of any state with 56 instances occurring.
##Number of healthcare data breaches by year
Below is a bar chart highlighting the number of data breaches by year. As can be seen in the graph there has been a significant decrease in the number of breaches in 2017. ##List of the top 25 largest healthcare data breaches
Here is a searchable list of the top 25 data breaches. The most common form of data breach was hacking/IT incident but interestingly the fourth largest breach of data (about half a million people) was recorded as a loss of data.
##Visual: Total healthcare records (individuals affected) exposed by state for the top 10 states
Here we observe a large spike in the number of individuals affected in Indiana. This is thanks to an outlier data point of Anthem, Inc. Affiliated Covered Entity making up 78800000 of the individuals affected. Strangly enough even though California has had the highest number of breaches out of any state they are the fifth largest sum of individuals affected.
##Number of healthcare hacking incidents by month
Here we can see a bar chart reflecting the number of healthcare hacking’s have taken place by month. The months of March and April have the highest count of hacking incidents taking place. Could this be in correlation with tax season?
##Number of breaches by covered entity type
As the table below shows there is a noticeable difference in the number of breaches taking place between 2009 and 2018 between Healthcare Providers and the other entity types.
| Covered Entity Type | Number of breaches |
|---|---|
| Business Associate | 315 |
| Health Plan | 268 |
| Healthcare Clearing House | 4 |
| Healthcare Provider | 1459 |
| NA | 2 |
##Breaches during days of the week
The lowest two instances of breaches taking place are during the weekend but we can also see a large spike in the number of reported breaches on friday.
| day | Number of Breaches |
|---|---|
| Friday | 617 |
| Monday | 333 |
| Saturday | 32 |
| Sunday | 22 |
| Thursday | 366 |
| Tuesday | 340 |
| Wednesday | 338 |
##How often where there more than or equal to 150 healthcare providor entity breaches and more than or equal to 50 bsuinesss associate?
2013 and 2014 where the only two years between 2009 and 2018 were more than 150 Healthcare Providers and more than 50 Business associate had data breaches.
| Year | HealthProvider | BusinessAssociate |
|---|---|---|
| 2013 | 193 | 64 |
##Breaches by Year.
The table below helps visualize the change in number of breaches occurring by breach type from year to year. We can see that in the case of thefts every year since 2010 fewer Thefts have been reported. In 2016 we can see that their was an extreme increase in the number of hacking incidents compared to the rest of the years in this frame. 71 instances of Hacking/IT incidents took place while in the years prior the largest count of instances was in 2014 with 33 entries of Hacking/IT incidents.
Part two of this document covers curious findings in data set that raise interesting questions and the present findings of useful information to the reader.
##Theft at work?
Between the years of 2009 and 2018 their were 118 instances of Thefts taking place while a business associate was present. Interestingly enough of those 118 instances 108 of those instances happened at business associate entities. That is 37% of all business associate entries in this database. Compared to Healthcare providers who only had 6 entries of theft with a business associate present representing only .5% of all healthcare provider entries, There is a noticeably higher correlation between thefts and business associates present at business associate entities.
##Which Business Assciate Entities where they?
To better support the visual above here is searchable database of the 108 business associate entities that had workers present during the theft.
##Social Security Numbers… better keep them safe!
In the table below we have the top 10 states with the most individuals affected by data breaches that contained the keyword social security number in their web description. California had the highest number of breaches in the entire data set at 49 but was still only the 7th largest state to have individuals affected. of the entire data set 450 entries contained the key word social security number and 11673076 individuals were affected.
| State | Number of breaches | Number Of Individuals Affected |
|---|---|---|
| FL | 40 | 4022358 |
| NJ | 7 | 1752386 |
| UT | 3 | 786557 |
| TX | 39 | 749640 |
| SC | 9 | 730118 |
| WA | 16 | 597482 |
| CA | 49 | 509730 |
| PA | 20 | 312742 |
| IN | 12 | 274688 |
| PR | 14 | 186598 |