I am an analyst working for the US Department of Health and Human Services (HHS) in the Office for Civil Rights (OCR). This office is responsible for collecting and reporting disclosures of protected health information (PHI) as mandated by law. Part of the law requires that the OCR report cases where covered entities (CE—organizations responsible for protecting health information) have a breach that affects more than 500 individuals. The data reported for each of these breaches include:
• Name of the covered entity (Organization responsible for the PHI)
• State (US State where the breach was reported)
• Covered Entity Type (Type of organization responsible for the PHI)
• Individuals Affected (Number of records affected by the breach)
• Breach submission date (Date the breach was reported by the CE)
• Type of breach (how unauthorized access to the PHI was obtained)
• Location of breached information (Where was the PHI when unauthorized access was obtained)
• Business associate present (Was a business associate such as a consultant or contractor involved in the breach)
• Web description (A optional statement explaining what happened and the resolution)
Below is a table summarizing the overall data. This table includes the amount of affected people as well as the average affected people and standard deviation.I included the total to emphasize just how many people are affected by this issue. I included the average to show how many people are typically affected per breach, however the standard deviation will help show just how dispersed the data is. I also included a number of total breaches and the largest and smallest breaches, to show the variation in size that is throughout the data.
| Average Affected | Standard Deviation Of Affected | Total Affected | Total Breaches | Largest Group | Smallest Group |
|---|---|---|---|---|---|
| 72703.15 | 1915658 | 124249678 | 1709 | 78800000 | 500 |
What these summary statistics show us is that per breach, an average of 72,703 people are affected. However, with a standard deviation of 1,915,658 we know that this is very dispersed data. With total breaches of 1,709, we can see that over the last few years there have been many different breaches. The dispersement of the data that was mentioned earlier is seen again in the largest versus smallest group. Since the largest group affected 78,800,000 and the smallest affected 500.
What this bar plot shows is how many breaches there were per year, allowing us to understand if there were any spikes and if recent years are consistent. What we can see from this plot is that there was a big spike in breaches from 2009 to 2010. After this large jump, there was a steady amount of breaches that remained around 200 for the next three years before making another jump in 2013. After this jump, there was an even larger one in 2015 before the number of breaches continued to decline from this point, to almost 150 less by 2017.
What this table shows us is the Covered Entity (company) that had a breach and how big that breach was. From here, we can see that the Anthem breach was historically the largest breach by a large margin, beating out Science Applications International by 73,900,000 individuals affected.
| Name of Covered Entity | Individuals Affected |
|---|---|
| Anthem, Inc. Affiliated Covered Entity | 78800000 |
| Science Applications International Corporation (SA | 4900000 |
| Advocate Health and Hospitals Corporation, d/b/a Advocate Medical Group | 4029530 |
| 21st Century Oncology | 2213597 |
| Xerox State Healthcare, LLC | 2000000 |
| IBM | 1900000 |
| GRM Information Management Services | 1700000 |
| AvMed, Inc. | 1220000 |
| Montana Department of Public Health & Human Services | 1062509 |
| The Nemours Foundation | 1055489 |
| BlueCross BlueShield of Tennessee, Inc. | 1023209 |
| Sutter Medical Foundation | 943434 |
| Valley Anesthesiology Consultants, Inc. d/b/a Valley Anesthesiology and Pain Consultants | 882590 |
| Horizon Healthcare Services, Inc., doing business as Horizon Blue Cross Blue Shield of New Jersey, and its affiliates | 839711 |
| Iron Mountain Data Products, Inc. (now known as | 800000 |
| Utah Department of Technology Services | 780000 |
| AHMC Healthcare Inc. and affiliated Hospitals | 729000 |
| EISENHOWER MEDICAL CENTER | 514330 |
| Radiology Regional Center, PA | 483063 |
| Puerto Rico Department of Health - Triple S Management Corp. | 475000 |
| St Joseph Health System | 405000 |
| Spartanburg Regional Healthcare System | 400000 |
| Triple-S Salud, Inc. - Breach Case#2 | 398000 |
| Triple-S Salud, Inc. | 398000 |
| Community Health Plan of Washington | 381504 |
What this bar plot shows us is that, obviously, Anthem INC is headquartered in Indiana. What this bar plot also shows us is that these states are all similar in the amount of individuals affected per state. Florida being the second most, with Virginia coming in a close third. Something interesting about this bar plot is that the states that have notably larger cities are included, yet they are not even close to the amount that happened with Anthem. For example, California, Illinois, and New York have some of the most well-known cities in the world but they are not top of the list. This emphasizes that there’s no correlation between the state/city size and the number of breaches.
The bar plot below shows us the all of the months of the year and how many hacking incidents there were per month. What this tells us is that there is a significant drop in hacking incidents in November and February. The number of hacking incidents also spike in the months of March, April, September, and December.
This table shows us that there is a larger amount of breaches at healthcare providers. This table begs the question of how many individuals are affected by the breaches from these types of entities. Although the healthcare providers have the largest number of breaches, they may not have the most affected individuals. What we see here is that healthcare providers have much more than any other covered entity. While business associates and health plans each have around 200.
| Covered Entity Type | Number of Breaches |
|---|---|
| Business Associate | 285 |
| Health Plan | 200 |
| Healthcare Clearing House | 4 |
| Healthcare Provider | 1220 |
From this graph, we are able to see that the most breaches are reported typically on Fridays. Thinking about this logically, this is most likely because people would like to report breaches and then escape to the bliss of the weekend before having to deal with repercussions on Monday. Also Saturday and Sunday are the two smallest bars, which is understandable because they are on the weekend where most providers aren’t operating.
From this table we are able to see that there were only 2 years in which a Business Associate had at least 50 breaches and a Healthcare Provider had at least 150 breaches. Something interesting to note is that on each of the years in which this happened, the number of breaches was about the same for each Covered Entity type, and they occurred in 2 years consecutively.
| Year | Number of Breaches at BA > 50 | Number of Breaches at HP > 150 |
|---|---|---|
| 2013 | 64 | 187 |
| 2014 | 67 | 179 |
This bar plot tells us that there is a disparity between how many theft breaches there were and every other type. The only other type of breach that came close to beating theft was Unauthorized Access/Disclosure. Another thing to note is that there was a spike in Hacking incidents in 2016 as well as a spike for everything but theft in 2014.
This question was posed in response to the one that I asked earlier in the Number of Breaches by Covered Entity Type chunk. I wanted to see if the Covered Entity type that had the most amount of breaches also had the most individuals affected. From the bar plot, we are able to see that the covered entity type that had the most breaches, did not end up having the most affected individuals. This is interesting, because Healthcare Providers had almost 1,000 more breaches than any other Covered Entity Type.
This bar plot will show us how many above average breaches occurred when a Business Associate was and was not present. What this will help us tell is whether or not having a Business Associate present correlates to less above average breaches. In the case of this graph, we are able to say that less above average breaches occur when there is a Business Associate present.
Despite the number of breaches being higher when a Business Associate is not present, this does not tell the total amount of Individuals Affected. Although a breach is more likely without a Business Associate present, how many Individuals Affected are not told in this visual. This bar plot shows us that not only do the most breaches occur when a Business Associate is not present, but also that the most Individuals Affected are affected when a Business Associate is not present
What we can determine from this bar plot is that there should be a Business Associate present at all times. This way, we can not only aid in making the amount of above average breaches lessened, but we can also cut down on the overall individuals affected. Adding a Business Associate presence will not completely rid each system of breaches, but based on these graphs it would be reasonable to assume that having one present will help.