Introduction

This Health Information Data comes from the Office for Civil Rights (OCR). This office is responsible for collecting and reporting disclosures of protected health information (PHI) as mandated by law. Part of the law requires that the OCR report cases where covered entities (CE—organizations responsible for protecting health information) have a breach that affects more than 500 individuals. The data reported for each of these breaches include

We have been given the following variables:

Name of Covered Entity Organization responsible for the PHI
State US State where the Breach was reported
Covered Entity Type Type of Organization responsible for the PHI
Individuals Affected Number of records affected by the breach
Breach Submission Date Date the breach was reported by CE
Type of Breach How unauthorized access to the PHI was obtained
Location of Breached Information Where was the PHI when unauthorized access was obtained
Business Associate Present Was a business associate such as a consultant or contractor involved in Breach
Web Description A optional statement explaining what happened and the resolution

##Top 100 Individuals Affected

###In this table you can see the top 100 individuals affected witht the state and the type of breach.

##A Table of every single output of Location of breached Data

###Since we will most likley be using this data to pinpoint the location of certain Breached information, it is interesting to see how some breaches involved more than one location.

##Business Associate Involvement ###With this Graph we wanted to look at the Affect of business associates and if they played an important role in maybe detering some breaches. As seen it seems like there is a correlation between a business associate being present and a lower number of breaches.

##Type of Breachs ###As compared with the table of location of breaches, now I wanted to look at a graph of each of the unique breaches and see if I can determine what the type of breach occured the most. As seen in this graph, “unkown” seems to be the most common type of breach which does not look good from a security standpoint.

##Top 50 most recent reports from the data set

As we can see from the table that shows as an outcome, we can tell what type of breach has most recently occured. In this case that seems to be mostly emails or laptops that caused the breach. This analysis can be used in determining what kind of formal training your potential employees should go through based on the recent breaches.

##Number of Healthcare Data Breaches by year ###With this Visual, we wanted to see how many healthcare breaches there were every year and from this visual we can see that there seems to be a general average of the amount of data breache. The reason that the poles (2009 and 2018) are much smaller than the rest of the years could be due to a lack of data collection in those years. For example, the surveying of data breaches maybe did not start until late 2009 and stopped early 2018.

##List of the Top 25 largest Healthcare Data Breaches

###This visual shows us the 25 top largest data breaches within our data. These top 25 could be the reason why there are such high outliers within our data. We can see the maximum number of data breaches which is a substaintial amount. This visual shows us the severity of what a data breach can do to personal records.

##Total Healthcare Records (Individuals Affected) Exposed by State for the top 10 States” ###This Visual shows the top 10 states and the number of individuals affected in each of the 10 states. As you can see Oklahoma and West Virginia have the highest number of people affected. This can tell us what states need to improve their security measures so that these breaches do not happen as often.

##Number of Healthcare hacking incidents by month ###This visual shows how many hacking incidents there were each month. As seen the majority of incidents happen at the end of the year. This can tell us that as the year goes on, people get more lazy when it comes to security.

##Number of breaches covered by entity type

###This table shows us what kind of entity type and how many breaches each one of the four have. As seen the entity “Health Care Provider” at 1220 total breaches. This could be attributed to the idea that patient information is very valuble and can be used in a variety of ways.

##On what day of the week (Sunday, Monday, etc.) are breaches most often reported?

###As seen from this table the day of the week that the most breaches are reported are on friday. This could attribute to the laziness that we all experience on the final day of the week. Employees are more likley to make careless mistakes when they are least focused. Also if we take a look at the two lowest days that breaches occured, we can see that it is Saturday and Sunday that have the lowest. This makes sense since many employees do not work on the weekends.

##2.2.2 In which year (or years) were there at least 50 breaches from a ‘Business Associate’ covered entity type and at least 150 breaches from a healthcare provider covered entity type?

###This table shows us a summary of breaches from a business associate entity type and a healthcare provider. As we can see 2013 and 2014 seemed to be the weekest years for these two entities.

##2.2.3 How has the type of breach (hacking, improper disposal, loss, etc.) changed for each year?

###This table shows us the number of breaches that have occured in each of the types of breaches for each year. We can see when each of these types of breaches happened most often.

##What is the difference between “High Affect” and “Low Affect”

## Affected
## High Affect  Low Affect 
##          49        1660

###I defined “High Affect” on if the number of breaches was over 100,000 and “Low Affect” if it was not. As seen there was more “Low Affect” on the number of individuals rather than “High Affect”.

##Show the Web Descriptions for the first 100 breaches

###This table is to give the reader a little bit of background about what the situation was in some of these breaches.