This data is obtained from the US Department of Health and Human Services (HHS) and tracks all health care information breaches, This analysis will help you better understand those affected by these data breaches.
| Required Packages: | |
|---|---|
| readr | |
| stringr | |
| ggplot2 | |
| dplyr | |
| lubridate |
Which data loss types account for highest percentage of cases?
## # A tibble: 1 × 7
## `% Hacking` `% Disposal` `% Loss` `% Theft` `% Disclosure` `% Unknown` % Oth…¹
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 13.7 3.86 9.19 44.8 27.8 0.761 5.50
## # … with abbreviated variable name ¹`% Other`
Theft accounts for 44.8% of incidents
Breaches increased until 2015, when we saw a slight decrease
## # A tibble: 25 × 16
## Name of Co…¹ State Cover…² Indiv…³ Breach S…⁴ Type …⁵ Locat…⁶ Busin…⁷ Web D…⁸
## <chr> <chr> <chr> <dbl> <date> <chr> <chr> <chr> <chr>
## 1 Anthem, Inc… IN Health… 7.88e7 2015-03-13 Hackin… Networ… No "On Fe…
## 2 Science App… VA Busine… 4.9 e6 2011-11-04 Loss Other Yes "\\N"
## 3 Advocate He… IL Health… 4.03e6 2013-08-23 Theft Deskto… No "Advoc…
## 4 21st Centur… FL Health… 2.21e6 2016-03-04 Hackin… Networ… No "Failu…
## 5 Xerox State… TX Busine… 2 e6 2014-09-10 Unauth… Deskto… Yes "\\N"
## 6 IBM NY Busine… 1.9 e6 2011-04-14 Unknown Other Yes "\\N"
## 7 GRM Informa… NJ Busine… 1.7 e6 2011-02-11 Theft Electr… Yes "Unenc…
## 8 AvMed, Inc. FL Health… 1.22e6 2010-06-03 Theft Laptop No "Two l…
## 9 Montana Dep… MT Health… 1.06e6 2014-07-07 Hackin… Networ… No "Monta…
## 10 The Nemours… FL Health… 1.06e6 2011-10-07 Loss Other No "A loc…
## # … with 15 more rows, 7 more variables: hackIT <dbl>, dispo <dbl>, loss <dbl>,
## # theft <dbl>, unauth <dbl>, unknown <dbl>, other <dbl>, and abbreviated
## # variable names ¹`Name of Covered Entity`, ²`Covered Entity Type`,
## # ³`Individuals Affected`, ⁴`Breach Submission Date`, ⁵`Type of Breach`,
## # ⁶`Location of Breached Information`, ⁷`Business Associate Present`,
## # ⁸`Web Description`
Indiana has the largest number of affected individuals
There is not a common trend to which months include the most breaches
## # A tibble: 4 × 2
## `Covered Entity Type` Breaches
## <chr> <int>
## 1 Business Associate 285
## 2 Health Plan 200
## 3 Healthcare Clearing House 4
## 4 Healthcare Provider 1220
Healthcare Providers are responsible for the most breaches
## # A tibble: 7 × 2
## day Breaches
## <ord> <int>
## 1 Sun 19
## 2 Mon 286
## 3 Tue 281
## 4 Wed 282
## 5 Thu 300
## 6 Fri 512
## 7 Sat 29
There are fewer reports on the weekends since businesses are not usually operating. M-TR are consistent, but there are the most reports on Fridays
## # A tibble: 2 × 3
## year biz care
## <dbl> <int> <int>
## 1 2013 64 187
## 2 2014 67 179
2013 and 2014 were the only years that had over 50 Business Associate and 150 Healthcare Provider Breaches
## # A tibble: 1,709 × 9
## # Groups: year [10]
## `Breach Submission Date` hackIT dispo loss theft unauth unknown other year
## <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2018-03-12 0 0 0 0 1 0 0 2018
## 2 2017-12-29 0 0 0 1 0 0 0 2017
## 3 2017-12-21 0 0 0 0 1 0 0 2017
## 4 2017-12-05 0 0 0 0 1 0 0 2017
## 5 2017-12-05 1 0 0 0 0 0 0 2017
## 6 2017-11-13 0 1 0 0 0 0 0 2017
## 7 2017-10-24 0 0 0 1 0 0 0 2017
## 8 2017-10-23 0 0 0 0 1 0 0 2017
## 9 2017-10-20 0 0 0 0 1 0 0 2017
## 10 2017-10-20 0 0 1 0 0 0 0 2017
## # … with 1,699 more rows
a.) How many times does data get breached from a laptop when an associate is present? I am asking this to understand how mishandling company laptops can lead to a data breach
## # A tibble: 1 × 1
## AssociatePresent
## <int>
## 1 52
52 times an associate was present that a breach occured through a laptop. Did these employees face criminal charges?
b.) Which of these words regarding legal action appear most in the web description? I am curious to know if sanctions occur and how they talk about protected documents
## # A tibble: 1 × 3
## sanctions counsel protected
## <int> <int> <int>
## 1 173 40 1006
Over 173 breaches resulted in sanctions