This is data of school shootings in America from 1999 to 2018

Here is a preview of the data. There is alot of information about the shooter and accomplices; information about the school - school types, location, time, enrollment, grade etc. There have been 217 school shootings since 1999 at the time this data was published.

glimpse(shoot)
## Observations: 217
## Variables: 50
## $ uid                              <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10...
## $ nces_school_id                   <chr> "080480000707", "220054000422...
## $ school_name                      <chr> "Columbine High School", "Sco...
## $ nces_district_id                 <int> 804800, 2200540, 1304410, 421...
## $ district_name                    <chr> "Jefferson County R-1", "East...
## $ date                             <chr> "4/20/1999", "4/22/1999", "5/...
## $ school_year                      <chr> "1998-1999", "1998-1999", "19...
## $ year                             <int> 1999, 1999, 1999, 1999, 1999,...
## $ time                             <chr> "11:19 AM", "12:30 PM", "8:03...
## $ day_of_week                      <chr> "Tuesday", "Thursday", "Thurs...
## $ city                             <chr> "Littleton", "Baton Rouge", "...
## $ state                            <chr> "Colorado", "Louisiana", "Geo...
## $ school_type                      <chr> "public", "public", "public",...
## $ enrollment                       <chr> "1965", "588", "1,369", "3147...
## $ killed                           <int> 13, 0, 0, 0, 0, 1, 0, 1, 0, 0...
## $ injured                          <int> 21, 1, 6, 1, 1, 0, 5, 0, 0, 1...
## $ casualties                       <int> 34, 1, 6, 1, 1, 1, 5, 1, 0, 1...
## $ shooting_type                    <chr> "indiscriminate", "targeted",...
## $ age_shooter1                     <int> 18, 14, 15, 17, NA, 12, 13, 1...
## $ gender_shooter1                  <chr> "m", "m", "m", "m", "m", "m",...
## $ race_ethnicity_shooter1          <chr> "w", "", "w", "", "", "h", "a...
## $ shooter_relationship1            <chr> "student", "former student (e...
## $ shooter_deceased1                <int> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
## $ deceased_notes1                  <chr> "suicide", "", "", "", "", ""...
## $ age_shooter2                     <int> 17, NA, NA, NA, NA, NA, NA, N...
## $ gender_shooter2                  <chr> "m", "", "", "", "", "", "", ...
## $ race_ethnicity_shooter2          <chr> "w", "", "", "", "", "", "", ...
## $ shooter_relationship2            <chr> "student", "", "", "", "", ""...
## $ shooter_deceased2                <int> 1, NA, NA, NA, NA, NA, NA, NA...
## $ deceased_notes2                  <chr> "suicide", "", "", "", "", ""...
## $ white                            <chr> "1783", "5", "1189", "209", "...
## $ black                            <int> 16, 583, 136, 2736, 755, 6, 3...
## $ hispanic                         <int> 112, 0, 28, 27, 287, 583, 12,...
## $ asian                            <int> 42, 0, 15, 170, 29, 2, 0, 26,...
## $ american_indian_alaska_native    <int> 12, 0, 1, 5, 5, 2, 153, 5, 1,...
## $ hawaiian_native_pacific_islander <int> NA, NA, NA, NA, NA, NA, NA, N...
## $ two_or_more                      <int> NA, NA, NA, NA, NA, NA, NA, N...
## $ resource_officer                 <int> 1, 0, 1, 1, 0, 0, 0, 1, 0, 0,...
## $ weapon                           <chr> "12-gauge Savage-Springfield ...
## $ weapon_source                    <chr> "purchased from friends", "",...
## $ lat                              <dbl> 39.60391, 30.52996, 33.62692,...
## $ long                             <dbl> -105.07500, -91.16997, -84.04...
## $ staffing                         <dbl> 89.600, 39.000, 84.000, 41.00...
## $ low_grade                        <chr> "9", "6", "9", "9", "9", "6",...
## $ high_grade                       <chr> "12", "8", "12", "12", "12", ...
## $ lunch                            <int> 41, 495, 125, 2007, 543, 502,...
## $ county                           <chr> "Jefferson County", "East Bat...
## $ state_fips                       <int> 8, 22, 13, 42, 25, 35, 40, 12...
## $ county_fips                      <int> 8059, 22033, 13247, 42101, 25...
## $ ulocale                          <int> 21, 12, 21, 11, 11, 33, 32, 2...

Deadliest school shootings.

The Sandy Hook Elementary School Massacre tops the list. Unsurprisingly all of the shooters were male.

shoot %>% arrange(desc(killed)) %>% select(school_name, gender_shooter1, killed, year, city, state)
## # A tibble: 217 x 6
##    school_name             gender_shooter1 killed  year city      state   
##    <chr>                   <chr>            <int> <int> <chr>     <chr>   
##  1 Sandy Hook Elementary ~ m                   26  2012 Newtown   Connect~
##  2 Marjory Stoneman Dougl~ m                   17  2018 Parkland  Florida 
##  3 Columbine High School   m                   13  1999 Littleton Colorado
##  4 Red Lake High School    m                    7  2005 Red Lake  Minneso~
##  5 West Nickel Mines Amis~ m                    5  2006 Nickel M~ Pennsyl~
##  6 Marysville Pilchuck Hi~ m                    4  2014 Marysvil~ Washing~
##  7 Chardon High School     m                    3  2012 Chardon   Ohio    
##  8 Santana High School     m                    2  2001 Santee    Califor~
##  9 Rocori High School      m                    2  2003 Cold Spr~ Minneso~
## 10 North Park Elementary ~ m                    2  2017 San Bern~ Califor~
## # ... with 207 more rows

Plot of all the school shootigs since 1999.

From the previous output we know that only three school shootings resulted in over 10 people being killed - Sandy Hook, Marjory Stoneman & Columbine which are in Newtown, Parkland & Littleton respectively.

ggplot() +
  geom_polygon(data = US, aes(x=long, y = lat, group = group), fill="grey", alpha=0.4) +
  geom_point(data=shoot, aes(x=long, y=lat, colour = killed), size=3) +
   theme_void() + scale_color_viridis(direction = -1)+
  geom_text_repel( data=shoot %>% arrange(desc(killed)) %>% head(10), aes(x=long, y=lat, 
                                                                    label=city), size=3) +
  ylim(10,60)+ xlim(-130,-40) + coord_map()
## Warning: Removed 2 rows containing missing values (geom_point).

Over 97% of shooters are male

shoot %>% group_by(gender_shooter1) %>% filter(gender_shooter1 != "") %>%  
  summarise(n = sum(killed), p = sum(killed)/128) 
## # A tibble: 2 x 3
##   gender_shooter1     n      p
##   <chr>           <int>  <dbl>
## 1 f                   3 0.0234
## 2 m                 125 0.977

Most shooters are young with the average age being around 19

shoot %>% group_by(age_shooter1) %>% filter(gender_shooter1 != "") %>%  
  summarise(n = sum(killed), p = sum(killed)/128) %>% arrange(desc(n))%>% na.omit()
## # A tibble: 32 x 3
##    age_shooter1     n      p
##           <int> <int>  <dbl>
##  1           20    27 0.211 
##  2           19    20 0.156 
##  3           15    17 0.133 
##  4           18    16 0.125 
##  5           16     8 0.0625
##  6           17     7 0.0547
##  7           32     5 0.0391
##  8           14     4 0.0312
##  9           21     3 0.0234
## 10           53     3 0.0234
## # ... with 22 more rows
mean(na.omit(shoot$age_shooter1))
## [1] 19.20904