Definition
An instance when a gun is brandished, is fired, or a bullet hits school property for any reason, regardless of the number of victims, time of day, or day of week. K to 12 schools.
Data Source
https://www.chds.us/ssdb/dataset/
- The K-12 School Shooting Database is designed to be as inclusive as possible so that the end user is empowered to filter the information based on their specific area of focus. For example, someone who is interested in only studying teen suicides in schools, bullying, or gang violence can filter the data specifically for those types of school shooting incidents. A researcher who wants to look at incidents involving just current students can filter out all of the shootings on school property committed by parents, teachers, and other non-students. While this database uses this very expansive definition of a school shooting, the authors were very careful to design a format that would allow other researchers to query the database using different definitions, in order to avoid semantic disputes.
The database is available for download as a csv file. The School Shooting Database Project is conducted as part of the Advanced Thinking in Homeland Security (HSx) program at the Naval Postgraduate School’s Center for Homeland Defense and Security (CHDS).
The database compiles information from more than 25 different sources including peer-reviewed studies, government reports, mainstream media, non-profits, private websites, blogs, and crowd-sourced lists that have been analyzed, filtered, deconflicted, and cross-referenced. All of the information is based on open-source information and 3rd party reporting.
Data source : Advanced Thinking in Homeland and Security (HSx)
Advanced Thinking in Homeland Security (HSx) is an 18-month collaborative program from the Center for Homeland Defense and Security. It is designed to build our knowledge and create new paradigms for the future security challenges facing the nation and our global community. This program is not intended to forecast or predict one or more futures for which its participants will learn to operate in; they will be taught, discover and create skills and abilities to lead and thrive in an environment that is unknown, complex, chaotic, and evolving exponentially.
Data source: the Center for Homeland Defense and Security at the Naval Postgraduate School
The Center for Homeland Defense and Security (CHDS) is located at the Naval Postgraduate School in Monterey, CA. Since 2003, CHDS has conducted a wide range of programs focused on assisting current and emerging leaders in Homeland Defense and Security to develop the policies, strategies, programs and organizational elements needed to defeat terrorism and prepare for and respond to natural disasters and public safety threats across the United States. The programs are developed in partnership with and are sponsored by the National Preparedness Directorate, FEMA.
loading dataset and exploratory data analysis
## [1] "Date"
## [2] "School"
## [3] "City"
## [4] "State"
## [5] "Reliability Score (1-5)"
## [6] "Killed (includes shooter)"
## [7] "Wounded"
## [8] "Total Injured/Killed Victims"
## [9] "Gender of Victims (M/F/Both)"
## [10] "Victim's Affiliation w/ School"
## [11] "Victim's age(s)"
## [12] "Victims Race"
## [13] "Victim Ethnicity"
## [14] "Targeted Specific Victim(s)"
## [15] "Random Victims"
## [16] "Bullied (Y/N/ N/A)"
## [17] "Domestic Violence (Y/N)"
## [18] "Suicide (Shooter was only victim) Y/N/ N/A"
## [19] "Suicide (shot self immediately following initial shootings) Y/N/ N/A"
## [20] "Suicide (e.g., shot self at end of incident - time period between first shots and suicide, different location, when confronted by police) Y/N/ N/A"
## [21] "Suicide (or attempted suicide) by Shooter (Y/N)"
## [22] "Shooter's actions immediately after shots fired"
## [23] "Pre-planned school attack"
## [24] "Summary"
## [25] "Category"
## [26] "School Type"
## [27] "Narrative (Detailed Summary/ Background)"
## [28] "Sources"
## [29] "Time of Occurrence (12 hour AM/PM)"
## [30] "Duration (minutes)"
## [31] "Day of week (formula)"
## [32] "During School Day (Y/N)"
## [33] "Time Period"
## [34] "Location"
## [35] "Number of Shots Fired"
## [36] "Firearm Type"
## [37] "Number of Shooters"
## [38] "Shooter Name"
## [39] "Shooter Age"
## [40] "Shooter Gender"
## [41] "Race"
## [42] "Shooter Ethnicity"
## [43] "Shooter's Affiliation with School"
## [44] "Shooter had an accomplice who did not fire gun (Y/N)"
## [45] "Hostages Taken (Y/N)"
Segerate dataset by year, and month
Grouping by Year
by_date <- school %>%
group_by(Date) %>%
summarise(Count=n())
by_year <- school %>%
group_by(Year) %>%
summarise(Count=n())
by_year <- filter(by_year,Year<2019)
ggplot(data = by_year, aes(x=Year,y=Count, width=.6))+
geom_bar(stat = "identity", fill="blue")+
geom_smooth(color="red")+
geom_text(data=by_year, aes(x=Year, y=Count, label=paste0(Count),hjust=-1,angle=90))+
theme(axis.text.x = element_text(angle = 90, hjust = .5))+
labs(title="Count of Shooting Incidents ", subtitle = "From 1970 to 2018")+
ylim(0, 130)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

kable(by_year) %>%
kable_styling(bootstrap_options = c("striped", "hover"))
Year
|
Count
|
1970
|
19
|
1971
|
21
|
1972
|
18
|
1973
|
18
|
1974
|
16
|
1975
|
14
|
1976
|
11
|
1977
|
16
|
1978
|
16
|
1979
|
14
|
1980
|
20
|
1981
|
17
|
1982
|
18
|
1983
|
25
|
1984
|
25
|
1985
|
19
|
1986
|
16
|
1987
|
24
|
1988
|
38
|
1989
|
16
|
1990
|
15
|
1991
|
31
|
1992
|
30
|
1993
|
42
|
1994
|
37
|
1995
|
19
|
1996
|
21
|
1997
|
20
|
1998
|
28
|
1999
|
22
|
2000
|
28
|
2001
|
25
|
2002
|
19
|
2003
|
31
|
2004
|
35
|
2005
|
47
|
2006
|
59
|
2007
|
43
|
2008
|
35
|
2009
|
31
|
2010
|
15
|
2011
|
13
|
2012
|
18
|
2013
|
33
|
2014
|
43
|
2015
|
35
|
2016
|
45
|
2017
|
45
|
2018
|
110
|
Year = 2018
Incidents since 2010 to 2018
school_2010 <- filter(school,Year>2009 & Year <2019)
type <- school_2010 %>%
group_by(`School Type`) %>%
summarise(Count=n())
colnames(type) <- c("school_type","Count")
ggplot(data = type, aes(x=reorder(school_type,-Count),y=Count))+
geom_bar(stat = "identity",fill="blue")+
geom_text(data=type, aes(x=reorder(school_type,-Count), y=Count,
label=paste0(Count),vjust=-1,angle=0))+
theme(axis.text.x = element_text(angle = 45, hjust = .5))+
ylim(0, 250)+
labs(title = "Incident by School Types", subtitle = "from 2010 to 2018")

by_month <- school_2010 %>%
group_by(Month) %>%
summarise(Count=n())
ggplot(data = by_month, aes(x=as.factor(Month),y=Count))+
geom_bar(stat = "identity",fill="maroon")+
geom_text(data=by_month, aes(x=Month, y=Count,
label=paste0(Count),vjust=-1,angle=0))+
ylim(0, 100)+
labs(title = "Incidents by Month", subtitle = "from 2010 to 2018")

Who are shooting at elementary schools
Caution : Shooter Age variable has many missing values
Chart only show incidents with confirmed age of shooter
elementary <- filter(na.omit(school),`School Type`=="Elementary")
elementary$age_grp <- elementary$`Shooter Age`
elementary$age_grp <- ifelse((elementary$`Shooter Age`>= 0& elementary$`Shooter Age`<=9),'minor', elementary$age_grp)
elementary$age_grp <- ifelse((elementary$`Shooter Age`>=10 & elementary$`Shooter Age`<=19),'teen', elementary$age_grp)
elementary$age_grp <- ifelse((elementary$`Shooter Age`>= 20& elementary$`Shooter Age`<=30),'20_30', elementary$age_grp)
elementary$age_grp <- ifelse((elementary$`Shooter Age`>= 31& elementary$`Shooter Age`<=40),'31_40', elementary$age_grp)
elementary$age_grp <- ifelse((elementary$`Shooter Age`>= 41& elementary$`Shooter Age`<=50),'41_50', elementary$age_grp)
elementary$age_grp <- ifelse((elementary$`Shooter Age`>= 51& elementary$`Shooter Age`<=60),'51_60', elementary$age_grp)
elementary$age_grp <- ifelse((elementary$`Shooter Age`>= 61& elementary$`Shooter Age`<=70),'61_70', elementary$age_grp)
shooter_elementary <- elementary %>%
group_by(age_grp) %>%
summarise(Count=n())
shooter_elementary
## # A tibble: 7 x 2
## age_grp Count
## <chr> <int>
## 1 20_30 2
## 2 31_40 2
## 3 41_50 2
## 4 51_60 2
## 5 61_70 1
## 6 minor 1
## 7 teen 5
ggplot(data = shooter_elementary, aes(x=age_grp,y=Count))+
geom_bar(stat = "identity", fill="darkgreen")+
theme(axis.text.x = element_text(angle = 90, hjust = .5))+
labs(title = "Elementary school shooting _ Age of shooter ", subtitle = "1970 to 2019")+
ylim(0,10)

Who are shooting at high schools
Caution : Shooter Age variable has many missing values
Chart only show incidents with confirmed age of shooter
high <- filter(na.omit(school),`School Type`=="High")
high$age_grp <- high$`Shooter Age`
high$age_grp <- ifelse((high$`Shooter Age`>= 0& high$`Shooter Age`<=9),'minor', high$age_grp)
high$age_grp <- ifelse((high$`Shooter Age`>=10 & high$`Shooter Age`<=19),'teen', high$age_grp)
high$age_grp <- ifelse((high$`Shooter Age`>= 20& high$`Shooter Age`<=30),'20_30', high$age_grp)
high$age_grp <- ifelse((high$`Shooter Age`>= 31& high$`Shooter Age`<=40),'31_40', high$age_grp)
high$age_grp <- ifelse((high$`Shooter Age`>= 41& high$`Shooter Age`<=50),'41_50', high$age_grp)
high$age_grp <- ifelse((high$`Shooter Age`>= 51& high$`Shooter Age`<=60),'51_60', high$age_grp)
high$age_grp <- ifelse((high$`Shooter Age`>= 61& high$`Shooter Age`<=70),'61_70', high$age_grp)
shooter_high <- high %>%
group_by(age_grp) %>%
summarise(Count=n())
shooter_high
## # A tibble: 3 x 2
## age_grp Count
## <chr> <int>
## 1 20_30 5
## 2 41_50 1
## 3 teen 81
ggplot(data = shooter_high, aes(x=age_grp,y=Count))+
geom_bar(stat = "identity", fill="darkgreen")+
theme(axis.text.x = element_text(angle = 90, hjust = .5))+
labs(title = "High school shooting _ Age of shooter ", subtitle = "1970 to 2019")+
ylim(0,100)

Injuries & Fatalities
subset <- school %>%
select(c("Year","Total Injured/Killed Victims", "Gender of Victims (M/F/Both)",
"Shooter's Affiliation with School","Shooter Age","Shooter Gender","Firearm Type","State") )
colnames(subset) <- c("Year","Total","Victim_Gender","Affiliation","Shooter_Age","Shooter_Gender","Firearm","State")
#-----------------------------
injury_year <- subset %>%
group_by(Year) %>%
summarise(Count=sum(Total))
ggplot(data = injury_year, aes(x=Year,y=Count))+
geom_bar(stat = "identity", fill="navy")+
geom_text(data=injury_year, aes(x=Year, y=Count, label=paste0(Count),hjust=-1,angle=90))+
theme(axis.text.x = element_text(angle = 90, hjust = .5))+
labs(title = "Count of injuries & Fatalities", subtitle = "from 1970 to present, including year 2019")+
ylim(0,200)

#----------------------------
shooter <- subset %>%
filter(Year>2009) %>%
group_by(Shooter_Age,Shooter_Gender) %>%
summarise(Count=n())
ggplot(data = shooter, aes(x=Shooter_Age,y=Count,fill=Shooter_Gender))+
geom_bar(stat = "identity")+
theme(axis.text.x = element_text(angle = 90, hjust = .5))+
labs(title = "Age and Gender of shooter ", subtitle = "2010 to 2018")+
ylim(0,40)
## Warning: Removed 5 rows containing missing values (position_stack).

kable(shooter) %>%
kable_styling(bootstrap_options = "hover")
Shooter_Age
|
Shooter_Gender
|
Count
|
5
|
M
|
1
|
6
|
M
|
1
|
7
|
M
|
1
|
8
|
M
|
2
|
9
|
M
|
1
|
10
|
M
|
1
|
11
|
M
|
2
|
12
|
F
|
1
|
12
|
M
|
3
|
13
|
M
|
6
|
14
|
F
|
1
|
14
|
M
|
20
|
15
|
F
|
1
|
15
|
M
|
23
|
16
|
M
|
32
|
17
|
F
|
2
|
17
|
M
|
35
|
18
|
M
|
21
|
19
|
M
|
9
|
20
|
M
|
3
|
21
|
M
|
8
|
22
|
F
|
1
|
22
|
M
|
3
|
23
|
F
|
1
|
23
|
M
|
3
|
24
|
M
|
2
|
25
|
M
|
2
|
26
|
F
|
1
|
26
|
M
|
2
|
27
|
M
|
1
|
28
|
M
|
2
|
29
|
M
|
1
|
30
|
M
|
4
|
31
|
M
|
2
|
32
|
M
|
3
|
33
|
M
|
1
|
34
|
M
|
1
|
35
|
M
|
1
|
36
|
F
|
1
|
36
|
M
|
1
|
37
|
F
|
1
|
37
|
M
|
2
|
38
|
M
|
1
|
39
|
F
|
1
|
40
|
M
|
2
|
41
|
M
|
3
|
42
|
M
|
1
|
44
|
M
|
2
|
46
|
M
|
1
|
47
|
M
|
1
|
48
|
M
|
3
|
49
|
M
|
1
|
51
|
M
|
2
|
53
|
M
|
4
|
55
|
M
|
1
|
56
|
M
|
1
|
59
|
M
|
1
|
62
|
M
|
3
|
63
|
M
|
1
|
70
|
M
|
1
|
74
|
M
|
1
|
NA
|
F
|
2
|
NA
|
M
|
104
|
NA
|
Multiple
|
3
|
NA
|
Police Officer/SRO
|
8
|
NA
|
Unknown
|
56
|
weapon <- subset %>%
group_by(Firearm) %>%
summarise(Count=n())
kable(weapon) %>%
kable_styling(bootstrap_options = "hover")
Firearm
|
Count
|
Combination of Different Types of Weapons
|
34
|
Handgun
|
951
|
Multiple Handguns
|
16
|
Multiple Rifles
|
1
|
Other
|
39
|
Rifle
|
75
|
Shotgun
|
49
|
Unknown
|
248
|
ggplot(data = weapon, aes(x=reorder(Firearm,Count),y=Count,fill=Firearm))+
geom_bar(stat = "identity")+
theme(axis.text.x = element_text(angle = 90, hjust = .5))+
coord_flip()+
labs(title = "Weapons", subtitle = "2010 to 2018")+
theme(legend.position = "none")

California State
ca <- filter(school,State=="CA")
ca_date <- ca %>%
group_by(Date) %>%
summarise(Count=n())
ca_year <- ca %>%
filter(Year<2019) %>%
group_by(Year) %>%
summarise(Count=n())
ggplot(data = ca_year, aes(x=Year,y=Count))+
geom_bar(stat = "identity", fill="blue")+
theme(axis.text.x = element_text(angle = 90))+
labs(title="California _ Count of Shooting Incidents ", subtitle = "From 1970 to 2018")+
ylim(0, 15)

Year = 2018 in California
ca_subset <- filter(ca,Year==2018) %>%
select(c("Date", "School", "City","Total Injured/Killed Victims","Firearm Type","Summary",
"Narrative (Detailed Summary/ Background)","Month"))
colnames(ca_subset) <- c("Date","School","City","Casualties","Firearm","Summary","Narrative","Month")
kable(ca_subset) %>%
kable_styling(bootstrap_options = c("striped", "hover"))
Date
|
School
|
City
|
Casualties
|
Firearm
|
Summary
|
Narrative
|
Month
|
2018-02-01
|
Salvador B. Castro Middle School
|
Los Angeles
|
2
|
Handgun
|
Accidental Discharge Inside of Backpack; possible bullying
|
Gun inside 12 YOF student’s backpack discharged inside the classroom striking two students. Two other students and a teacher suffered minor abrasions. Shooter may have been bullied and showed the gun off to other students that day.
|
2
|
2018-03-13
|
Seaside High School
|
Seaside
|
3
|
Handgun
|
Accidental discharge during teacher’s gun safety demonstration
|
Teacher (reserve police officer) was showing gun as part of demonstration and firearm discharged in the ceiling. Bullet fragments struck 1 student, ceiling debris hurt 2 students, all have minor injuries. Guns not allowed on school premises in CA.
|
3
|
2018-05-11
|
Highland High School
|
Palmdale
|
1
|
Rifle
|
Shot fired into air during argument between students
|
Prior to the school day starting (before the SRO’s shift started), the shooter (former student) fired 10 shots from the bathroom window striking one student in the shoulder. Shooter called his father immediately after the shooting to say he fired his gun into the air. Father called a friend who is a police officer and directed him to the shooter’s location. The shooter fled the area and was apprehended at a nearby grocery store. The weapon was found near the school. Unclear if the victim was random or targeted.
|
5
|
2018-08-31
|
Balboa High School
|
San Francisco
|
0
|
Handgun
|
Accidental discharge in classroom, student with gun fled
|
Student brought gun to school in backpack. Discharged in classroom. Shooter dropped the gun, fled the school and was later arrested when his mother took him to the police station. 2 other students arrested and charged with being accessories. School went on lockdown and police conducted room by room search. Nearby Elementary school and Middle school were also locked down.
|
8
|
2018-09-09
|
Gilroy High School
|
Gilroy
|
0
|
Handgun
|
Officer shot a vehicle driving recklessly on football field
|
Officer fired at a vehicle driving recklessly around football field during youth football tournament. No injuries. Driver was a former police officer, had made 911 calls, and was allegedly involved in the kidnapping of a woman. He was arrested, no known connection to the school.
|
9
|
2018-09-20
|
Pomona High School
|
Pomona
|
1
|
Unknown
|
Adult make commit suicide on the school campus before classes started
|
Adult male’s body was found at 9AM near the bleachers on the campus. Police believe the suicide had occurred prior to classes starting that day.
|
9
|
2018-09-20
|
CHAMPS Charter High School
|
Los Angeles
|
2
|
Handgun
|
Shooting at fast food restaurant directly across from school, student and employee struck ran back to school
|
School employee and a student were stuck when shots were fired in a large group outside a fast food restaurant across the street from the school. Victims returned to the school building where they were treated by first responders. School was locked down. Police later arrested 18YOM and 20YOM.
|
9
|
2018-11-08
|
Cleveland Elementary School
|
Santa Barbara
|
1
|
Unknown
|
Shot during robbery outside of school
|
Two adult males shot an unidentified adult male during a robbery outside of the school after hours. Two adult male shooters were involved in multiple crimes in the area.
|
11
|
ca_city <- ca_subset %>%
group_by(City) %>%
summarise(Total=sum(Casualties))
ggplot(data = ca_city, aes(x=City,y=Total))+
geom_bar(stat = "identity",fill="red")+
theme(axis.text.x = element_text(angle = 90, hjust = .5))+
ylim(0, 7)+
labs(title = "Incidents by City", subtitle = "2018 Califoria",x = "City", y = "Total Injured & Killed")

#-----------------------
ca_month <- ca_subset %>%
group_by(Month) %>%
summarise(Total=sum(Casualties))
ggplot(data = ca_month, aes(x=as.factor(Month),y=Total))+
geom_bar(stat = "identity",fill="brown")+
theme(axis.text.x = element_text(angle = 0, hjust = .5))+
labs(title = "Incidents by month", subtitle = "2018 California",x = "Month", y = "Casualty count")
# Year = 2019
ca_2019 <- filter(ca,Year==2019) %>%
select(c("Date", "School", "City","Total Injured/Killed Victims","Firearm Type","Summary",
"Narrative (Detailed Summary/ Background)","Month"))
colnames(ca_2019) <- c("Date","School","City","Casualties","Firearm","Summary","Narrative","Month")
kable(ca_2019, caption = " 2019 California") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
2019 California
Date
|
School
|
City
|
Casualties
|
Firearm
|
Summary
|
Narrative
|
Month
|
2019-01-07
|
Central Elementary School
|
Belmont
|
1
|
Unknown
|
17YOM high school student shot in elementary school parking lot
|
17YOM varsity athlete shot and killed in elementary school parking lot. Shooter fled the scene. Shooting occurred at night when the school was closed. School was closed following day while police searched for shooter in the neighborhood.
|
1
|
2019-07-19
|
Monroe Clark Middle School
|
San Diego
|
0
|
Other
|
Teen fired BB’s at school building then fled
|
Teen in sweatshirt was seen firing BB’s at the school building. School was locked down. Teen fled the area and police were unable to locate a suspect. School was locked down for 1 hour.
|
7
|
2019-08-27
|
Hollenbeck Middle School
|
Los Angeles
|
1
|
Unknown
|
Student hit by bullet in lunch line. Shot fired from off campus.
|
Student standing in lunch line felt sharp pain in jaw and went to nurses office. X-ray later showed bullet in students neck. Shot was fired from off campus. No other students were hurt. School did not report shooting to police until 3pm (occured at 11 AM). Parents were not notified that student was shot until 9/12/2019. No suspect or motive.
|
8
|
Which California cities experience more incidents
ca_cities <- ca %>%
group_by(City) %>%
summarise(Count=n()) %>%
top_n(10,Count) %>%
arrange(desc(Count))
kable(ca_cities, caption = "California Cities with high incidents") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
California Cities with high incidents
City
|
Count
|
Los Angeles
|
45
|
Sacramento
|
7
|
Compton
|
5
|
Oakland
|
5
|
Merced
|
4
|
Oxnard
|
4
|
San Diego
|
4
|
San Francisco
|
4
|
Fresno
|
3
|
Long Beach
|
3
|
Pomona
|
3
|
San Jose
|
3
|
#--------------------------------
ca <- select(ca, c("Date", "School", "City","Total Injured/Killed Victims","Firearm Type","Summary",
"Narrative (Detailed Summary/ Background)","Month"))
colnames(ca) <- c("Date","School","City","Casualties","Firearm","Summary","Narrative","Month")
ca_casualty <- ca %>%
group_by(City) %>%
summarise(Total=sum(Casualties)) %>%
top_n(10,Total) %>%
arrange(desc(Total))
kable(ca_casualty, caption = "California cities with high number of injuries and deaths") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
California cities with high number of injuries and deaths
City
|
Total
|
Los Angeles
|
98
|
Stockton
|
38
|
Rancho Tehama Reserve
|
24
|
Santee
|
15
|
Olivehurst
|
14
|
San Diego
|
13
|
Norco
|
12
|
Sacramento
|
10
|
Compton
|
9
|
Oakland
|
8
|
San Francisco
|
8
|
relation <- subset %>%
group_by(Affiliation) %>%
summarise(Count=n()) %>%
arrange(-Count)
relation <- na.omit(relation)
kable(relation) %>%
kable_styling(bootstrap_options = "hover")
Affiliation
|
Count
|
Student
|
713
|
Unknown
|
271
|
No Relation
|
151
|
Former Student
|
61
|
Non-Student Using Athletic Facilities/Attending Game
|
37
|
Police Officer/SRO
|
33
|
Intimate Relationship with Victim
|
31
|
Parent
|
26
|
Other Staff
|
22
|
Multiple Shooters
|
19
|
Teacher
|
18
|
Relative
|
17
|
Students from Rival School
|
10
|
Former Teacher
|
2
|
Visiting Student
|
1
|
ggplot(data = relation, aes(x=reorder(Affiliation,Count),y=Count))+
geom_bar(stat = "identity",fill="darkblue")+
theme(axis.text.x = element_text(angle = 90, hjust = .5))+
coord_flip()+
labs(title = "Shooter's affiliation with school", subtitle = "2010 to 2018")+
theme(legend.position = "none")

Conclusion:
- 2018 was the dealiest year with 110 incidents since 1970, nationally.
- 2019 ,as of end of Semptember, recorded 69 shooting incidents.
- Trends from 1970 to presence is upward.