Data Visualization - TSA Claims Data Set

Introduction

Sometimes things go wrong when dealing with TSA or the US Terminal Security Agency. Things can be damaged or lost, and people can actually get injured as well. With all these things potentially able to occur, you can file claims against the TSA if in a situation where it’s necessary. The visualizations created through this analysis covers these TSA Claims through the years of 2002-2015. Different information visualizing data about these claims through TSA can be seen through the following graphs/charts that consist of a heat map, scatter plot, dual axis bar chart, nested donut chart, and stacked bar chart. It starts out kind of general in the heat map, where it shows all claims and the amount per year, separated by their status of claims. We then move on to get more and more specific through each visualization. Each graph looks at a different aspect of the information covered in the data set. We look at claims by the incident year and the airport it occurred at, moving onto the average amount of money claimed and actually received in each type of claim, then the total claim value by the quarter and month, and lastly we look at the claim site and how much value of claims occurred at each of these sites, specifically to each day of the week. By looking at all these things, viewers can understand more about TSA Claims and specifics about them like where they usually occur or how successful claims seem to be and how much money you could receive.

Dataset

This data set is called TSA Claims and comes from the file tsa_claims.csv which was found on the website Kaggle: https://www.kaggle.com/datasets/terminal-security-agency/tsa-claims-database/data. The data set includes claims filed against TSA that were filed between the years 2000 through 2017, but the bulk of the data is between 2002 and 2015. There are 13 columns, each of which are variables of the data set. The variables include: Claim Number, Date Received, Incident Date, Airport Code, Airport Name, Airline Name, Claim Type, Claim Site, Item, Claim Amount, Status, Close Amount, and Disposition. Each row represents a filed claim with a unique claim number. There are 204,244 rows in the data set, but there were a lot of NaNs and messy data throughout this data set that needed cleaning up. Most of the NaNs were removed, which took out some data, but not a substantial amount to really matter. I did a lot of cleaning which included changing the dates in the Date Received column into consistent date formats, as well as taking away dollar signs from prices to use them as floats and taking any semi colons or dashes that were separating the data. Status was narrowed down to three categories (Settled, Denied, Approved) and the Airport Code column was cleaned of any missing data. Once the data was all cleaned, it was much easier to work with it and visualize the data present.

# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.ticker import FuncFormatter
import warnings
warnings.filterwarnings("ignore")

# Load in the data set
path = "C:/Users/VictoriaKwortnik/Desktop/GB 736/Python_datafiles/"
filename = path + 'tsa_claims.csv'
tsaClaims = pd.read_csv(filename)

# In order to clean some of the data, I used some of the following code that I got off of Kaggle.com from user PERRYCHU (published 6years ago) in the notebook "TSA Claims Classification (Part 1)"

#Format columns nicely for dataframe index
tsaClaims.columns = [s.strip().replace(" ","_") for s in tsaClaims.columns]

#Drop rows with too many nulls
tsaClaims.dropna(thresh=6, inplace=True)

#Fill NA for categorical columns
fill_columns = ["Airline_Name","Airport_Name","Airport_Code","Claim_Type","Claim_Site","Item"]
tsaClaims[fill_columns] = tsaClaims[fill_columns].fillna("-")

#Set NA Claim Amount to 0. Zeros are dropped later in the code.
tsaClaims["Claim_Amount"] = tsaClaims.Claim_Amount.fillna("$0.00")

# Clean Data in Claim Status
valid_targets = ['Denied','Approved','Deny','Settled','Approve in Full', 'Settle']

tsaClaims = tsaClaims[tsaClaims.Status.isin(valid_targets)]
tsaClaims.Status.replace("Approve in Full","Approved",inplace=True)
tsaClaims.Status.replace("Deny","Denied",inplace=True)
tsaClaims.Status.replace("Settle","Settled",inplace=True)

#Drop nulls
tsaClaims.dropna(subset=['Date_Received'], inplace=True)

#Format datetime
tsaClaims['Date_Received'] = pd.to_datetime(tsaClaims['Date_Received'],format="%d-%b-%y")

#Check year range
tsaClaims = tsaClaims[tsaClaims['Date_Received'].dt.year.isin(range(2000,2016+1))]

tsaClaims['Date_Rcvd_Year'] = tsaClaims['Date_Received'].dt.year.astype(pd.Int64Dtype())
tsaClaims['Date_Rcvd_Month'] = tsaClaims['Date_Received'].dt.month.astype(pd.Int64Dtype())
tsaClaims['Date_Rcvd_Day'] = tsaClaims['Date_Received'].dt.day.astype(pd.Int64Dtype())
tsaClaims['Date_Rcvd_Quarter'] = tsaClaims['Date_Received'].dt.quarter.astype(pd.Int64Dtype())
tsaClaims['Date_Rcvd_MonthName'] = tsaClaims['Date_Received'].dt.strftime('%B')
tsaClaims['Date_Rcvd_MonthNameAbbrev'] = tsaClaims['Date_Received'].dt.strftime('%b')
tsaClaims['Date_Rcvd_DayName'] = tsaClaims['Date_Received'].dt.strftime('%A')
tsaClaims['Date_Rcvd_DayNameAbbrev'] = tsaClaims['Date_Received'].dt.strftime('%a')

#Check multiple Airport Names assigned to one Airport Code
temp = tsaClaims.groupby("Airport_Code").Airport_Name.nunique().sort_values(ascending=False)

#Duplicates are from excess spaces
tsaClaims["Airport_Code"] = tsaClaims.Airport_Code.str.strip()
tsaClaims["Airport_Name"] = tsaClaims.Airport_Name.str.strip()

#Check multiple Airport Names assigned to one Airport Code
temp = tsaClaims.groupby("Airport_Code").Airport_Name.nunique().sort_values(ascending=False)

tsaClaims["Airline_Name"] = tsaClaims.Airline_Name.str.strip().str.replace(" ","")
tsaClaims.Airline_Name.replace("AmericanEagle","AmericanAirlines",inplace=True)
tsaClaims.Airline_Name.replace("AmericanWest","AmericaWest",inplace=True)
tsaClaims.Airline_Name.replace("AirTranAirlines(donotuse)","AirTranAirlines",inplace=True)
tsaClaims.Airline_Name.replace("AeroflotRussianInternational","AeroFlot",inplace=True)
tsaClaims.Airline_Name.replace("ContinentalExpressInc","ContinentalAirlines",inplace=True)
tsaClaims.Airline_Name.replace("Delta(Song)","DeltaAirLines",inplace=True)
tsaClaims.Airline_Name.replace("FrontierAviationInc","FrontierAirlines",inplace=True)
tsaClaims.Airline_Name.replace("NorthwestInternationalAirwaysLtd","NorthwestAirlines",inplace=True)
tsaClaims.Airline_Name.replace("SkywestAirlinesAustralia","SkywestAirlinesIncUSA",inplace=True)

df_item = tsaClaims.Item.str.split("-").map(lambda x: "" if type(x) == float else x[0])
df_item = df_item.str.split(r" \(").map(lambda x: x[0])
df_item = df_item.str.split(r" &").map(lambda x: x[0])
df_item = df_item.str.split(r"; ").map(lambda x: x[0])
df_item = df_item.str.strip()

categories = df_item.value_counts()
items = categories[categories > 12600]

# Data Cleaning
tsaClaims["Claim_Amount"] = tsaClaims["Claim_Amount"].str.replace("$","")
tsaClaims["Close_Amount"] = tsaClaims["Close_Amount"].str.replace("$","")
tsaClaims["Claim_Amount"] = tsaClaims["Claim_Amount"].str.replace(";","")
tsaClaims["Close_Amount"] = tsaClaims["Close_Amount"].str.replace(";","")

tsaClaims = tsaClaims.replace('-',np.nan)
tsaClaims = tsaClaims[pd.notnull(tsaClaims['Claim_Amount'])]

# Claim Amounts cleaning
tsaClaims["Claim_Amount"] = tsaClaims.Claim_Amount.str.strip()
tsaClaims["Claim_Amount"] = tsaClaims.Claim_Amount.str.replace(";","").str.replace("$","").str.replace("-","0")
tsaClaims["Claim_Value"] = tsaClaims.Claim_Amount.astype(float)

tsaClaims_copy = tsaClaims.copy()

# Close Amounts cleaning
tsaClaims["Close_Amount"] = tsaClaims.Close_Amount.str.strip()
tsaClaims["Close_Amount"] = tsaClaims.Close_Amount.str.replace(";","").str.replace("$","")
tsaClaims["Close_Value"] = tsaClaims.Close_Amount.astype(float)

Findings

The findings of this data set, which are seen below, include: number of claims by status and year, claims by incident date and airport code, $ claimed vs received by claim types, total claim value by quarter and month, and total value claims by claim site and day. We can see it start off somewhat general and get more and more specific as we go through the visualizations - this was meant to happen, so we could narrow in on what information we were looking at. Each graph shows information about different aspects of the data, so we can get a full picture of what TSA claims are like and how they affect travel.

Heatmap

Looking at the data set and cleaning all the data up, I thought the first thing to look at that would be interesting was the statuses of each claim and look at that over the years in the data. The following heatmap looks at the claims’ status separated into the category of either Settled, Denied, or Approved. That is on the left (y) axis of the heatmap. On the bottom (x) axis, we see each year for the claims filed (2000-2015). The color index on the right side shows the number of claims by color starting from the darkest color moving to a lighter color the higher the number of claims is. So for example, looking at year 2000, there was 2 settled claims, 2 denied claims, and no approved claims that year. Moving onto another year, for example, 2007, we can see 2,413 claims were settled in 2007, 11,879 denied claims which is represented by the light yellow color, and 3,091 claims approved. It is an easy visual to understand because of the colors and labels inside the map, and is a good starting point to understanding our data.

scatter1 = tsaClaims.groupby(['Date_Rcvd_Year', 'Status'])['Status'].count().reset_index(name='count').sort_values(by='count', ascending=False).reset_index(drop=True)
scatter1 = pd.DataFrame(scatter1)

newScatter1 = scatter1.groupby(['Date_Rcvd_Year', 'Status'])['count'].sum().reset_index()
newScatter1 = pd.DataFrame(newScatter1)

# Make sure there are no empty cells in heatmap
newScatter1.loc[len(newScatter1.index)] = [2000, 'Approved', 0] 

hm_df = pd.pivot_table(newScatter1, index='Status', columns='Date_Rcvd_Year', values='count')

# Create the Heatmap
fig = plt.figure(figsize=(18,10))
ax = fig.add_subplot(1, 1, 1)

comma_fmt = FuncFormatter(lambda x, p: format(int(x), ','))

ax = sns.heatmap(hm_df, linewidth = 0.2, annot = True, cmap = 'magma', fmt = ',.0f',
                 square = True, annot_kws={'size': 11},
                 cbar_kws = {'format': comma_fmt, 'orientation': 'vertical', 'shrink':0.6})
plt.title('Heatmap of the Number of Claims by Status and Year', fontsize=18, pad=15)
plt.xlabel('Claim Year', fontsize=18, labelpad=10)
plt.ylabel('Status', fontsize=18, labelpad=10)
plt.yticks(rotation=0, size=14)

plt.xticks(size=14)

ax.invert_yaxis()

cbar = ax.collections[0].colorbar

max_count = hm_df.to_numpy().astype(int).max()

my_colorbar_ticks = [*range(1000, max_count, 1000)]
cbar.set_ticks(my_colorbar_ticks)

my_colorbar_tick_labels = ['{:,}'.format(each) for each in my_colorbar_ticks]
cbar.set_ticklabels(my_colorbar_tick_labels)

cbar.set_label('Number of Claims', rotation=270, fontsize=14, color='black', labelpad=20)

plt.show()

Scatterplot

Moving onto the scatterplot, I thought it was important to look at the airport codes and how that plays a role into how many claims are filed through TSA. We look at this specifically in the scatterplot below, along with comparing to over the years (2000-2015). Looking at the scatteprlot, we can see the years on the y axis on the left and the top 10 airport codes on the x axis, along the bottom. The top 10 airports (with the most amount of claims) included EWR, ATL, JFK, LAS, LAX, MCO, MIA, ORD, PHX, and SEA. These were the top 10 airports with the highest amount of claims, which when you think about it, it makes sense since these are all pretty large/international airports, with lots of people going through TSA, and therefore a busy TSA. This plot also has a color index on the right side that represents the number of claims by dark blue (low number of claims) through a light yellow (high number of claims). By looking at different parts of the plot, we can find out a lot about the data. For example, we can see by the color that LAX in the year 2004 probably had the highest number of claims filed (around 1,300 claims), while the years 2000 and 2001 barely has any data to represent claims. 2002 also has a very low number of claims for all the airports.

scatter = tsaClaims.groupby(['Date_Rcvd_Year', 'Airport_Code'])['Date_Rcvd_Year'].count().reset_index(name='count').sort_values(by='count', ascending=False).reset_index(drop=True)
scatter = pd.DataFrame(scatter)

newScatter2 = scatter.groupby(['Airport_Code'])['count'].sum().reset_index().sort_values(by='count', ascending=False)
newScatter2 = pd.DataFrame(newScatter2)

# Only keep the top 10 airport codes, get rid of rest
KeepRows = 'LAX|JFK|ORD|EWR|MCO|MIA|ATL|SEA|LAS|PHX'
goodRows = scatter[scatter.Airport_Code.str.contains(KeepRows)]
goodRows = goodRows.groupby(['Date_Rcvd_Year', 'Airport_Code'])['count'].sum().reset_index()


# Construct Scatterplot (Top 10 Airport Codes w Years and # of Claims)
plt.figure(figsize=(18,10))

plt.scatter(goodRows['Airport_Code'], goodRows['Date_Rcvd_Year'], marker='X', cmap='magma',
            c=goodRows['count'], s=goodRows['count'], edgecolors='black')

plt.title('TSA Claims by Incident Date and Airport', fontsize=18)
plt.xlabel('Top 10 Airport Codes', fontsize=14)
plt.ylabel('Year', fontsize = 14)

cbar = plt.colorbar()
cbar.set_label('Number of Claims', rotation=270, fontsize=14, color='black', labelpad=30)

my_colorbar_ticks = [*range(100, int(goodRows['count'].max()), 100   )]
cbar.set_ticks(my_colorbar_ticks)

my_colorbar_tick_labels = [*range(100, int(goodRows['count'].max()), 100 )]
my_colorbar_tick_labels = [ '{:,}'.format(each) for each in my_colorbar_tick_labels]
cbar.set_ticklabels(my_colorbar_tick_labels)

my_x_ticks = goodRows['Airport_Code'].unique()
plt.xticks(my_x_ticks, fontsize=10, color='black')

my_y_ticks = [*range( goodRows['Date_Rcvd_Year'].min(), goodRows
                     ['Date_Rcvd_Year'].max()+1, 1  )]
plt.yticks(my_y_ticks, fontsize=14, color='black')

plt.show()

Dual Axis Chart

In this dual axis bar chart, we get even more specific, looking into how much money was filed for with these claims, and how much the customer actually received. This is also separated by the type of claim that was filed for, which included Passenger Property Loss, Property Damage, Employee Loss (MPCECA), Passenger Theft, and Motor Vehicle. It also included Personal Injury, but because that is so severe, the average of $ claim amounts were so high, so I left it out of this visualization to better be able to see the other types. The y axis on the left side is the average of $ received, which is represented by the pink bars, while the right y axis is average of $ claimed and represented in the graph by purple bars. On the x axis (bottom), we see each claim type, and each type has average money received and claimed bars. A take away from the graph can be that the average amount claimed is higher than the money received for each claim type, which does make sense. We can also see that the claim type to have the most money claimed for and received was Motor Vehicle, while the lowest is Employee Loss.

tsaClaims['Claim_Amount'] = tsaClaims['Claim_Amount'].astype(float)
tsaClaims['Close_Amount'] = tsaClaims['Close_Amount'].astype(float)

# Get rid of NAs
noNaDf = tsaClaims[tsaClaims['Claim_Type'].notna() & tsaClaims['Claim_Amount'].notna() & tsaClaims['Close_Amount']]

dualAxis = noNaDf.groupby(['Claim_Type']).agg({'Claim_Type':['count'], 'Claim_Amount':['sum', 'mean'], 'Close_Amount':['sum', 'mean']}).reset_index()
dualAxis.columns = ['Claim Type', 'Count', 'TotalClaimsFiled', 'AverClaimsFiled', 'TotalClaimsRcvd', 'AverClaimsRcvd']

dualAxis = dualAxis.sort_values('Count', ascending = False).reset_index(drop=True)

# excluding personal injury, bc so large
dualAxis = dualAxis.drop([4])

def autolabel(these_bars, this_ax, place_of_decimals, symbol):
    for each_bar in these_bars:
        height = each_bar.get_height()
        this_ax.text(each_bar.get_x() + each_bar.get_width()/2, height * 1.01, symbol + format(height, place_of_decimals), 
                    fontsize=11, color='black', ha='center', va='bottom')

# Create dual axis chart and plot                   
fig = plt.figure(figsize=(18,10))
ax1 = fig.add_subplot(1, 1, 1)
ax2 = ax1.twinx()
bar_width = 0.4

x_pos = np.arange(5)
count_bars = ax1.bar(x_pos-(0.5*bar_width), dualAxis.AverClaimsRcvd, bar_width, color='pink', edgecolor='black', label='Average of $ Received')
aver_fine_bars = ax2.bar(x_pos+(0.5*bar_width), dualAxis.AverClaimsFiled, bar_width, color='purple', edgecolor='black', label='Average of $ Claimed')

ax1.set_xlabel('Claim Types', fontsize=18)
ax1.set_ylabel('Average of $ Received', fontsize=18, labelpad=20)
ax2.set_ylabel('Average of $ Claimed', fontsize=18, rotation=270, labelpad=20)
ax1.tick_params(axis='y', labelsize=14)
ax2.tick_params(axis='y', labelsize=14)

plt.title('Comparison of $ Claimed vs. Received for Claim Types\n Top 5 Most Frequently Claimed (excluding Personal Injury)', fontsize=18)
ax1.set_xticks(x_pos)
ax1.set_xticklabels(dualAxis['Claim Type'], fontsize=14)

count_color, count_label = ax1.get_legend_handles_labels()
fine_color, fine_label   = ax2.get_legend_handles_labels()
legend = ax1.legend(count_color + fine_color, count_label + fine_label, loc='upper left', frameon=True, ncol=1, shadow=True,
                   borderpad=1, fontsize=14)
ax1.set_ylim(0, dualAxis.Count.max()*0.07)

autolabel(count_bars, ax1, '.2f', '$')
autolabel(aver_fine_bars, ax2, '.2f', '$')

plt.show()

Nested Donut Chart

This nested donut chart splits up the total claim value into each quarter it was claimed, and even further breaks it down into each month as well. By looking at this chart we can see a lot of different things about the data. First, we can tell that the total claim value over the years was $13.16 million. We can also see that each quarter is pretty similar in amounts, but that quarter 1 has the largest toal claim value at $3.6M and makes up 27.63% of the total. Within that quarter, we see that February makes up the highest percentage out of the total, at 9.92% of total claim value. This helps us understand when the most amount of claims are filed, but more specifically how much money is expected to be paid out for these claims and when they’re expected to be paid out. This is helpful for TSA to know, so they can plan accordingly, or be more aware during those times to try and be more careful.

# Add 'Quarter'
tsaClaims['Date_Rcvd_Quarter'] = 'Quarter ' + tsaClaims['Date_Rcvd_Quarter'].astype('string')

pie_df = tsaClaims.groupby(['Date_Rcvd_Quarter', 'Date_Rcvd_MonthName', 'Date_Rcvd_Month'])['Close_Value'].sum().reset_index(name='TotalClaims')

del pie_df['Date_Rcvd_Month']

# Set up inside & outside reference #s for colors
number_outside_colors = len(pie_df.Date_Rcvd_Quarter.unique())
outside_color_ref_number = np.arange(number_outside_colors)*4

number_inside_colors = len(pie_df.Date_Rcvd_MonthName.unique())
all_color_ref_number = np.arange(number_outside_colors + number_inside_colors)

inside_color_ref_number = []
for each in all_color_ref_number:
    if each not in outside_color_ref_number:
        inside_color_ref_number.append(each)

# Created nested donut chart
fig = plt.figure(figsize=(10, 10))
ax = fig.add_subplot(1, 1, 1)

colormap = plt.get_cmap("tab20")
outer_colors = colormap(outside_color_ref_number)

all_fines = pie_df.TotalClaims.sum()

pie_df.groupby(['Date_Rcvd_Quarter'])['TotalClaims'].sum().plot(
    kind = 'pie', radius=1, colors=outer_colors, pctdistance=0.85, labeldistance = 1.1,
    wedgeprops = dict(edgecolor='w'), textprops={'fontsize':18},
    autopct = lambda p: '{:.2f}%\n(${:.1f}M)'.format(p,(p/100)*all_fines/1e+6),
    startangle=90)

inner_colors = colormap(inside_color_ref_number)
pie_df.TotalClaims.plot(
    kind = 'pie', radius=0.7, colors=inner_colors, pctdistance=0.55, labeldistance = 0.8,
    wedgeprops = dict(edgecolor='w'), textprops={'fontsize':13},
    labels = pie_df.Date_Rcvd_MonthName, 
    autopct = '%1.2f%%',
    startangle=90)

hole = plt.Circle((0,0), 0.3, fc='white')
fig1 = plt.gcf()
fig1.gca().add_artist(hole)

ax.yaxis.set_visible(False)
plt.title('Total Claim Value by Quarter and Month', fontsize=18)

ax.text(0, 0, 'Total Claim Value\n' + '$' + str(round(all_fines/1e6, 2)) + 'M', size=18, ha='center', va='center')

ax.axis('equal')

plt.tight_layout()
plt.show()

Stacked Bar Chart

After looking at all the other information within this dataset, we now turn to looking at the claim sites and how much of value these accumulate. We also try and look at how this pans out over each day of the week, to get even more information out of this data for the future. This stacked bar chart has the total value of claims on the y axis (left side), going from $0M all the way to $8.0M. On the x axis (bottom) there is the claim site, where the incident occurred. Each color within the bars represents a day of the week, which we can see in the color legend in the top right corner. So, by looking at this chart and analyzing it, it tells us a lot of helpful things about the data. The Checked Baggage area totals up to $8.2M, and we can see that the most value is made up in Tuesday, but that a lot of the days of the week are pretty similar - Saturday and Sunday however are very low values. We can also see that $4.3M occurred at the Checkpoint site, while Motor Vehicle site is $0.4M. The lowest value is other at $0.3M. By looking at all the bars, we can see that not much value occurs on the weekend, so that is something important to take note of.

stacked_df = tsaClaims.groupby(['Claim_Site', 'Date_Rcvd_DayNameAbbrev'])['Close_Value'].sum().reset_index(name='TotalFines')

stacked_df = stacked_df.pivot(index='Claim_Site', columns='Date_Rcvd_DayNameAbbrev', values='TotalFines')

day_order = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
stacked_df = stacked_df.reindex(columns=reversed(day_order))

# Drop Bus Station bc nothing in it
stacked_df = stacked_df.drop(['Bus Station'])

# Plot Stacked Bar Chart
fig = plt.figure(figsize=(18, 10))
ax = fig.add_subplot(1, 1, 1)

stacked_df.plot(kind='bar', stacked=True, ax=ax)

plt.ylabel('Total Value of Claims', fontsize=18, labelpad=10)
plt.title('Total Value Claims by Claim Site and by Day \n Stacked Bar Plot', fontsize=18)
plt.xticks(rotation=0, horizontalalignment='center', fontsize=14)

plt.yticks(rotation=0, fontsize=14)

ax.set_xlabel('Claim Site', fontsize=18)

handles, labels = ax.get_legend_handles_labels()
handles = [handles[6], handles[5], handles[4], handles[3], handles[2], handles[1], handles[0]]
labels  = [ labels[6],  labels[5],  labels[4],  labels[3],  labels[2],  labels[1],  labels[0]]
plt.legend(handles, labels, loc='best', fontsize=14)

ax.yaxis.set_major_formatter(FuncFormatter(lambda x, pos:('$%1.1fM')%(x*1e-6)))

# Sum up the rows of our data to get the total value of each bar.
totals = stacked_df.sum(axis=1)

# Set an offset that is used to bump the label up a bit above the bar.
y_offset = 1e5

# Add labels to each bar.
for i, total in enumerate(totals):
    ax.text(i, total + y_offset, "$"+str(round(total/1e6,1))+"M", ha='center', weight='bold', size=11)


plt.show()

Conclusion

By working through this data set and looking at different patterns and information produced from these visualizations, we are able to understand more about how TSA Claims work and really how often they occur. As an airline or airport dealing with TSA, it would be important to look at this information to see how many incidents are occurring through TSA, as well as how much these incidents are costing them. As a flyer, it is also important information to know, so you can know how likely issues with TSA are likely to occur, and how much other people have claimed in the past for similar situations.