This data set contains 1000 observations on 39 variables concerning insurance claims made by motorists after being involved in a collision. Data can be downloaded from: https://www.kaggle.com/roshansharma/insurance-claim.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 19.00 32.00 38.00 38.95 44.00 64.00
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 19.00 32.00 38.00 38.95 44.00 64.00
##
## Associate College High School JD Masters MD
## 0.145 0.122 0.160 0.161 0.143 0.144
## PhD
## 0.125
##
## Accura Audi BMW Chevrolet Dodge Ford Honda
## 0.068 0.069 0.072 0.076 0.080 0.072 0.055
## Jeep Mercedes Nissan Saab Suburu Toyota Volkswagen
## 0.067 0.065 0.078 0.080 0.080 0.070 0.068
Question Raised: Does gender effect how often car insurance claims are made?
Investigation Using Graphs:
Question Answered: According to the graphs displayed above, it appears that females make more car insurance claims than males. According to the relative frequency graph, about 55% of the ones who made claims were female and 45% were male. While this may cause one to conclude that women make more claims than men, I do not think that is a reasonable conclusion to make off of such little information.
Question Raised: Does the severity of the collision effect how expensive the total car insurance claim will be?
Investigation Using Graphs:
Question Answered: As you can see in the graph, the severity of the incident likely effects the total claim amount. Trivial damage has a far lower spread and median than that of the other 3 variables, and minor damage has a slightly smaller median and first quartile compared to the more severe incidents like total loss and major damage. ## Analysis Critique
A hypothetical data analyst created the following graph to help him figure out whether certain types of incidents tend to occur more often at certain times of the day.
The two variables are hour of the day and incident type.
This graph is useful because it shows that different incident types are in fact occuring at different hours of the day. According to the graph, parked car and vehicle theft incidents occur more in the early hours of the day, whereas multi-vehicle and single vehicle collisions occur more during the busy hours of the day when people are out driving more.
An alternative method to addressing this question would be to look at the number of vehicles involved in the incident and at what time the incident occurs. If there are more than one vehicle involved, it is a multi_vehicle incident, and if not, it is one of the other 3 variables. The only issue with this method is that you will not be able to tell the difference between theft, single vehicle collision, and parked car incidents. However, I do not see that being a major issue because this method will be able to clearly display whether multi-vehicle incidents occur more during the busy hours of the day compared to incidents involving one vehicle.