Map of Claims by Cities

Map without Tirane

By Gender

Map without Tirane

First analysis

## # A tibble: 21 × 4
##    Refined_Employment_Category   Count Mean_InsuranceAmount Mean_InsurancePeriod
##    <chr>                         <int>                <dbl>                <dbl>
##  1 Agriculture                       9              500000                  2.11
##  2 Art and Design                   14              928571.                 8.71
##  3 Construction and Labor          354              538136.                 6.76
##  4 Customer Service                 24              687500                  5.42
##  5 Driving/Security/General Lab…   505              593069.                 7.64
##  6 Education and Training          513              764133.                 9.16
##  7 Engineering and IT               85              870588.                16.8 
##  8 Finance and Economics           440              592045.                 7.59
##  9 Health and Medicine              69              891304.                11.0 
## 10 Hospitality and Service          79              500000                  4.37
## # ℹ 11 more rows

Each Refined_Employment_Category has a varying count of entries, with mean insurance amounts and periods also differing significantly.

For instance, categories like Management and Administration have a higher mean insurance amount compared to Construction and Labor.

This indicates that employment categories influence the insurance amount and period.

Correlation Analysis

## [1] 0.06398

The Pearson correlation coefficient between InsuranceAmmount and InsurancePeriod is approximately 0.064, with a very small p-value.

This suggests a very weak positive linear relationship between the two variables, but the relationship is statistically significant.

ANOVA Analysis

##                               Df          Sum Sq       Mean Sq F value
## Refined_Employment_Category   20  48068010537774 2403400526889    13.9
## Residuals                   4227 730850951326634  172900627236        
##                                          Pr(>F)    
## Refined_Employment_Category <0.0000000000000002 ***
## Residuals                                          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##                               Df Sum Sq Mean Sq F value              Pr(>F)    
## Refined_Employment_Category   20  15725     786     9.4 <0.0000000000000002 ***
## Residuals                   4227 353542      84                                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ANOVA (Analysis of Variance):

For InsuranceAmmount across different employment categories, the F-statistic is approximately 13.9 with a p-value close to zero.

This indicates that there are significant differences in the mean insurance amounts across different employment categories.

For InsurancePeriod, the F-statistic is approximately 9.4, also with a p-value close to zero, suggesting significant differences in the mean insurance periods across different employment categories.

Visualization

The box plots for Insurance Amount by Employment Category and Insurance Period by Employment Category show a wide variation within each category.

Some categories have a higher median and wider spread, indicating more variability and higher values.

Conclusions:

Refined_Employment_Category significantly influences both the InsuranceAmmount and InsurancePeriod. Different categories have different typical amounts and periods for insurance.

There is a very weak but statistically significant linear relationship between InsuranceAmmount and InsurancePeriod.

The significant ANOVA results confirm that the mean insurance amounts and periods vary across employment categories, suggesting that employment type is a critical factor in determining insurance specifics.