Modelo Predectivo

Im() es la función de R para ajustar modelos lineales. Es el modelo estadístico más básico que existe y más fácil de interpretar. Para interpretarlo se usa la medida R-cuadrada, que significa que tan cerca están los datos de la linea de regresión ajustada (Va de 0 a 1, donde 1 es que el modelo explica toda la variabilidad).
base_de_datos <- read.csv("C:/Users/eleyva1/Downloads/seguros.csv")

resumen <- summary(base_de_datos)
resumen
##     ClaimID           TotalPaid       TotalReserves     TotalRecovery      
##  Min.   :  777632   Min.   :      0   Min.   :      0   Min.   :     0.00  
##  1st Qu.:  800748   1st Qu.:     83   1st Qu.:      0   1st Qu.:     0.00  
##  Median :  812128   Median :    271   Median :      0   Median :     0.00  
##  Mean   : 1864676   Mean   :  10404   Mean   :   3368   Mean   :    66.05  
##  3rd Qu.:  824726   3rd Qu.:   1122   3rd Qu.:      0   3rd Qu.:     0.00  
##  Max.   :62203364   Max.   :4527291   Max.   :1529053   Max.   :100000.00  
##                                                                            
##  IndemnityPaid      OtherPaid       TotalIncurredCost ClaimStatus       
##  Min.   :     0   Min.   :      0   Min.   : -10400   Length:31619      
##  1st Qu.:     0   1st Qu.:     80   1st Qu.:     80   Class :character  
##  Median :     0   Median :    265   Median :    266   Mode  :character  
##  Mean   :  4977   Mean   :   5427   Mean   :  13706                     
##  3rd Qu.:     0   3rd Qu.:   1023   3rd Qu.:   1098                     
##  Max.   :640732   Max.   :4129915   Max.   :4734750                     
##                                                                         
##     IsDenied       Transaction_Time Procesing_Time     ClaimantAge_at_DOI
##  Min.   :0.00000   Min.   :    0    Min.   :    0.00   Min.   :14.0      
##  1st Qu.:0.00000   1st Qu.:  211    1st Qu.:    4.00   1st Qu.:33.0      
##  Median :0.00000   Median :  780    Median :   10.00   Median :42.0      
##  Mean   :0.04463   Mean   : 1004    Mean   :   62.99   Mean   :41.6      
##  3rd Qu.:0.00000   3rd Qu.: 1440    3rd Qu.:   24.00   3rd Qu.:50.0      
##  Max.   :1.00000   Max.   :16428    Max.   :11558.00   Max.   :94.0      
##                    NA's   :614                                           
##     Gender          ClaimantType       InjuryNature       BodyPartRegion    
##  Length:31619       Length:31619       Length:31619       Length:31619      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  AverageWeeklyWage1    ClaimID1        BillReviewALE        Hospital        
##  Min.   : 100.0     Min.   :  777632   Min.   : -448.0   Min.   : -12570.4  
##  1st Qu.: 492.0     1st Qu.:  800748   1st Qu.:   16.0   1st Qu.:    210.5  
##  Median : 492.0     Median :  812128   Median :   24.0   Median :    613.9  
##  Mean   : 536.5     Mean   : 1864676   Mean   :  188.7   Mean   :   5113.2  
##  3rd Qu.: 492.0     3rd Qu.:  824726   3rd Qu.:   64.1   3rd Qu.:   2349.1  
##  Max.   :8613.5     Max.   :62203364   Max.   :46055.3   Max.   :2759604.0  
##                                        NA's   :14912     NA's   :19655      
##  PhysicianOutpatient       Rx          
##  Min.   :   -549.5   Min.   :  -160.7  
##  1st Qu.:    105.8   1st Qu.:    22.9  
##  Median :    218.0   Median :    61.5  
##  Mean   :   1813.2   Mean   :  1695.2  
##  3rd Qu.:    680.6   3rd Qu.:   189.0  
##  Max.   :1219766.6   Max.   :631635.5  
##  NA's   :2329        NA's   :20730
regresion <- lm(TotalIncurredCost ~TotalPaid+ IndemnityPaid, data=base_de_datos)
summary(regresion)
## 
## Call:
## lm(formula = TotalIncurredCost ~ TotalPaid + IndemnityPaid, data = base_de_datos)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -886159     727     899     937 1304420 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -9.520e+02  1.748e+02  -5.448 5.14e-08 ***
## TotalPaid      1.204e+00  5.195e-03 231.822  < 2e-16 ***
## IndemnityPaid  4.278e-01  1.128e-02  37.919  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 30500 on 31616 degrees of freedom
## Multiple R-squared:  0.869,  Adjusted R-squared:  0.869 
## F-statistic: 1.048e+05 on 2 and 31616 DF,  p-value: < 2.2e-16
datos_nuevos <- data.frame(TotalPaid=8000, IndemnityPaid=3000)
predict(regresion,datos_nuevos)
##        1 
## 9965.171
LS0tDQp0aXRsZTogIk1vZHVsbyAyIEVqZXJjaWNpbyAxMiINCmF1dGhvcjogIkVkdWFyZG8gTGV5dmEiDQpkYXRlOiAiMjAyNC0wOC0xMyINCm91dHB1dDogDQogIGh0bWxfZG9jdW1lbnQ6DQogICAgdG9jOiBUUlVFDQogICAgdG9jX2Zsb2F0OiBUUlVFDQogICAgY29kZV9kb3dubG9hZDogVFJVRQ0KICAgIHRoZW1lOiBjb3Ntbw0KZWRpdG9yX29wdGlvbnM6IA0KICBjaHVua19vdXRwdXRfdHlwZTogY29uc29sZQ0KLS0tDQoNCiMjIyMgTW9kZWxvIFByZWRlY3Rpdm8NCg0KIyMjIyMgSW0oKSBlcyBsYSBmdW5jacOzbiBkZSBSIHBhcmEgYWp1c3RhciBtb2RlbG9zIGxpbmVhbGVzLiBFcyBlbCBtb2RlbG8gZXN0YWTDrXN0aWNvIG3DoXMgYsOhc2ljbyBxdWUgZXhpc3RlIHkgbcOhcyBmw6FjaWwgZGUgaW50ZXJwcmV0YXIuIFBhcmEgaW50ZXJwcmV0YXJsbyBzZSB1c2EgbGEgbWVkaWRhIFItY3VhZHJhZGEsIHF1ZSBzaWduaWZpY2EgcXVlIHRhbiBjZXJjYSBlc3TDoW4gbG9zIGRhdG9zIGRlIGxhIGxpbmVhIGRlIHJlZ3Jlc2nDs24gYWp1c3RhZGEgKFZhIGRlIDAgYSAxLCBkb25kZSAxIGVzIHF1ZSBlbCBtb2RlbG8gZXhwbGljYSB0b2RhIGxhIHZhcmlhYmlsaWRhZCkuDQoNCmBgYHtyfQ0KYmFzZV9kZV9kYXRvcyA8LSByZWFkLmNzdigiQzovVXNlcnMvZWxleXZhMS9Eb3dubG9hZHMvc2VndXJvcy5jc3YiKQ0KDQpyZXN1bWVuIDwtIHN1bW1hcnkoYmFzZV9kZV9kYXRvcykNCnJlc3VtZW4NCg0KYGBgDQoNCmBgYHtyfQ0KcmVncmVzaW9uIDwtIGxtKFRvdGFsSW5jdXJyZWRDb3N0IH5Ub3RhbFBhaWQrIEluZGVtbml0eVBhaWQsIGRhdGE9YmFzZV9kZV9kYXRvcykNCnN1bW1hcnkocmVncmVzaW9uKQ0KYGBgDQoNCmBgYHtyfQ0KDQpkYXRvc19udWV2b3MgPC0gZGF0YS5mcmFtZShUb3RhbFBhaWQ9ODAwMCwgSW5kZW1uaXR5UGFpZD0zMDAwKQ0KcHJlZGljdChyZWdyZXNpb24sZGF0b3NfbnVldm9zKQ0KYGBgDQo=