EJERCICIO 1. MODELOS DE REGRESIÓN PARA LOS CASOS ACUMULADOS DE COVID-19 EN REINO UNIDO

Grupo 10. Marcos Flórez Fernández, Julio González Escudero, Marina Martínez González, Lucía Ojer Guerra, Roberto Panero Hoz, Sergi Sos Lluch.

Apartado a): Modelar el número acumulado de personas infectadas mediante un ajuste lineal.

Los datos introducidos a través de un fichero han sido modificados hasta el día 03/05/2020.

days<-read.table("AcummulativeDaysUK")
days<-c(days$V1)
NewCasesUK<-read.table("NewCasesUK",sep=".")
NewCasesUK<-c(NewCasesUK$V1);NewCasesUK
 [1]    2    0    0    0    0    1    0    0    1    4    0    0    1    0    0
[16]    0    0    0    0    0    0    0    0    4    0    0    0    6    4   12
[31]    5   11   34   29   46   46   65   50   52   83  134  207  264  330  152
[46]  407  676  643  714 1040  665  967 1430 1450 2130 2890 2560 2500 2670 3250
[61] 4570 4520 4670 4000 6200 4140 3890 5870 4680 5710 5230 5290 4340 5250 4600
[76] 4620 5600 5530 5850 4680 4300 4450 4580 5390 4910 4460 4310 4000 4080 6030
[91] 6200 4810 4340
AllCasesUK<-cumsum(NewCasesUK)
AllCasesUK
 [1]      2      2      2      2      2      3      3      3      4      8
[11]      8      8      9      9      9      9      9      9      9      9
[21]      9      9      9     13     13     13     13     19     23     35
[31]     40     51     85    114    160    206    271    321    373    456
[41]    590    797   1061   1391   1543   1950   2626   3269   3983   5023
[51]   5688   6655   8085   9535  11665  14555  17115  19615  22285  25535
[61]  30105  34625  39295  43295  49495  53635  57525  63395  68075  73785
[71]  79015  84305  88645  93895  98495 103115 108715 114245 120095 124775
[81] 129075 133525 138105 143495 148405 152865 157175 161175 165255 171285
[91] 177485 182295 186635

plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4

RECTA DE REGRESIÓN O AJUSTE LINEAL.\( \ \ \ \ \ \ \ \ y=a+bx\rightarrow y=-46280+1840x \)

LinealModel<-lm(AllCasesUK~days)
summary(LinealModel)

Call:
lm(formula = AllCasesUK ~ days)

Residuals:
   Min     1Q Median     3Q    Max 
-43570 -28214  -1559  26043  61752 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   -46280       6387  -7.246 1.35e-10 ***
days            1840        118  15.598  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 30550 on 91 degrees of freedom
Multiple R-squared:  0.7278,    Adjusted R-squared:  0.7248 
F-statistic: 243.3 on 1 and 91 DF,  p-value: < 2.2e-16

plot of chunk unnamed-chunk-6

Apartado b): Modelar el número acumulado de personas infectadas mediante un ajuste no lineal de los presentados en las transparencias teóricas de la asignatura.

\[ \begin{align*} y&=\alpha e^{\beta x}\\ \\ log(y)&=log(\alpha)+\beta x\\ \\ Z&=A+Bx \end{align*} \]

\[ Z=log(y);\ A=log(\alpha);\ B=\beta \]

Z<-log(AllCasesUK)

\[ Cambio\ de\ variable\ para\ los\ valores\ de\ "AllCasesUK". \]

cor(Z,days)
[1] 0.9788588
ExpModel<-lm(Z~days)
summary(ExpModel)

Call:
lm(formula = Z ~ days)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.81048 -0.67971  0.06022  0.70234  1.18595 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.034520   0.177367   0.195    0.846    
days        0.149601   0.003277  45.653   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.8483 on 91 degrees of freedom
Multiple R-squared:  0.9582,    Adjusted R-squared:  0.9577 
F-statistic:  2084 on 1 and 91 DF,  p-value: < 2.2e-16
A<-ExpModel$coefficients[1]
b<-ExpModel$coefficients[2]

\[ \alpha=e^A;\ \beta=B \]

a<-exp(A)
a
(Intercept) 
   1.035123 
b
     days 
0.1496007 

Ecuación del modelo exponencial:

\[ y=1.035123 e^{0.1496007x} \]

plot of chunk unnamed-chunk-12

Apartado c): Mejorar dicho ajuste mediante técnicas de mínimos cuadrados no lineales.

\[ A=\alpha = 1.035123; B=\beta = 0.1496007 \]

NLS<-nls(AllCasesUK~A*exp(B*days),start=list(A=a,B=b))
summary(NLS)

Formula: AllCasesUK ~ A * exp(B * days)

Parameters:
   Estimate Std. Error t value Pr(>|t|)    
A 1.381e+03  2.150e+02   6.426  5.9e-09 ***
B 5.429e-02  1.835e-03  29.580  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 11320 on 91 degrees of freedom

Number of iterations to convergence: 24 
Achieved convergence tolerance: 8.401e-06
A<-summary(NLS)$coefficients[1]
B<-summary(NLS)$coefficients[2]

\[ y=1381.294 e^{0.05429373x} \]

plot of chunk unnamed-chunk-14

Apartado d): Modelar el número acumulado de personas infectadas mediante un ajuste no lineal de tipo asintótico más realista que los anteriores.

Benjamin GompertzSigmoide

Modelo sigmoide de Gompertz:

\[ Y = Ke^{-e^{a-bx}} \]

Relación con Ec. Lineal:

\[ loglog\dfrac{k}{y}=a-bx \]

Punto de inflexión:

\[ \left. x=\dfrac{a}{b} \atop y=\dfrac{k}{e} \right\} \left. 93 \approx\dfrac{a}{b} \atop 186635 \approx\dfrac{k}{e} \right\} \left. \begin{array}{rcl} a = 9.3\\ b = 0.1\\ k = 186635e \end{array} \right\} \]

Mínimos cuadrados no lineales para este modelo:

Gompertz<-nls(AllCasesUK~k*exp(-exp(a-b*days)),
star=list(a=9.3,b=0.1,k=507326.5291),
control=nls.control(minFactor=2^-24, maxiter=200))
summary(Gompertz)

Formula: AllCasesUK ~ k * exp(-exp(a - b * days))

Parameters:
   Estimate Std. Error t value Pr(>|t|)    
a 4.177e+00  4.098e-02  101.95   <2e-16 ***
b 5.606e-02  7.123e-04   78.70   <2e-16 ***
k 2.609e+05  2.905e+03   89.79   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1022 on 90 degrees of freedom

Number of iterations to convergence: 159 
Achieved convergence tolerance: 2.967e-06
a<-summary(Gompertz)$coefficients[1];a
[1] 4.177245
b<-summary(Gompertz)$coefficients[2];b
[1] 0.05605733
k<-summary(Gompertz)$coefficients[3];k
[1] 260853.8

\[ \]

Ecuación final del modelo de Gompertz:

\[ \] \[ Y = 260853.9e^{-e^{4.177245-0.05605732x}} \]

plot of chunk unnamed-chunk-17