Centroamérica y el Caribe

A continuación, en el siguiente informe se organizarán y analizarán datos correspondientes a deslizamientos en una muestra de 15 países de Centroamérica y Norteamérica. Lo anterior se realiza teniendo en cuenta estados y ciudades de los países tomados como muestra. De igual manera, dentro de esteinforme se hace uso de diferentes métodos estadísticos que brindan un análisis de manera general y a detalle de los deslizamientos y la relación con sus distancias, fechas, etc. Por último, como objetivo se busca apreciar la utilidad de las estadisticas para acciones como organizar, categorizar y graficar datos de manera precisa y significativa.

Republica Dominicana

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
colnames(df)[4] <- "Continent"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[10] <- "Distance"
colnames(df)[2] <- "Date"
library(readr)
library(knitr)
df_DO <- subset (df, Country == "Dominican Republic")
knitr::kable(head(df_DO))
id Date time Continent Country country_code State population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
15 124 7/13/07 Night NA Dominican Republic DO Distrito Nacional 13456 San Carlos 1.70298 18.4757 -69.9140 (18.4757, -69.914000000000001) Landslide Landslide Small Unknown NA NA Dominican Today http://www.dominicantoday.com/app/article.aspx?id=24682
52 333 10/29/07 NA Dominican Republic DO San Cristóbal 66784 Bajos de Haina 1.72138 18.4270 -70.0440 (18.427, -70.043999999999997) Landslide Mudslide Medium Tropical cyclone Tropical Storm Noel NA 3 United Nations Development Programme - Relief Web http://news.scotsman.com/international.cfm?id=1730152007
58 343 11/1/07 NA Dominican Republic DO La Vega 3613 Río Verde Abajo 3.72637 19.3050 -70.6000 (19.305, -70.599999999999994) Landslide Complex Large Tropical cyclone Tropical Storm Noel NA 68 United Nations Development Programme - Relief Web http://www.reliefweb.int/rw/fullMaps_Am.nsf/luFullMap/CEB72F0756431A7CC125738D003E2EF4/$File/ifrc_TC_carib071108.pdf?OpenElement
64 388 12/11/07 NA Dominican Republic DO Santiago 1200000 Santiago de los Caballeros 1.10868 19.4550 -70.7070 (19.454999999999998, -70.706999999999994) Landslide Landslide Medium Tropical cyclone Tropical Storm Olga NA 17 news.gossip.info http://clutchmagonline.com/newsgossipinfo/caribbean-storm-death-toll-rises/
132 724 8/17/08 NA Dominican Republic DO Hato Mayor 13977 Sabana de La Mar 0.75284 19.0560 -69.3822 (19.056000000000001, -69.382199999999997) Landslide Complex Medium Tropical cyclone Tropical Storm Fay NA NA http://www.dominicantoday.com/dr/economy/2008/8/18/29085/Storms-downpours-block-transit-on-newest-Dominican-highway
138 746 8/26/08 NA Dominican Republic DO Distrito Nacional 10457 La Agustina 5.71058 18.5500 -69.9200 (18.55, -69.92) Landslide Mudslide Medium Tropical cyclone Hurricane Gustav NA 8 http://www.reuters.com/article/worldNews/idUSN2541891320080827?pageNumber=1&virtualBrandChannel=0
df_DO %>% 
  select(Country, State, City, Distance, Date)
##                 Country             State                       City Distance
## 15   Dominican Republic Distrito Nacional                 San Carlos  1.70298
## 52   Dominican Republic    San Cristóbal             Bajos de Haina  1.72138
## 58   Dominican Republic           La Vega           Río Verde Abajo  3.72637
## 64   Dominican Republic          Santiago Santiago de los Caballeros  1.10868
## 132  Dominican Republic        Hato Mayor           Sabana de La Mar  0.75284
## 138  Dominican Republic Distrito Nacional                La Agustina  5.71058
## 178  Dominican Republic          Santiago              Pedro García  4.86398
## 211  Dominican Republic      Puerto Plata                   Altamira  0.88500
## 212  Dominican Republic          Santiago                   Tamboril  4.31327
## 750  Dominican Republic          Santiago     San José de Las Matas  2.72462
## 774  Dominican Republic Distrito Nacional              Santo Domingo  0.55721
## 833  Dominican Republic           La Vega                  Constanza  0.52969
## 923  Dominican Republic      Puerto Plata               Puerto Plata  1.19636
## 1394 Dominican Republic     Santo Domingo         Santo Domingo Este  3.98059
## 1395 Dominican Republic      Puerto Plata                   Luperón  1.54885
##          Date
## 15    7/13/07
## 52   10/29/07
## 58    11/1/07
## 64   12/11/07
## 132   8/17/08
## 138   8/26/08
## 178   2/12/09
## 211   9/20/09
## 212   9/20/09
## 750    6/3/11
## 774    7/6/11
## 833  11/18/11
## 923   12/5/12
## 1394   8/3/14
## 1395  11/7/14

Deslizamientos por estado

library(ggplot2)
ggplot(data=df_DO, aes(x = "Dominican Republic", y = Distance, fill=State)) +
  geom_bar(stat = "identity", width = 1, color = "black") +
  coord_polar("y", start = 0)

ggplot(data=df_DO, aes(fill=State, y=Distance, x="Dominican Republic")) +
  geom_bar(position="dodge", stat="identity")

Distrito Nacional

Deslizamientos de las ciudades de Distrito Nacional

library(readr)
library(knitr)
df_DN <- subset (df, State == "Distrito Nacional")
df_DN %>% 
  select(Country, State, City, Distance, Date) 
##                Country             State          City Distance    Date
## 15  Dominican Republic Distrito Nacional    San Carlos  1.70298 7/13/07
## 138 Dominican Republic Distrito Nacional   La Agustina  5.71058 8/26/08
## 774 Dominican Republic Distrito Nacional Santo Domingo  0.55721  7/6/11
ggplot(data=df_DN, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_DN,aes(x="Distrito Nacional",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento

library(qcc)
## Warning: package 'qcc' was built under R version 4.1.1
## Package 'qcc' version 2.7
## Type 'citation("qcc")' for citing this R package in publications.
Distance <- df_DN$Distance
names(Distance) <- df_DN$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##                
## Pareto chart analysis for Distance
##                  Frequency  Cum.Freq. Percentage Cum.Percent.
##   La Agustina     5.710580   5.710580  71.644019    71.644019
##   San Carlos      1.702980   7.413560  21.365314    93.009333
##   Santo Domingo   0.557210   7.970770   6.990667   100.000000

Diagrama de tallo y hojas

stem(df_DN$"Distance")
## 
##   The decimal point is at the |
## 
##   0 | 67
##   2 | 
##   4 | 7
stem(df_DN$"Distance")
## 
##   The decimal point is at the |
## 
##   0 | 67
##   2 | 
##   4 | 7
stem(df_DN$"Distance", scale = 2)
## 
##   The decimal point is at the |
## 
##   0 | 6
##   1 | 7
##   2 | 
##   3 | 
##   4 | 
##   5 | 7

Series temporales

library(forecast)
## Warning: package 'forecast' was built under R version 4.1.1
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
data_serie<- ts(df_DN$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan     Feb     Mar
## 2007 1.70298 5.71058 0.55721
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()

Tablas de frecuencia

library(questionr)
## Warning: package 'questionr' was built under R version 4.1.1
table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
0.55721 1 33.3 33.3 33.3 33.3
1.70298 1 33.3 33.3 66.7 66.7
5.71058 1 33.3 33.3 100.0 100.0
Total 3 100.0 100.0 100.0 100.0
str(table) 
## Classes 'freqtab' and 'data.frame':  4 obs. of  5 variables:
##  $ n      : num  1 1 1 3
##  $ %      : num  33.3 33.3 33.3 100
##  $ val%   : num  33.3 33.3 33.3 100
##  $ %cum   : num  33.3 66.7 100 100
##  $ val%cum: num  33.3 66.7 100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
0.55721 1
1.70298 1
5.71058 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1] 0.55721 2.55721 4.55721 6.55721
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
0.55721 1 0.3333333 1
1.70298 1 0.3333333 2
5.71058 1 0.3333333 3
str(Freq_table)
## 'data.frame':    3 obs. of  4 variables:
##  $ Distance: Factor w/ 3 levels "0.55721","1.70298",..: 1 2 3
##  $ Freq    : int  1 1 1
##  $ Rel_Freq: num  0.333 0.333 0.333
##  $ Cum_Freq: int  1 2 3
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
0.55721 1
1.70298 1
5.71058 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_DN$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.5572  1.1301  1.7030  2.6569  3.7068  5.7106
library(pastecs)
## Warning: package 'pastecs' was built under R version 4.1.1
## 
## Attaching package: 'pastecs'
## The following objects are masked from 'package:dplyr':
## 
##     first, last
stat.desc(df_DN)
## Warning in min(x): no non-missing arguments to min; returning Inf
## Warning in max(x): no non-missing arguments to max; returning -Inf
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                        id Date time Continent Country country_code State
## nbr.val      3.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.null     0.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.na       0.000000e+00   NA   NA        NA      NA           NA    NA
## min          1.240000e+02   NA   NA        NA      NA           NA    NA
## max          3.736000e+03   NA   NA        NA      NA           NA    NA
## range        3.612000e+03   NA   NA        NA      NA           NA    NA
## sum          4.606000e+03   NA   NA        NA      NA           NA    NA
## median       7.460000e+02   NA   NA        NA      NA           NA    NA
## mean         1.535333e+03   NA   NA        NA      NA           NA    NA
## SE.mean      1.114887e+03   NA   NA        NA      NA           NA    NA
## CI.mean.0.95 4.796973e+03   NA   NA        NA      NA           NA    NA
## var          3.728921e+06   NA   NA        NA      NA           NA    NA
## std.dev      1.931042e+03   NA   NA        NA      NA           NA    NA
## coef.var     1.257734e+00   NA   NA        NA      NA           NA    NA
##                population City Distance location_description     latitude
## nbr.val      3.000000e+00   NA 3.000000                   NA  3.000000000
## nbr.null     0.000000e+00   NA 0.000000                   NA  0.000000000
## nbr.na       0.000000e+00   NA 0.000000                   NA  0.000000000
## min          1.045700e+04   NA 0.557210                   NA 18.475700000
## max          2.201941e+06   NA 5.710580                   NA 18.550000000
## range        2.191484e+06   NA 5.153370                   NA  0.074300000
## sum          2.225854e+06   NA 7.970770                   NA 55.525700000
## median       1.345600e+04   NA 1.702980                   NA 18.500000000
## mean         7.419513e+05   NA 2.656923                   NA 18.508566667
## SE.mean      7.299953e+05   NA 1.562243                   NA  0.021872078
## CI.mean.0.95 3.140916e+06   NA 6.721790                   NA  0.094107954
## var          1.598680e+12   NA 7.321812                   NA  0.001435163
## std.dev      1.264389e+06   NA 2.705885                   NA  0.037883550
## coef.var     1.704140e+00   NA 1.018428                   NA  0.002046812
##                  longitude geolocation hazard_type landslide_type
## nbr.val       3.000000e+00          NA          NA             NA
## nbr.null      0.000000e+00          NA          NA             NA
## nbr.na        0.000000e+00          NA          NA             NA
## min          -6.998330e+01          NA          NA             NA
## max          -6.991400e+01          NA          NA             NA
## range         6.930000e-02          NA          NA             NA
## sum          -2.098173e+02          NA          NA             NA
## median       -6.992000e+01          NA          NA             NA
## mean         -6.993910e+01          NA          NA             NA
## SE.mean       2.216777e-02          NA          NA             NA
## CI.mean.0.95  9.538021e-02          NA          NA             NA
## var           1.474230e-03          NA          NA             NA
## std.dev       3.839570e-02          NA          NA             NA
## coef.var     -5.489877e-04          NA          NA             NA
##              landslide_size trigger storm_name injuries fatalities source_name
## nbr.val                  NA      NA         NA        0   2.000000          NA
## nbr.null                 NA      NA         NA        0   0.000000          NA
## nbr.na                   NA      NA         NA        3   1.000000          NA
## min                      NA      NA         NA      Inf   1.000000          NA
## max                      NA      NA         NA     -Inf   8.000000          NA
## range                    NA      NA         NA     -Inf   7.000000          NA
## sum                      NA      NA         NA        0   9.000000          NA
## median                   NA      NA         NA       NA   4.500000          NA
## mean                     NA      NA         NA      NaN   4.500000          NA
## SE.mean                  NA      NA         NA       NA   3.500000          NA
## CI.mean.0.95             NA      NA         NA      NaN  44.471717          NA
## var                      NA      NA         NA       NA  24.500000          NA
## std.dev                  NA      NA         NA       NA   4.949747          NA
## coef.var                 NA      NA         NA       NA   1.099944          NA
##              source_link
## nbr.val               NA
## nbr.null              NA
## nbr.na                NA
## min                   NA
## max                   NA
## range                 NA
## sum                   NA
## median                NA
## mean                  NA
## SE.mean               NA
## CI.mean.0.95          NA
## var                   NA
## std.dev               NA
## coef.var              NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v tibble  3.1.3     v stringr 1.4.0
## v tidyr   1.1.3     v forcats 0.5.1
## v purrr   0.3.4
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x tidyr::extract() masks pastecs::extract()
## x dplyr::filter()  masks stats::filter()
## x pastecs::first() masks dplyr::first()
## x dplyr::lag()     masks stats::lag()
## x pastecs::last()  masks dplyr::last()
library(hrbrthemes)
## Warning: package 'hrbrthemes' was built under R version 4.1.1
## NOTE: Either Arial Narrow or Roboto Condensed fonts are required to use these themes.
##       Please use hrbrthemes::import_roboto_condensed() to install Roboto Condensed and
##       if Arial Narrow is not on your system, please see https://bit.ly/arialnarrow
library(viridis)
## Warning: package 'viridis' was built under R version 4.1.1
## Loading required package: viridisLite
df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family not
## found in Windows font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family not
## found in Windows font database

## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family not
## found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[10] <- "Distance"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[2] <- "Date"

La Vega

Deslizamientos de las ciudades de La Vega

library(readr)
library(knitr)
df_LV <- subset (df, State == "La Vega")
df_LV %>% 
  select(Country, State, City, Distance, Date) 
##                Country   State             City Distance     Date
## 58  Dominican Republic La Vega Río Verde Abajo  3.72637  11/1/07
## 833 Dominican Republic La Vega        Constanza  0.52969 11/18/11
head(df_LV)
##       id     Date time Continent            Country country_code   State
## 58   343  11/1/07           <NA> Dominican Republic           DO La Vega
## 833 4051 11/18/11           <NA> Dominican Republic           DO La Vega
##     population             City Distance location_description latitude
## 58        3613 Río Verde Abajo  3.72637                       19.3050
## 833      29481        Constanza  0.52969                       18.9045
##     longitude                   geolocation hazard_type landslide_type
## 58    -70.600 (19.305, -70.599999999999994)   Landslide        Complex
## 833   -70.744 (18.904499999999999, -70.744)   Landslide       Mudslide
##     landslide_size          trigger          storm_name injuries fatalities
## 58           Large Tropical cyclone Tropical Storm Noel       NA         68
## 833         Medium         Downpour                           NA          0
##                                           source_name
## 58  United Nations Development Programme - Relief Web
## 833                                                  
##                                                                                                                          source_link
## 58  http://www.reliefweb.int/rw/fullMaps_Am.nsf/luFullMap/CEB72F0756431A7CC125738D003E2EF4/$File/ifrc_TC_carib071108.pdf?OpenElement
## 833                                      http://www.dominicantoday.com/dr/local/2011/11/18/41684/Mudslides-halt-traffic-to-Constanza
ggplot(data=df_LV, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_LV,aes(x="La Vega",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)

Distance <- df_LV$Distance
names(Distance) <- df_LV$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##                   
## Pareto chart analysis for Distance
##                    Frequency Cum.Freq. Percentage Cum.Percent.
##   Río Verde Abajo   3.72637   3.72637   87.55445     87.55445
##   Constanza          0.52969   4.25606   12.44555    100.00000

Diagrama de tallo y hojas

stem(df_LV$"Distance")
## 
##   The decimal point is at the |
## 
##   0 | 5
##   1 | 
##   2 | 
##   3 | 7
stem(df_LV$"Distance")
## 
##   The decimal point is at the |
## 
##   0 | 5
##   1 | 
##   2 | 
##   3 | 7
stem(df_LV$"Distance", scale = 2)
## 
##   The decimal point is at the |
## 
##   0 | 5
##   1 | 
##   1 | 
##   2 | 
##   2 | 
##   3 | 
##   3 | 7

Series temporales

library(forecast)
data_serie<- ts(df_LV$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan     Feb
## 2007 3.72637 0.52969
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
0.52969 1 50 50 50 50
3.72637 1 50 50 100 100
Total 2 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  3 obs. of  5 variables:
##  $ n      : num  1 1 2
##  $ %      : num  50 50 100
##  $ val%   : num  50 50 100
##  $ %cum   : num  50 100 100
##  $ val%cum: num  50 100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
0.52969 1
3.72637 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1] 0.52969 2.52969 4.52969
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
0.52969 1 0.5 1
3.72637 1 0.5 2
str(Freq_table)
## 'data.frame':    2 obs. of  4 variables:
##  $ Distance: Factor w/ 2 levels "0.52969","3.72637": 1 2
##  $ Freq    : int  1 1
##  $ Rel_Freq: num  0.5 0.5
##  $ Cum_Freq: int  1 2
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
0.52969 1
3.72637 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_LV$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.5297  1.3289  2.1280  2.1280  2.9272  3.7264
library(pastecs)
stat.desc(df_LV)
## Warning in min(x): no non-missing arguments to min; returning Inf
## Warning in max(x): no non-missing arguments to max; returning -Inf
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                        id Date time Continent Country country_code State
## nbr.val      2.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.null     0.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.na       0.000000e+00   NA   NA        NA      NA           NA    NA
## min          3.430000e+02   NA   NA        NA      NA           NA    NA
## max          4.051000e+03   NA   NA        NA      NA           NA    NA
## range        3.708000e+03   NA   NA        NA      NA           NA    NA
## sum          4.394000e+03   NA   NA        NA      NA           NA    NA
## median       2.197000e+03   NA   NA        NA      NA           NA    NA
## mean         2.197000e+03   NA   NA        NA      NA           NA    NA
## SE.mean      1.854000e+03   NA   NA        NA      NA           NA    NA
## CI.mean.0.95 2.355730e+04   NA   NA        NA      NA           NA    NA
## var          6.874632e+06   NA   NA        NA      NA           NA    NA
## std.dev      2.621952e+03   NA   NA        NA      NA           NA    NA
## coef.var     1.193424e+00   NA   NA        NA      NA           NA    NA
##                population City  Distance location_description    latitude
## nbr.val      2.000000e+00   NA  2.000000                   NA  2.00000000
## nbr.null     0.000000e+00   NA  0.000000                   NA  0.00000000
## nbr.na       0.000000e+00   NA  0.000000                   NA  0.00000000
## min          3.613000e+03   NA  0.529690                   NA 18.90450000
## max          2.948100e+04   NA  3.726370                   NA 19.30500000
## range        2.586800e+04   NA  3.196680                   NA  0.40050000
## sum          3.309400e+04   NA  4.256060                   NA 38.20950000
## median       1.654700e+04   NA  2.128030                   NA 19.10475000
## mean         1.654700e+04   NA  2.128030                   NA 19.10475000
## SE.mean      1.293400e+04   NA  1.598340                   NA  0.20025000
## CI.mean.0.95 1.643421e+05   NA 20.308835                   NA  2.54441750
## var          3.345767e+08   NA  5.109382                   NA  0.08020013
## std.dev      1.829144e+04   NA  2.260394                   NA  0.28319627
## coef.var     1.105423e+00   NA  1.062200                   NA  0.01482334
##                  longitude geolocation hazard_type landslide_type
## nbr.val       2.000000e+00          NA          NA             NA
## nbr.null      0.000000e+00          NA          NA             NA
## nbr.na        0.000000e+00          NA          NA             NA
## min          -7.074400e+01          NA          NA             NA
## max          -7.060000e+01          NA          NA             NA
## range         1.440000e-01          NA          NA             NA
## sum          -1.413440e+02          NA          NA             NA
## median       -7.067200e+01          NA          NA             NA
## mean         -7.067200e+01          NA          NA             NA
## SE.mean       7.200000e-02          NA          NA             NA
## CI.mean.0.95  9.148467e-01          NA          NA             NA
## var           1.036800e-02          NA          NA             NA
## std.dev       1.018234e-01          NA          NA             NA
## coef.var     -1.440788e-03          NA          NA             NA
##              landslide_size trigger storm_name injuries  fatalities source_name
## nbr.val                  NA      NA         NA        0    2.000000          NA
## nbr.null                 NA      NA         NA        0    1.000000          NA
## nbr.na                   NA      NA         NA        2    0.000000          NA
## min                      NA      NA         NA      Inf    0.000000          NA
## max                      NA      NA         NA     -Inf   68.000000          NA
## range                    NA      NA         NA     -Inf   68.000000          NA
## sum                      NA      NA         NA        0   68.000000          NA
## median                   NA      NA         NA       NA   34.000000          NA
## mean                     NA      NA         NA      NaN   34.000000          NA
## SE.mean                  NA      NA         NA       NA   34.000000          NA
## CI.mean.0.95             NA      NA         NA      NaN  432.010961          NA
## var                      NA      NA         NA       NA 2312.000000          NA
## std.dev                  NA      NA         NA       NA   48.083261          NA
## coef.var                 NA      NA         NA       NA    1.414214          NA
##              source_link
## nbr.val               NA
## nbr.null              NA
## nbr.na                NA
## min                   NA
## max                   NA
## range                 NA
## sum                   NA
## median                NA
## mean                  NA
## SE.mean               NA
## CI.mean.0.95          NA
## var                   NA
## std.dev               NA
## coef.var              NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[10] <- "Distance"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[2] <- "Date"

Santiago

Deslizamiento en las ciudades de Santiago

library(readr)
library(knitr)
df_St <- subset (df, State == "Santiago")
df_St %>% 
  select(Country, State, City, Distance, Date) 
##                Country    State                       City Distance     Date
## 64  Dominican Republic Santiago Santiago de los Caballeros  1.10868 12/11/07
## 178 Dominican Republic Santiago              Pedro García  4.86398  2/12/09
## 212 Dominican Republic Santiago                   Tamboril  4.31327  9/20/09
## 750 Dominican Republic Santiago     San José de Las Matas  2.72462   6/3/11
head(df_St)
##       id     Date time Continent            Country country_code    State
## 64   388 12/11/07           <NA> Dominican Republic           DO Santiago
## 178  984  2/12/09           <NA> Dominican Republic           DO Santiago
## 212 1178  9/20/09           <NA> Dominican Republic           DO Santiago
## 750 3569   6/3/11           <NA> Dominican Republic           DO Santiago
##     population                       City Distance location_description
## 64     1200000 Santiago de los Caballeros  1.10868                     
## 178       1457              Pedro García  4.86398                     
## 212      23304                   Tamboril  4.31327                     
## 750       9853     San José de Las Matas  2.72462                     
##     latitude longitude                               geolocation hazard_type
## 64   19.4550  -70.7070 (19.454999999999998, -70.706999999999994)   Landslide
## 178  19.5500  -70.6390              (19.55, -70.638999999999996)   Landslide
## 212  19.5167  -70.5866            (19.5167, -70.586600000000004)   Landslide
## 750  19.3556  -70.9189 (19.355599999999999, -70.918899999999994)   Landslide
##     landslide_type landslide_size          trigger          storm_name injuries
## 64       Landslide         Medium Tropical cyclone Tropical Storm Olga       NA
## 178       Mudslide         Medium         Downpour                           NA
## 212      Landslide          Small         Downpour                           NA
## 750      Landslide         Medium         Downpour                           NA
##     fatalities      source_name
## 64          17 news.gossip.info
## 178          0                 
## 212         NA                 
## 750          1                 
##                                                                        source_link
## 64     http://clutchmagonline.com/newsgossipinfo/caribbean-storm-death-toll-rises/
## 178 http://us.puerto-plata-live.com/puerto-plata/news/year-2009/february-2009.html
## 212              http://www.laht.com/article.asp?CategoryId=14092&ArticleId=327347
## 750               http://english.peopledaily.com.cn/90001/90777/90852/7402423.html
ggplot(data=df_St, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_St,aes(x="Santiago",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)

Distance <- df_St$Distance
names(Distance) <- df_St$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##                             
## Pareto chart analysis for Distance
##                               Frequency  Cum.Freq. Percentage Cum.Percent.
##   Pedro García                4.863980   4.863980  37.384891    37.384891
##   Tamboril                     4.313270   9.177250  33.152096    70.536987
##   San José de Las Matas       2.724620  11.901870  20.941620    91.478608
##   Santiago de los Caballeros   1.108680  13.010550   8.521392   100.000000

Diagrama de tallo y hojas

stem(df_St$"Distance")
## 
##   The decimal point is at the |
## 
##   1 | 1
##   2 | 7
##   3 | 
##   4 | 39
stem(df_St$"Distance")
## 
##   The decimal point is at the |
## 
##   1 | 1
##   2 | 7
##   3 | 
##   4 | 39
stem(df_St$"Distance", scale = 2)
## 
##   The decimal point is at the |
## 
##   1 | 1
##   1 | 
##   2 | 
##   2 | 7
##   3 | 
##   3 | 
##   4 | 3
##   4 | 9

Series temporales

library(forecast)
data_serie<- ts(df_St$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan     Feb     Mar     Apr
## 2007 1.10868 4.86398 4.31327 2.72462
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
1.10868 1 25 25 25 25
2.72462 1 25 25 50 50
4.31327 1 25 25 75 75
4.86398 1 25 25 100 100
Total 4 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  5 obs. of  5 variables:
##  $ n      : num  1 1 1 1 4
##  $ %      : num  25 25 25 25 100
##  $ val%   : num  25 25 25 25 100
##  $ %cum   : num  25 50 75 100 100
##  $ val%cum: num  25 50 75 100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
1.10868 1
2.72462 1
4.31327 1
4.86398 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1] 1.10868 3.10868 5.10868
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
1.10868 1 0.25 1
2.72462 1 0.25 2
4.31327 1 0.25 3
4.86398 1 0.25 4
str(Freq_table)
## 'data.frame':    4 obs. of  4 variables:
##  $ Distance: Factor w/ 4 levels "1.10868","2.72462",..: 1 2 3 4
##  $ Freq    : int  1 1 1 1
##  $ Rel_Freq: num  0.25 0.25 0.25 0.25
##  $ Cum_Freq: int  1 2 3 4
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
1.10868 1
2.72462 1
4.31327 1
4.86398 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_St$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.109   2.321   3.519   3.253   4.451   4.864
library(pastecs)
stat.desc(df_St)
## Warning in min(x): no non-missing arguments to min; returning Inf
## Warning in max(x): no non-missing arguments to max; returning -Inf
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                        id Date time Continent Country country_code State
## nbr.val      4.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.null     0.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.na       0.000000e+00   NA   NA        NA      NA           NA    NA
## min          3.880000e+02   NA   NA        NA      NA           NA    NA
## max          3.569000e+03   NA   NA        NA      NA           NA    NA
## range        3.181000e+03   NA   NA        NA      NA           NA    NA
## sum          6.119000e+03   NA   NA        NA      NA           NA    NA
## median       1.081000e+03   NA   NA        NA      NA           NA    NA
## mean         1.529750e+03   NA   NA        NA      NA           NA    NA
## SE.mean      7.002205e+02   NA   NA        NA      NA           NA    NA
## CI.mean.0.95 2.228414e+03   NA   NA        NA      NA           NA    NA
## var          1.961235e+06   NA   NA        NA      NA           NA    NA
## std.dev      1.400441e+03   NA   NA        NA      NA           NA    NA
## coef.var     9.154705e-01   NA   NA        NA      NA           NA    NA
##                population City   Distance location_description     latitude
## nbr.val      4.000000e+00   NA  4.0000000                   NA  4.000000000
## nbr.null     0.000000e+00   NA  0.0000000                   NA  0.000000000
## nbr.na       0.000000e+00   NA  0.0000000                   NA  0.000000000
## min          1.457000e+03   NA  1.1086800                   NA 19.355600000
## max          1.200000e+06   NA  4.8639800                   NA 19.550000000
## range        1.198543e+06   NA  3.7553000                   NA  0.194400000
## sum          1.234614e+06   NA 13.0105500                   NA 77.877300000
## median       1.657850e+04   NA  3.5189450                   NA 19.485850000
## mean         3.086535e+05   NA  3.2526375                   NA 19.469325000
## SE.mean      2.971496e+05   NA  0.8464003                   NA  0.042711657
## CI.mean.0.95 9.456625e+05   NA  2.6936236                   NA  0.135927554
## var          3.531914e+11   NA  2.8655741                   NA  0.007297143
## std.dev      5.942991e+05   NA  1.6928007                   NA  0.085423314
## coef.var     1.925457e+00   NA  0.5204394                   NA  0.004387585
##                  longitude geolocation hazard_type landslide_type
## nbr.val       4.000000e+00          NA          NA             NA
## nbr.null      0.000000e+00          NA          NA             NA
## nbr.na        0.000000e+00          NA          NA             NA
## min          -7.091890e+01          NA          NA             NA
## max          -7.058660e+01          NA          NA             NA
## range         3.323000e-01          NA          NA             NA
## sum          -2.828515e+02          NA          NA             NA
## median       -7.067300e+01          NA          NA             NA
## mean         -7.071287e+01          NA          NA             NA
## SE.mean       7.296329e-02          NA          NA             NA
## CI.mean.0.95  2.322018e-01          NA          NA             NA
## var           2.129457e-02          NA          NA             NA
## std.dev       1.459266e-01          NA          NA             NA
## coef.var     -2.063649e-03          NA          NA             NA
##              landslide_size trigger storm_name injuries fatalities source_name
## nbr.val                  NA      NA         NA        0   3.000000          NA
## nbr.null                 NA      NA         NA        0   1.000000          NA
## nbr.na                   NA      NA         NA        4   1.000000          NA
## min                      NA      NA         NA      Inf   0.000000          NA
## max                      NA      NA         NA     -Inf  17.000000          NA
## range                    NA      NA         NA     -Inf  17.000000          NA
## sum                      NA      NA         NA        0  18.000000          NA
## median                   NA      NA         NA       NA   1.000000          NA
## mean                     NA      NA         NA      NaN   6.000000          NA
## SE.mean                  NA      NA         NA       NA   5.507571          NA
## CI.mean.0.95             NA      NA         NA      NaN  23.697163          NA
## var                      NA      NA         NA       NA  91.000000          NA
## std.dev                  NA      NA         NA       NA   9.539392          NA
## coef.var                 NA      NA         NA       NA   1.589899          NA
##              source_link
## nbr.val               NA
## nbr.null              NA
## nbr.na                NA
## min                   NA
## max                   NA
## range                 NA
## sum                   NA
## median                NA
## mean                  NA
## SE.mean               NA
## CI.mean.0.95          NA
## var                   NA
## std.dev               NA
## coef.var              NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[10] <- "Distance"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[2] <- "Date"

Hato Mayor

Deslizamiento en las ciudades de Hato Mayor

library(readr)
library(knitr)
df_HM <- subset (df, State == "Hato Mayor")
df_HM %>% 
  select(Country, State, City, Distance, Date) 
##                Country      State             City Distance    Date
## 132 Dominican Republic Hato Mayor Sabana de La Mar  0.75284 8/17/08
head(df_HM)
##      id    Date time Continent            Country country_code      State
## 132 724 8/17/08           <NA> Dominican Republic           DO Hato Mayor
##     population             City Distance location_description latitude
## 132      13977 Sabana de La Mar  0.75284                        19.056
##     longitude                               geolocation hazard_type
## 132  -69.3822 (19.056000000000001, -69.382199999999997)   Landslide
##     landslide_type landslide_size          trigger         storm_name injuries
## 132        Complex         Medium Tropical cyclone Tropical Storm Fay       NA
##     fatalities source_name
## 132         NA            
##                                                                                                             source_link
## 132 http://www.dominicantoday.com/dr/economy/2008/8/18/29085/Storms-downpours-block-transit-on-newest-Dominican-highway
ggplot(data=df_HM, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_HM,aes(x="Hato Mayor",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)

Distance <- df_HM$Distance
names(Distance) <- df_HM$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##                   
## Pareto chart analysis for Distance
##                    Frequency Cum.Freq. Percentage Cum.Percent.
##   Sabana de La Mar   0.75284   0.75284  100.00000    100.00000

Diagrama de tallo y hojas

stem(df_HM$"Distance")


stem(df_HM$"Distance")
stem(df_HM$"Distance", scale = 2)

Series temporales

library(forecast)
data_serie<- ts(df_HM$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan
## 2007 0.75284
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()
## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
0.75284 1 100 100 100 100
Total 1 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  2 obs. of  5 variables:
##  $ n      : num  1 1
##  $ %      : num  100 100
##  $ val%   : num  100 100
##  $ %cum   : num  100 100
##  $ val%cum: num  100 100

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
str(Freq_table)
## 'data.frame':    4 obs. of  4 variables:
##  $ Distance: Factor w/ 4 levels "1.10868","2.72462",..: 1 2 3 4
##  $ Freq    : int  1 1 1 1
##  $ Rel_Freq: num  0.25 0.25 0.25 0.25
##  $ Cum_Freq: int  1 2 3 4
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
1.10868 1
2.72462 1
4.31327 1
4.86398 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
0.75284 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

head(
summary(df_HM$Distance)) 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.75284 0.75284 0.75284 0.75284 0.75284 0.75284
library(pastecs)
stat.desc(df_HM)
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
## Warning in min(x): no non-missing arguments to min; returning Inf
## Warning in max(x): no non-missing arguments to max; returning -Inf
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
## Warning in min(x): no non-missing arguments to min; returning Inf
## Warning in max(x): no non-missing arguments to max; returning -Inf
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##               id Date time Continent Country country_code State population City
## nbr.val        1   NA   NA        NA      NA           NA    NA          1   NA
## nbr.null       0   NA   NA        NA      NA           NA    NA          0   NA
## nbr.na         0   NA   NA        NA      NA           NA    NA          0   NA
## min          724   NA   NA        NA      NA           NA    NA      13977   NA
## max          724   NA   NA        NA      NA           NA    NA      13977   NA
## range          0   NA   NA        NA      NA           NA    NA          0   NA
## sum          724   NA   NA        NA      NA           NA    NA      13977   NA
## median       724   NA   NA        NA      NA           NA    NA      13977   NA
## mean         724   NA   NA        NA      NA           NA    NA      13977   NA
## SE.mean       NA   NA   NA        NA      NA           NA    NA         NA   NA
## CI.mean.0.95 NaN   NA   NA        NA      NA           NA    NA        NaN   NA
## var           NA   NA   NA        NA      NA           NA    NA         NA   NA
## std.dev       NA   NA   NA        NA      NA           NA    NA         NA   NA
## coef.var      NA   NA   NA        NA      NA           NA    NA         NA   NA
##              Distance location_description latitude longitude geolocation
## nbr.val       1.00000                   NA    1.000    1.0000          NA
## nbr.null      0.00000                   NA    0.000    0.0000          NA
## nbr.na        0.00000                   NA    0.000    0.0000          NA
## min           0.75284                   NA   19.056  -69.3822          NA
## max           0.75284                   NA   19.056  -69.3822          NA
## range         0.00000                   NA    0.000    0.0000          NA
## sum           0.75284                   NA   19.056  -69.3822          NA
## median        0.75284                   NA   19.056  -69.3822          NA
## mean          0.75284                   NA   19.056  -69.3822          NA
## SE.mean            NA                   NA       NA        NA          NA
## CI.mean.0.95      NaN                   NA      NaN       NaN          NA
## var                NA                   NA       NA        NA          NA
## std.dev            NA                   NA       NA        NA          NA
## coef.var           NA                   NA       NA        NA          NA
##              hazard_type landslide_type landslide_size trigger storm_name
## nbr.val               NA             NA             NA      NA         NA
## nbr.null              NA             NA             NA      NA         NA
## nbr.na                NA             NA             NA      NA         NA
## min                   NA             NA             NA      NA         NA
## max                   NA             NA             NA      NA         NA
## range                 NA             NA             NA      NA         NA
## sum                   NA             NA             NA      NA         NA
## median                NA             NA             NA      NA         NA
## mean                  NA             NA             NA      NA         NA
## SE.mean               NA             NA             NA      NA         NA
## CI.mean.0.95          NA             NA             NA      NA         NA
## var                   NA             NA             NA      NA         NA
## std.dev               NA             NA             NA      NA         NA
## coef.var              NA             NA             NA      NA         NA
##              injuries fatalities source_name source_link
## nbr.val             0          0          NA          NA
## nbr.null            0          0          NA          NA
## nbr.na              1          1          NA          NA
## min               Inf        Inf          NA          NA
## max              -Inf       -Inf          NA          NA
## range            -Inf       -Inf          NA          NA
## sum                 0          0          NA          NA
## median             NA         NA          NA          NA
## mean              NaN        NaN          NA          NA
## SE.mean            NA         NA          NA          NA
## CI.mean.0.95      NaN        NaN          NA          NA
## var                NA         NA          NA          NA
## std.dev            NA         NA          NA          NA
## coef.var           NA         NA          NA          NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[10] <- "Distance"
colnames(df)[2] <- "Date"

Santo Domingo

Deslizamiento en las ciudades de Santo Domingo

library(readr)
library(knitr)
df_SD <- subset (df, State == "Santo Domingo")
df_SD %>% 
  select(Country, State, City, Distance, Date) 
##                 Country         State               City Distance   Date
## 1394 Dominican Republic Santo Domingo Santo Domingo Este  3.98059 8/3/14
head(df_SD)
##        id   Date time Continent            Country country_code         State
## 1394 6706 8/3/14           <NA> Dominican Republic           DO Santo Domingo
##      population               City Distance location_description latitude
## 1394          0 Santo Domingo Este  3.98059           Urban area  18.5225
##      longitude                               geolocation hazard_type
## 1394  -69.8693 (18.522500000000001, -69.869299999999996)   Landslide
##      landslide_type landslide_size          trigger storm_name injuries
## 1394      Landslide         Medium Tropical cyclone     Bertha        0
##      fatalities   source_name
## 1394          0 Zona Oriental
##                                                                                                                                             source_link
## 1394 http://www.delazonaoriental.net/2014/08/03/derrumbes-y-deslizamientos-de-tierra-afectan-varias-viviendas-en-la-barquita-tras-paso-tormenta-bertha/
ggplot(data=df_SD, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_SD,aes(x="Santo Domingo",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)

Distance <- df_SD$Distance
names(Distance) <- df_SD$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##                     
## Pareto chart analysis for Distance
##                      Frequency Cum.Freq. Percentage Cum.Percent.
##   Santo Domingo Este   3.98059   3.98059  100.00000    100.00000

Diagrama de tallo y hojas

stem(df_SD$"Distance")
head(df_SD)
##        id   Date time Continent            Country country_code         State
## 1394 6706 8/3/14           <NA> Dominican Republic           DO Santo Domingo
##      population               City Distance location_description latitude
## 1394          0 Santo Domingo Este  3.98059           Urban area  18.5225
##      longitude                               geolocation hazard_type
## 1394  -69.8693 (18.522500000000001, -69.869299999999996)   Landslide
##      landslide_type landslide_size          trigger storm_name injuries
## 1394      Landslide         Medium Tropical cyclone     Bertha        0
##      fatalities   source_name
## 1394          0 Zona Oriental
##                                                                                                                                             source_link
## 1394 http://www.delazonaoriental.net/2014/08/03/derrumbes-y-deslizamientos-de-tierra-afectan-varias-viviendas-en-la-barquita-tras-paso-tormenta-bertha/
stem(df_SD$"Distance", scale = 2)

Series temporales

library(forecast)
data_serie<- ts(df_SD$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan
## 2007 3.98059
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()
## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
3.98059 1 100 100 100 100
Total 1 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  2 obs. of  5 variables:
##  $ n      : num  1 1
##  $ %      : num  100 100
##  $ val%   : num  100 100
##  $ %cum   : num  100 100
##  $ val%cum: num  100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
3.98059 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1] 3.98059
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
3.98059 1 1 1
str(Freq_table)
## 'data.frame':    1 obs. of  4 variables:
##  $ Distance: Factor w/ 1 level "3.98059": 1
##  $ Freq    : int 1
##  $ Rel_Freq: num 1
##  $ Cum_Freq: int 1
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
3.98059 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_SD$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   3.981   3.981   3.981   3.981   3.981   3.981
library(pastecs)
stat.desc(df_SD)
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                id Date time Continent Country country_code State population
## nbr.val         1   NA   NA        NA      NA           NA    NA          1
## nbr.null        0   NA   NA        NA      NA           NA    NA          1
## nbr.na          0   NA   NA        NA      NA           NA    NA          0
## min          6706   NA   NA        NA      NA           NA    NA          0
## max          6706   NA   NA        NA      NA           NA    NA          0
## range           0   NA   NA        NA      NA           NA    NA          0
## sum          6706   NA   NA        NA      NA           NA    NA          0
## median       6706   NA   NA        NA      NA           NA    NA          0
## mean         6706   NA   NA        NA      NA           NA    NA          0
## SE.mean        NA   NA   NA        NA      NA           NA    NA         NA
## CI.mean.0.95  NaN   NA   NA        NA      NA           NA    NA        NaN
## var            NA   NA   NA        NA      NA           NA    NA         NA
## std.dev        NA   NA   NA        NA      NA           NA    NA         NA
## coef.var       NA   NA   NA        NA      NA           NA    NA         NA
##              City Distance location_description latitude longitude geolocation
## nbr.val        NA  1.00000                   NA   1.0000    1.0000          NA
## nbr.null       NA  0.00000                   NA   0.0000    0.0000          NA
## nbr.na         NA  0.00000                   NA   0.0000    0.0000          NA
## min            NA  3.98059                   NA  18.5225  -69.8693          NA
## max            NA  3.98059                   NA  18.5225  -69.8693          NA
## range          NA  0.00000                   NA   0.0000    0.0000          NA
## sum            NA  3.98059                   NA  18.5225  -69.8693          NA
## median         NA  3.98059                   NA  18.5225  -69.8693          NA
## mean           NA  3.98059                   NA  18.5225  -69.8693          NA
## SE.mean        NA       NA                   NA       NA        NA          NA
## CI.mean.0.95   NA      NaN                   NA      NaN       NaN          NA
## var            NA       NA                   NA       NA        NA          NA
## std.dev        NA       NA                   NA       NA        NA          NA
## coef.var       NA       NA                   NA       NA        NA          NA
##              hazard_type landslide_type landslide_size trigger storm_name
## nbr.val               NA             NA             NA      NA         NA
## nbr.null              NA             NA             NA      NA         NA
## nbr.na                NA             NA             NA      NA         NA
## min                   NA             NA             NA      NA         NA
## max                   NA             NA             NA      NA         NA
## range                 NA             NA             NA      NA         NA
## sum                   NA             NA             NA      NA         NA
## median                NA             NA             NA      NA         NA
## mean                  NA             NA             NA      NA         NA
## SE.mean               NA             NA             NA      NA         NA
## CI.mean.0.95          NA             NA             NA      NA         NA
## var                   NA             NA             NA      NA         NA
## std.dev               NA             NA             NA      NA         NA
## coef.var              NA             NA             NA      NA         NA
##              injuries fatalities source_name source_link
## nbr.val             1          1          NA          NA
## nbr.null            1          1          NA          NA
## nbr.na              0          0          NA          NA
## min                 0          0          NA          NA
## max                 0          0          NA          NA
## range               0          0          NA          NA
## sum                 0          0          NA          NA
## median              0          0          NA          NA
## mean                0          0          NA          NA
## SE.mean            NA         NA          NA          NA
## CI.mean.0.95      NaN        NaN          NA          NA
## var                NA         NA          NA          NA
## std.dev            NA         NA          NA          NA
## coef.var           NA         NA          NA          NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[10] <- "Distance"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[2] <- "Date"

Puerto Plata

Deslizamiento en las ciudades de Puerto Plata

library(readr)
library(knitr)
df_PP <- subset (df, State == "Puerto Plata")
df_PP %>% 
  select(Country, State, City, Distance, Date) 
##                 Country        State         City Distance    Date
## 211  Dominican Republic Puerto Plata     Altamira  0.88500 9/20/09
## 923  Dominican Republic Puerto Plata Puerto Plata  1.19636 12/5/12
## 1395 Dominican Republic Puerto Plata     Luperón  1.54885 11/7/14
head(df_PP)
##        id    Date time Continent            Country country_code        State
## 211  1177 9/20/09           <NA> Dominican Republic           DO Puerto Plata
## 923  4655 12/5/12           <NA> Dominican Republic           DO Puerto Plata
## 1395 6707 11/7/14           <NA> Dominican Republic           DO Puerto Plata
##      population         City Distance location_description latitude longitude
## 211        4563     Altamira  0.88500                       19.6750  -70.8362
## 923      146000 Puerto Plata  1.19636                       19.7827  -70.6871
## 1395       4393     Luperón  1.54885           Below road  19.9053  -70.9630
##                                    geolocation hazard_type landslide_type
## 211  (19.675000000000001, -70.836200000000005)   Landslide      Landslide
## 923  (19.782699999999998, -70.687100000000001)   Landslide      Landslide
## 1395            (19.9053, -70.962999999999994)   Landslide      Landslide
##      landslide_size  trigger storm_name injuries fatalities source_name
## 211          Medium Downpour                  NA          2            
## 923          Medium     Rain                  NA         NA            
## 1395         Medium     Rain                   0          0         Hoy
##                                                                                                              source_link
## 211                                                    http://www.laht.com/article.asp?CategoryId=14092&ArticleId=327347
## 923  http://www.dominicantoday.com/dr/local/2012/12/5/45992/Crews-clear-Santiago-Puerto-Plata-road-blocked-by-landslides
## 1395                             http://hoy.com.do/carretera-luperon-presenta-hundimientos-y-deslizamientos-por-lluvias/
ggplot(data=df_PP, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_PP,aes(x="Puerto Plata",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=6)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)

Distance <- df_PP$Distance
names(Distance) <- df_PP$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##               
## Pareto chart analysis for Distance
##                Frequency Cum.Freq. Percentage Cum.Percent.
##   Luperón       1.54885   1.54885   42.66558     42.66558
##   Puerto Plata   1.19636   2.74521   32.95567     75.62125
##   Altamira       0.88500   3.63021   24.37875    100.00000

Diagrama de tallo y hojas

stem(df_PP$"Distance")
## 
##   The decimal point is 1 digit(s) to the left of the |
## 
##    8 | 9
##   10 | 
##   12 | 0
##   14 | 5
stem(df_PP$"Distance")
## 
##   The decimal point is 1 digit(s) to the left of the |
## 
##    8 | 9
##   10 | 
##   12 | 0
##   14 | 5
stem(df_PP$"Distance", scale = 2)
## 
##   The decimal point is 1 digit(s) to the left of the |
## 
##    8 | 9
##    9 | 
##   10 | 
##   11 | 
##   12 | 0
##   13 | 
##   14 | 
##   15 | 5

Series temporales

library(forecast)
data_serie<- ts(df_PP$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan     Feb     Mar
## 2007 0.88500 1.19636 1.54885
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
0.885 1 33.3 33.3 33.3 33.3
1.19636 1 33.3 33.3 66.7 66.7
1.54885 1 33.3 33.3 100.0 100.0
Total 3 100.0 100.0 100.0 100.0
str(table) 
## Classes 'freqtab' and 'data.frame':  4 obs. of  5 variables:
##  $ n      : num  1 1 1 3
##  $ %      : num  33.3 33.3 33.3 100
##  $ val%   : num  33.3 33.3 33.3 100
##  $ %cum   : num  33.3 66.7 100 100
##  $ val%cum: num  33.3 66.7 100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
0.885 1
1.19636 1
1.54885 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1] 0.885 1.885
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
0.885 1 0.3333333 1
1.19636 1 0.3333333 2
1.54885 1 0.3333333 3
str(Freq_table)
## 'data.frame':    3 obs. of  4 variables:
##  $ Distance: Factor w/ 3 levels "0.885","1.19636",..: 1 2 3
##  $ Freq    : int  1 1 1
##  $ Rel_Freq: num  0.333 0.333 0.333
##  $ Cum_Freq: int  1 2 3
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
0.885 1
1.19636 1
1.54885 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_PP$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.885   1.041   1.196   1.210   1.373   1.549
head(df_PP)
##        id    Date time Continent            Country country_code        State
## 211  1177 9/20/09           <NA> Dominican Republic           DO Puerto Plata
## 923  4655 12/5/12           <NA> Dominican Republic           DO Puerto Plata
## 1395 6707 11/7/14           <NA> Dominican Republic           DO Puerto Plata
##      population         City Distance location_description latitude longitude
## 211        4563     Altamira  0.88500                       19.6750  -70.8362
## 923      146000 Puerto Plata  1.19636                       19.7827  -70.6871
## 1395       4393     Luperón  1.54885           Below road  19.9053  -70.9630
##                                    geolocation hazard_type landslide_type
## 211  (19.675000000000001, -70.836200000000005)   Landslide      Landslide
## 923  (19.782699999999998, -70.687100000000001)   Landslide      Landslide
## 1395            (19.9053, -70.962999999999994)   Landslide      Landslide
##      landslide_size  trigger storm_name injuries fatalities source_name
## 211          Medium Downpour                  NA          2            
## 923          Medium     Rain                  NA         NA            
## 1395         Medium     Rain                   0          0         Hoy
##                                                                                                              source_link
## 211                                                    http://www.laht.com/article.asp?CategoryId=14092&ArticleId=327347
## 923  http://www.dominicantoday.com/dr/local/2012/12/5/45992/Crews-clear-Santiago-Puerto-Plata-road-blocked-by-landslides
## 1395                             http://hoy.com.do/carretera-luperon-presenta-hundimientos-y-deslizamientos-por-lluvias/
library(pastecs)
stat.desc(df_PP)
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                        id Date time Continent Country country_code State
## nbr.val      3.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.null     0.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.na       0.000000e+00   NA   NA        NA      NA           NA    NA
## min          1.177000e+03   NA   NA        NA      NA           NA    NA
## max          6.707000e+03   NA   NA        NA      NA           NA    NA
## range        5.530000e+03   NA   NA        NA      NA           NA    NA
## sum          1.253900e+04   NA   NA        NA      NA           NA    NA
## median       4.655000e+03   NA   NA        NA      NA           NA    NA
## mean         4.179667e+03   NA   NA        NA      NA           NA    NA
## SE.mean      1.613968e+03   NA   NA        NA      NA           NA    NA
## CI.mean.0.95 6.944345e+03   NA   NA        NA      NA           NA    NA
## var          7.814681e+06   NA   NA        NA      NA           NA    NA
## std.dev      2.795475e+03   NA   NA        NA      NA           NA    NA
## coef.var     6.688273e-01   NA   NA        NA      NA           NA    NA
##                population City  Distance location_description    latitude
## nbr.val      3.000000e+00   NA 3.0000000                   NA  3.00000000
## nbr.null     0.000000e+00   NA 0.0000000                   NA  0.00000000
## nbr.na       0.000000e+00   NA 0.0000000                   NA  0.00000000
## min          4.393000e+03   NA 0.8850000                   NA 19.67500000
## max          1.460000e+05   NA 1.5488500                   NA 19.90530000
## range        1.416070e+05   NA 0.6638500                   NA  0.23030000
## sum          1.549560e+05   NA 3.6302100                   NA 59.36300000
## median       4.563000e+03   NA 1.1963600                   NA 19.78270000
## mean         5.165200e+04   NA 1.2100700                   NA 19.78766667
## SE.mean      4.717403e+04   NA 0.1917596                   NA  0.06652825
## CI.mean.0.95 2.029734e+05   NA 0.8250748                   NA  0.28624795
## var          6.676166e+09   NA 0.1103152                   NA  0.01327802
## std.dev      8.170781e+04   NA 0.3321373                   NA  0.11523031
## coef.var     1.581891e+00   NA 0.2744777                   NA  0.00582334
##                  longitude geolocation hazard_type landslide_type
## nbr.val       3.000000e+00          NA          NA             NA
## nbr.null      0.000000e+00          NA          NA             NA
## nbr.na        0.000000e+00          NA          NA             NA
## min          -7.096300e+01          NA          NA             NA
## max          -7.068710e+01          NA          NA             NA
## range         2.759000e-01          NA          NA             NA
## sum          -2.124863e+02          NA          NA             NA
## median       -7.083620e+01          NA          NA             NA
## mean         -7.082877e+01          NA          NA             NA
## SE.mean       7.973214e-02          NA          NA             NA
## CI.mean.0.95  3.430597e-01          NA          NA             NA
## var           1.907164e-02          NA          NA             NA
## std.dev       1.381001e-01          NA          NA             NA
## coef.var     -1.949774e-03          NA          NA             NA
##              landslide_size trigger storm_name injuries fatalities source_name
## nbr.val                  NA      NA         NA        1   2.000000          NA
## nbr.null                 NA      NA         NA        1   1.000000          NA
## nbr.na                   NA      NA         NA        2   1.000000          NA
## min                      NA      NA         NA        0   0.000000          NA
## max                      NA      NA         NA        0   2.000000          NA
## range                    NA      NA         NA        0   2.000000          NA
## sum                      NA      NA         NA        0   2.000000          NA
## median                   NA      NA         NA        0   1.000000          NA
## mean                     NA      NA         NA        0   1.000000          NA
## SE.mean                  NA      NA         NA       NA   1.000000          NA
## CI.mean.0.95             NA      NA         NA      NaN  12.706205          NA
## var                      NA      NA         NA       NA   2.000000          NA
## std.dev                  NA      NA         NA       NA   1.414214          NA
## coef.var                 NA      NA         NA       NA   1.414214          NA
##              source_link
## nbr.val               NA
## nbr.null              NA
## nbr.na                NA
## min                   NA
## max                   NA
## range                 NA
## sum                   NA
## median                NA
## mean                  NA
## SE.mean               NA
## CI.mean.0.95          NA
## var                   NA
## std.dev               NA
## coef.var              NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[10] <- "Distance"
colnames(df)[2] <- "Date"

Cuba

library(readr)
library(knitr)
df_CU <- subset (df, Country == "Cuba")
knitr::kable(head(df_CU))
id Date time Continent Country country_code State population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
483 2611 10/18/10 NA Cuba CU Provincia de La Habana 132351 Cerro 0.89865 23.1098 -82.3691 (23.1098, -82.369100000000003) Landslide Complex Medium Tropical cyclone Tropical Storm Paula NA 0 http://www.reliefweb.int/rw/RWFiles2010.nsf/FilesByRWDocUnidFilename/VDUX-8ADM53-full_report.pdf/$File/full_report.pdf
515 2706 11/7/10 NA Cuba CU Guantanamo 48362 Baracoa 10.45795 20.2526 -74.4867 (20.252600000000001, -74.486699999999999) Landslide Landslide Medium Tropical cyclone Hurricane Tomas NA 0 http://www.solvision.co.cu/english/index.php?option=com_content&view=article&id=1631:viaduct-la-farola-in-baracoa-traffic-restored&catid=34:portada&Itemid=171
1031 5067 7/9/13 NA Cuba CU Artemisa Province 7205 Soroa 11.87914 22.7943 -83.1322 (22.7943, -83.132199999999997) Landslide Landslide Medium Downpour NA 0 www.havanatimes.org http://www.havanatimes.org/?p=96131
df_CU %>% 
  select(Country, State, City, Distance, Date)
##      Country                  State    City Distance     Date
## 483     Cuba Provincia de La Habana   Cerro  0.89865 10/18/10
## 515     Cuba             Guantanamo Baracoa 10.45795  11/7/10
## 1031    Cuba      Artemisa Province   Soroa 11.87914   7/9/13
head(df_CU)
##        id     Date time Continent Country country_code                  State
## 483  2611 10/18/10           <NA>    Cuba           CU Provincia de La Habana
## 515  2706  11/7/10           <NA>    Cuba           CU             Guantanamo
## 1031 5067   7/9/13           <NA>    Cuba           CU      Artemisa Province
##      population    City Distance location_description latitude longitude
## 483      132351   Cerro  0.89865                       23.1098  -82.3691
## 515       48362 Baracoa 10.45795                       20.2526  -74.4867
## 1031       7205   Soroa 11.87914                       22.7943  -83.1322
##                                    geolocation hazard_type landslide_type
## 483             (23.1098, -82.369100000000003)   Landslide        Complex
## 515  (20.252600000000001, -74.486699999999999)   Landslide      Landslide
## 1031            (22.7943, -83.132199999999997)   Landslide      Landslide
##      landslide_size          trigger           storm_name injuries fatalities
## 483          Medium Tropical cyclone Tropical Storm Paula       NA          0
## 515          Medium Tropical cyclone      Hurricane Tomas       NA          0
## 1031         Medium         Downpour                            NA          0
##              source_name
## 483                     
## 515                     
## 1031 www.havanatimes.org
##                                                                                                                                                         source_link
## 483                                          http://www.reliefweb.int/rw/RWFiles2010.nsf/FilesByRWDocUnidFilename/VDUX-8ADM53-full_report.pdf/$File/full_report.pdf
## 515  http://www.solvision.co.cu/english/index.php?option=com_content&view=article&id=1631:viaduct-la-farola-in-baracoa-traffic-restored&catid=34:portada&Itemid=171
## 1031                                                                                                                            http://www.havanatimes.org/?p=96131

Deslizamentos por estado

library(ggplot2)
ggplot(data=df_CU, aes(x = "Cuba", y = Distance, fill=State)) +
  geom_bar(stat = "identity", width = 1, color = "black") +
  coord_polar("y", start = 0)

ggplot(data=df_CU, aes(fill=State, y=Distance, x="Cuba")) +
  geom_bar(position="dodge", stat="identity")

Artemisa Province

Deslizamientos de las ciudades de Artemisa Province

library(readr)
library(knitr)
df_AP <- subset (df, State == "Artemisa Province")
df_AP %>% 
  select(Country, State, City, Distance, Date) 
##      Country             State  City Distance   Date
## 1031    Cuba Artemisa Province Soroa 11.87914 7/9/13
head(df_AP)
##        id   Date time Continent Country country_code             State
## 1031 5067 7/9/13           <NA>    Cuba           CU Artemisa Province
##      population  City Distance location_description latitude longitude
## 1031       7205 Soroa 11.87914                       22.7943  -83.1322
##                         geolocation hazard_type landslide_type landslide_size
## 1031 (22.7943, -83.132199999999997)   Landslide      Landslide         Medium
##       trigger storm_name injuries fatalities         source_name
## 1031 Downpour                  NA          0 www.havanatimes.org
##                              source_link
## 1031 http://www.havanatimes.org/?p=96131
ggplot(data=df_AP, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_AP,aes(x="Distrito Nacional",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=6)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)

Distance <- df_AP$Distance
names(Distance) <- df_AP$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##        
## Pareto chart analysis for Distance
##         Frequency Cum.Freq. Percentage Cum.Percent.
##   Soroa  11.87914  11.87914  100.00000    100.00000

Diagrama de tallo y hojas

stem(df_AP$"Distance")


stem(df_AP$"Distance")
stem(df_AP$"Distance", scale = 2)

Series temporales

library(forecast)
data_serie<- ts(df_AP$Distance, frequency=12, start=2007)
head(data_serie)
##           Jan
## 2007 11.87914
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()
## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
11.87914 1 100 100 100 100
Total 1 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  2 obs. of  5 variables:
##  $ n      : num  1 1
##  $ %      : num  100 100
##  $ val%   : num  100 100
##  $ %cum   : num  100 100
##  $ val%cum: num  100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
11.87914 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1] 11.87914
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
11.87914 1 1 1
str(Freq_table)
## 'data.frame':    1 obs. of  4 variables:
##  $ Distance: Factor w/ 1 level "11.87914": 1
##  $ Freq    : int 1
##  $ Rel_Freq: num 1
##  $ Cum_Freq: int 1
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
11.87914 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_AP$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   11.88   11.88   11.88   11.88   11.88   11.88
library(pastecs)
stat.desc(df_AP)
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
## Warning in min(x): no non-missing arguments to min; returning Inf
## Warning in max(x): no non-missing arguments to max; returning -Inf
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                id Date time Continent Country country_code State population
## nbr.val         1   NA   NA        NA      NA           NA    NA          1
## nbr.null        0   NA   NA        NA      NA           NA    NA          0
## nbr.na          0   NA   NA        NA      NA           NA    NA          0
## min          5067   NA   NA        NA      NA           NA    NA       7205
## max          5067   NA   NA        NA      NA           NA    NA       7205
## range           0   NA   NA        NA      NA           NA    NA          0
## sum          5067   NA   NA        NA      NA           NA    NA       7205
## median       5067   NA   NA        NA      NA           NA    NA       7205
## mean         5067   NA   NA        NA      NA           NA    NA       7205
## SE.mean        NA   NA   NA        NA      NA           NA    NA         NA
## CI.mean.0.95  NaN   NA   NA        NA      NA           NA    NA        NaN
## var            NA   NA   NA        NA      NA           NA    NA         NA
## std.dev        NA   NA   NA        NA      NA           NA    NA         NA
## coef.var       NA   NA   NA        NA      NA           NA    NA         NA
##              City Distance location_description latitude longitude geolocation
## nbr.val        NA  1.00000                   NA   1.0000    1.0000          NA
## nbr.null       NA  0.00000                   NA   0.0000    0.0000          NA
## nbr.na         NA  0.00000                   NA   0.0000    0.0000          NA
## min            NA 11.87914                   NA  22.7943  -83.1322          NA
## max            NA 11.87914                   NA  22.7943  -83.1322          NA
## range          NA  0.00000                   NA   0.0000    0.0000          NA
## sum            NA 11.87914                   NA  22.7943  -83.1322          NA
## median         NA 11.87914                   NA  22.7943  -83.1322          NA
## mean           NA 11.87914                   NA  22.7943  -83.1322          NA
## SE.mean        NA       NA                   NA       NA        NA          NA
## CI.mean.0.95   NA      NaN                   NA      NaN       NaN          NA
## var            NA       NA                   NA       NA        NA          NA
## std.dev        NA       NA                   NA       NA        NA          NA
## coef.var       NA       NA                   NA       NA        NA          NA
##              hazard_type landslide_type landslide_size trigger storm_name
## nbr.val               NA             NA             NA      NA         NA
## nbr.null              NA             NA             NA      NA         NA
## nbr.na                NA             NA             NA      NA         NA
## min                   NA             NA             NA      NA         NA
## max                   NA             NA             NA      NA         NA
## range                 NA             NA             NA      NA         NA
## sum                   NA             NA             NA      NA         NA
## median                NA             NA             NA      NA         NA
## mean                  NA             NA             NA      NA         NA
## SE.mean               NA             NA             NA      NA         NA
## CI.mean.0.95          NA             NA             NA      NA         NA
## var                   NA             NA             NA      NA         NA
## std.dev               NA             NA             NA      NA         NA
## coef.var              NA             NA             NA      NA         NA
##              injuries fatalities source_name source_link
## nbr.val             0          1          NA          NA
## nbr.null            0          1          NA          NA
## nbr.na              1          0          NA          NA
## min               Inf          0          NA          NA
## max              -Inf          0          NA          NA
## range            -Inf          0          NA          NA
## sum                 0          0          NA          NA
## median             NA          0          NA          NA
## mean              NaN          0          NA          NA
## SE.mean            NA         NA          NA          NA
## CI.mean.0.95      NaN        NaN          NA          NA
## var                NA         NA          NA          NA
## std.dev            NA         NA          NA          NA
## coef.var           NA         NA          NA          NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[10] <- "Distance"
colnames(df)[2] <- "Date"

Provincia de La Habana

Deslizamiento en las ciudades de Provincia de La Habana

library(readr)
library(knitr)
df_HA <- subset (df, State == "Provincia de La Habana")
df_HA %>% 
  select(Country, State, City, Distance, Date) 
##     Country                  State  City Distance     Date
## 483    Cuba Provincia de La Habana Cerro  0.89865 10/18/10
head(df_HA)
##       id     Date time Continent Country country_code                  State
## 483 2611 10/18/10           <NA>    Cuba           CU Provincia de La Habana
##     population  City Distance location_description latitude longitude
## 483     132351 Cerro  0.89865                       23.1098  -82.3691
##                        geolocation hazard_type landslide_type landslide_size
## 483 (23.1098, -82.369100000000003)   Landslide        Complex         Medium
##              trigger           storm_name injuries fatalities source_name
## 483 Tropical cyclone Tropical Storm Paula       NA          0            
##                                                                                                                source_link
## 483 http://www.reliefweb.int/rw/RWFiles2010.nsf/FilesByRWDocUnidFilename/VDUX-8ADM53-full_report.pdf/$File/full_report.pdf
ggplot(data=df_HA, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_HA,aes(x="Provincia de La Habana",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)
Distance <- df_HA$Distance
names(Distance) <- df_HA$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##        
## Pareto chart analysis for Distance
##         Frequency Cum.Freq. Percentage Cum.Percent.
##   Cerro   0.89865   0.89865  100.00000    100.00000

Diagrama de tallo y hojas

stem(df_HA$"Distance")

stem(df_HA$"Distance")
stem(df_HA$"Distance", scale = 2)

Series temporales

library(forecast)
data_serie<- ts(df_HA$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan
## 2007 0.89865
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()
## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
0.89865 1 100 100 100 100
Total 1 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  2 obs. of  5 variables:
##  $ n      : num  1 1
##  $ %      : num  100 100
##  $ val%   : num  100 100
##  $ %cum   : num  100 100
##  $ val%cum: num  100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
0.89865 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
str(Freq_table)
## 'data.frame':    1 obs. of  4 variables:
##  $ Distance: Factor w/ 1 level "11.87914": 1
##  $ Freq    : int 1
##  $ Rel_Freq: num 1
##  $ Cum_Freq: int 1
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
11.87914 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_HA$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.8986  0.8986  0.8986  0.8986  0.8986  0.8986
library(pastecs)
stat.desc(df_HA)
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
## Warning in min(x): no non-missing arguments to min; returning Inf
## Warning in max(x): no non-missing arguments to max; returning -Inf
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                id Date time Continent Country country_code State population
## nbr.val         1   NA   NA        NA      NA           NA    NA          1
## nbr.null        0   NA   NA        NA      NA           NA    NA          0
## nbr.na          0   NA   NA        NA      NA           NA    NA          0
## min          2611   NA   NA        NA      NA           NA    NA     132351
## max          2611   NA   NA        NA      NA           NA    NA     132351
## range           0   NA   NA        NA      NA           NA    NA          0
## sum          2611   NA   NA        NA      NA           NA    NA     132351
## median       2611   NA   NA        NA      NA           NA    NA     132351
## mean         2611   NA   NA        NA      NA           NA    NA     132351
## SE.mean        NA   NA   NA        NA      NA           NA    NA         NA
## CI.mean.0.95  NaN   NA   NA        NA      NA           NA    NA        NaN
## var            NA   NA   NA        NA      NA           NA    NA         NA
## std.dev        NA   NA   NA        NA      NA           NA    NA         NA
## coef.var       NA   NA   NA        NA      NA           NA    NA         NA
##              City Distance location_description latitude longitude geolocation
## nbr.val        NA  1.00000                   NA   1.0000    1.0000          NA
## nbr.null       NA  0.00000                   NA   0.0000    0.0000          NA
## nbr.na         NA  0.00000                   NA   0.0000    0.0000          NA
## min            NA  0.89865                   NA  23.1098  -82.3691          NA
## max            NA  0.89865                   NA  23.1098  -82.3691          NA
## range          NA  0.00000                   NA   0.0000    0.0000          NA
## sum            NA  0.89865                   NA  23.1098  -82.3691          NA
## median         NA  0.89865                   NA  23.1098  -82.3691          NA
## mean           NA  0.89865                   NA  23.1098  -82.3691          NA
## SE.mean        NA       NA                   NA       NA        NA          NA
## CI.mean.0.95   NA      NaN                   NA      NaN       NaN          NA
## var            NA       NA                   NA       NA        NA          NA
## std.dev        NA       NA                   NA       NA        NA          NA
## coef.var       NA       NA                   NA       NA        NA          NA
##              hazard_type landslide_type landslide_size trigger storm_name
## nbr.val               NA             NA             NA      NA         NA
## nbr.null              NA             NA             NA      NA         NA
## nbr.na                NA             NA             NA      NA         NA
## min                   NA             NA             NA      NA         NA
## max                   NA             NA             NA      NA         NA
## range                 NA             NA             NA      NA         NA
## sum                   NA             NA             NA      NA         NA
## median                NA             NA             NA      NA         NA
## mean                  NA             NA             NA      NA         NA
## SE.mean               NA             NA             NA      NA         NA
## CI.mean.0.95          NA             NA             NA      NA         NA
## var                   NA             NA             NA      NA         NA
## std.dev               NA             NA             NA      NA         NA
## coef.var              NA             NA             NA      NA         NA
##              injuries fatalities source_name source_link
## nbr.val             0          1          NA          NA
## nbr.null            0          1          NA          NA
## nbr.na              1          0          NA          NA
## min               Inf          0          NA          NA
## max              -Inf          0          NA          NA
## range            -Inf          0          NA          NA
## sum                 0          0          NA          NA
## median             NA          0          NA          NA
## mean              NaN          0          NA          NA
## SE.mean            NA         NA          NA          NA
## CI.mean.0.95      NaN        NaN          NA          NA
## var                NA         NA          NA          NA
## std.dev            NA         NA          NA          NA
## coef.var           NA         NA          NA          NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[10] <- "Distance"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[2] <- "Date"

Guantanamo

Deslizamiento en las ciudades de Guantanamo

library(readr)
library(knitr)
df_Gu <- subset (df, State == "Guantanamo")
df_Gu %>% 
  select(Country, State, City, Distance, Date) 
##     Country      State    City Distance    Date
## 515    Cuba Guantanamo Baracoa 10.45795 11/7/10
head(df_Gu)
##       id    Date time Continent Country country_code      State population
## 515 2706 11/7/10           <NA>    Cuba           CU Guantanamo      48362
##        City Distance location_description latitude longitude
## 515 Baracoa 10.45795                       20.2526  -74.4867
##                                   geolocation hazard_type landslide_type
## 515 (20.252600000000001, -74.486699999999999)   Landslide      Landslide
##     landslide_size          trigger      storm_name injuries fatalities
## 515         Medium Tropical cyclone Hurricane Tomas       NA          0
##     source_name
## 515            
##                                                                                                                                                        source_link
## 515 http://www.solvision.co.cu/english/index.php?option=com_content&view=article&id=1631:viaduct-la-farola-in-baracoa-traffic-restored&catid=34:portada&Itemid=171
ggplot(data=df_Gu, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_Gu,aes(x="Guantanamo",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)
Distance <- df_Gu$Distance
names(Distance) <- df_Gu$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##          
## Pareto chart analysis for Distance
##           Frequency Cum.Freq. Percentage Cum.Percent.
##   Baracoa  10.45795  10.45795  100.00000    100.00000

Diagrama de tallo y hojas

stem(df_Gu$"Distance")

stem(df_Gu$"Distance")
stem(df_Gu$"Distance", scale = 2)

Series temporales

library(forecast)
data_serie<- ts(df_Gu$Distance, frequency=12, start=2007)
head(data_serie)
##           Jan
## 2007 10.45795
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()
## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
10.45795 1 100 100 100 100
Total 1 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  2 obs. of  5 variables:
##  $ n      : num  1 1
##  $ %      : num  100 100
##  $ val%   : num  100 100
##  $ %cum   : num  100 100
##  $ val%cum: num  100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
10.45795 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1] 10.45795
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
10.45795 1 1 1
str(Freq_table)
## 'data.frame':    1 obs. of  4 variables:
##  $ Distance: Factor w/ 1 level "10.45795": 1
##  $ Freq    : int 1
##  $ Rel_Freq: num 1
##  $ Cum_Freq: int 1
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
10.45795 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_Gu$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   10.46   10.46   10.46   10.46   10.46   10.46
library(pastecs)
stat.desc(df_Gu)
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
## Warning in min(x): no non-missing arguments to min; returning Inf
## Warning in max(x): no non-missing arguments to max; returning -Inf
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                id Date time Continent Country country_code State population
## nbr.val         1   NA   NA        NA      NA           NA    NA          1
## nbr.null        0   NA   NA        NA      NA           NA    NA          0
## nbr.na          0   NA   NA        NA      NA           NA    NA          0
## min          2706   NA   NA        NA      NA           NA    NA      48362
## max          2706   NA   NA        NA      NA           NA    NA      48362
## range           0   NA   NA        NA      NA           NA    NA          0
## sum          2706   NA   NA        NA      NA           NA    NA      48362
## median       2706   NA   NA        NA      NA           NA    NA      48362
## mean         2706   NA   NA        NA      NA           NA    NA      48362
## SE.mean        NA   NA   NA        NA      NA           NA    NA         NA
## CI.mean.0.95  NaN   NA   NA        NA      NA           NA    NA        NaN
## var            NA   NA   NA        NA      NA           NA    NA         NA
## std.dev        NA   NA   NA        NA      NA           NA    NA         NA
## coef.var       NA   NA   NA        NA      NA           NA    NA         NA
##              City Distance location_description latitude longitude geolocation
## nbr.val        NA  1.00000                   NA   1.0000    1.0000          NA
## nbr.null       NA  0.00000                   NA   0.0000    0.0000          NA
## nbr.na         NA  0.00000                   NA   0.0000    0.0000          NA
## min            NA 10.45795                   NA  20.2526  -74.4867          NA
## max            NA 10.45795                   NA  20.2526  -74.4867          NA
## range          NA  0.00000                   NA   0.0000    0.0000          NA
## sum            NA 10.45795                   NA  20.2526  -74.4867          NA
## median         NA 10.45795                   NA  20.2526  -74.4867          NA
## mean           NA 10.45795                   NA  20.2526  -74.4867          NA
## SE.mean        NA       NA                   NA       NA        NA          NA
## CI.mean.0.95   NA      NaN                   NA      NaN       NaN          NA
## var            NA       NA                   NA       NA        NA          NA
## std.dev        NA       NA                   NA       NA        NA          NA
## coef.var       NA       NA                   NA       NA        NA          NA
##              hazard_type landslide_type landslide_size trigger storm_name
## nbr.val               NA             NA             NA      NA         NA
## nbr.null              NA             NA             NA      NA         NA
## nbr.na                NA             NA             NA      NA         NA
## min                   NA             NA             NA      NA         NA
## max                   NA             NA             NA      NA         NA
## range                 NA             NA             NA      NA         NA
## sum                   NA             NA             NA      NA         NA
## median                NA             NA             NA      NA         NA
## mean                  NA             NA             NA      NA         NA
## SE.mean               NA             NA             NA      NA         NA
## CI.mean.0.95          NA             NA             NA      NA         NA
## var                   NA             NA             NA      NA         NA
## std.dev               NA             NA             NA      NA         NA
## coef.var              NA             NA             NA      NA         NA
##              injuries fatalities source_name source_link
## nbr.val             0          1          NA          NA
## nbr.null            0          1          NA          NA
## nbr.na              1          0          NA          NA
## min               Inf          0          NA          NA
## max              -Inf          0          NA          NA
## range            -Inf          0          NA          NA
## sum                 0          0          NA          NA
## median             NA          0          NA          NA
## mean              NaN          0          NA          NA
## SE.mean            NA         NA          NA          NA
## CI.mean.0.95      NaN        NaN          NA          NA
## var                NA         NA          NA          NA
## std.dev            NA         NA          NA          NA
## coef.var           NA         NA          NA          NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[10] <- "Distance"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[2] <- "Date"

Costa Rica

library(readr)
library(knitr)
df_CR <- subset (df, Country == "Costa Rica")
knitr::kable(head(df_CR, n=4))
id Date time Continent Country country_code State population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
38 249 9/9/07 NA Costa Rica CR Heredia 21947 Heredia 0.26208 10.0000 -84.1167 (10, -84.116699999999994) Landslide Landslide Medium Rain NA NA ticotimes.net http://www.ticotimes.net/dailyarchive/2007_09/0911072.htm
44 299 10/9/07 NA Costa Rica CR San José 3072 San Ignacio 4.57763 9.7789 -84.1250 (9.7789000000000001, -84.125) Landslide Complex Medium Rain NA 4 ticotimes.net http://www.ticotimes.net/dailyarchive/2007_10/1010071.htm
45 301 10/11/07 NA Costa Rica CR Alajuela 7014 Atenas 3.08459 9.9869 -84.4070 (9.9869000000000003, -84.406999999999996) Landslide Mudslide Large Rain NA 14 Agence France-Presse, afp.google.com http://afp.google.com/article/ALeqM5hu6a8oyAM1ycq9nU_6Zyj_l7F0AA
46 302 10/11/07 NA Costa Rica CR San José 26669 9.56251 10.0214 -83.9451 (10.0214, -83.945099999999996) Landslide Landslide Large Rain NA 10 International Herald http://www.iht.com/articles/ap/2007/10/12/america/LA-GEN-Costa-Rica-Mudslide.php
df_CR %>% 
  select(Country, State, City, Distance, Date)
##         Country      State                  City Distance     Date
## 38   Costa Rica    Heredia               Heredia  0.26208   9/9/07
## 44   Costa Rica  San José           San Ignacio  4.57763  10/9/07
## 45   Costa Rica   Alajuela                Atenas  3.08459 10/11/07
## 46   Costa Rica  San José                        9.56251 10/11/07
## 51   Costa Rica Puntarenas               Miramar  3.82425 10/24/07
## 102  Costa Rica Guanacaste               Bagaces 17.65521  5/29/08
## 147  Costa Rica  San José         Daniel Flores  1.85787   9/6/08
## 153  Costa Rica  San José            San Isidro 16.24937 10/12/08
## 154  Costa Rica  San José              Santiago 12.85801 10/12/08
## 156  Costa Rica Puntarenas               Golfito 11.74074 10/15/08
## 157  Costa Rica Puntarenas               Miramar  8.92048 10/16/08
## 229  Costa Rica Puntarenas              San Vito 18.00524 11/13/09
## 302  Costa Rica   Alajuela          Desamparados  6.88715  4/14/10
## 311  Costa Rica    Heredia              Ã\201ngeles 19.51432  4/27/10
## 347  Costa Rica   Alajuela          Desamparados  6.92174  5/22/10
## 395  Costa Rica   Alajuela          Desamparados  4.24199  7/30/10
## 459  Costa Rica   Alajuela            San Rafael  1.47396  9/29/10
## 469  Costa Rica  San José              Salitral  0.25254  10/1/10
## 470  Costa Rica  San José              Salitral  0.25254  10/1/10
## 480  Costa Rica    Heredia              Ã\201ngeles 14.81614 10/15/10
## 501  Costa Rica  San José               Escazú  3.67691  11/4/10
## 502  Costa Rica  San José            San Marcos  0.55804  11/4/10
## 503  Costa Rica   Alajuela            San Rafael  9.61692  11/4/10
## 504  Costa Rica Guanacaste              Tilarán 10.21631  11/4/10
## 505  Costa Rica    Cartago                Orosí 19.28722  11/4/10
## 506  Costa Rica Puntarenas               Golfito  7.87044  11/4/10
## 507  Costa Rica  San José                 Tejar  6.49523  11/4/10
## 508  Costa Rica  San José            San Isidro 15.64997  11/4/10
## 509  Costa Rica Puntarenas              Corredor  4.93053  11/4/10
## 510  Costa Rica Puntarenas               Parrita 13.48919  11/4/10
## 511  Costa Rica Puntarenas        Ciudad Cortés 20.06633  11/4/10
## 512  Costa Rica  San José            San Isidro 11.31047  11/4/10
## 513  Costa Rica  San José              Mercedes  8.21372  11/4/10
## 514  Costa Rica   Alajuela              Santiago  5.43516  11/5/10
## 529  Costa Rica    Heredia              Ã\201ngeles 19.54581 11/21/10
## 579  Costa Rica     Limón             Guápiles 17.23264  1/11/11
## 702  Costa Rica    Heredia              Ã\201ngeles 15.05161   5/8/11
## 780  Costa Rica   Alajuela                 Upala  0.70048  7/12/11
## 819  Costa Rica  San José            San Isidro 21.67452  9/25/11
## 828  Costa Rica    Cartago                   Cot  9.63616 10/31/11
## 884  Costa Rica    Heredia         Santo Domingo 21.95470  5/13/12
## 888  Costa Rica Guanacaste              Tilarán 12.33807  5/31/12
## 889  Costa Rica     Limón             Siquirres  5.36500  6/14/12
## 913  Costa Rica  San José         Daniel Flores  4.89954 10/23/12
## 1098 Costa Rica   Alajuela             Sabanilla  4.87432  8/27/13
## 1156 Costa Rica   Alajuela             Sabanilla 10.32968  9/16/13
## 1157 Costa Rica    Heredia         Santo Domingo  9.85736  9/16/13
## 1169 Costa Rica Guanacaste              Tilarán 12.21952  10/3/13
## 1173 Costa Rica Guanacaste              Tilarán 12.18115  10/8/13
## 1289 Costa Rica   Alajuela            La Fortuna  9.84213  10/4/14
## 1301 Costa Rica   Alajuela                        5.57523  9/19/14
## 1308 Costa Rica   Alajuela          Desamparados  5.95519  11/1/14
## 1342 Costa Rica   Alajuela           Rio Segundo 11.96524  8/21/14
## 1364 Costa Rica   Alajuela          Desamparados  5.12667  8/10/14
## 1383 Costa Rica    Cartago               Cartago  3.07297  9/13/14
## 1384 Costa Rica    Heredia Dulce Nombre de Jesus 10.01310 12/13/14
## 1385 Costa Rica  San José Dulce Nombre de Jesus  2.92605  11/3/14
## 1386 Costa Rica  San José            San Isidro 10.73752  9/19/14
## 1404 Costa Rica  San José            San Isidro 22.32368  1/28/15
## 1406 Costa Rica  San José Dulce Nombre de Jesus  8.39161   2/6/15
## 1461 Costa Rica   Alajuela            La Fortuna  5.96634  6/17/15
## 1475 Costa Rica   Alajuela                Atenas  6.80061   6/3/15
## 1528 Costa Rica  San José              Ã\201ngeles  9.53611   7/6/15
## 1529 Costa Rica  San José Dulce Nombre de Jesus  3.71407   7/6/15
## 1600 Costa Rica  San José              San Juan  0.72957 10/29/15
## 1642 Costa Rica   Alajuela         Santo Domingo  3.21979 10/27/15
## 1643 Costa Rica   Alajuela              Alajuela  3.08916 11/18/15
## 1644 Costa Rica   Alajuela               Naranjo  2.08469 10/29/15
## 1646 Costa Rica    Cartago                        5.15142 10/15/15
## 1647 Costa Rica    Cartago                   Cot  9.53493  3/20/15
## 1648 Costa Rica    Cartago               Cartago  2.94804  3/18/15
## 1649 Costa Rica Puntarenas          Buenos Aires  0.35225 11/23/15
## 1650 Costa Rica  San José             San José  1.16705  9/25/15
## 1651 Costa Rica  San José              Mercedes 10.01198  11/5/15
## 1652 Costa Rica  San José              Santiago  8.27042 11/11/15
head(df_CR)
##      id     Date time Continent    Country country_code      State population
## 38  249   9/9/07           <NA> Costa Rica           CR    Heredia      21947
## 44  299  10/9/07           <NA> Costa Rica           CR  San José       3072
## 45  301 10/11/07           <NA> Costa Rica           CR   Alajuela       7014
## 46  302 10/11/07           <NA> Costa Rica           CR  San José      26669
## 51  323 10/24/07           <NA> Costa Rica           CR Puntarenas       6540
## 102 556  5/29/08           <NA> Costa Rica           CR Guanacaste       4108
##            City Distance location_description latitude longitude
## 38      Heredia  0.26208                       10.0000  -84.1167
## 44  San Ignacio  4.57763                        9.7789  -84.1250
## 45       Atenas  3.08459                        9.9869  -84.4070
## 46               9.56251                       10.0214  -83.9451
## 51      Miramar  3.82425    Mine construction  10.0715  -84.7575
## 102     Bagaces 17.65521                       10.4024  -85.3555
##                                   geolocation hazard_type landslide_type
## 38                  (10, -84.116699999999994)   Landslide      Landslide
## 44              (9.7789000000000001, -84.125)   Landslide        Complex
## 45  (9.9869000000000003, -84.406999999999996)   Landslide       Mudslide
## 46             (10.0214, -83.945099999999996)   Landslide      Landslide
## 51             (10.0715, -84.757499999999993)   Landslide       Mudslide
## 102            (10.4024, -85.355500000000006)   Landslide      Landslide
##     landslide_size          trigger          storm_name injuries fatalities
## 38          Medium             Rain                           NA         NA
## 44          Medium             Rain                           NA          4
## 45           Large             Rain                           NA         14
## 46           Large             Rain                           NA         10
## 51          Medium         Downpour                           NA         NA
## 102         Medium Tropical cyclone Tropical Storm Alma       NA         NA
##                              source_name
## 38                         ticotimes.net
## 44                         ticotimes.net
## 45  Agence France-Presse, afp.google.com
## 46                  International Herald
## 51                Reuters - AlertNet.org
## 102                                     
##                                                                          source_link
## 38                         http://www.ticotimes.net/dailyarchive/2007_09/0911072.htm
## 44                         http://www.ticotimes.net/dailyarchive/2007_10/1010071.htm
## 45                  http://afp.google.com/article/ALeqM5hu6a8oyAM1ycq9nU_6Zyj_l7F0AA
## 46  http://www.iht.com/articles/ap/2007/10/12/america/LA-GEN-Costa-Rica-Mudslide.php
## 51             http://www.reuters.com/article/companyNewsAndPR/idUSN2435152820071025
## 102            http://www.reliefweb.int/rw/RWB.NSF/db900SID/ASAZ-7FHCHL?OpenDocument

Deslizamentos por estado

library(ggplot2)
ggplot(data=df_CR, aes(x = "Costa Rica", y = Distance, fill=State)) +
  geom_bar(stat = "identity", width = 1, color = "black") +
  coord_polar("y", start = 0)

ggplot(data=df_CR, aes(fill=State, y=Distance, x="Costa Rica")) +
  geom_bar(position="dodge", stat="identity")

Guanacaste

Deslizamientos de las ciudades de Guanacaste

library(readr)
library(knitr)
df_GU <- subset (df, State == "Guanacaste")
df_GU %>% 
  select(Country, State, City, Distance, Date) 
##         Country      State     City Distance    Date
## 102  Costa Rica Guanacaste  Bagaces 17.65521 5/29/08
## 504  Costa Rica Guanacaste Tilarán 10.21631 11/4/10
## 888  Costa Rica Guanacaste Tilarán 12.33807 5/31/12
## 1169 Costa Rica Guanacaste Tilarán 12.21952 10/3/13
## 1173 Costa Rica Guanacaste Tilarán 12.18115 10/8/13
head(df_GU, n=4)
##        id    Date time Continent    Country country_code      State population
## 102   556 5/29/08           <NA> Costa Rica           CR Guanacaste       4108
## 504  2683 11/4/10           <NA> Costa Rica           CR Guanacaste       7301
## 888  4375 5/31/12           <NA> Costa Rica           CR Guanacaste       7301
## 1169 5571 10/3/13           <NA> Costa Rica           CR Guanacaste       7301
##          City Distance location_description latitude longitude
## 102   Bagaces 17.65521                       10.4024  -85.3555
## 504  Tilarán 10.21631                       10.4548  -84.8751
## 888  Tilarán 12.33807                       10.5562  -84.8952
## 1169 Tilarán 12.21952                       10.5543  -84.8946
##                                    geolocation hazard_type landslide_type
## 102             (10.4024, -85.355500000000006)   Landslide      Landslide
## 504  (10.454800000000001, -84.875100000000003)   Landslide      Landslide
## 888             (10.5562, -84.895200000000003)   Landslide      Landslide
## 1169            (10.5543, -84.894599999999997)   Landslide      Landslide
##      landslide_size          trigger           storm_name injuries fatalities
## 102          Medium Tropical cyclone  Tropical Storm Alma       NA         NA
## 504          Medium Tropical cyclone Tropical Storm Tomas       NA          0
## 888           Large         Downpour                            NA         NA
## 1169         Medium   Mining digging                            NA         NA
##            source_name
## 102                   
## 504                   
## 888                   
## 1169 www.ticotimes.net
##                                                                                                                                          source_link
## 102                                                                            http://www.reliefweb.int/rw/RWB.NSF/db900SID/ASAZ-7FHCHL?OpenDocument
## 504                                                                  http://fortunatimes.com/2010/11/06/no-passage-to-the-south-and-central-pacific/
## 888                                     http://thecostaricanews.com/landslides-and-wash-outs-continue-to-cause-problems-in-northern-costa-rica/12129
## 1169 http://www.ticotimes.net/More-news/News-Briefs/TRAVEL-ALERT-UPDATE-Rains-landslides-close-eight-routes-across-Costa-Rica_Friday-October-04-2013
ggplot(data=df_GU, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_GU,aes(x="Guanacaste",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)

Distance <- df_GU$Distance
names(Distance) <- df_GU$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##           
## Pareto chart analysis for Distance
##            Frequency Cum.Freq. Percentage Cum.Percent.
##   Bagaces   17.65521  17.65521   27.32571     27.32571
##   Tilarán  12.33807  29.99328   19.09615     46.42185
##   Tilarán  12.21952  42.21280   18.91266     65.33451
##   Tilarán  12.18115  54.39395   18.85328     84.18779
##   Tilarán  10.21631  64.61026   15.81221    100.00000

Diagrama de tallo y hojas

stem(df_GU$"Distance")
## 
##   The decimal point is at the |
## 
##   10 | 2
##   12 | 223
##   14 | 
##   16 | 7
stem(df_GU$"Distance")
## 
##   The decimal point is at the |
## 
##   10 | 2
##   12 | 223
##   14 | 
##   16 | 7
stem(df_GU$"Distance", scale = 2)
## 
##   The decimal point is at the |
## 
##   10 | 2
##   11 | 
##   12 | 223
##   13 | 
##   14 | 
##   15 | 
##   16 | 
##   17 | 7

Series temporales

library(forecast)
data_serie<- ts(df_GU$Distance, frequency=12, start=2007)
head(data_serie)
##           Jan      Feb      Mar      Apr      May
## 2007 17.65521 10.21631 12.33807 12.21952 12.18115
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
10.21631 1 20 20 20 20
12.18115 1 20 20 40 40
12.21952 1 20 20 60 60
12.33807 1 20 20 80 80
17.65521 1 20 20 100 100
Total 5 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  6 obs. of  5 variables:
##  $ n      : num  1 1 1 1 1 5
##  $ %      : num  20 20 20 20 20 100
##  $ val%   : num  20 20 20 20 20 100
##  $ %cum   : num  20 40 60 80 100 100
##  $ val%cum: num  20 40 60 80 100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
10.21631 1
12.18115 1
12.21952 1
12.33807 1
17.65521 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1] 10.21631 13.21631 16.21631 19.21631
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
10.21631 1 0.2 1
12.18115 1 0.2 2
12.21952 1 0.2 3
12.33807 1 0.2 4
17.65521 1 0.2 5
str(Freq_table)
## 'data.frame':    5 obs. of  4 variables:
##  $ Distance: Factor w/ 5 levels "10.21631","12.18115",..: 1 2 3 4 5
##  $ Freq    : int  1 1 1 1 1
##  $ Rel_Freq: num  0.2 0.2 0.2 0.2 0.2
##  $ Cum_Freq: int  1 2 3 4 5
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
10.21631 1
12.18115 1
12.21952 1
12.33807 1
17.65521 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_GU$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   10.22   12.18   12.22   12.92   12.34   17.66
library(pastecs)
stat.desc(df_GU)
## Warning in min(x): no non-missing arguments to min; returning Inf
## Warning in max(x): no non-missing arguments to max; returning -Inf
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                        id Date time Continent Country country_code State
## nbr.val      5.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.null     0.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.na       0.000000e+00   NA   NA        NA      NA           NA    NA
## min          5.560000e+02   NA   NA        NA      NA           NA    NA
## max          5.591000e+03   NA   NA        NA      NA           NA    NA
## range        5.035000e+03   NA   NA        NA      NA           NA    NA
## sum          1.877600e+04   NA   NA        NA      NA           NA    NA
## median       4.375000e+03   NA   NA        NA      NA           NA    NA
## mean         3.755200e+03   NA   NA        NA      NA           NA    NA
## SE.mean      9.601025e+02   NA   NA        NA      NA           NA    NA
## CI.mean.0.95 2.665672e+03   NA   NA        NA      NA           NA    NA
## var          4.608984e+06   NA   NA        NA      NA           NA    NA
## std.dev      2.146854e+03   NA   NA        NA      NA           NA    NA
## coef.var     5.717018e-01   NA   NA        NA      NA           NA    NA
##                population City   Distance location_description     latitude
## nbr.val      5.000000e+00   NA  5.0000000                   NA  5.000000000
## nbr.null     0.000000e+00   NA  0.0000000                   NA  0.000000000
## nbr.na       0.000000e+00   NA  0.0000000                   NA  0.000000000
## min          4.108000e+03   NA 10.2163100                   NA 10.402400000
## max          7.301000e+03   NA 17.6552100                   NA 10.556200000
## range        3.193000e+03   NA  7.4389000                   NA  0.153800000
## sum          3.331200e+04   NA 64.6102600                   NA 52.522300000
## median       7.301000e+03   NA 12.2195200                   NA 10.554300000
## mean         6.662400e+03   NA 12.9220520                   NA 10.504460000
## SE.mean      6.386000e+02   NA  1.2471437                   NA  0.032060437
## CI.mean.0.95 1.773038e+03   NA  3.4626259                   NA  0.089014042
## var          2.039050e+06   NA  7.7768366                   NA  0.005139358
## std.dev      1.427953e+03   NA  2.7886980                   NA  0.071689316
## coef.var     2.143301e-01   NA  0.2158092                   NA  0.006824655
##                  longitude geolocation hazard_type landslide_type
## nbr.val       5.000000e+00          NA          NA             NA
## nbr.null      0.000000e+00          NA          NA             NA
## nbr.na        0.000000e+00          NA          NA             NA
## min          -8.535550e+01          NA          NA             NA
## max          -8.487510e+01          NA          NA             NA
## range         4.804000e-01          NA          NA             NA
## sum          -4.249159e+02          NA          NA             NA
## median       -8.489520e+01          NA          NA             NA
## mean         -8.498318e+01          NA          NA             NA
## SE.mean       9.316065e-02          NA          NA             NA
## CI.mean.0.95  2.586554e-01          NA          NA             NA
## var           4.339454e-02          NA          NA             NA
## std.dev       2.083136e-01          NA          NA             NA
## coef.var     -2.451233e-03          NA          NA             NA
##              landslide_size trigger storm_name injuries fatalities source_name
## nbr.val                  NA      NA         NA        0   2.000000          NA
## nbr.null                 NA      NA         NA        0   1.000000          NA
## nbr.na                   NA      NA         NA        5   3.000000          NA
## min                      NA      NA         NA      Inf   0.000000          NA
## max                      NA      NA         NA     -Inf   2.000000          NA
## range                    NA      NA         NA     -Inf   2.000000          NA
## sum                      NA      NA         NA        0   2.000000          NA
## median                   NA      NA         NA       NA   1.000000          NA
## mean                     NA      NA         NA      NaN   1.000000          NA
## SE.mean                  NA      NA         NA       NA   1.000000          NA
## CI.mean.0.95             NA      NA         NA      NaN  12.706205          NA
## var                      NA      NA         NA       NA   2.000000          NA
## std.dev                  NA      NA         NA       NA   1.414214          NA
## coef.var                 NA      NA         NA       NA   1.414214          NA
##              source_link
## nbr.val               NA
## nbr.null              NA
## nbr.na                NA
## min                   NA
## max                   NA
## range                 NA
## sum                   NA
## median                NA
## mean                  NA
## SE.mean               NA
## CI.mean.0.95          NA
## var                   NA
## std.dev               NA
## coef.var              NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[10] <- "Distance"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[2] <- "Date"

Alajuela

Deslizamiento en las ciudades de Alajuela

library(readr)
library(knitr)
df_AJ <- subset (df, State == "Alajuela")
df_AJ %>% 
  select(Country, State, City, Distance, Date) 
##         Country    State          City Distance     Date
## 45   Costa Rica Alajuela        Atenas  3.08459 10/11/07
## 302  Costa Rica Alajuela  Desamparados  6.88715  4/14/10
## 347  Costa Rica Alajuela  Desamparados  6.92174  5/22/10
## 395  Costa Rica Alajuela  Desamparados  4.24199  7/30/10
## 459  Costa Rica Alajuela    San Rafael  1.47396  9/29/10
## 503  Costa Rica Alajuela    San Rafael  9.61692  11/4/10
## 514  Costa Rica Alajuela      Santiago  5.43516  11/5/10
## 780  Costa Rica Alajuela         Upala  0.70048  7/12/11
## 1098 Costa Rica Alajuela     Sabanilla  4.87432  8/27/13
## 1156 Costa Rica Alajuela     Sabanilla 10.32968  9/16/13
## 1289 Costa Rica Alajuela    La Fortuna  9.84213  10/4/14
## 1301 Costa Rica Alajuela                5.57523  9/19/14
## 1308 Costa Rica Alajuela  Desamparados  5.95519  11/1/14
## 1342 Costa Rica Alajuela   Rio Segundo 11.96524  8/21/14
## 1364 Costa Rica Alajuela  Desamparados  5.12667  8/10/14
## 1461 Costa Rica Alajuela    La Fortuna  5.96634  6/17/15
## 1475 Costa Rica Alajuela        Atenas  6.80061   6/3/15
## 1642 Costa Rica Alajuela Santo Domingo  3.21979 10/27/15
## 1643 Costa Rica Alajuela      Alajuela  3.08916 11/18/15
## 1644 Costa Rica Alajuela       Naranjo  2.08469 10/29/15
head(df_AJ)
##       id     Date     time Continent    Country country_code    State
## 45   301 10/11/07               <NA> Costa Rica           CR Alajuela
## 302 1749  4/14/10               <NA> Costa Rica           CR Alajuela
## 347 1886  5/22/10 18:00:00      <NA> Costa Rica           CR Alajuela
## 395 2174  7/30/10  9:30:00      <NA> Costa Rica           CR Alajuela
## 459 2516  9/29/10               <NA> Costa Rica           CR Alajuela
## 503 2682  11/4/10               <NA> Costa Rica           CR Alajuela
##     population         City Distance location_description latitude longitude
## 45        7014       Atenas  3.08459                        9.9869  -84.4070
## 302      14448 Desamparados  6.88715           Above road   9.9323  -84.4453
## 347      14448 Desamparados  6.92174           Above road   9.9290  -84.4428
## 395      14448 Desamparados  4.24199           Above road   9.9271  -84.4568
## 459       3624   San Rafael  1.47396                       10.0757  -84.4793
## 503       3624   San Rafael  9.61692                       10.0421  -84.5577
##                                   geolocation hazard_type landslide_type
## 45  (9.9869000000000003, -84.406999999999996)   Landslide       Mudslide
## 302 (9.9322999999999997, -84.445300000000003)   Landslide      Landslide
## 347 (9.9290000000000003, -84.442800000000005)   Landslide      Landslide
## 395 (9.9270999999999994, -84.456800000000001)   Landslide      Landslide
## 459 (10.075699999999999, -84.479299999999995)   Landslide       Mudslide
## 503            (10.0421, -84.557699999999997)   Landslide      Landslide
##     landslide_size          trigger           storm_name injuries fatalities
## 45           Large             Rain                            NA         14
## 302         Medium         Downpour                            NA          0
## 347         Medium         Downpour                             3          0
## 395         Medium             Rain                            NA          0
## 459         Medium         Downpour                            NA          0
## 503         Medium Tropical cyclone Tropical Storm Tomas       NA          0
##                              source_name
## 45  Agence France-Presse, afp.google.com
## 302                                     
## 347                      Costa Rica News
## 395                           La Fortuna
## 459                                     
## 503                                     
##                                                                                                                     source_link
## 45                                                             http://afp.google.com/article/ALeqM5hu6a8oyAM1ycq9nU_6Zyj_l7F0AA
## 302                                                http://www.insidecostarica.com/dailynews/2010/april/16/costarica10041602.htm
## 347                                       http://thecostaricanews.com/rains-cause-landslides-and-road-accidents-on-caldera/3255
## 395    https://lafortunatimes.wordpress.com/2010/07/30/landslide-caused-closure-of-san-jose-caldera-for-most-of-the-day-friday/
## 459 http://www.ticotimes.net/News/Daily-News/Inter-American-Highway-Reopens-Caldera-Highway-Under-Repair_Monday-October-04-2010
## 503                                             http://fortunatimes.com/2010/11/06/no-passage-to-the-south-and-central-pacific/
ggplot(data=df_AJ, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_AJ,aes(x="Alajuela",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)
Distance <- df_AJ$Distance
names(Distance) <- df_AJ$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##                
## Pareto chart analysis for Distance
##                   Frequency   Cum.Freq.  Percentage Cum.Percent.
##   Rio Segundo    11.9652400  11.9652400  10.5708367   10.5708367
##   Sabanilla      10.3296800  22.2949200   9.1258813   19.6967180
##   La Fortuna      9.8421300  32.1370500   8.6951494   28.3918674
##   San Rafael      9.6169200  41.7539700   8.4961849   36.8880523
##   Desamparados    6.9217400  48.6757100   6.1150953   43.0031476
##   Desamparados    6.8871500  55.5628600   6.0845364   49.0876840
##   Atenas          6.8006100  62.3634700   6.0080816   55.0957655
##   La Fortuna      5.9663400  68.3298100   5.2710356   60.3668011
##   Desamparados    5.9551900  74.2850000   5.2611850   65.6279861
##                   5.5752300  79.8602300   4.9255047   70.5534908
##   Santiago        5.4351600  85.2953900   4.8017582   75.3552490
##   Desamparados    5.1266700  90.4220600   4.5292189   79.8844679
##   Sabanilla       4.8743200  95.2963800   4.3062772   84.1907451
##   Desamparados    4.2419900  99.5383700   3.7476376   87.9383828
##   Santo Domingo   3.2197900 102.7581600   2.8445626   90.7829454
##   Alajuela        3.0891600 105.8473200   2.7291559   93.5121013
##   Atenas          3.0845900 108.9319100   2.7251185   96.2372198
##   Naranjo         2.0846900 111.0166000   1.8417447   98.0789646
##   San Rafael      1.4739600 112.4905600   1.3021879   99.3811524
##   Upala           0.7004800 113.1910400   0.6188476  100.0000000

Diagrama de tallo y hojas

stem(df_AJ$"Distance")
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 1123334
##   0 | 555666777
##   1 | 0002
stem(df_AJ$"Distance")
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 1123334
##   0 | 555666777
##   1 | 0002
stem(df_AJ$"Distance", scale = 2)
## 
##   The decimal point is at the |
## 
##    0 | 75
##    2 | 1112
##    4 | 29146
##    6 | 00899
##    8 | 68
##   10 | 3
##   12 | 0

Series temporales

library(forecast)
data_serie<- ts(df_AJ$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan     Feb     Mar     Apr     May     Jun
## 2007 3.08459 6.88715 6.92174 4.24199 1.47396 9.61692
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
0.70048 1 5 5 5 5
1.47396 1 5 5 10 10
2.08469 1 5 5 15 15
3.08459 1 5 5 20 20
3.08916 1 5 5 25 25
3.21979 1 5 5 30 30
4.24199 1 5 5 35 35
4.87432 1 5 5 40 40
5.12667 1 5 5 45 45
5.43516 1 5 5 50 50
5.57523 1 5 5 55 55
5.95519 1 5 5 60 60
5.96634 1 5 5 65 65
6.80061 1 5 5 70 70
6.88715 1 5 5 75 75
6.92174 1 5 5 80 80
9.61692 1 5 5 85 85
9.84213 1 5 5 90 90
10.32968 1 5 5 95 95
11.96524 1 5 5 100 100
Total 20 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  21 obs. of  5 variables:
##  $ n      : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ %      : num  5 5 5 5 5 5 5 5 5 5 ...
##  $ val%   : num  5 5 5 5 5 5 5 5 5 5 ...
##  $ %cum   : num  5 10 15 20 25 30 35 40 45 50 ...
##  $ val%cum: num  5 10 15 20 25 30 35 40 45 50 ...
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
0.70048 1
1.47396 1
2.08469 1
3.08459 1
3.08916 1
3.21979 1
4.24199 1
4.87432 1
5.12667 1
5.43516 1
5.57523 1
5.95519 1
5.96634 1
6.80061 1
6.88715 1
6.92174 1
9.61692 1
9.84213 1
10.32968 1
11.96524 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1]  0.70048  3.70048  6.70048  9.70048 12.70048
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
0.70048 1 0.05 1
1.47396 1 0.05 2
2.08469 1 0.05 3
3.08459 1 0.05 4
3.08916 1 0.05 5
3.21979 1 0.05 6
4.24199 1 0.05 7
4.87432 1 0.05 8
5.12667 1 0.05 9
5.43516 1 0.05 10
5.57523 1 0.05 11
5.95519 1 0.05 12
5.96634 1 0.05 13
6.80061 1 0.05 14
6.88715 1 0.05 15
6.92174 1 0.05 16
9.61692 1 0.05 17
9.84213 1 0.05 18
10.32968 1 0.05 19
11.96524 1 0.05 20
str(Freq_table)
## 'data.frame':    20 obs. of  4 variables:
##  $ Distance: Factor w/ 20 levels "0.70048","1.47396",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ Freq    : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ Rel_Freq: num  0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 ...
##  $ Cum_Freq: int  1 2 3 4 5 6 7 8 9 10 ...
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
0.70048 1
1.47396 1
2.08469 1
3.08459 1
3.08916 1
3.21979 1
4.24199 1
4.87432 1
5.12667 1
5.43516 1
5.57523 1
5.95519 1
5.96634 1
6.80061 1
6.88715 1
6.92174 1
9.61692 1
9.84213 1
10.32968 1
11.96524 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_AJ$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.7005  3.1871  5.5052  5.6596  6.8958 11.9652
library(pastecs)
stat.desc(df_AJ)
##                        id Date time Continent Country country_code State
## nbr.val      2.000000e+01   NA   NA        NA      NA           NA    NA
## nbr.null     0.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.na       0.000000e+00   NA   NA        NA      NA           NA    NA
## min          3.010000e+02   NA   NA        NA      NA           NA    NA
## max          7.488000e+03   NA   NA        NA      NA           NA    NA
## range        7.187000e+03   NA   NA        NA      NA           NA    NA
## sum          9.718800e+04   NA   NA        NA      NA           NA    NA
## median       5.878000e+03   NA   NA        NA      NA           NA    NA
## mean         4.859400e+03   NA   NA        NA      NA           NA    NA
## SE.mean      5.261514e+02   NA   NA        NA      NA           NA    NA
## CI.mean.0.95 1.101248e+03   NA   NA        NA      NA           NA    NA
## var          5.536707e+06   NA   NA        NA      NA           NA    NA
## std.dev      2.353021e+03   NA   NA        NA      NA           NA    NA
## coef.var     4.842204e-01   NA   NA        NA      NA           NA    NA
##                population City    Distance location_description     latitude
## nbr.val      2.000000e+01   NA  20.0000000                   NA  20.00000000
## nbr.null     0.000000e+00   NA   0.0000000                   NA   0.00000000
## nbr.na       0.000000e+00   NA   0.0000000                   NA   0.00000000
## min          1.015000e+03   NA   0.7004800                   NA   9.91890000
## max          4.749400e+04   NA  11.9652400                   NA  10.89160000
## range        4.647900e+04   NA  11.2647600                   NA   0.97270000
## sum          1.924900e+05   NA 113.1910400                   NA 202.24760000
## median       7.014000e+03   NA   5.5051950                   NA  10.04315000
## mean         9.624500e+03   NA   5.6595520                   NA  10.11238000
## SE.mean      2.281502e+03   NA   0.6812501                   NA   0.05493583
## CI.mean.0.95 4.775238e+03   NA   1.4258729                   NA   0.11498201
## var          1.041050e+08   NA   9.2820347                   NA   0.06035891
## std.dev      1.020319e+04   NA   3.0466432                   NA   0.24568050
## coef.var     1.060126e+00   NA   0.5383188                   NA   0.02429502
##                  longitude geolocation hazard_type landslide_type
## nbr.val       2.000000e+01          NA          NA             NA
## nbr.null      0.000000e+00          NA          NA             NA
## nbr.na        0.000000e+00          NA          NA             NA
## min          -8.501410e+01          NA          NA             NA
## max          -8.418070e+01          NA          NA             NA
## range         8.334000e-01          NA          NA             NA
## sum          -1.688552e+03          NA          NA             NA
## median       -8.444405e+01          NA          NA             NA
## mean         -8.442758e+01          NA          NA             NA
## SE.mean       4.594981e-02          NA          NA             NA
## CI.mean.0.95  9.617405e-02          NA          NA             NA
## var           4.222770e-02          NA          NA             NA
## std.dev       2.054938e-01          NA          NA             NA
## coef.var     -2.433965e-03          NA          NA             NA
##              landslide_size trigger storm_name   injuries fatalities
## nbr.val                  NA      NA         NA 11.0000000 18.0000000
## nbr.null                 NA      NA         NA 10.0000000 15.0000000
## nbr.na                   NA      NA         NA  9.0000000  2.0000000
## min                      NA      NA         NA  0.0000000  0.0000000
## max                      NA      NA         NA  3.0000000 14.0000000
## range                    NA      NA         NA  3.0000000 14.0000000
## sum                      NA      NA         NA  3.0000000 16.0000000
## median                   NA      NA         NA  0.0000000  0.0000000
## mean                     NA      NA         NA  0.2727273  0.8888889
## SE.mean                  NA      NA         NA  0.2727273  0.7749716
## CI.mean.0.95             NA      NA         NA  0.6076742  1.6350471
## var                      NA      NA         NA  0.8181818 10.8104575
## std.dev                  NA      NA         NA  0.9045340  3.2879260
## coef.var                 NA      NA         NA  3.3166248  3.6989168
##              source_name source_link
## nbr.val               NA          NA
## nbr.null              NA          NA
## nbr.na                NA          NA
## min                   NA          NA
## max                   NA          NA
## range                 NA          NA
## sum                   NA          NA
## median                NA          NA
## mean                  NA          NA
## SE.mean               NA          NA
## CI.mean.0.95          NA          NA
## var                   NA          NA
## std.dev               NA          NA
## coef.var              NA          NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[10] <- "Distance"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[2] <- "Date"

Provincia de Cartago

Deslizamiento en las ciudades de Cartago

library(readr)
library(knitr)
df_CA <- subset (df, State == "Cartago")
head(df_CA, n=4)
##        id     Date time Continent    Country country_code   State population
## 505  2684  11/4/10           <NA> Costa Rica           CR Cartago       4350
## 828  4031 10/31/11           <NA> Costa Rica           CR Cartago       6784
## 1383 6695  9/13/14           <NA> Costa Rica           CR Cartago      26594
## 1646 7490 10/15/15           <NA> Costa Rica           CR Cartago       4060
##         City Distance location_description latitude longitude
## 505   Orosí 19.28722                        9.6227  -83.8359
## 828      Cot  9.63616        Natural slope   9.9792  -83.8525
## 1383 Cartago  3.07297           Below road   9.8895  -83.9316
## 1646          5.15142           Above road   9.7917  -83.9815
##                                    geolocation hazard_type landslide_type
## 505              (9.6227, -83.835899999999995)   Landslide      Landslide
## 828  (9.9792000000000005, -83.852500000000006)   Landslide      Landslide
## 1383             (9.8895, -83.931600000000003)   Landslide      Landslide
## 1646 (9.7917000000000005, -83.981499999999997)   Landslide      Landslide
##      landslide_size          trigger           storm_name injuries fatalities
## 505          Medium Tropical cyclone Tropical Storm Tomas       NA          0
## 828          Medium         Downpour                            NA          0
## 1383          Small             Rain                             0          0
## 1646         Medium         Downpour                             0          0
##            source_name
## 505                   
## 828  Inside Costa Rica
## 1383             Ahora
## 1646             crhoy
##                                                                              source_link
## 505      http://fortunatimes.com/2010/11/06/no-passage-to-the-south-and-central-pacific/
## 828       http://www.insidecostarica.com/dailynews/2011/october/31/costarica11103102.htm
## 1383 http://www.ahora.cr/nacionales/Derrumbe-pone-riesgo-linea-Cartago_0_1439256064.html
## 1646     http://www.crhoy.com/carril-cerrado-sobre-interamericana-sur-por-deslizamiento/
df_CA %>% 
  select(Country, State, City, Distance, Date) 
##         Country   State    City Distance     Date
## 505  Costa Rica Cartago  Orosí 19.28722  11/4/10
## 828  Costa Rica Cartago     Cot  9.63616 10/31/11
## 1383 Costa Rica Cartago Cartago  3.07297  9/13/14
## 1646 Costa Rica Cartago          5.15142 10/15/15
## 1647 Costa Rica Cartago     Cot  9.53493  3/20/15
## 1648 Costa Rica Cartago Cartago  2.94804  3/18/15
ggplot(data=df_CA, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_CA,aes(x="Cartago",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)
Distance <- df_CA$Distance
names(Distance) <- df_CA$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##          
## Pareto chart analysis for Distance
##            Frequency  Cum.Freq. Percentage Cum.Percent.
##   Orosí   19.287220  19.287220  38.861440    38.861440
##   Cot       9.636160  28.923380  19.415709    58.277148
##   Cot       9.534930  38.458310  19.211743    77.488891
##             5.151420  43.609730  10.379495    87.868386
##   Cartago   3.072970  46.682700   6.191667    94.060052
##   Cartago   2.948040  49.630740   5.939948   100.000000

Diagrama de tallo y hojas

stem(df_CA$"Distance")
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 33
##   0 | 5
##   1 | 00
##   1 | 9
stem(df_CA$"Distance")
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 33
##   0 | 5
##   1 | 00
##   1 | 9
stem(df_CA$"Distance", scale = 2)
## 
##   The decimal point is at the |
## 
##    2 | 91
##    4 | 2
##    6 | 
##    8 | 56
##   10 | 
##   12 | 
##   14 | 
##   16 | 
##   18 | 3

Series temporales

library(forecast)
data_serie<- ts(df_CA$Distance, frequency=12, start=2007)
head(data_serie)
##           Jan      Feb      Mar      Apr      May      Jun
## 2007 19.28722  9.63616  3.07297  5.15142  9.53493  2.94804
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
2.94804 1 16.7 16.7 16.7 16.7
3.07297 1 16.7 16.7 33.3 33.3
5.15142 1 16.7 16.7 50.0 50.0
9.53493 1 16.7 16.7 66.7 66.7
9.63616 1 16.7 16.7 83.3 83.3
19.28722 1 16.7 16.7 100.0 100.0
Total 6 100.0 100.0 100.0 100.0
str(table) 
## Classes 'freqtab' and 'data.frame':  7 obs. of  5 variables:
##  $ n      : num  1 1 1 1 1 1 6
##  $ %      : num  16.7 16.7 16.7 16.7 16.7 16.7 100
##  $ val%   : num  16.7 16.7 16.7 16.7 16.7 16.7 100
##  $ %cum   : num  16.7 33.3 50 66.7 83.3 100 100
##  $ val%cum: num  16.7 33.3 50 66.7 83.3 100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
2.94804 1
3.07297 1
5.15142 1
9.53493 1
9.63616 1
19.28722 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1]  2.94804  8.94804 14.94804 20.94804
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
2.94804 1 0.1666667 1
3.07297 1 0.1666667 2
5.15142 1 0.1666667 3
9.53493 1 0.1666667 4
9.63616 1 0.1666667 5
19.28722 1 0.1666667 6
str(Freq_table)
## 'data.frame':    6 obs. of  4 variables:
##  $ Distance: Factor w/ 6 levels "2.94804","3.07297",..: 1 2 3 4 5 6
##  $ Freq    : int  1 1 1 1 1 1
##  $ Rel_Freq: num  0.167 0.167 0.167 0.167 0.167 ...
##  $ Cum_Freq: int  1 2 3 4 5 6
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
2.94804 1
3.07297 1
5.15142 1
9.53493 1
9.63616 1
19.28722 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_CA$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.948   3.593   7.343   8.272   9.611  19.287
library(pastecs)
stat.desc(df_CA)
##                        id Date time Continent Country country_code State
## nbr.val      6.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.null     0.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.na       0.000000e+00   NA   NA        NA      NA           NA    NA
## min          2.684000e+03   NA   NA        NA      NA           NA    NA
## max          7.492000e+03   NA   NA        NA      NA           NA    NA
## range        4.808000e+03   NA   NA        NA      NA           NA    NA
## sum          3.588300e+04   NA   NA        NA      NA           NA    NA
## median       7.092500e+03   NA   NA        NA      NA           NA    NA
## mean         5.980500e+03   NA   NA        NA      NA           NA    NA
## SE.mean      8.567926e+02   NA   NA        NA      NA           NA    NA
## CI.mean.0.95 2.202455e+03   NA   NA        NA      NA           NA    NA
## var          4.404561e+06   NA   NA        NA      NA           NA    NA
## std.dev      2.098705e+03   NA   NA        NA      NA           NA    NA
## coef.var     3.509246e-01   NA   NA        NA      NA           NA    NA
##                population City   Distance location_description    latitude
## nbr.val      6.000000e+00   NA  6.0000000                   NA  6.00000000
## nbr.null     0.000000e+00   NA  0.0000000                   NA  0.00000000
## nbr.na       0.000000e+00   NA  0.0000000                   NA  0.00000000
## min          4.060000e+03   NA  2.9480400                   NA  9.62270000
## max          2.659400e+04   NA 19.2872200                   NA  9.97920000
## range        2.253400e+04   NA 16.3391800                   NA  0.35650000
## sum          7.516600e+04   NA 49.6307400                   NA 59.14320000
## median       6.784000e+03   NA  7.3431750                   NA  9.88550000
## mean         1.252767e+04   NA  8.2717900                   NA  9.85720000
## SE.mean      4.473174e+03   NA  2.5159722                   NA  0.05493519
## CI.mean.0.95 1.149866e+04   NA  6.4675124                   NA  0.14121539
## var          1.200557e+08   NA 37.9806957                   NA  0.01810725
## std.dev      1.095699e+04   NA  6.1628480                   NA  0.13456317
## coef.var     8.746236e-01   NA  0.7450441                   NA  0.01365126
##                  longitude geolocation hazard_type landslide_type
## nbr.val       6.000000e+00          NA          NA             NA
## nbr.null      0.000000e+00          NA          NA             NA
## nbr.na        0.000000e+00          NA          NA             NA
## min          -8.398150e+01          NA          NA             NA
## max          -8.383590e+01          NA          NA             NA
## range         1.456000e-01          NA          NA             NA
## sum          -5.033958e+02          NA          NA             NA
## median       -8.389290e+01          NA          NA             NA
## mean         -8.389930e+01          NA          NA             NA
## SE.mean       2.429580e-02          NA          NA             NA
## CI.mean.0.95  6.245435e-02          NA          NA             NA
## var           3.541716e-03          NA          NA             NA
## std.dev       5.951232e-02          NA          NA             NA
## coef.var     -7.093303e-04          NA          NA             NA
##              landslide_size trigger storm_name injuries fatalities source_name
## nbr.val                  NA      NA         NA        4          6          NA
## nbr.null                 NA      NA         NA        4          6          NA
## nbr.na                   NA      NA         NA        2          0          NA
## min                      NA      NA         NA        0          0          NA
## max                      NA      NA         NA        0          0          NA
## range                    NA      NA         NA        0          0          NA
## sum                      NA      NA         NA        0          0          NA
## median                   NA      NA         NA        0          0          NA
## mean                     NA      NA         NA        0          0          NA
## SE.mean                  NA      NA         NA        0          0          NA
## CI.mean.0.95             NA      NA         NA        0          0          NA
## var                      NA      NA         NA        0          0          NA
## std.dev                  NA      NA         NA        0          0          NA
## coef.var                 NA      NA         NA      NaN        NaN          NA
##              source_link
## nbr.val               NA
## nbr.null              NA
## nbr.na                NA
## min                   NA
## max                   NA
## range                 NA
## sum                   NA
## median                NA
## mean                  NA
## SE.mean               NA
## CI.mean.0.95          NA
## var                   NA
## std.dev               NA
## coef.var              NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[10] <- "Distance"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[2] <- "Date"

Heredia

Deslizamientos de las ciudades de Heredia

library(readr)
library(knitr)
df_HE <- subset (df, State == "Heredia")
df_HE %>% 
  select(Country, State, City, Distance, Date) 
##         Country   State                  City Distance     Date
## 38   Costa Rica Heredia               Heredia  0.26208   9/9/07
## 311  Costa Rica Heredia              Ã\201ngeles 19.51432  4/27/10
## 480  Costa Rica Heredia              Ã\201ngeles 14.81614 10/15/10
## 529  Costa Rica Heredia              Ã\201ngeles 19.54581 11/21/10
## 702  Costa Rica Heredia              Ã\201ngeles 15.05161   5/8/11
## 884  Costa Rica Heredia         Santo Domingo 21.95470  5/13/12
## 1157 Costa Rica Heredia         Santo Domingo  9.85736  9/16/13
## 1384 Costa Rica Heredia Dulce Nombre de Jesus 10.01310 12/13/14
head(df_HE)
##       id     Date          time Continent    Country country_code   State
## 38   249   9/9/07                    <NA> Costa Rica           CR Heredia
## 311 1786  4/27/10 Early morning      <NA> Costa Rica           CR Heredia
## 480 2598 10/15/10                    <NA> Costa Rica           CR Heredia
## 529 2742 11/21/10                    <NA> Costa Rica           CR Heredia
## 702 3472   5/8/11         Night      <NA> Costa Rica           CR Heredia
## 884 4358  5/13/12                    <NA> Costa Rica           CR Heredia
##     population          City Distance location_description latitude longitude
## 38       21947       Heredia  0.26208                       10.0000  -84.1167
## 311       1355      Ã\201ngeles 19.51432                       10.1452  -83.9564
## 480       1355      Ã\201ngeles 14.81614                       10.1067  -83.9753
## 529       1355      Ã\201ngeles 19.54581                       10.1433  -83.9529
## 702       1355      Ã\201ngeles 15.05161                       10.1118  -83.9793
## 884       5745 Santo Domingo 21.95470                       10.1981  -84.0074
##                                   geolocation hazard_type landslide_type
## 38                  (10, -84.116699999999994)   Landslide      Landslide
## 311 (10.145200000000001, -83.956400000000002)   Landslide      Landslide
## 480            (10.1067, -83.975300000000004)   Landslide       Rockfall
## 529                       (10.1433, -83.9529)   Landslide      Landslide
## 702 (10.111800000000001, -83.979299999999995)   Landslide      Landslide
## 884            (10.1981, -84.007400000000004)   Landslide      Landslide
##     landslide_size  trigger storm_name injuries fatalities   source_name
## 38          Medium     Rain                  NA         NA ticotimes.net
## 311         Medium Downpour                  NA          0              
## 480         Medium Downpour                  NA          2              
## 529         Medium Downpour                  NA          0              
## 702         Medium     Rain                  NA          0              
## 884         Medium Downpour                  NA         NA              
##                                                                                                        source_link
## 38                                                       http://www.ticotimes.net/dailyarchive/2007_09/0911072.htm
## 311                                                                  http://en.trend.az/news/incident/1678592.html
## 480 http://www.ticotimes.net/News/Daily-News/Two-People-Die-in-Landslide-on-Limon-Highway_Saturday-October-16-2010
## 529                                    http://insidecostarica.com/dailynews/2010/november/22/costarica10112204.htm
## 702                                         http://insidecostarica.com/dailynews/2011/may/10/costarica11051010.htm
## 884                                     http://www.insidecostarica.com/dailynews/2012/may/17/costarica12051708.htm
ggplot(data=df_HE, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_HE,aes(x="Distrito Nacional",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)

Distance <- df_HE$Distance
names(Distance) <- df_HE$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##                        
## Pareto chart analysis for Distance
##                          Frequency  Cum.Freq. Percentage Cum.Percent.
##   Santo Domingo          21.954700  21.954700  19.776315    19.776315
##   Ã\201ngeles               19.545810  41.500510  17.606440    37.382755
##   Ã\201ngeles               19.514320  61.014830  17.578074    54.960829
##   Ã\201ngeles               15.051610  76.066440  13.558162    68.518991
##   Ã\201ngeles               14.816140  90.882580  13.346056    81.865047
##   Dulce Nombre de Jesus  10.013100 100.895680   9.019582    90.884629
##   Santo Domingo           9.857360 110.753040   8.879295    99.763924
##   Heredia                 0.262080 111.015120   0.236076   100.000000

Diagrama de tallo y hojas

stem(df_HE$"Distance")
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 0
##   0 | 
##   1 | 00
##   1 | 55
##   2 | 002
stem(df_HE$"Distance")
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 0
##   0 | 
##   1 | 00
##   1 | 55
##   2 | 002
stem(df_HE$"Distance", scale = 2)
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 0
##   0 | 
##   1 | 00
##   1 | 55
##   2 | 002

Series temporales

library(forecast)
data_serie<- ts(df_HE$Distance, frequency=12, start=2007)
head(data_serie)
##           Jan      Feb      Mar      Apr      May      Jun
## 2007  0.26208 19.51432 14.81614 19.54581 15.05161 21.95470
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
0.26208 1 12.5 12.5 12.5 12.5
9.85736 1 12.5 12.5 25.0 25.0
10.0131 1 12.5 12.5 37.5 37.5
14.81614 1 12.5 12.5 50.0 50.0
15.05161 1 12.5 12.5 62.5 62.5
19.51432 1 12.5 12.5 75.0 75.0
19.54581 1 12.5 12.5 87.5 87.5
21.9547 1 12.5 12.5 100.0 100.0
Total 8 100.0 100.0 100.0 100.0
str(table) 
## Classes 'freqtab' and 'data.frame':  9 obs. of  5 variables:
##  $ n      : num  1 1 1 1 1 1 1 1 8
##  $ %      : num  12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 100
##  $ val%   : num  12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 100
##  $ %cum   : num  12.5 25 37.5 50 62.5 75 87.5 100 100
##  $ val%cum: num  12.5 25 37.5 50 62.5 75 87.5 100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
0.26208 1
9.85736 1
10.0131 1
14.81614 1
15.05161 1
19.51432 1
19.54581 1
21.9547 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1]  0.26208  6.26208 12.26208 18.26208 24.26208
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
0.26208 1 0.125 1
9.85736 1 0.125 2
10.0131 1 0.125 3
14.81614 1 0.125 4
15.05161 1 0.125 5
19.51432 1 0.125 6
19.54581 1 0.125 7
21.9547 1 0.125 8
str(Freq_table)
## 'data.frame':    8 obs. of  4 variables:
##  $ Distance: Factor w/ 8 levels "0.26208","9.85736",..: 1 2 3 4 5 6 7 8
##  $ Freq    : int  1 1 1 1 1 1 1 1
##  $ Rel_Freq: num  0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
##  $ Cum_Freq: int  1 2 3 4 5 6 7 8
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
0.26208 1
9.85736 1
10.0131 1
14.81614 1
15.05161 1
19.51432 1
19.54581 1
21.9547 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_HE$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.2621  9.9742 14.9339 13.8769 19.5222 21.9547
library(pastecs)
stat.desc(df_HE)
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                        id Date time Continent Country country_code State
## nbr.val      8.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.null     0.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.na       0.000000e+00   NA   NA        NA      NA           NA    NA
## min          2.490000e+02   NA   NA        NA      NA           NA    NA
## max          6.696000e+03   NA   NA        NA      NA           NA    NA
## range        6.447000e+03   NA   NA        NA      NA           NA    NA
## sum          2.744200e+04   NA   NA        NA      NA           NA    NA
## median       3.107000e+03   NA   NA        NA      NA           NA    NA
## mean         3.430250e+03   NA   NA        NA      NA           NA    NA
## SE.mean      7.315967e+02   NA   NA        NA      NA           NA    NA
## CI.mean.0.95 1.729951e+03   NA   NA        NA      NA           NA    NA
## var          4.281870e+06   NA   NA        NA      NA           NA    NA
## std.dev      2.069268e+03   NA   NA        NA      NA           NA    NA
## coef.var     6.032412e-01   NA   NA        NA      NA           NA    NA
##                population City    Distance location_description     latitude
## nbr.val      8.000000e+00   NA   8.0000000                   NA  8.000000000
## nbr.null     1.000000e+00   NA   0.0000000                   NA  0.000000000
## nbr.na       0.000000e+00   NA   0.0000000                   NA  0.000000000
## min          0.000000e+00   NA   0.2620800                   NA 10.000000000
## max          2.194700e+04   NA  21.9547000                   NA 10.205400000
## range        2.194700e+04   NA  21.6926200                   NA  0.205400000
## sum          3.885700e+04   NA 111.0151200                   NA 81.063300000
## median       1.355000e+03   NA  14.9338750                   NA 10.144250000
## mean         4.857125e+03   NA  13.8768900                   NA 10.132912500
## SE.mean      2.557523e+03   NA   2.4924134                   NA  0.022739522
## CI.mean.0.95 6.047580e+03   NA   5.8936213                   NA  0.053770426
## var          5.232738e+07   NA  49.6969984                   NA  0.004136687
## std.dev      7.233767e+03   NA   7.0496098                   NA  0.064317081
## coef.var     1.489310e+00   NA   0.5080108                   NA  0.006347344
##                  longitude geolocation hazard_type landslide_type
## nbr.val       8.000000e+00          NA          NA             NA
## nbr.null      0.000000e+00          NA          NA             NA
## nbr.na        0.000000e+00          NA          NA             NA
## min          -8.414890e+01          NA          NA             NA
## max          -8.390410e+01          NA          NA             NA
## range         2.448000e-01          NA          NA             NA
## sum          -6.720410e+02          NA          NA             NA
## median       -8.397730e+01          NA          NA             NA
## mean         -8.400512e+01          NA          NA             NA
## SE.mean       2.987758e-02          NA          NA             NA
## CI.mean.0.95  7.064924e-02          NA          NA             NA
## var           7.141356e-03          NA          NA             NA
## std.dev       8.450655e-02          NA          NA             NA
## coef.var     -1.005969e-03          NA          NA             NA
##              landslide_size trigger storm_name injuries fatalities source_name
## nbr.val                  NA      NA         NA        1  6.0000000          NA
## nbr.null                 NA      NA         NA        1  5.0000000          NA
## nbr.na                   NA      NA         NA        7  2.0000000          NA
## min                      NA      NA         NA        0  0.0000000          NA
## max                      NA      NA         NA        0  2.0000000          NA
## range                    NA      NA         NA        0  2.0000000          NA
## sum                      NA      NA         NA        0  2.0000000          NA
## median                   NA      NA         NA        0  0.0000000          NA
## mean                     NA      NA         NA        0  0.3333333          NA
## SE.mean                  NA      NA         NA       NA  0.3333333          NA
## CI.mean.0.95             NA      NA         NA      NaN  0.8568606          NA
## var                      NA      NA         NA       NA  0.6666667          NA
## std.dev                  NA      NA         NA       NA  0.8164966          NA
## coef.var                 NA      NA         NA       NA  2.4494897          NA
##              source_link
## nbr.val               NA
## nbr.null              NA
## nbr.na                NA
## min                   NA
## max                   NA
## range                 NA
## sum                   NA
## median                NA
## mean                  NA
## SE.mean               NA
## CI.mean.0.95          NA
## var                   NA
## std.dev               NA
## coef.var              NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[10] <- "Distance"
colnames(df)[2] <- "Date"

Puntarenas

Deslizamientos de las ciudades de Puntarenas

library(readr)
library(knitr)
df_PNS <- subset (df, State == "Puntarenas")
df_PNS %>% 
  select(Country, State, City, Distance, Date) 
##         Country      State           City Distance     Date
## 51   Costa Rica Puntarenas        Miramar  3.82425 10/24/07
## 156  Costa Rica Puntarenas        Golfito 11.74074 10/15/08
## 157  Costa Rica Puntarenas        Miramar  8.92048 10/16/08
## 229  Costa Rica Puntarenas       San Vito 18.00524 11/13/09
## 506  Costa Rica Puntarenas        Golfito  7.87044  11/4/10
## 509  Costa Rica Puntarenas       Corredor  4.93053  11/4/10
## 510  Costa Rica Puntarenas        Parrita 13.48919  11/4/10
## 511  Costa Rica Puntarenas Ciudad Cortés 20.06633  11/4/10
## 1649 Costa Rica Puntarenas   Buenos Aires  0.35225 11/23/15
head(df_PNS, n=4)
##       id     Date time Continent    Country country_code      State population
## 51   323 10/24/07           <NA> Costa Rica           CR Puntarenas       6540
## 156  845 10/15/08           <NA> Costa Rica           CR Puntarenas       6777
## 157  848 10/16/08           <NA> Costa Rica           CR Puntarenas       6540
## 229 1296 11/13/09           <NA> Costa Rica           CR Puntarenas       3981
##         City Distance location_description latitude longitude
## 51   Miramar  3.82425    Mine construction  10.0715  -84.7575
## 156  Golfito 11.74074                        8.6700  -83.0640
## 157  Miramar  8.92048                       10.1110  -84.8090
## 229 San Vito 18.00524                        8.8021  -83.1335
##                                   geolocation hazard_type landslide_type
## 51             (10.0715, -84.757499999999993)   Landslide       Mudslide
## 156               (8.67, -83.063999999999993)   Landslide      Landslide
## 157 (10.111000000000001, -84.808999999999997)   Landslide        Complex
## 229 (8.8020999999999994, -83.133499999999998)   Landslide      Landslide
##     landslide_size    trigger storm_name injuries fatalities
## 51          Medium   Downpour                  NA         NA
## 156         Medium   Downpour                  NA         NA
## 157         Medium   Downpour                  NA         NA
## 229         Medium Earthquake                  NA          1
##                source_name
## 51  Reuters - AlertNet.org
## 156                       
## 157                       
## 229                       
##                                                               source_link
## 51  http://www.reuters.com/article/companyNewsAndPR/idUSN2435152820071025
## 156             http://www.ticotimes.net/dailyarchive/2008_10/1016081.htm
## 157        http://insidecostarica.com/dailynews/2008/october/17/nac01.htm
## 229             http://www.ticotimes.net/dailyarchive/2009_11/1116092.cfm
ggplot(data=df_PNS, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_PNS,aes(x="Puntarenas",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)

Distance <- df_PNS$Distance
names(Distance) <- df_PNS$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##                 
## Pareto chart analysis for Distance
##                    Frequency   Cum.Freq.  Percentage Cum.Percent.
##   Ciudad Cortés  20.0663300  20.0663300  22.4960244   22.4960244
##   San Vito        18.0052400  38.0715700  20.1853711   42.6813955
##   Parrita         13.4891900  51.5607600  15.1225036   57.8038990
##   Golfito         11.7407400  63.3015000  13.1623457   70.9662447
##   Miramar          8.9204800  72.2219800  10.0005998   80.9668445
##   Golfito          7.8704400  80.0924200   8.8234176   89.7902622
##   Corredor         4.9305300  85.0229500   5.5275341   95.3177962
##   Miramar          3.8242500  88.8472000   4.2873022   99.6050985
##   Buenos Aires     0.3522500  89.1994500   0.3949015  100.0000000

Diagrama de tallo y hojas

stem(df_PNS$"Distance")
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 04
##   0 | 589
##   1 | 23
##   1 | 8
##   2 | 0
stem(df_PNS$"Distance")
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   0 | 04
##   0 | 589
##   1 | 23
##   1 | 8
##   2 | 0
stem(df_PNS$"Distance", scale = 2)
## 
##   The decimal point is at the |
## 
##    0 | 4
##    2 | 8
##    4 | 9
##    6 | 9
##    8 | 9
##   10 | 7
##   12 | 5
##   14 | 
##   16 | 
##   18 | 0
##   20 | 1

Series temporales

library(forecast)
data_serie<- ts(df_PNS$Distance, frequency=12, start=2007)
head(data_serie)
##           Jan      Feb      Mar      Apr      May      Jun
## 2007  3.82425 11.74074  8.92048 18.00524  7.87044  4.93053
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
0.35225 1 11.1 11.1 11.1 11.1
3.82425 1 11.1 11.1 22.2 22.2
4.93053 1 11.1 11.1 33.3 33.3
7.87044 1 11.1 11.1 44.4 44.4
8.92048 1 11.1 11.1 55.6 55.6
11.74074 1 11.1 11.1 66.7 66.7
13.48919 1 11.1 11.1 77.8 77.8
18.00524 1 11.1 11.1 88.9 88.9
20.06633 1 11.1 11.1 100.0 100.0
Total 9 100.0 100.0 100.0 100.0
str(table) 
## Classes 'freqtab' and 'data.frame':  10 obs. of  5 variables:
##  $ n      : num  1 1 1 1 1 1 1 1 1 9
##  $ %      : num  11.1 11.1 11.1 11.1 11.1 11.1 11.1 11.1 11.1 100
##  $ val%   : num  11.1 11.1 11.1 11.1 11.1 11.1 11.1 11.1 11.1 100
##  $ %cum   : num  11.1 22.2 33.3 44.4 55.6 66.7 77.8 88.9 100 100
##  $ val%cum: num  11.1 22.2 33.3 44.4 55.6 66.7 77.8 88.9 100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
0.35225 1
3.82425 1
4.93053 1
7.87044 1
8.92048 1
11.74074 1
13.48919 1
18.00524 1
20.06633 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1]  0.35225  4.35225  8.35225 12.35225 16.35225 20.35225
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
0.35225 1 0.1111111 1
3.82425 1 0.1111111 2
4.93053 1 0.1111111 3
7.87044 1 0.1111111 4
8.92048 1 0.1111111 5
11.74074 1 0.1111111 6
13.48919 1 0.1111111 7
18.00524 1 0.1111111 8
20.06633 1 0.1111111 9
str(Freq_table)
## 'data.frame':    9 obs. of  4 variables:
##  $ Distance: Factor w/ 9 levels "0.35225","3.82425",..: 1 2 3 4 5 6 7 8 9
##  $ Freq    : int  1 1 1 1 1 1 1 1 1
##  $ Rel_Freq: num  0.111 0.111 0.111 0.111 0.111 ...
##  $ Cum_Freq: int  1 2 3 4 5 6 7 8 9
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
0.35225 1
3.82425 1
4.93053 1
7.87044 1
8.92048 1
11.74074 1
13.48919 1
18.00524 1
20.06633 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_PNS$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.3523  4.9305  8.9205  9.9110 13.4892 20.0663
library(pastecs)
stat.desc(df_PNS)
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                        id Date time Continent Country country_code State
## nbr.val      9.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.null     0.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.na       0.000000e+00   NA   NA        NA      NA           NA    NA
## min          3.230000e+02   NA   NA        NA      NA           NA    NA
## max          7.493000e+03   NA   NA        NA      NA           NA    NA
## range        7.170000e+03   NA   NA        NA      NA           NA    NA
## sum          2.155700e+04   NA   NA        NA      NA           NA    NA
## median       2.685000e+03   NA   NA        NA      NA           NA    NA
## mean         2.395222e+03   NA   NA        NA      NA           NA    NA
## SE.mean      7.132643e+02   NA   NA        NA      NA           NA    NA
## CI.mean.0.95 1.644790e+03   NA   NA        NA      NA           NA    NA
## var          4.578713e+06   NA   NA        NA      NA           NA    NA
## std.dev      2.139793e+03   NA   NA        NA      NA           NA    NA
## coef.var     8.933588e-01   NA   NA        NA      NA           NA    NA
##                population City   Distance location_description    latitude
## nbr.val      9.000000e+00   NA  9.0000000                   NA  9.00000000
## nbr.null     0.000000e+00   NA  0.0000000                   NA  0.00000000
## nbr.na       0.000000e+00   NA  0.0000000                   NA  0.00000000
## min          3.734000e+03   NA  0.3522500                   NA  8.61170000
## max          1.168000e+04   NA 20.0663300                   NA 10.11100000
## range        7.946000e+03   NA 19.7140800                   NA  1.49930000
## sum          5.696300e+04   NA 89.1994500                   NA 82.72080000
## median       6.540000e+03   NA  8.9204800                   NA  8.98960000
## mean         6.329222e+03   NA  9.9110500                   NA  9.19120000
## SE.mean      8.172298e+02   NA  2.1831653                   NA  0.19984316
## CI.mean.0.95 1.884535e+03   NA  5.0343882                   NA  0.46083916
## var          6.010781e+06   NA 42.8958955                   NA  0.35943561
## std.dev      2.451689e+03   NA  6.5494958                   NA  0.59952949
## coef.var     3.873603e-01   NA  0.6608276                   NA  0.06522864
##                  longitude geolocation hazard_type landslide_type
## nbr.val       9.000000e+00          NA          NA             NA
## nbr.null      0.000000e+00          NA          NA             NA
## nbr.na        0.000000e+00          NA          NA             NA
## min          -8.480900e+01          NA          NA             NA
## max          -8.294180e+01          NA          NA             NA
## range         1.867200e+00          NA          NA             NA
## sum          -7.528426e+02          NA          NA             NA
## median       -8.332680e+01          NA          NA             NA
## mean         -8.364918e+01          NA          NA             NA
## SE.mean       2.553648e-01          NA          NA             NA
## CI.mean.0.95  5.888723e-01          NA          NA             NA
## var           5.869007e-01          NA          NA             NA
## std.dev       7.660945e-01          NA          NA             NA
## coef.var     -9.158422e-03          NA          NA             NA
##              landslide_size trigger storm_name injuries fatalities source_name
## nbr.val                  NA      NA         NA        1  6.0000000          NA
## nbr.null                 NA      NA         NA        1  5.0000000          NA
## nbr.na                   NA      NA         NA        8  3.0000000          NA
## min                      NA      NA         NA        0  0.0000000          NA
## max                      NA      NA         NA        0  1.0000000          NA
## range                    NA      NA         NA        0  1.0000000          NA
## sum                      NA      NA         NA        0  1.0000000          NA
## median                   NA      NA         NA        0  0.0000000          NA
## mean                     NA      NA         NA        0  0.1666667          NA
## SE.mean                  NA      NA         NA       NA  0.1666667          NA
## CI.mean.0.95             NA      NA         NA      NaN  0.4284303          NA
## var                      NA      NA         NA       NA  0.1666667          NA
## std.dev                  NA      NA         NA       NA  0.4082483          NA
## coef.var                 NA      NA         NA       NA  2.4494897          NA
##              source_link
## nbr.val               NA
## nbr.null              NA
## nbr.na                NA
## min                   NA
## max                   NA
## range                 NA
## sum                   NA
## median                NA
## mean                  NA
## SE.mean               NA
## CI.mean.0.95          NA
## var                   NA
## std.dev               NA
## coef.var              NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[10] <- "Distance"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[2] <- "Date"

Barbados

library(readr)
library(knitr)
df_BA <- subset (df, Country == "Barbados")
knitr::kable(head(df_BA))
id Date time Continent Country country_code State population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
161 857 10/22/08 NA Barbados BB Saint Joseph 1765 Bathsheba 2.87363 13.229 -59.54 (13.228999999999999, -59.54) Landslide Mudslide Medium Downpour NA NA http://www.nationnews.com/story/326456269849259.php
df_BA %>% 
  select(Country, State, City, Distance, Date)
##      Country        State      City Distance     Date
## 161 Barbados Saint Joseph Bathsheba  2.87363 10/22/08

Deslizamentos por estado

library(ggplot2)
ggplot(data=df_BA, aes(x = "Barbados", y = Distance, fill=State)) +
  geom_bar(stat = "identity", width = 1, color = "black") +
  coord_polar("y", start = 0)

ggplot(data=df_BA, aes(fill=State, y=Distance, x="Barbados")) +
  geom_bar(position="dodge", stat="identity")

Saint Joseph:

Deslizamientos en las ciudades de Saint Joseph

library(readr)
library(knitr)
df_SJ <- subset (df, State == "Saint Joseph")
df_SJ %>% 
  select(Country, State, City, Distance, Date) 
##       Country        State         City Distance     Date
## 161  Barbados Saint Joseph    Bathsheba  2.87363 10/22/08
## 1201 Dominica Saint Joseph Saint Joseph  2.38605   1/7/14
head(df_SJ)
##        id     Date    time Continent  Country country_code        State
## 161   857 10/22/08              <NA> Barbados           BB Saint Joseph
## 1201 5754   1/7/14 Morning      <NA> Dominica           DM Saint Joseph
##      population         City Distance location_description latitude longitude
## 161        1765    Bathsheba  2.87363                        13.229  -59.5400
## 1201       2184 Saint Joseph  2.38605           Above road   15.421  -61.4285
##                         geolocation hazard_type landslide_type landslide_size
## 161    (13.228999999999999, -59.54)   Landslide       Mudslide         Medium
## 1201 (15.420999999999999, -61.4285)   Landslide      Landslide         Medium
##       trigger storm_name injuries fatalities                       source_name
## 161  Downpour                  NA         NA                                  
## 1201  unknown                   0          0 DaVibes The Caribbean News Portal
##                                                 source_link
## 161     http://www.nationnews.com/story/326456269849259.php
## 1201 http://dominicavibes.dm/colihaut-men-escape-landslide/
ggplot(data=df_SJ, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico Circular

ggplot(df_SJ,aes(x="Saint Joseph",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento

library(qcc)

Distance <- df_SJ$Distance
names(Distance) <- df_SJ$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##               
## Pareto chart analysis for Distance
##                Frequency Cum.Freq. Percentage Cum.Percent.
##   Bathsheba      2.87363   2.87363   54.63507     54.63507
##   Saint Joseph   2.38605   5.25968   45.36493    100.00000

Diagrama tallo y hoja

stem(df_SJ$"Distance")
## 
##   The decimal point is 1 digit(s) to the left of the |
## 
##   23 | 9
##   24 | 
##   25 | 
##   26 | 
##   27 | 
##   28 | 7
stem(df_SJ$"Distance")
## 
##   The decimal point is 1 digit(s) to the left of the |
## 
##   23 | 9
##   24 | 
##   25 | 
##   26 | 
##   27 | 
##   28 | 7
stem(df_SJ$"Distance", scale = 2)
## 
##   The decimal point is 1 digit(s) to the left of the |
## 
##   23 | 9
##   24 | 
##   24 | 
##   25 | 
##   25 | 
##   26 | 
##   26 | 
##   27 | 
##   27 | 
##   28 | 
##   28 | 7

Series temporales

library(forecast)
data_serie<- ts(df_SJ$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan     Feb
## 2007 2.87363 2.38605
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
2.38605 1 50 50 50 50
2.87363 1 50 50 100 100
Total 2 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  3 obs. of  5 variables:
##  $ n      : num  1 1 2
##  $ %      : num  50 50 100
##  $ val%   : num  50 50 100
##  $ %cum   : num  50 100 100
##  $ val%cum: num  50 100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
2.38605 1
2.87363 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1] 2.38605 3.38605
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
2.38605 1 0.5 1
2.87363 1 0.5 2
str(Freq_table)
## 'data.frame':    2 obs. of  4 variables:
##  $ Distance: Factor w/ 2 levels "2.38605","2.87363": 1 2
##  $ Freq    : int  1 1
##  $ Rel_Freq: num  0.5 0.5
##  $ Cum_Freq: int  1 2
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
2.38605 1
2.87363 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Estadísticos

Personas afectadas por deslizamiento

summary(df_SJ$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.386   2.508   2.630   2.630   2.752   2.874
library(pastecs)
stat.desc(df_SJ)
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                        id Date time Continent Country country_code State
## nbr.val      2.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.null     0.000000e+00   NA   NA        NA      NA           NA    NA
## nbr.na       0.000000e+00   NA   NA        NA      NA           NA    NA
## min          8.570000e+02   NA   NA        NA      NA           NA    NA
## max          5.754000e+03   NA   NA        NA      NA           NA    NA
## range        4.897000e+03   NA   NA        NA      NA           NA    NA
## sum          6.611000e+03   NA   NA        NA      NA           NA    NA
## median       3.305500e+03   NA   NA        NA      NA           NA    NA
## mean         3.305500e+03   NA   NA        NA      NA           NA    NA
## SE.mean      2.448500e+03   NA   NA        NA      NA           NA    NA
## CI.mean.0.95 3.111114e+04   NA   NA        NA      NA           NA    NA
## var          1.199030e+07   NA   NA        NA      NA           NA    NA
## std.dev      3.462702e+03   NA   NA        NA      NA           NA    NA
## coef.var     1.047558e+00   NA   NA        NA      NA           NA    NA
##                population City  Distance location_description   latitude
## nbr.val          2.000000   NA 2.0000000                   NA  2.0000000
## nbr.null         0.000000   NA 0.0000000                   NA  0.0000000
## nbr.na           0.000000   NA 0.0000000                   NA  0.0000000
## min           1765.000000   NA 2.3860500                   NA 13.2290000
## max           2184.000000   NA 2.8736300                   NA 15.4210000
## range          419.000000   NA 0.4875800                   NA  2.1920000
## sum           3949.000000   NA 5.2596800                   NA 28.6500000
## median        1974.500000   NA 2.6298400                   NA 14.3250000
## mean          1974.500000   NA 2.6298400                   NA 14.3250000
## SE.mean        209.500000   NA 0.2437900                   NA  1.0960000
## CI.mean.0.95  2661.949892   NA 3.0976457                   NA 13.9260004
## var          87780.500000   NA 0.1188671                   NA  2.4024320
## std.dev        296.277741   NA 0.3447711                   NA  1.5499781
## coef.var         0.150052   NA 0.1310997                   NA  0.1082009
##                longitude geolocation hazard_type landslide_type landslide_size
## nbr.val         2.000000          NA          NA             NA             NA
## nbr.null        0.000000          NA          NA             NA             NA
## nbr.na          0.000000          NA          NA             NA             NA
## min           -61.428500          NA          NA             NA             NA
## max           -59.540000          NA          NA             NA             NA
## range           1.888500          NA          NA             NA             NA
## sum          -120.968500          NA          NA             NA             NA
## median        -60.484250          NA          NA             NA             NA
## mean          -60.484250          NA          NA             NA             NA
## SE.mean         0.944250          NA          NA             NA             NA
## CI.mean.0.95   11.997834          NA          NA             NA             NA
## var             1.783216          NA          NA             NA             NA
## std.dev         1.335371          NA          NA             NA             NA
## coef.var       -0.022078          NA          NA             NA             NA
##              trigger storm_name injuries fatalities source_name source_link
## nbr.val           NA         NA        1          1          NA          NA
## nbr.null          NA         NA        1          1          NA          NA
## nbr.na            NA         NA        1          1          NA          NA
## min               NA         NA        0          0          NA          NA
## max               NA         NA        0          0          NA          NA
## range             NA         NA        0          0          NA          NA
## sum               NA         NA        0          0          NA          NA
## median            NA         NA        0          0          NA          NA
## mean              NA         NA        0          0          NA          NA
## SE.mean           NA         NA       NA         NA          NA          NA
## CI.mean.0.95      NA         NA      NaN        NaN          NA          NA
## var               NA         NA       NA         NA          NA          NA
## std.dev           NA         NA       NA         NA          NA          NA
## coef.var          NA         NA       NA         NA          NA          NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[4] <- "Continent"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[9] <- "City"
colnames(df)[10] <- "Distance"
colnames(df)[2] <- "Date"

Belize

library(readr)
library(knitr)
df_BZ <- subset (df, Country == "Belize")
knitr::kable(head(df_BZ, n=4))
id Date time Continent Country country_code State population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
1593 7437 11/24/15 10:00 NA Belize BZ Cayo 13381 Belmopan 9.71758 Above road 17.2183 -88.8519 (17.218299999999999, -88.851900000000001) Landslide Rockfall Small Mining digging 0 0 Plus TV http://www.plustvbelize.com/landslide-in-arizona-village-blocks-road-for-hours/
df_BZ %>% 
  select(Country, State, City, Distance, Date)
##      Country State     City Distance     Date
## 1593  Belize  Cayo Belmopan  9.71758 11/24/15
head(df_BZ)
##        id     Date  time Continent Country country_code State population
## 1593 7437 11/24/15 10:00      <NA>  Belize           BZ  Cayo      13381
##          City Distance location_description latitude longitude
## 1593 Belmopan  9.71758           Above road  17.2183  -88.8519
##                                    geolocation hazard_type landslide_type
## 1593 (17.218299999999999, -88.851900000000001)   Landslide       Rockfall
##      landslide_size        trigger storm_name injuries fatalities source_name
## 1593          Small Mining digging                   0          0     Plus TV
##                                                                          source_link
## 1593 http://www.plustvbelize.com/landslide-in-arizona-village-blocks-road-for-hours/

Deslizamentos por estado

library(ggplot2)
ggplot(data=df_BZ, aes(x = "Belize", y = Distance, fill=State)) +
  geom_bar(stat = "identity", width = 1, color = "black") +
  coord_polar("y", start = 0)

ggplot(data=df_BZ, aes(fill=State, y=Distance, x="Belize")) +
  geom_bar(position="dodge", stat="identity")

Cayo

Deslizamientos de las ciudades de Cayo

library(readr)
library(knitr)
df_CY <- subset (df, State == "Cayo")
df_CY %>% 
  select(Country, State, City, Distance, Date) 
##      Country State     City Distance     Date
## 1593  Belize  Cayo Belmopan  9.71758 11/24/15
head(df_CY, n=4)
##        id     Date  time Continent Country country_code State population
## 1593 7437 11/24/15 10:00      <NA>  Belize           BZ  Cayo      13381
##          City Distance location_description latitude longitude
## 1593 Belmopan  9.71758           Above road  17.2183  -88.8519
##                                    geolocation hazard_type landslide_type
## 1593 (17.218299999999999, -88.851900000000001)   Landslide       Rockfall
##      landslide_size        trigger storm_name injuries fatalities source_name
## 1593          Small Mining digging                   0          0     Plus TV
##                                                                          source_link
## 1593 http://www.plustvbelize.com/landslide-in-arizona-village-blocks-road-for-hours/
ggplot(data=df_CY, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="blue", fill="white")

Gráfico circular

ggplot(df_CY,aes(x="Cayo",y=Distance, fill=City))+
  geom_bar(stat = "identity",
           color="white")+
    geom_text(aes(label=(Distance*1)),
              position=position_stack(vjust=0.5),color="white",size=4)+
  coord_polar(theta = "y")+
    labs(title="Gráfico de Deslizamiento")

Diagrama de pareto

Cuidad con mayor deslizamiento
library(qcc)

Distance <- df_CY$Distance
names(Distance) <- df_CY$City 

pareto.chart(Distance, 
             ylab="Distance",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "CIUDADES CON MAYORES DESLIZAMIENTOS"
)

##           
## Pareto chart analysis for Distance
##            Frequency Cum.Freq. Percentage Cum.Percent.
##   Belmopan   9.71758   9.71758  100.00000    100.00000

Diagrama de tallo y hojas

stem(df_CY$"Distance")

stem(df_CY$"Distance")
stem(df_CY$"Distance", scale = 2)

Series temporales

library(forecast)
data_serie<- ts(df_CY$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan
## 2007 9.71758
autoplot(data_serie)+
labs(title = "Serie de Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_bw()
## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?

Tablas de frecuencia

library(questionr)

table <- questionr::freq(Distance, cum = TRUE, sort = "dec", total = TRUE)
knitr::kable(table)
n % val% %cum val%cum
9.71758 1 100 100 100 100
Total 1 100 100 100 100
str(table) 
## Classes 'freqtab' and 'data.frame':  2 obs. of  5 variables:
##  $ n      : num  1 1
##  $ %      : num  100 100
##  $ val%   : num  100 100
##  $ %cum   : num  100 100
##  $ val%cum: num  100 100
x <- row.names(table)
y <- table$n
names <- x[1:(length(x)-1)]
freqs <- y[1:(length(y)-1)]
df <- data.frame(x = names, y = freqs)
knitr::kable(df)
x y
9.71758 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) + 
  geom_bar(stat="identity", color="white", fill="blue") +
  xlab("Número de asistencias") +
  ylab("Frecuencia")

Tabla de frecuencia agrupada

n_sturges = 1 + log(length(Distance))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Distance) - min(Distance)
w = ceiling(R/n_clases)
bins <- seq(min(Distance), max(Distance) + w, by = w)
bins
## [1] 9.71758
Edades <- cut(Distance, bins)
Freq_table <- transform(table(Distance), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Distance Freq Rel_Freq Cum_Freq
9.71758 1 1 1
str(Freq_table)
## 'data.frame':    1 obs. of  4 variables:
##  $ Distance: Factor w/ 1 level "9.71758": 1
##  $ Freq    : int 1
##  $ Rel_Freq: num 1
##  $ Cum_Freq: int 1
df <- data.frame(x = Freq_table$Distance, y = Freq_table$Freq)
knitr::kable(df)
x y
9.71758 1
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Rango de Distance") +
  ylab("Frecuencia")

Personas afectadas por deslizamiento

summary(df_CY$Distance)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   9.718   9.718   9.718   9.718   9.718   9.718
library(pastecs)
stat.desc(df_CY)
## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced

## Warning in qt((0.5 + p/2), (Nbrval - 1)): NaNs produced
##                id Date time Continent Country country_code State population
## nbr.val         1   NA   NA        NA      NA           NA    NA          1
## nbr.null        0   NA   NA        NA      NA           NA    NA          0
## nbr.na          0   NA   NA        NA      NA           NA    NA          0
## min          7437   NA   NA        NA      NA           NA    NA      13381
## max          7437   NA   NA        NA      NA           NA    NA      13381
## range           0   NA   NA        NA      NA           NA    NA          0
## sum          7437   NA   NA        NA      NA           NA    NA      13381
## median       7437   NA   NA        NA      NA           NA    NA      13381
## mean         7437   NA   NA        NA      NA           NA    NA      13381
## SE.mean        NA   NA   NA        NA      NA           NA    NA         NA
## CI.mean.0.95  NaN   NA   NA        NA      NA           NA    NA        NaN
## var            NA   NA   NA        NA      NA           NA    NA         NA
## std.dev        NA   NA   NA        NA      NA           NA    NA         NA
## coef.var       NA   NA   NA        NA      NA           NA    NA         NA
##              City Distance location_description latitude longitude geolocation
## nbr.val        NA  1.00000                   NA   1.0000    1.0000          NA
## nbr.null       NA  0.00000                   NA   0.0000    0.0000          NA
## nbr.na         NA  0.00000                   NA   0.0000    0.0000          NA
## min            NA  9.71758                   NA  17.2183  -88.8519          NA
## max            NA  9.71758                   NA  17.2183  -88.8519          NA
## range          NA  0.00000                   NA   0.0000    0.0000          NA
## sum            NA  9.71758                   NA  17.2183  -88.8519          NA
## median         NA  9.71758                   NA  17.2183  -88.8519          NA
## mean           NA  9.71758                   NA  17.2183  -88.8519          NA
## SE.mean        NA       NA                   NA       NA        NA          NA
## CI.mean.0.95   NA      NaN                   NA      NaN       NaN          NA
## var            NA       NA                   NA       NA        NA          NA
## std.dev        NA       NA                   NA       NA        NA          NA
## coef.var       NA       NA                   NA       NA        NA          NA
##              hazard_type landslide_type landslide_size trigger storm_name
## nbr.val               NA             NA             NA      NA         NA
## nbr.null              NA             NA             NA      NA         NA
## nbr.na                NA             NA             NA      NA         NA
## min                   NA             NA             NA      NA         NA
## max                   NA             NA             NA      NA         NA
## range                 NA             NA             NA      NA         NA
## sum                   NA             NA             NA      NA         NA
## median                NA             NA             NA      NA         NA
## mean                  NA             NA             NA      NA         NA
## SE.mean               NA             NA             NA      NA         NA
## CI.mean.0.95          NA             NA             NA      NA         NA
## var                   NA             NA             NA      NA         NA
## std.dev               NA             NA             NA      NA         NA
## coef.var              NA             NA             NA      NA         NA
##              injuries fatalities source_name source_link
## nbr.val             1          1          NA          NA
## nbr.null            1          1          NA          NA
## nbr.na              0          0          NA          NA
## min                 0          0          NA          NA
## max                 0          0          NA          NA
## range               0          0          NA          NA
## sum                 0          0          NA          NA
## median              0          0          NA          NA
## mean                0          0          NA          NA
## SE.mean            NA         NA          NA          NA
## CI.mean.0.95      NaN        NaN          NA          NA
## var                NA         NA          NA          NA
## std.dev            NA         NA          NA          NA
## coef.var           NA         NA          NA          NA

Caja y extensión

boxplot(Distance, horizontal=TRUE, col='steelblue')

library(tidyverse)
library(hrbrthemes)
library(viridis)

df <- data.frame(Distance)
df %>% ggplot(aes(x = "", y = Distance)) +
  geom_boxplot(color="red", fill="orange", alpha=0.5) +
  theme_ipsum() +
  theme(legend.position="none", plot.title = element_text(size=11)) +
  ggtitle("Deslizamientos") +
  coord_flip() +
  xlab("") +
  ylab("")
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database

Guatemala

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[2] <- "Date"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[8] <- "Population"
colnames(df)[9] <- "City"
colnames(df)[10] <- "Distance"
colnames(df)[18] <- "Trigger"
colnames(df)[21] <- "Fatalities"
library(readr)
library(knitr)
df_GT <- subset (df, Country == "Guatemala")
knitr::kable(head(df_GT))
id Date time continent_code Country country_code State Population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size Trigger storm_name injuries Fatalities source_name source_link
17 165 8/9/07 NA Guatemala GT Guatemala 47247 San José Pinula 4.74385 14.5667 -90.4500 (14.566700000000001, -90.45) Landslide Mudslide Medium Rain NA 5 International Herald http://www.iht.com/articles/ap/2007/08/09/america/LA-GEN-Guatemala-Deadly-Mudslide.php
27 198 8/21/07 NA Guatemala GT Alta Verapaz 2006 Lanquín 13.39817 15.6046 -90.0853 (15.6046, -90.085300000000004) Landslide Landslide Medium Tropical cyclone Hurricane Dean NA NA United Nations Development Programme - Relief Web http://www.reliefweb.int/rw/RWB.NSF/db900SID/EDIS-76BSG6?OpenDocument
28 199 8/21/07 NA Guatemala GT Izabal 18994 Morales 12.55184 15.5163 -88.9286 (15.516299999999999, -88.928600000000003) Landslide Landslide Medium Tropical cyclone Hurricane Dean NA NA United Nations Development Programme - Relief Web http://www.reliefweb.int/rw/RWB.NSF/db900SID/EDIS-76BSG6?OpenDocument
41 277 9/22/07 NA Guatemala GT Guatemala 994938 Guatemala City 2.79113 14.6229 -90.5316 (14.6229, -90.531599999999997) Landslide Mudslide Medium Rain NA 3 Fox News http://www.foxnews.com/story/0,2933,297714,00.html
104 563 6/1/08 NA Guatemala GT Escuintla 31329 Palín 3.10150 14.4226 -90.6755 (14.422599999999999, -90.6755) Landslide Mudslide Medium Tropical cyclone Tropical Storm Arthur NA 1 http://209.85.215.104/search?q=cache:QU_lPxNfk78J:www.plenglish.com/article.asp?ID=%7B1D4A74F7-CDCA-49D0-ABD4-D2E0FD9D2130%7D&language=EN+Colom+said+the+declaration+came+after+a+death+in+Palin+and+40+houses+partially&hl=en&ct=clnk&cd=1&gl=us&c
108 591 6/18/08 NA Guatemala GT Guatemala 994938 Guatemala City 3.12614 14.6510 -90.5403 (14.651, -90.540300000000002) Landslide Complex Medium Rain NA 8 http://cnnwire.blogs.cnn.com/2008/06/20/8-dead-in-rough-weather-in-guatemala/

Gráfico de barras clásico

df_GT %>% 
  select(Country, Distance, Trigger) 
##        Country Distance          Trigger
## 17   Guatemala  4.74385             Rain
## 27   Guatemala 13.39817 Tropical cyclone
## 28   Guatemala 12.55184 Tropical cyclone
## 41   Guatemala  2.79113             Rain
## 104  Guatemala  3.10150 Tropical cyclone
## 108  Guatemala  3.12614             Rain
## 120  Guatemala  0.80640 Tropical cyclone
## 158  Guatemala  5.31511 Tropical cyclone
## 162  Guatemala  1.58358         Downpour
## 169  Guatemala 23.92309             Rain
## 351  Guatemala  0.77254 Tropical cyclone
## 353  Guatemala  0.18542 Tropical cyclone
## 354  Guatemala  2.02891 Tropical cyclone
## 355  Guatemala  0.44764 Tropical cyclone
## 356  Guatemala  6.13527 Tropical cyclone
## 357  Guatemala  4.07930 Tropical cyclone
## 358  Guatemala  6.00513 Tropical cyclone
## 359  Guatemala  0.99952 Tropical cyclone
## 360  Guatemala  0.50611 Tropical cyclone
## 361  Guatemala  0.89040 Tropical cyclone
## 362  Guatemala  8.93658 Tropical cyclone
## 363  Guatemala  0.17513 Tropical cyclone
## 372  Guatemala  3.85753 Tropical cyclone
## 383  Guatemala  3.85648         Downpour
## 427  Guatemala  2.10418         Downpour
## 428  Guatemala  3.64749         Downpour
## 429  Guatemala  2.81128         Downpour
## 430  Guatemala  6.15103         Downpour
## 431  Guatemala  0.03280         Downpour
## 432  Guatemala  0.00359         Downpour
## 433  Guatemala  2.30104         Downpour
## 437  Guatemala  3.04642         Downpour
## 438  Guatemala  0.92729         Downpour
## 439  Guatemala 21.83272         Downpour
## 440  Guatemala  0.63089         Downpour
## 441  Guatemala  1.36473         Downpour
## 442  Guatemala  0.35171         Downpour
## 818  Guatemala  0.45507             Rain
## 885  Guatemala  7.39906         Downpour
## 1112 Guatemala  0.96647       Earthquake
## 1244 Guatemala  0.91108         Downpour
## 1347 Guatemala  7.03115             Rain
## 1352 Guatemala  5.88787             Rain
## 1353 Guatemala  2.70053             Rain
## 1354 Guatemala  2.59620             Rain
## 1356 Guatemala 22.56101         Downpour
## 1357 Guatemala  4.51954         Downpour
## 1358 Guatemala  3.30989         Downpour
## 1359 Guatemala  5.94535         Downpour
## 1360 Guatemala  3.98185             Rain
## 1361 Guatemala  0.75729             Rain
## 1557 Guatemala  0.94245             Rain
## 1559 Guatemala  3.96161 Tropical cyclone
## 1560 Guatemala  0.82332 Tropical cyclone
## 1561 Guatemala  3.47803             Rain
## 1568 Guatemala  6.19218             Rain
## 1569 Guatemala  5.52205             Rain
## 1570 Guatemala  1.87009             Rain
## 1571 Guatemala  4.20726             Rain
## 1572 Guatemala  3.18658             Rain
## 1573 Guatemala  0.67040             Rain
## 1574 Guatemala  3.80312             Rain
## 1575 Guatemala  1.68290             Rain
## 1576 Guatemala  2.08425             Rain
## 1577 Guatemala  3.25675             Rain
## 1578 Guatemala  3.49341             Rain
## 1579 Guatemala  1.83863             Rain
## 1580 Guatemala  1.57381             Rain
## 1581 Guatemala  1.70147             Rain
## 1582 Guatemala  3.00314             Rain
## 1583 Guatemala  2.27725             Rain
## 1584 Guatemala  2.36376             Rain
## 1585 Guatemala  2.66358             Rain
## 1588 Guatemala  1.45200  Continuous rain
## 1589 Guatemala  5.14479             Rain
## 1590 Guatemala  8.25465          Unknown
## 1591 Guatemala  0.65744             Rain
## 1592 Guatemala  0.75685             Rain
## 1595 Guatemala  1.81216             Rain
library(ggplot2)
library(dplyr)
ggplot(data=df_GT, aes(fill=Trigger, x="Guatemala", y=Distance)) +
  geom_bar(position="dodge", stat="identity")

Gráfico circular

library(ggplot2)
library(dplyr)

data <- data.frame(Trigger = 
                     c("Tropical cyclone", 
                       "Rain", 
                       "Downpour", 
                       "Earthquake",
                       "Continuous rain", 
                       "Unknown"),
                   value = c(20, 35, 21, 1, 1, 1))
knitr::kable(data)
Trigger value
Tropical cyclone 20
Rain 35
Downpour 21
Earthquake 1
Continuous rain 1
Unknown 1
library(ggplot2)
library(dplyr)

data <- data %>% 
  arrange(desc(Trigger)) %>%
  mutate(prop = value / sum(data$value) *78) %>%
  mutate(ypos = cumsum(prop)- 0.5*prop )
require(scales)
## Loading required package: scales
## 
## Attaching package: 'scales'
## The following object is masked from 'package:viridis':
## 
##     viridis_pal
## The following object is masked from 'package:purrr':
## 
##     discard
## The following object is masked from 'package:readr':
## 
##     col_factor
ggplot(data, aes(x="Trigger", y = value, fill=Trigger)) +
  geom_bar(stat="identity", width=1, color="white") +
  coord_polar("y", start=0) +
  theme_void() + 
  theme(legend.position="none") +
  
  geom_text(aes(y = ypos, label = percent(value/100)), color = "white", size=6) +
  scale_fill_brewer(palette="Set1")+
    labs(title="Trigger")

ggplot(data, aes(x = "Trigger", y = value, fill=Trigger)) +
  geom_bar(stat = "identity", width = 1) +
  coord_polar("y", start = 0)

Diagrama de Pareto

df <- data.frame(Trigger =
                   c("Tropical cyclone", 
                       "Rain", 
                       "Downpour", 
                       "Earthquake",
                       "Continuous rain", 
                       "Unknown"),
                   Frequency = c(20, 35, 21, 1, 1, 1))
knitr::kable(df)
Trigger Frequency
Tropical cyclone 20
Rain 35
Downpour 21
Earthquake 1
Continuous rain 1
Unknown 1
library(qcc)

Frequency <- df$Frequency
names(Frequency) <- df$Trigger 

pareto.chart(Frequency, 
             ylab="Frequency",
             col = heat.colors(length(Frequency)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Accumulated Percentage",
             main = "Events that trigger landslides "
)

##                   
## Pareto chart analysis for Frequency
##                     Frequency  Cum.Freq. Percentage Cum.Percent.
##   Rain              35.000000  35.000000  44.303797    44.303797
##   Downpour          21.000000  56.000000  26.582278    70.886076
##   Tropical cyclone  20.000000  76.000000  25.316456    96.202532
##   Earthquake         1.000000  77.000000   1.265823    97.468354
##   Continuous rain    1.000000  78.000000   1.265823    98.734177
##   Unknown            1.000000  79.000000   1.265823   100.000000

Haiti

library(readr)
library(knitr)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[2] <- "Date"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[8] <- "Population"
colnames(df)[9] <- "City"
colnames(df)[10] <- "Distance"
colnames(df)[18] <- "Trigger"
colnames(df)[21] <- "Fatalities"
library(readr)
library(knitr)
df_HT <- subset (df, Country == "Haiti")
knitr::kable(df_HT)
id Date time continent_code Country country_code State Population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size Trigger storm_name injuries Fatalities source_name source_link
43 297 10/8/07 NA Haiti HT Artibonite 7294 Gros Morne 8.70343 19.6990 -72.7540 (19.699000000000002, -72.754000000000005) Landslide Landslide Medium Downpour NA NA https://www-secure.ifrc.org/dmis/prepare/view_report.asp?ReportID=3285
47 303 10/12/07 NA Haiti HT Ouest 3951 Cabaret 0.51272 18.7335 -72.4133 (18.733499999999999, -72.413300000000007) Landslide Complex Large Rain NA 23 Euronews.net http://www.euronews.net/index.php?page=info&article=448067&lng=1
53 334 10/29/07 NA Haiti HT Ouest 1234742 Port-au-Prince 2.72168 18.5146 -72.3361 (18.514600000000002, -72.336100000000002) Landslide Complex Medium Tropical cyclone Tropical Storm Noel NA NA ABC news http://www.abcnews.go.com/International/wireStory?id=3807131
94 506 4/20/08 NA Haiti HT Ouest 1234742 Port-au-Prince 1.80063 18.5283 -72.3224 (18.528300000000002, -72.322400000000002) Landslide Mudslide Medium Rain NA 3 http://www.news.com.au/heraldsun/story/0,21985,23596379-5005961,00.html
139 747 8/26/08 NA Haiti HT Sud-Est 137966 Jacmel 4.41574 18.2640 -72.5070 (18.263999999999999, -72.507000000000005) Landslide Landslide Medium Tropical cyclone Hurricane Gustav NA 25 http://ap.google.com/article/ALeqM5gVWjsPEiqe1tEu2mhBIRaxxGi8owD92RGO9O1
140 748 8/26/08 NA Haiti HT Ouest 1234742 Port-au-Prince 3.50201 18.5090 -72.3450 (18.509, -72.344999999999999) Landslide Mudslide Medium Tropical cyclone Hurricane Gustav NA 3 http://www.reuters.com/article/worldNews/idUSN2541891320080827?pageNumber=1&virtualBrandChannel=0
145 771 9/3/08 NA Haiti HT Artibonite 84961 Gonaïves 4.72379 19.4300 -72.6480 (19.43, -72.647999999999996) Landslide Mudslide Medium Tropical cyclone Hurricane Hannah NA 26 http://www.miamiherald.com/news/americas/cuba/story/671682.html
208 1140 9/7/09 Early morning NA Haiti HT Artibonite 66226 Saint-Marc 17.29836 18.9523 -72.7053 (18.952300000000001, -72.705299999999994) Landslide Mudslide Medium Downpour NA 1 http://www.google.com/hostednews/ap/article/ALeqM5hdjzxxFRHymhlrd1BpUjDSV3HK6AD9AIQ5OO0
223 1266 10/20/09 NA Haiti HT Ouest 442156 Carrefour 1.31659 18.5347 -72.4097 (18.534700000000001, -72.409700000000001) Landslide Landslide Small Downpour NA 4 http://www.etaiwannews.com/etn/news_content.php?id=1088959&lang=eng_news
264 1506 2/15/10 12:00 NA Haiti HT Nord 134815 Cap-Haïtien 0.27505 Urban area 19.7560 -72.2060 (19.756, -72.206000000000003) Landslide Mudslide Medium Downpour NA 4 Times Live http://www.timeslive.co.za/world/article311411.ece
471 2528 10/1/10 NA Haiti HT Ouest 442156 Carrefour 12.13199 18.4468 -72.4577 (18.4468, -72.457700000000003) Landslide Mudslide Medium Downpour NA 3 http://www.presstv.ir/detail/144854.html
481 2604 10/17/10 NA Haiti HT Ouest 134190 Léogâne 7.67473 18.4674 -72.5738 (18.467400000000001, -72.573800000000006) Landslide Complex Medium Downpour NA 8 http://edition.cnn.com/2010/WORLD/americas/10/19/haiti.flooding/
482 2605 10/17/10 NA Haiti HT Ouest 442156 Carrefour 2.63565 18.5202 -72.4111 (18.520199999999999, -72.411100000000005) Landslide Mudslide Medium Downpour NA 2 http://www.npr.org/templates/story/story.php?storyId=130649188
748 3563 6/2/11 NA Haiti HT Sud-Est 137966 Jacmel 0.19079 18.2348 -72.5364 (18.2348, -72.5364) Landslide Landslide Small Downpour NA 0 http://www.haitilibre.com/en/news-3095-haiti-climate-the-situation-by-department.html
749 3564 6/2/11 NA Haiti HT Centre 18590 Hinche 7.86436 19.2088 -71.9747 (19.2088, -71.974699999999999) Landslide Landslide Medium Downpour NA 1 http://www.haitilibre.com/en/news-3095-haiti-climate-the-situation-by-department.html
754 3576 6/7/11 NA Haiti HT Ouest 283052 Pétionville 0.11071 18.5135 -72.2853 (18.513500000000001, -72.285300000000007) Landslide Landslide Large Downpour NA 13 http://www.bbc.co.uk/news/world-latin-america-13689711
873 4289 3/30/12 Late night NA Haiti HT Ouest 283052 Pétionville 1.33931 18.5044 -72.2947 (18.5044, -72.294700000000006) Landslide Landslide Medium Downpour NA 6 http://www.haitilibre.com/en/news-5290-haiti-weather-first-drama-of-the-rain.html
875 4312 4/8/12 NA Haiti HT Nord 32645 Limbé 0.03471 19.7041 -72.4006 (19.7041, -72.400599999999997) Landslide Landslide Medium Downpour NA 2 http://www.usatoday.com/news/world/story/2012-04-10/Haiti-floods/54160810/1
1401 6713 11/1/14 NA Haiti HT Nord 134815 Okap 5.23459 Urban area 19.7450 -72.2152 (19.745000000000001, -72.215199999999996) Landslide Landslide Medium Downpour 0 1 reliefweb http://reliefweb.int/report/haiti/undp-government-haiti-provide-immediate-support-flood-affected-victims
1402 6722 5/27/14 NA Haiti HT Nord 134815 Okap 1.58489 Unknown 19.7698 -72.2085 (19.7698, -72.208500000000001) Landslide Landslide Small Continuous rain 1 3 Business Recorder http://www.brecorder.com/world/north-america/15393-three-children-die-in-haiti-landslide.html

Gráfico apilado

df_HT %>% 
  select(Country, City, Distance)
##      Country           City Distance
## 43     Haiti     Gros Morne  8.70343
## 47     Haiti        Cabaret  0.51272
## 53     Haiti Port-au-Prince  2.72168
## 94     Haiti Port-au-Prince  1.80063
## 139    Haiti         Jacmel  4.41574
## 140    Haiti Port-au-Prince  3.50201
## 145    Haiti      Gonaïves  4.72379
## 208    Haiti     Saint-Marc 17.29836
## 223    Haiti      Carrefour  1.31659
## 264    Haiti   Cap-Haïtien  0.27505
## 471    Haiti      Carrefour 12.13199
## 481    Haiti      Léogâne  7.67473
## 482    Haiti      Carrefour  2.63565
## 748    Haiti         Jacmel  0.19079
## 749    Haiti         Hinche  7.86436
## 754    Haiti   Pétionville  0.11071
## 873    Haiti   Pétionville  1.33931
## 875    Haiti         Limbé  0.03471
## 1401   Haiti           Okap  5.23459
## 1402   Haiti           Okap  1.58489
library(ggplot2)
library(dplyr)
ggplot(data=df_HT, aes(fill=City, x="Haiti", y=Distance)) +
  geom_bar(position="stack", stat="identity")

Gráfico agrupado

df_HT %>% 
  select(Country, Fatalities)
##      Country Fatalities
## 43     Haiti         NA
## 47     Haiti         23
## 53     Haiti         NA
## 94     Haiti          3
## 139    Haiti         25
## 140    Haiti          3
## 145    Haiti         26
## 208    Haiti          1
## 223    Haiti          4
## 264    Haiti          4
## 471    Haiti          3
## 481    Haiti          8
## 482    Haiti          2
## 748    Haiti          0
## 749    Haiti          1
## 754    Haiti         13
## 873    Haiti          6
## 875    Haiti          2
## 1401   Haiti          1
## 1402   Haiti          3
Fatalities <- c(0, 23, 0, 3, 25, 3, 26, 1, 4, 4, 3, 8, 2, 0, 1, 13, 6, 2, 1, 3)
knitr::kable(head(Fatalities))
x
0
23
0
3
25
3
n_sturges = 1 + log(length(Fatalities))/log(2)
n_sturgesc = ceiling(n_sturges)
n_sturgesf = floor(n_sturges)

n_clases = 0
if (n_sturgesc%%2 == 0) {
  n_clases = n_sturgesf
} else {
  n_clases = n_sturgesc
}
R = max(Fatalities) - min(Fatalities)
w = ceiling(R/n_clases)

bins <- seq(min(Fatalities), max(Fatalities) + w, by = w)
bins
## [1]  0  6 12 18 24 30
Fatalities <- cut(Fatalities, bins)
Freq_table <- transform(table(Fatalities), Rel_Freq=prop.table(Freq), Cum_Freq=cumsum(Freq))
knitr::kable(Freq_table)
Fatalities Freq Rel_Freq Cum_Freq
(0,6] 12 0.7058824 12
(6,12] 1 0.0588235 13
(12,18] 1 0.0588235 14
(18,24] 1 0.0588235 15
(24,30] 2 0.1176471 17
str(Freq_table)
## 'data.frame':    5 obs. of  4 variables:
##  $ Fatalities: Factor w/ 5 levels "(0,6]","(6,12]",..: 1 2 3 4 5
##  $ Freq      : int  12 1 1 1 2
##  $ Rel_Freq  : num  0.7059 0.0588 0.0588 0.0588 0.1176
##  $ Cum_Freq  : int  12 13 14 15 17
df <- data.frame(x = Freq_table$Fatalities, y = Freq_table$Freq)
knitr::kable(df)
x y
(0,6] 12
(6,12] 1
(12,18] 1
(18,24] 1
(24,30] 2
library(ggplot2)

ggplot(data=df, aes(x=x, y=y)) +
  geom_bar(stat="identity", color="blue", fill="green") +
  xlab("Fatalities") +
  ylab("Frecuencia")

Dominica

library(readr)
library(knitr)
library(ggplot2)
df <- read.csv("https://raw.githubusercontent.com/lihkir/AnalisisEstadisticoUN/main/Data/catalog.csv")
library(dplyr)
colnames(df)[2] <- "Date"
colnames(df)[5] <- "Country"
colnames(df)[7] <- "State"
colnames(df)[8] <- "Population"
colnames(df)[9] <- "City"
colnames(df)[10] <- "Distance"
library(readr)
library(knitr)
df_DOM <- subset (df, Country == "Dominica")
knitr::kable(head(df_DOM))
id Date time continent_code Country country_code State Population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
20 186 8/17/07 NA Dominica DM Saint Paul 702 Pont Cassé 3.39516 15.3379 -61.3610 (15.337899999999999, -61.360999999999997) Landslide Mudslide Small Tropical cyclone Hurricane Dean NA 2 Tribune India http://www.tribuneindia.com/2007/20070817/himachal.htm
39 250 9/9/07 NA Dominica DM Saint George 16571 Roseau 2.59849 15.3055 -61.3642 (15.3055, -61.364199999999997) Landslide Landslide Medium Rain Tropical Wave NA NA RadioJamaica http://www.radiojamaica.com/content/view/1156/88/
267 1552 3/11/10 NA Dominica DM Saint Paul 702 Pont Cassé 3.98646 15.3356 -61.3312 (15.335599999999999, -61.331200000000003) Landslide Landslide Medium Rain NA 0 http://stormcarib.com/reports/current/report.php?id=1268397271_8827
297 1743 4/12/10 NA Dominica DM Saint Patrick 2608 Berekua 2.08997 15.2454 -61.3017 (15.2454, -61.301699999999997) Landslide Landslide Medium Downpour NA 0 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
298 1744 4/12/10 NA Dominica DM Saint Paul 702 Pont Cassé 3.78784 15.4004 -61.3440 (15.400399999999999, -61.344000000000001) Landslide Landslide Medium Downpour NA 0 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
299 1745 4/12/10 NA Dominica DM Saint Patrick 2608 Berekua 4.08252 15.2458 -61.2809 (15.245799999999999, -61.280900000000003) Landslide Landslide Small Downpour NA 0 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
df_DOM %>% 
  select(Country, State, City, Distance, Date) 
##       Country         State         City Distance     Date
## 20   Dominica    Saint Paul  Pont Cassé  3.39516  8/17/07
## 39   Dominica  Saint George       Roseau  2.59849   9/9/07
## 267  Dominica    Saint Paul  Pont Cassé  3.98646  3/11/10
## 297  Dominica Saint Patrick      Berekua  2.08997  4/12/10
## 298  Dominica    Saint Paul  Pont Cassé  3.78784  4/12/10
## 299  Dominica Saint Patrick      Berekua  4.08252  4/12/10
## 300  Dominica Saint Patrick      Berekua  5.61495  4/12/10
## 301  Dominica Saint Patrick    La Plaine  5.11600  4/12/10
## 304  Dominica    Saint Paul  Pont Cassé  6.45930  4/16/10
## 476  Dominica  Saint Andrew   Calibishie  2.64873  10/5/10
## 1190 Dominica    Saint Paul  Pont Cassé  4.20239 12/24/13
## 1193 Dominica    Saint John   Portsmouth  5.92994 12/24/13
## 1194 Dominica    Saint Mark   Soufrière  1.80847 12/24/13
## 1201 Dominica  Saint Joseph Saint Joseph  2.38605   1/7/14
head(df_DOM)
##       id    Date time continent_code  Country country_code         State
## 20   186 8/17/07                <NA> Dominica           DM    Saint Paul
## 39   250  9/9/07                <NA> Dominica           DM  Saint George
## 267 1552 3/11/10                <NA> Dominica           DM    Saint Paul
## 297 1743 4/12/10                <NA> Dominica           DM Saint Patrick
## 298 1744 4/12/10                <NA> Dominica           DM    Saint Paul
## 299 1745 4/12/10                <NA> Dominica           DM Saint Patrick
##     Population        City Distance location_description latitude longitude
## 20         702 Pont Cassé  3.39516                       15.3379  -61.3610
## 39       16571      Roseau  2.59849                       15.3055  -61.3642
## 267        702 Pont Cassé  3.98646                       15.3356  -61.3312
## 297       2608     Berekua  2.08997                       15.2454  -61.3017
## 298        702 Pont Cassé  3.78784                       15.4004  -61.3440
## 299       2608     Berekua  4.08252                       15.2458  -61.2809
##                                   geolocation hazard_type landslide_type
## 20  (15.337899999999999, -61.360999999999997)   Landslide       Mudslide
## 39             (15.3055, -61.364199999999997)   Landslide      Landslide
## 267 (15.335599999999999, -61.331200000000003)   Landslide      Landslide
## 297            (15.2454, -61.301699999999997)   Landslide      Landslide
## 298 (15.400399999999999, -61.344000000000001)   Landslide      Landslide
## 299 (15.245799999999999, -61.280900000000003)   Landslide      Landslide
##     landslide_size          trigger     storm_name injuries fatalities
## 20           Small Tropical cyclone Hurricane Dean       NA          2
## 39          Medium             Rain  Tropical Wave       NA         NA
## 267         Medium             Rain                      NA          0
## 297         Medium         Downpour                      NA          0
## 298         Medium         Downpour                      NA          0
## 299          Small         Downpour                      NA          0
##       source_name
## 20  Tribune India
## 39   RadioJamaica
## 267              
## 297              
## 298              
## 299              
##                                                                                                    source_link
## 20                                                      http://www.tribuneindia.com/2007/20070817/himachal.htm
## 39                                                           http://www.radiojamaica.com/content/view/1156/88/
## 267                                        http://stormcarib.com/reports/current/report.php?id=1268397271_8827
## 297 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
## 298 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
## 299 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html

Grafica de Pareto

Utilizada para representar a las ciudades con mayor deslizamiento.

library(qcc)

Distance <- df_DOM$Distance
names(Distance) <- df_DOM$City 

pareto.chart(Distance, 
             ylab="Distancia",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "Ciudades con mayor deslizamientos"
)

##               
## Pareto chart analysis for Distance
##                 Frequency  Cum.Freq. Percentage Cum.Percent.
##   Pont Cassé    6.459300   6.459300  11.938173    11.938173
##   Portsmouth     5.929940  12.389240  10.959802    22.897975
##   Berekua        5.614950  18.004190  10.377633    33.275607
##   La Plaine      5.116000  23.120190   9.455466    42.731073
##   Pont Cassé    4.202390  27.322580   7.766919    50.497992
##   Berekua        4.082520  31.405100   7.545373    58.043365
##   Pont Cassé    3.986460  35.391560   7.367834    65.411199
##   Pont Cassé    3.787840  39.179400   7.000741    72.411940
##   Pont Cassé    3.395160  42.574560   6.274984    78.686925
##   Calibishie     2.648730  45.223290   4.895422    83.582346
##   Roseau         2.598490  47.821780   4.802567    88.384914
##   Saint Joseph   2.386050  50.207830   4.409933    92.794846
##   Berekua        2.089970  52.297800   3.862713    96.657559
##   Soufrière     1.808470  54.106270   3.342441   100.000000
library(forecast)
data_serie<- ts(df_DOM$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan     Feb     Mar     Apr     May     Jun
## 2007 3.39516 2.59849 3.98646 2.08997 3.78784 4.08252

Serie de tiempo

Representa el deslizamiento en Dominica a través de los años.

autoplot(data_serie)+
labs(title = "Deslizamiento", x="Años", y = "Distancia", colour = "#00a0dc") +theme_grey()

Salvador

library(readr)
library(knitr)
df_SV <- subset (df, Country == "El Salvador")
knitr::kable(head(df_DOM))
id Date time continent_code Country country_code State Population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
20 186 8/17/07 NA Dominica DM Saint Paul 702 Pont Cassé 3.39516 15.3379 -61.3610 (15.337899999999999, -61.360999999999997) Landslide Mudslide Small Tropical cyclone Hurricane Dean NA 2 Tribune India http://www.tribuneindia.com/2007/20070817/himachal.htm
39 250 9/9/07 NA Dominica DM Saint George 16571 Roseau 2.59849 15.3055 -61.3642 (15.3055, -61.364199999999997) Landslide Landslide Medium Rain Tropical Wave NA NA RadioJamaica http://www.radiojamaica.com/content/view/1156/88/
267 1552 3/11/10 NA Dominica DM Saint Paul 702 Pont Cassé 3.98646 15.3356 -61.3312 (15.335599999999999, -61.331200000000003) Landslide Landslide Medium Rain NA 0 http://stormcarib.com/reports/current/report.php?id=1268397271_8827
297 1743 4/12/10 NA Dominica DM Saint Patrick 2608 Berekua 2.08997 15.2454 -61.3017 (15.2454, -61.301699999999997) Landslide Landslide Medium Downpour NA 0 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
298 1744 4/12/10 NA Dominica DM Saint Paul 702 Pont Cassé 3.78784 15.4004 -61.3440 (15.400399999999999, -61.344000000000001) Landslide Landslide Medium Downpour NA 0 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
299 1745 4/12/10 NA Dominica DM Saint Patrick 2608 Berekua 4.08252 15.2458 -61.2809 (15.245799999999999, -61.280900000000003) Landslide Landslide Small Downpour NA 0 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
df_SV %>% 
  select(Country, State, City, Distance, Date) 
##          Country        State                 City Distance     Date
## 34   El Salvador  Ahuachapán Concepción de Ataco  0.00273   9/5/07
## 105  El Salvador  La Libertad          Santa Tecla  4.96416   6/2/08
## 224  El Salvador  San Vicente          San Vicente  7.60946  11/8/09
## 225  El Salvador  La Libertad   Antiguo Cuscatlán  4.86219  11/8/09
## 226  El Salvador  San Vicente          San Vicente  5.90726  11/8/09
## 227  El Salvador  San Vicente          San Vicente  4.03125  11/8/09
## 453  El Salvador  Ahuachapán               Tacuba  5.29901  9/26/10
## 824  El Salvador San Salvador                Apopa  3.01739 10/10/11
## 1294 El Salvador   San Miguel           Chirilagua  6.94536 10/13/14
## 1366 El Salvador   San Miguel   San Rafael Oriente 10.06695  5/22/14
## 1367 El Salvador     Cabañas          San Martín  8.82525  4/21/14
## 1369 El Salvador    Sonsonate           Nahuizalco  4.23875 10/15/14
## 1370 El Salvador    Sonsonate            Sonzacate  3.22235 10/15/14
## 1371 El Salvador       La Paz   San Pedro Masahuat  0.31933 10/15/14
## 1372 El Salvador   San Miguel           Chirilagua  9.97227 10/15/14
## 1373 El Salvador    Santa Ana           Coatepeque  8.83210 10/12/14
## 1374 El Salvador  La Libertad          Santa Tecla  4.60655 10/12/14
## 1375 El Salvador San Salvador   Antiguo Cuscatlán  3.25227 10/12/14
## 1594 El Salvador    Santa Ana          Ciudad Arce  1.15810  7/18/15
## 1596 El Salvador  La Libertad          Santa Tecla  4.67722  11/3/15
## 1597 El Salvador  La Libertad          Santa Tecla  9.87553  11/4/15
## 1598 El Salvador    Sonsonate              Juayúa  0.49346 10/19/15
head(df_SV)
##       id    Date time continent_code     Country country_code       State
## 34   230  9/5/07                <NA> El Salvador           SV Ahuachapán
## 105  564  6/2/08                <NA> El Salvador           SV La Libertad
## 224 1285 11/8/09                <NA> El Salvador           SV San Vicente
## 225 1286 11/8/09                <NA> El Salvador           SV La Libertad
## 226 1287 11/8/09                <NA> El Salvador           SV San Vicente
## 227 1288 11/8/09                <NA> El Salvador           SV San Vicente
##     Population                 City Distance location_description latitude
## 34        7797 Concepción de Ataco  0.00273                       13.8703
## 105     124694          Santa Tecla  4.96416                       13.7205
## 224      41504          San Vicente  7.60946                       13.6409
## 225      33767   Antiguo Cuscatlán  4.86219                       13.7156
## 226      41504          San Vicente  5.90726                       13.6094
## 227      41504          San Vicente  4.03125                       13.6466
##     longitude                               geolocation hazard_type
## 34   -89.8486            (13.8703, -89.848600000000005)   Landslide
## 105  -89.2687 (13.720499999999999, -89.268699999999995)   Landslide
## 224  -88.8699            (13.6409, -88.869900000000001)   Landslide
## 225  -89.2521            (13.7156, -89.252099999999999)   Landslide
## 226  -88.8488 (13.609400000000001, -88.848799999999997)   Landslide
## 227  -88.8347 (13.646599999999999, -88.834699999999998)   Landslide
##     landslide_type landslide_size          trigger            storm_name
## 34        Mudslide         Medium Tropical cyclone       Hurricane Felix
## 105      Landslide         Medium Tropical cyclone Tropical Storm Arthur
## 224        Complex     Very_large Tropical cyclone  Tropical Cyclone Ida
## 225       Mudslide         Medium Tropical cyclone  Tropical Cyclone Ida
## 226       Rockfall         Medium Tropical cyclone  Tropical Cyclone Ida
## 227       Mudslide         Medium Tropical cyclone  Tropical Cyclone Ida
##     injuries fatalities   source_name
## 34        NA         NA Azcentral.com
## 105       NA         NA              
## 224       NA         23              
## 225       NA          4              
## 226       NA         NA              
## 227       NA         NA              
##                                                                                 source_link
## 34                   http://www.azcentral.com/news/articles/1108sr-fhsistercity1109-ON.html
## 105                        http://news.xinhuanet.com/english/2008-06/04/content_8310737.htm
## 224 http://www.google.com/hostednews/ap/article/ALeqM5j0XCCb1n12DyhoBoDzGj_hTyEtrAD9BRKPRG0
## 225 http://www.google.com/hostednews/ap/article/ALeqM5j0XCCb1n12DyhoBoDzGj_hTyEtrAD9BRKPRG0
## 226                                         http://news.bbc.co.uk/2/hi/in_depth/8349333.stm
## 227         http://news.yahoo.com/s/afp/20091109/wl_afp/salvadorweatherstorm_20091109100952

Grafica de Pareto, utilizada para representar a las ciudades con mayor deslizamiento.

library(qcc)

Distance <- df_SV$Distance
names(Distance) <- df_SV$City 

pareto.chart(Distance, 
             ylab="Distancia",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "Ciudades con mayor deslizamientos"
)

##                       
## Pareto chart analysis for Distance
##                           Frequency    Cum.Freq.   Percentage Cum.Percent.
##   San Rafael Oriente   1.006695e+01 1.006695e+01 8.974011e+00 8.974011e+00
##   Chirilagua           9.972270e+00 2.003922e+01 8.889610e+00 1.786362e+01
##   Santa Tecla          9.875530e+00 2.991475e+01 8.803373e+00 2.666699e+01
##   Coatepeque           8.832100e+00 3.874685e+01 7.873225e+00 3.454022e+01
##   San Martín          8.825250e+00 4.757210e+01 7.867118e+00 4.240734e+01
##   San Vicente          7.609460e+00 5.518156e+01 6.783323e+00 4.919066e+01
##   Chirilagua           6.945360e+00 6.212692e+01 6.191323e+00 5.538198e+01
##   San Vicente          5.907260e+00 6.803418e+01 5.265926e+00 6.064791e+01
##   Tacuba               5.299010e+00 7.333319e+01 4.723712e+00 6.537162e+01
##   Santa Tecla          4.964160e+00 7.829735e+01 4.425216e+00 6.979684e+01
##   Antiguo Cuscatlán   4.862190e+00 8.315954e+01 4.334316e+00 7.413115e+01
##   Santa Tecla          4.677220e+00 8.783676e+01 4.169428e+00 7.830058e+01
##   Santa Tecla          4.606550e+00 9.244331e+01 4.106430e+00 8.240701e+01
##   Nahuizalco           4.238750e+00 9.668206e+01 3.778561e+00 8.618557e+01
##   San Vicente          4.031250e+00 1.007133e+02 3.593589e+00 8.977916e+01
##   Antiguo Cuscatlán   3.252270e+00 1.039656e+02 2.899181e+00 9.267834e+01
##   Sonzacate            3.222350e+00 1.071879e+02 2.872509e+00 9.555085e+01
##   Apopa                3.017390e+00 1.102053e+02 2.689801e+00 9.824065e+01
##   Ciudad Arce          1.158100e+00 1.113634e+02 1.032368e+00 9.927302e+01
##   Juayúa              4.934600e-01 1.118569e+02 4.398865e-01 9.971291e+01
##   San Pedro Masahuat   3.193300e-01 1.121762e+02 2.846613e-01 9.999757e+01
##   Concepción de Ataco 2.730000e-03 1.121789e+02 2.433612e-03 1.000000e+02
library(forecast)
data_serie<- ts(df_SV$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan     Feb     Mar     Apr     May     Jun
## 2007 0.00273 4.96416 7.60946 4.86219 5.90726 4.03125

Esta serie de tiempo, representa el deslizamiento en Honduras a través de los años.

autoplot(data_serie)+
labs(title = "Deslizamiento", x="Años", y = "Distancia", colour = "#752514") +theme_grey()

Honduras

library(readr)
library(knitr)
df_HON <- subset (df, Country == "Honduras")
knitr::kable(head(df_DOM))
id Date time continent_code Country country_code State Population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
20 186 8/17/07 NA Dominica DM Saint Paul 702 Pont Cassé 3.39516 15.3379 -61.3610 (15.337899999999999, -61.360999999999997) Landslide Mudslide Small Tropical cyclone Hurricane Dean NA 2 Tribune India http://www.tribuneindia.com/2007/20070817/himachal.htm
39 250 9/9/07 NA Dominica DM Saint George 16571 Roseau 2.59849 15.3055 -61.3642 (15.3055, -61.364199999999997) Landslide Landslide Medium Rain Tropical Wave NA NA RadioJamaica http://www.radiojamaica.com/content/view/1156/88/
267 1552 3/11/10 NA Dominica DM Saint Paul 702 Pont Cassé 3.98646 15.3356 -61.3312 (15.335599999999999, -61.331200000000003) Landslide Landslide Medium Rain NA 0 http://stormcarib.com/reports/current/report.php?id=1268397271_8827
297 1743 4/12/10 NA Dominica DM Saint Patrick 2608 Berekua 2.08997 15.2454 -61.3017 (15.2454, -61.301699999999997) Landslide Landslide Medium Downpour NA 0 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
298 1744 4/12/10 NA Dominica DM Saint Paul 702 Pont Cassé 3.78784 15.4004 -61.3440 (15.400399999999999, -61.344000000000001) Landslide Landslide Medium Downpour NA 0 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
299 1745 4/12/10 NA Dominica DM Saint Patrick 2608 Berekua 4.08252 15.2458 -61.2809 (15.245799999999999, -61.280900000000003) Landslide Landslide Small Downpour NA 0 http://www.dominicacentral.com/general/community/heavy-overnight-rains-cause-landslides-across-island.html
df_HON %>% 
  select(Country, State, City, Distance, Date) 
##       Country               State                       City Distance     Date
## 159  Honduras              Copán                   Corquín  0.43391 10/19/08
## 160  Honduras Francisco Morazán                 Tegucigalpa  2.99239 10/20/08
## 376  Honduras Francisco Morazán                 Tegucigalpa  0.98377  7/12/10
## 381  Honduras Francisco Morazán                 Tegucigalpa  1.24404  7/18/10
## 406  Honduras Francisco Morazán                 Tegucigalpa  2.21442   8/7/10
## 435  Honduras Francisco Morazán                Santa Lucía  4.75791  8/29/10
## 474  Honduras           Comayagua                  El Rancho  4.53362  10/3/10
## 485  Honduras              Colón                     Cusuna 36.37629 10/25/10
## 820  Honduras Francisco Morazán                 Tegucigalpa  1.23639  9/26/11
## 1100 Honduras             Cortés                Los Caminos  3.53737  8/29/13
## 1279 Honduras           Choluteca           Ciudad Choluteca  3.69596   7/2/14
## 1288 Honduras                Yoro                       Yoro  0.31238  5/20/14
## 1363 Honduras          Ocotepeque                    Sinuapa  2.00805 10/13/14
## 1377 Honduras             Cortés           Agua Azul Rancho  0.97057  7/31/14
## 1379 Honduras      Santa Bárbara                   Agualote  2.91594 10/14/14
## 1599 Honduras         El Paraíso                             1.90052   6/9/15
## 1602 Honduras Francisco Morazán                     El Lolo  1.85897  9/28/15
## 1603 Honduras Francisco Morazán                 Tegucigalpa  3.25281  9/28/15
## 1604 Honduras           Choluteca                     Duyure 11.67237  6/11/15
## 1605 Honduras           Choluteca                     Corpus  0.36987 12/15/15
## 1610 Honduras           Comayagua                   El Sauce  7.28575 10/16/15
## 1611 Honduras           Comayagua                La Libertad 17.28613 10/29/15
## 1612 Honduras           Comayagua Concepción de Guasistagua  8.52584 10/16/15
## 1613 Honduras              Copán       Santa Rosa de Copán  0.74414   9/6/15
## 1614 Honduras              Copán       Santa Rosa de Copán  0.28887   9/6/15
## 1615 Honduras              Copán               Ojos de Agua  1.39095 11/21/15
## 1616 Honduras              La Paz                  San José  4.69133  9/25/15
## 1617 Honduras              Copán                    Lucerna  5.89721  9/24/15
## 1618 Honduras          Ocotepeque                   La Labor  5.79867  9/25/15
## 1619 Honduras Francisco Morazán                 Villa Nueva  2.00830  6/13/15
## 1620 Honduras      Santa Bárbara                      Ilama  2.87349  9/28/15
## 1622 Honduras Francisco Morazán                 El Guapinol  3.54399 10/21/15
## 1623 Honduras                Yoro                 La Sarrosa  6.66574  1/22/15
## 1624 Honduras Francisco Morazán                 El Tablón   3.12986  6/13/15
## 1638 Honduras Francisco Morazán                 Tegucigalpa  0.91552  9/28/15
## 1639 Honduras Francisco Morazán                   Yaguacire  1.30583  9/28/15
## 1640 Honduras Francisco Morazán                  Río Abajo  3.63962   9/9/15
## 1641 Honduras Francisco Morazán                 Tegucigalpa  2.91326  9/20/15
head(df_HON)
##       id     Date      time continent_code  Country country_code
## 159  854 10/19/08                     <NA> Honduras           HN
## 160  855 10/20/08                     <NA> Honduras           HN
## 376 2062  7/12/10   5:30:00           <NA> Honduras           HN
## 381 2093  7/18/10                     <NA> Honduras           HN
## 406 2217   8/7/10 Overnight           <NA> Honduras           HN
## 435 2358  8/29/10   4:30:00           <NA> Honduras           HN
##                   State Population         City Distance location_description
## 159              Copán       4752     Corquín  0.43391                     
## 160 Francisco Morazán      850848  Tegucigalpa  2.99239                     
## 376 Francisco Morazán      850848  Tegucigalpa  0.98377                     
## 381 Francisco Morazán      850848  Tegucigalpa  1.24404                     
## 406 Francisco Morazán      850848  Tegucigalpa  2.21442                     
## 435 Francisco Morazán        2288 Santa Lucía  4.75791                     
##     latitude longitude                               geolocation hazard_type
## 159  14.5637  -88.8693 (14.563700000000001, -88.869299999999996)   Landslide
## 160  14.1080  -87.2137 (14.108000000000001, -87.213700000000003)   Landslide
## 376  14.0831  -87.1978            (14.0831, -87.197800000000001)   Landslide
## 381  14.0814  -87.1953            (14.0814, -87.195300000000003)   Landslide
## 406  14.0783  -87.2270            (14.0783, -87.227000000000004)   Landslide
## 435  14.1015  -87.1607            (14.1015, -87.160700000000006)   Landslide
##     landslide_type landslide_size          trigger             storm_name
## 159      Landslide          Large Tropical cyclone Tropical Depression 16
## 160       Mudslide          Large Tropical cyclone Tropical Depression 16
## 376       Mudslide         Medium         Downpour                       
## 381      Landslide         Medium         Downpour                       
## 406       Mudslide         Medium         Downpour                       
## 435       Rockfall         Medium         Downpour                       
##     injuries fatalities source_name
## 159       NA         23            
## 160       NA         29            
## 376       NA          1            
## 381       NA          0            
## 406       NA          3            
## 435       NA          5            
##                                                                           source_link
## 159                         http://www.chron.com/disp/story.mpl/ap/world/6068144.html
## 160 http://in.ibtimes.com/articles/20081021/honduras-landslide-tegucigalpa-victim.htm
## 376                 http://mdn.mainichi.jp/mdnnews/news/20100713p2a00m0na013000c.html
## 381  http://www.insidecostarica.com/dailynews/2010/july/19/centralamerica10071903.htm
## 406                                                                                  
## 435

Grafica de Pareto, utilizada para representar a las ciudades con mayor deslizamiento.

library(qcc)

Distance <- df_HON$Distance
names(Distance) <- df_HON$City 

pareto.chart(Distance, 
             ylab="Distancia",
             col = heat.colors(length(Distance)),
             cumperc = seq(0, 100, by = 10),
             ylab2 = "Porcentaje acumulado",
             main = "Ciudades con mayor deslizamientos"
)

##                             
## Pareto chart analysis for Distance
##                                Frequency   Cum.Freq.  Percentage Cum.Percent.
##   Cusuna                      36.3762900  36.3762900  21.8907391   21.8907391
##   La Libertad                 17.2861300  53.6624200  10.4025496   32.2932888
##   Duyure                      11.6723700  65.3347900   7.0242679   39.3175567
##   Concepción de Guasistagua   8.5258400  73.8606300   5.1307305   44.4482872
##   El Sauce                     7.2857500  81.1463800   4.3844618   48.8327489
##   La Sarrosa                   6.6657400  87.8121200   4.0113485   52.8440974
##   Lucerna                      5.8972100  93.7093300   3.5488579   56.3929554
##   La Labor                     5.7986700  99.5080000   3.4895580   59.8825133
##   Santa Lucía                 4.7579100 104.2659100   2.8632432   62.7457566
##   San José                    4.6913300 108.9572400   2.8231763   65.5689329
##   El Rancho                    4.5336200 113.4908600   2.7282687   68.2972016
##   Ciudad Choluteca             3.6959600 117.1868200   2.2241767   70.5213783
##   Río Abajo                   3.6396200 120.8264400   2.1902721   72.7116504
##   El Guapinol                  3.5439900 124.3704300   2.1327233   74.8443736
##   Los Caminos                  3.5373700 127.9078000   2.1287395   76.9731131
##   Tegucigalpa                  3.2528100 131.1606100   1.9574953   78.9306084
##   El Tablón                   3.1298600 134.2904700   1.8835057   80.8141140
##   Tegucigalpa                  2.9923900 137.2828600   1.8007782   82.6148922
##   Agualote                     2.9159400 140.1988000   1.7547716   84.3696639
##   Tegucigalpa                  2.9132600 143.1120600   1.7531588   86.1228227
##   Ilama                        2.8734900 145.9855500   1.7292258   87.8520485
##   Tegucigalpa                  2.2144200 148.1999700   1.3326068   89.1846553
##   Villa Nueva                  2.0083000 150.2082700   1.2085667   90.3932220
##   Sinuapa                      2.0080500 152.2163200   1.2084162   91.6016382
##                                1.9005200 154.1168400   1.1437062   92.7453444
##   El Lolo                      1.8589700 155.9758100   1.1187020   93.8640463
##   Ojos de Agua                 1.3909500 157.3667600   0.8370541   94.7011005
##   Yaguacire                    1.3058300 158.6725900   0.7858301   95.4869306
##   Tegucigalpa                  1.2440400 159.9166300   0.7486458   96.2355763
##   Tegucigalpa                  1.2363900 161.1530200   0.7440421   96.9796184
##   Tegucigalpa                  0.9837700 162.1367900   0.5920189   97.5716373
##   Agua Azul Rancho             0.9705700 163.1073600   0.5840754   98.1557127
##   Tegucigalpa                  0.9155200 164.0228800   0.5509470   98.7066598
##   Santa Rosa de Copán         0.7441400 164.7670200   0.4478130   99.1544727
##   Corquín                     0.4339100 165.2009300   0.2611209   99.4155937
##   Corpus                       0.3698700 165.5708000   0.2225826   99.6381762
##   Yoro                         0.3123800 165.8831800   0.1879859   99.8261621
##   Santa Rosa de Copán         0.2888700 166.1720500   0.1738379  100.0000000
library(forecast)
data_serie<- ts(df_HON$Distance, frequency=12, start=2007)
head(data_serie)
##          Jan     Feb     Mar     Apr     May     Jun
## 2007 0.43391 2.99239 0.98377 1.24404 2.21442 4.75791

Esta serie de tiempo, representa el deslizamiento en Honduras a través de los años.

autoplot(data_serie)+
labs(title = "Deslizamiento", x="Años", y = "Distancia", colour = "#752514") +theme_grey()

Puerto Rico

library(readr)
library(knitr)
df_PR <- subset (df, Country == "Puerto Rico")
knitr::kable(head(df_PR))
id Date time continent_code Country country_code State Population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
68 393 12/12/07 NA Puerto Rico PR San Juan 418140 San Juan 6.91777 18.4320 -66.0510 (18.431999999999999, -66.051000000000002) Landslide Landslide Medium Tropical cyclone Tropical Storm Olga NA NA AP.google.com http://ap.google.com/article/ALeqM5gVWjsPEiqe1tEu2mhBIRaxxGi8owD8TFVR600
477 2550 10/6/10 NA Puerto Rico PR Orocovis 944 Orocovis 6.85760 18.1652 -66.3969 (18.165199999999999, -66.396900000000002) Landslide Complex Medium Tropical cyclone Tropical Storm Otto NA 0 http://www.whitehouse.gov/the-press-office/2010/10/26/president-obama-signs-puerto-rico-disaster-declaration
1396 6708 5/18/14 16:30 NA Puerto Rico PR Vega Alta 12036 Vega Alta 3.49090 Mine construction 18.3806 -66.3319 (18.380600000000001, -66.331900000000005) Landslide Other Small Rain 0 0 Telemundo http://www.telemundopr.com/telenoticias/puerto-rico/Deslizamiento-deja-a-familias-incomunicadas-en-Vega-Alta-258522361.html
1397 6709 9/24/14 NA Puerto Rico PR Aguada 4040 Aguada 1.40257 Unknown 18.3711 -67.1782 (18.371099999999998, -67.178200000000004) Landslide Landslide Medium Downpour 0 0 Telemundo http://www.telemundopr.com/telenoticias/puerto-rico/Viviendas-inhabitables-luego-de-deslizamiento-de-tierras-en-Aguada-277123031.html
1398 6710 8/24/14 3:00 NA Puerto Rico PR Ponce 5080 Adjuntas 5.78872 Unknown 18.1283 -66.6810 (18.128299999999999, -66.680999999999997) Landslide Landslide Small Downpour 0 0 Perla del Sur http://www.periodicolaperla.com/index.php?option=com_content&view=article&id=6371:surgen-nuevos-deslizamientos-en-ponce&catid=135:actualidad-del-sur&Itemid=423
1399 6711 8/24/14 NA Puerto Rico PR Ponce 5080 Adjuntas 6.89036 Unknown 18.1254 -66.6700 (18.125399999999999, -66.67) Landslide Landslide Medium Downpour 0 0 Perla del Sur http://www.periodicolaperla.com/index.php?option=com_content&view=article&id=6371:surgen-nuevos-deslizamientos-en-ponce&catid=135:actualidad-del-sur&Itemid=423
df_PR %>% 
  select(Country, State, City, Distance, Date) 
##          Country     State      City Distance     Date
## 68   Puerto Rico  San Juan  San Juan  6.91777 12/12/07
## 477  Puerto Rico  Orocovis  Orocovis  6.85760  10/6/10
## 1396 Puerto Rico Vega Alta Vega Alta  3.49090  5/18/14
## 1397 Puerto Rico    Aguada    Aguada  1.40257  9/24/14
## 1398 Puerto Rico     Ponce  Adjuntas  5.78872  8/24/14
## 1399 Puerto Rico     Ponce  Adjuntas  6.89036  8/24/14
## 1400 Puerto Rico  Villalba  Villalba  3.65535  11/7/14
head(df_PR)
##        id     Date  time continent_code     Country country_code     State
## 68    393 12/12/07                 <NA> Puerto Rico           PR  San Juan
## 477  2550  10/6/10                 <NA> Puerto Rico           PR  Orocovis
## 1396 6708  5/18/14 16:30           <NA> Puerto Rico           PR Vega Alta
## 1397 6709  9/24/14                 <NA> Puerto Rico           PR    Aguada
## 1398 6710  8/24/14  3:00           <NA> Puerto Rico           PR     Ponce
## 1399 6711  8/24/14                 <NA> Puerto Rico           PR     Ponce
##      Population      City Distance location_description latitude longitude
## 68       418140  San Juan  6.91777                       18.4320  -66.0510
## 477         944  Orocovis  6.85760                       18.1652  -66.3969
## 1396      12036 Vega Alta  3.49090    Mine construction  18.3806  -66.3319
## 1397       4040    Aguada  1.40257              Unknown  18.3711  -67.1782
## 1398       5080  Adjuntas  5.78872              Unknown  18.1283  -66.6810
## 1399       5080  Adjuntas  6.89036              Unknown  18.1254  -66.6700
##                                    geolocation hazard_type landslide_type
## 68   (18.431999999999999, -66.051000000000002)   Landslide      Landslide
## 477  (18.165199999999999, -66.396900000000002)   Landslide        Complex
## 1396 (18.380600000000001, -66.331900000000005)   Landslide          Other
## 1397 (18.371099999999998, -67.178200000000004)   Landslide      Landslide
## 1398 (18.128299999999999, -66.680999999999997)   Landslide      Landslide
## 1399              (18.125399999999999, -66.67)   Landslide      Landslide
##      landslide_size          trigger          storm_name injuries fatalities
## 68           Medium Tropical cyclone Tropical Storm Olga       NA         NA
## 477          Medium Tropical cyclone Tropical Storm Otto       NA          0
## 1396          Small             Rain                            0          0
## 1397         Medium         Downpour                            0          0
## 1398          Small         Downpour                            0          0
## 1399         Medium         Downpour                            0          0
##        source_name
## 68   AP.google.com
## 477               
## 1396     Telemundo
## 1397     Telemundo
## 1398 Perla del Sur
## 1399 Perla del Sur
##                                                                                                                                                          source_link
## 68                                                                                          http://ap.google.com/article/ALeqM5gVWjsPEiqe1tEu2mhBIRaxxGi8owD8TFVR600
## 477                                                     http://www.whitehouse.gov/the-press-office/2010/10/26/president-obama-signs-puerto-rico-disaster-declaration
## 1396                                     http://www.telemundopr.com/telenoticias/puerto-rico/Deslizamiento-deja-a-familias-incomunicadas-en-Vega-Alta-258522361.html
## 1397                           http://www.telemundopr.com/telenoticias/puerto-rico/Viviendas-inhabitables-luego-de-deslizamiento-de-tierras-en-Aguada-277123031.html
## 1398 http://www.periodicolaperla.com/index.php?option=com_content&view=article&id=6371:surgen-nuevos-deslizamientos-en-ponce&catid=135:actualidad-del-sur&Itemid=423
## 1399 http://www.periodicolaperla.com/index.php?option=com_content&view=article&id=6371:surgen-nuevos-deslizamientos-en-ponce&catid=135:actualidad-del-sur&Itemid=423

Los datos recibidos para Puerto Rico, no son abundantes, por lo tanto, la mejor relación está dada por la distancia de cada ciudad.

ggplot(data=df_PR, aes(x=City, y=Distance)) + geom_bar(stat="identity", color="aquamarine2", fill="aquamarine2")

Gráfico circular

Se encuentra la relación del espacio que tiene cada ciudad, con la distancia total del territorio nacional.

library(ggplot2)
library(dplyr)

df_PR <- df_PR %>% 
  arrange(desc(City)) %>%
  mutate(prop = Distance / sum(df_PR$Distance) *100) %>%
  mutate(ypos = cumsum(prop)- 0.5*prop )
require(scales)
ggplot(df_PR, aes(x="Puerto Rico", y = prop, fill=City)) +
  geom_bar(stat="identity", width=1, color="white") +
  coord_polar("y", start=0) +
  theme_void() + 
  theme(legend.position="none") +
  
  geom_text(aes(y = ypos, label = percent(Distance/100)), color = "white", size=6) +
  scale_fill_brewer(palette="Set2")+
    labs(title="Distancia")

A través de esta gráfica se puede concluir entonces que, San Juan posee la mayor distancia en todo el territorio nacional, con un 6,918%

Trinidad and Tobago

library(readr)
library(knitr)
df_TNT <- subset (df, Country == "Trinidad and Tobago")
knitr::kable(head(df_TNT))
id Date time continent_code Country country_code State Population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
30 224 9/1/07 NA Trinidad and Tobago TT Tobago 17000 Scarborough 9.11607 11.2415 -60.6742 (11.2415, -60.674199999999999) Landslide Landslide Medium Tropical cyclone Hurricane Felix NA NA Trinadad Express http://www.trinidadexpress.com/index.pl/article_news?id=161197580
61 357 11/17/07 NA Trinidad and Tobago TT Eastern Tobago 0 Roxborough 7.33295 11.2965 -60.6312 (11.2965, -60.6312) Landslide Landslide Medium Rain NA NA Trinadad Express http://www.trinidadexpress.com/index.pl/article_news?id=161237574
65 390 12/11/07 NA Trinidad and Tobago TT Sangre Grande 15968 Sangre Grande 29.28864 10.8410 -61.0550 (10.840999999999999, -61.055) Landslide Landslide Medium Tropical cyclone Tropical Storm Olga NA 3 Trinidad and Tobago’s Newsday http://www.newsday.co.tt/news/0,69681.html
66 391 12/11/07 NA Trinidad and Tobago TT Eastern Tobago 0 Roxborough 8.62938 11.3000 -60.6440 (11.3, -60.643999999999998) Landslide Landslide Medium Tropical cyclone Tropical Storm Olga NA NA Trinidad and Tobago’s Newsday http://www.newsday.co.tt/news/0,69681.html
67 392 12/11/07 NA Trinidad and Tobago TT Eastern Tobago 0 Roxborough 2.66802 11.2670 -60.5660 (11.266999999999999, -60.566000000000003) Landslide Landslide Small Tropical cyclone Tropical Storm Olga NA NA Trinidad and Tobago’s Newsday http://www.newsday.co.tt/news/0,69681.html
149 780 9/7/08 NA Trinidad and Tobago TT Diego Martin 8140 Petit Valley 10.61854 10.7603 -61.4578 (10.760300000000001, -61.457799999999999) Landslide Landslide Medium Downpour NA NA http://www.newsday.co.tt/news/0,85847.html

Gráfico de barras

Se compara la densidad población en cada una de las ciudades de Trinidad Y Tobago.

library(ggplot2)
library(dplyr)
ggplot(data=df_TNT, aes(fill=City, x="Trinidad and Tobago", y=Population)) +
  geom_bar(position="dodge", stat="identity")

Es importante recalcar, la división mostrada en el gráfico, pertenece a un dato desconocido

Nicaragua

library(readr)
library(knitr)
df_NIC <- subset (df, Country == "Nicaragua")
knitr::kable(head(df_NIC))
id Date time continent_code Country country_code State Population City Distance location_description latitude longitude geolocation hazard_type landslide_type landslide_size trigger storm_name injuries fatalities source_name source_link
33 229 9/4/07 NA Nicaragua NI Atlántico Norte 6315 Bonanza 54.90196 13.6670 -84.2435 (13.667, -84.243499999999997) Landslide Complex Medium Tropical cyclone Hurricane Felix NA NA United Nations Development Programme - Relief Web http://www.reliefweb.int/
151 826 10/3/08 NA Nicaragua NI Masaya 5182 Tisma 14.49301 12.1200 -85.8900 (12.12, -85.89) Landslide Landslide Medium Downpour NA 9 CBC http://www.cbc.ca/world/story/2008/10/04/nicaragua-flooding.html
420 2289 8/20/10 NA Nicaragua NI Managua 16469 El Crucero 5.84054 12.0420 -86.2998 (12.042, -86.299800000000005) Landslide Mudslide Medium Downpour NA 3
424 2330 8/25/10 NA Nicaragua NI Jinotega 2367 San José de Bocay 1.36745 13.5317 -85.5325 (13.531700000000001, -85.532499999999999) Landslide Landslide Medium Downpour NA NA
1261 6089 6/23/14 NA Nicaragua NI Chontales 5827 Santo Domingo 31.14242 Unknown 12.3535 -84.8095 (12.3535, -84.8095) Landslide Landslide Small Continuous rain 0 0 Wilfried Strauch
1262 6090 6/23/14 NA Nicaragua NI Chontales 5827 Santo Domingo 31.24511 Unknown 12.3521 -84.8080 (12.3521, -84.808000000000007) Landslide Landslide Medium Continuous rain 0 0 Wilfried Strauch
df_NIC %>% 
  select(Country, State, City, Distance, Date) 
##        Country            State                City Distance     Date
## 33   Nicaragua Atlántico Norte             Bonanza 54.90196   9/4/07
## 151  Nicaragua           Masaya               Tisma 14.49301  10/3/08
## 420  Nicaragua          Managua          El Crucero  5.84054  8/20/10
## 424  Nicaragua         Jinotega  San José de Bocay  1.36745  8/25/10
## 1261 Nicaragua        Chontales       Santo Domingo 31.14242  6/23/14
## 1262 Nicaragua        Chontales       Santo Domingo 31.24511  6/23/14
## 1263 Nicaragua        Chontales       Santo Domingo 31.37360  6/23/14
## 1264 Nicaragua        Chontales       Santo Domingo 31.10125  6/23/14
## 1265 Nicaragua        Chontales       Santo Domingo 30.99704  6/23/14
## 1266 Nicaragua        Chontales       Santo Domingo 30.77070  6/23/14
## 1267 Nicaragua        Chontales       Santo Domingo 30.27546  6/23/14
## 1268 Nicaragua        Chontales       Santo Domingo 29.95253  6/23/14
## 1269 Nicaragua        Chontales       Santo Domingo 29.92927  6/23/14
## 1270 Nicaragua        Chontales       Santo Domingo 28.90294  6/23/14
## 1271 Nicaragua        Chontales       Santo Domingo 32.69694  6/23/14
## 1272 Nicaragua        Chontales       Santo Domingo 32.96402  6/23/14
## 1273 Nicaragua        Chontales       Santo Domingo 32.77401  6/23/14
## 1274 Nicaragua        Chontales       Santo Domingo 29.94574  6/23/14
## 1299 Nicaragua          Managua      Ciudad Sandino  5.59574 10/16/14
## 1321 Nicaragua       Ogun State             Bonanza  0.37593  8/28/14
## 1380 Nicaragua            Rivas          Altagracia  1.97784 11/23/14
## 1381 Nicaragua            Rivas          Altagracia  5.77119  10/9/14
## 1382 Nicaragua    Río San Juan          San Carlos  0.67752  10/9/14
## 1626 Nicaragua         Jinotega             Wiwilí 25.81514  10/8/15
## 1627 Nicaragua         Jinotega            Jinotega  2.44880  2/19/16
## 1631 Nicaragua           Madriz         Las Sabanas  7.21108  9/27/15
## 1632 Nicaragua           Madriz         Las Sabanas  4.86364  9/27/15
## 1633 Nicaragua          Managua           Terrabona 18.92056  6/12/15
## 1634 Nicaragua       Ogun State             Bonanza 10.61568  6/10/15
## 1636 Nicaragua       Ogun State               Siuna  1.68056  7/23/15
## 1637 Nicaragua           Masaya San Juan de Oriente  1.56730  5/13/15
head(df_NIC)
##        id    Date time continent_code   Country country_code            State
## 33    229  9/4/07                <NA> Nicaragua           NI Atlántico Norte
## 151   826 10/3/08                <NA> Nicaragua           NI           Masaya
## 420  2289 8/20/10                <NA> Nicaragua           NI          Managua
## 424  2330 8/25/10                <NA> Nicaragua           NI         Jinotega
## 1261 6089 6/23/14                <NA> Nicaragua           NI        Chontales
## 1262 6090 6/23/14                <NA> Nicaragua           NI        Chontales
##      Population               City Distance location_description latitude
## 33         6315            Bonanza 54.90196                       13.6670
## 151        5182              Tisma 14.49301                       12.1200
## 420       16469         El Crucero  5.84054                       12.0420
## 424        2367 San José de Bocay  1.36745                       13.5317
## 1261       5827      Santo Domingo 31.14242              Unknown  12.3535
## 1262       5827      Santo Domingo 31.24511              Unknown  12.3521
##      longitude                               geolocation hazard_type
## 33    -84.2435             (13.667, -84.243499999999997)   Landslide
## 151   -85.8900                           (12.12, -85.89)   Landslide
## 420   -86.2998             (12.042, -86.299800000000005)   Landslide
## 424   -85.5325 (13.531700000000001, -85.532499999999999)   Landslide
## 1261  -84.8095                       (12.3535, -84.8095)   Landslide
## 1262  -84.8080            (12.3521, -84.808000000000007)   Landslide
##      landslide_type landslide_size          trigger      storm_name injuries
## 33          Complex         Medium Tropical cyclone Hurricane Felix       NA
## 151       Landslide         Medium         Downpour                       NA
## 420        Mudslide         Medium         Downpour                       NA
## 424       Landslide         Medium         Downpour                       NA
## 1261      Landslide          Small  Continuous rain                        0
## 1262      Landslide         Medium  Continuous rain                        0
##      fatalities                                       source_name
## 33           NA United Nations Development Programme - Relief Web
## 151           9                                               CBC
## 420           3                                                  
## 424          NA                                                  
## 1261          0                                  Wilfried Strauch
## 1262          0                                  Wilfried Strauch
##                                                           source_link
## 33                                          http://www.reliefweb.int/
## 151  http://www.cbc.ca/world/story/2008/10/04/nicaragua-flooding.html
## 420                                                                  
## 424                                                                  
## 1261                                                                 
## 1262

Grafica de barras

Relaciona la densidad población con cada estado/provincia de Nicaragua.

ggplot(data=df_NIC, aes(x=State, y=Population)) + geom_bar(stat="identity", color="aquamarine4", fill="aquamarine4")

Tabla de frecuencia

Se relaciona la población con la cantidad de muertes dada por factores ambientales:

Nicaragua Population Fatalities
Bonanza 6315 7
Bonanza 6315 38
Tisma 5182 9
El crucero 19469 3
San José de Bocay 2367 0
Santo Domingo 5827 0
Ciudad Sandino 70013 9
Alta Gracia 2771 1
San Carlos 13451 0
Wiwilí 6955 0
Jinotega 51073 0
Las Sabanas 1257 0
Terrabona 1902 0
Siuna 16056 0
San Juan de Oriente 2111 0

En este caso, Bonanza es la ciudad con más muertes (38) producidas por factores ambientales.

Conclusión

Según los resultados y las gráficas presentadas, podemos decir que Rstudio es una aplicación que facilita la aplicación de ciertos procedimientos y cálculos estadísticos. Además, aprovechando las características ofrecidas por la aplicación, se han aprendido herramientas básicas que permiten aplicar conocimientos teóricos ya discutidos en clase.

Utilizando las gráficas de barras, podemos relacionar tanto la población y distancia con su respectiva ciudad. Así también, al utilizar los gráficos circulares, podemos relacionar el espacio que tiene cada ciudad con su estado/provincia.

Estos datos estadísticos, también fueron útiles para crear tablas de frecuencia, así como cajas y extensiones, los cuales nos permitieron tener una visión más amplia de, no sólo la densidad poblacional o el desplazamiento, sino también muertes súbitas, fenómenos ambientales, y desencadenantes que se provocan a partir de estos mismos.

Lo aquí aprendido, será de utilidad sin importar el uso de R que se tenga previsto, pues son conceptos fundamentales que nos permitirán acceder a otros más complejos y avanzados.