Práctica Calificada 2
library(htmltab)
link='https://en.wikipedia.org/wiki/List_of_Wimbledon_gentlemen%27s_singles_champions'
linktabla= '//*[@id="mw-content-text"]/div/table[4]'
hombres=htmltab(doc=link, which=linktabla)
names(hombres)
## [1] "Year" "Country" "Champion"
## [4] "Country" "Runner-up" "Score in the final"
hombres=hombres[c(2)]
names(hombres)
## [1] "Country"
str(hombres)
## 'data.frame': 52 obs. of 1 variable:
## $ Country: chr " AUS" " AUS" " AUS" " AUS" ...
son caracteres (char)
hombres
## Country
## 2 AUS
## 3 AUS
## 4 AUS
## 5 AUS
## 6 USA
## 7 TCH
## 8 USA
## 9 USA
## 10 SWE
## 11 SWE
## 12 SWE
## 13 SWE
## 14 SWE
## 15 USA
## 16 USA
## 17 USA
## 18 USA
## 19 FRG
## 20 FRG
## 21 AUS
## 22 SWE
## 23 FRG
## 24 SWE
## 25 GER
## 26 USA
## 27 USA
## 28 USA
## 29 USA
## 30 NED
## 31 USA
## 32 USA
## 33 USA
## 34 USA
## 35 CRO
## 36 AUS
## 37 SUI
## 38 SUI
## 39 SUI
## 40 SUI
## 41 SUI
## 42 ESP
## 43 SUI
## 44 ESP
## 45 SRB
## 46 SUI
## 47 GBR
## 48 SRB
## 49 SRB
## 50 GBR
## 51 SUI
## 52 SRB
## 53 SRB
head(hombres)
## Country
## 2 AUS
## 3 AUS
## 4 AUS
## 5 AUS
## 6 USA
## 7 TCH
tabla de frecuencias
library(questionr)
library(magrittr)
Nom=freq(hombres$Country,total = F,sort = 'dec',exclude = c(NA)) %>% data.frame()
Nom=data.frame(variable=row.names(Nom),Nom,row.names = NULL)
Nom
## variable n X.
## 1 USA 15 28.8
## 2 SUI 8 15.4
## 3 SWE 7 13.5
## 4 AUS 6 11.5
## 5 SRB 5 9.6
## 6 FRG 3 5.8
## 7 ESP 2 3.8
## 8 GBR 2 3.8
## 9 CRO 1 1.9
## 10 GER 1 1.9
## 11 NED 1 1.9
## 12 TCH 1 1.9
Graficos
library(ggplot2)
base= ggplot(data=Nom, aes(x=variable, y=n))
bar1=base+geom_bar(stat='identity')
bar1
Orden:
bar1= bar1 + scale_x_discrete(limits=Nom$variable)
bar1
Datos nominales
Títulos
text1="Open Era-Hombres"
text2="País"
text3="Conteo"
text4="Fuente: Wikipedia"
bar2= bar1 + labs(title=text1,
x =text2,
y = text3,
caption = text4)
bar2
library(qcc)
## Package 'qcc' version 2.7
## Type 'citation("qcc")' for citing this R package in publications.
pareto.chart(table(hombres$Country), cumperc = c(0,50, 80, 100))
##
## Pareto chart analysis for table(hombres$Country)
## Frequency Cum.Freq. Percentage Cum.Percent.
## USA 15.000000 15.000000 28.846154 28.846154
## SUI 8.000000 23.000000 15.384615 44.230769
## SWE 7.000000 30.000000 13.461538 57.692308
## AUS 6.000000 36.000000 11.538462 69.230769
## SRB 5.000000 41.000000 9.615385 78.846154
## FRG 3.000000 44.000000 5.769231 84.615385
## ESP 2.000000 46.000000 3.846154 88.461538
## GBR 2.000000 48.000000 3.846154 92.307692
## CRO 1.000000 49.000000 1.923077 94.230769
## GER 1.000000 50.000000 1.923077 96.153846
## NED 1.000000 51.000000 1.923077 98.076923
## TCH 1.000000 52.000000 1.923077 100.000000
El 80% de los ganadores estan repartidos entre: USA, SUI, SWE, AUS y SRB.
CALCULOS ESTADISTICOS
library(DescTools)
Mode(hombres$Country)
## [1] " USA"
Concentracion: Herfindahl- Hirschman
dataTable=table(hombres$Country)
Herfindahl(dataTable)
## [1] 0.1553254
Al ser mayor que 0.15, existe una moda.
Representatividad Efectiva : Laakso - Taagepera
1/sum(prop.table(dataTable)**2)
## [1] 6.438095
Hay 6 grupos representativos
PARTE 2: MUJERES
library(htmltab)
linkm="https://en.wikipedia.org/wiki/List_of_Wimbledon_ladies%27_singles_champions"
linktabm="//div/table[4]"
mujer=htmltab(doc=linkm, which=linktabm)