I like sports, especially tennis. Sure you do too. Watching Wimbledon matches on TV in 2015 made me fancy: is there any difference between ladies and gentlemen on the grass court except gender and the country they represent. So we have to get some data: http://www.Wimbledon.com/.
## 'data.frame': 40 obs. of 6 variables:
## $ Rank : int 1 2 3 4 5 6 7 7 7 10 ...
## $ Player : Factor w/ 40 levels "A.Friedsam","A.Murray",..: 12 10 20 8 30 39 32 16 33 4 ...
## $ Country : Factor w/ 20 levels "AUS","BEL","BLR",..: 8 20 15 11 1 4 9 6 9 11 ...
## $ Matches : int 3 4 4 3 4 5 2 4 6 2 ...
## $ Double.faults: int 34 33 28 25 23 22 21 21 21 19 ...
## $ Gender : Factor w/ 2 levels "female","male": 2 2 2 2 2 2 2 2 2 2 ...
## 'data.frame': 40 obs. of 6 variables:
## $ Rank : int 1 1 3 4 5 6 7 8 9 9 ...
## $ Player : Factor w/ 40 levels "A.Kerber","A.Murray",..: 27 31 32 13 2 10 9 5 33 39 ...
## $ Country : Factor w/ 23 levels "AUS","BEL","BLR",..: 20 21 11 11 12 23 2 14 21 20 ...
## $ Matches : int 7 7 6 5 6 4 4 3 5 4 ...
## $ Break.points.won: int 29 29 26 24 23 20 19 18 16 16 ...
## $ gender : Factor w/ 2 levels "female","male": 2 2 2 2 2 2 2 2 2 2 ...
## 'data.frame': 40 obs. of 6 variables:
## $ Rank : int 1 2 3 4 5 5 5 8 8 8 ...
## $ Player : Factor w/ 40 levels "A.Dolgopolov",..: 29 27 20 21 15 34 17 18 12 13 ...
## $ Country : Factor w/ 16 levels "AUS","AUT","BUL",..: 1 4 16 12 3 14 5 8 2 7 ...
## $ Matches : int 3 3 3 4 3 5 4 3 2 2 ...
## $ Fastest.serve.speed.mph: int 147 145 140 139 137 137 137 136 136 136 ...
## $ gender : Factor w/ 2 levels "female","male": 2 2 2 2 2 2 2 2 2 2 ...
## Rank Player Country Matches Double.faults Gender
## 1 1 F.Verdasco ESP 3 34 male
## 2 2 D.Kudla USA 4 33 male
## 3 3 K.Anderson RSA 4 28 male
## 4 4 D.Brown GER 3 25 male
## 5 5 N.Kyrgios AUS 4 23 male
## 6 6 V.Pospisil CAN 5 22 male
## 7 7 P.Herbert FRA 2 21 male
## 8 7 I.Karlovic CRO 4 21 male
## 9 7 R.Gasquet FRA 6 21 male
## 10 10 A.Zverev GER 2 19 male
## 11 10 S.Wawrinka SUI 5 19 male
## 12 10 B.Coric CRO 2 19 male
## 13 13 V.Troicki SRB 4 17 male
## 14 14 M.Cilic CRO 5 16 male
## 15 14 A.Murray GBR 6 16 male
## 16 14 M.Raonic CAN 3 16 male
## 17 17 S.Johnson USA 2 15 male
## 18 17 D.Goffin BEL 4 15 male
## 19 17 J.Struff GER 1 15 male
## 20 20 J.Duckworth AUS 2 14 male
## 21 1 M.Sharapova RUS 6 44 female
## 22 2 C.Giorgi ITA 3 28 female
## 23 3 V.Azarenka BLR 5 26 female
## 24 4 C.Vandeweghe USA 5 22 female
## 25 5 S.Williams USA 7 21 female
## 26 6 G.Muguruza ESP 7 20 female
## 27 7 J.Ostapenko LAT 2 18 female
## 28 8 L.Tsurenko UKR 2 17 female
## 29 9 E.Kulichkova RUS 2 16 female
## 30 9 S.Lisicki GER 3 16 female
## 31 11 L.Safarova CZE 4 15 female
## 32 11 A.Friedsam GER 2 15 female
## 33 11 M.Niculescu ROU 4 15 female
## 34 11 I.Begu ROU 3 15 female
## 35 15 M.Duque-Marino COL 2 13 female
## 36 15 M.Lucic-Baroni CRO 2 13 female
## 37 17 H.Watson GBR 3 12 female
## 38 17 A.Riske USA 1 12 female
## 39 19 O.Govortsova BLR 4 11 female
## 40 19 Kr.Pliskova CZE 3 11 female
## Rank Player Country Matches Break.points.won gender
## 1 1 N.Djokovic SRB 7 29 male
## 2 1 R.Federer SUI 7 29 male
## 3 3 R.Gasquet FRA 6 26 male
## 4 4 G.Simon FRA 5 24 male
## 5 5 A.Murray GBR 6 23 male
## 6 6 D.Kudla USA 4 20 male
## 7 7 D.Goffin BEL 4 19 male
## 8 8 A.Seppi ITA 3 18 male
## 9 9 S.Wawrinka SUI 5 16 male
## 10 9 V.Troicki SRB 4 16 male
## 11 9 R.Bautista Agut ESP 4 16 male
## 12 12 T.Berdych CZE 4 15 male
## 13 12 M.Baghdatis CYP 3 15 male
## 14 14 M.Cilic CRO 5 14 male
## 15 15 V.Pospisil CAN 5 12 male
## 16 15 N.Kyrgios AUS 4 12 male
## 17 17 K.Anderson RSA 4 11 male
## 18 17 J.Ward GBR 3 11 male
## 19 17 G.Monfils FRA 3 11 male
## 20 17 J.Nieminen FIN 2 11 male
## 21 1 S.Williams USA 7 30 female
## 22 2 G.Muguruza ESP 7 29 female
## 23 3 M.Sharapova RUS 6 24 female
## 24 3 A.Radwanska POL 6 24 female
## 25 5 M.Niculescu ROU 4 21 female
## 26 6 B.Bencic SUI 4 19 female
## 27 6 T.Bacsinszky SUI 5 19 female
## 28 8 O.Govortsova BLR 4 18 female
## 29 8 M.Keys USA 5 18 female
## 30 8 V.Azarenka BLR 5 18 female
## 31 11 C.Wozniacki DEN 4 16 female
## 32 11 L.Safarova CZE 4 16 female
## 33 11 C.Vandeweghe USA 5 16 female
## 34 14 A.Kerber GER 3 15 female
## 35 14 J.Jankovic SRB 4 15 female
## 36 16 H.Watson GBR 3 14 female
## 37 16 I.Begu ROU 3 14 female
## 38 16 Z.Diyas KAZ 4 14 female
## 39 19 A.Petkovic GER 3 13 female
## 40 20 L.Tsurenko UKR 2 12 female
## Rank Player Country Matches Fastest.serve.speed.mph gender
## 1 1 S.Groth AUS 3 147 male
## 2 2 M.Raonic CAN 3 145 male
## 3 3 J.Isner USA 3 140 male
## 4 4 K.Anderson RSA 4 139 male
## 5 5 G.Dimitrov BUL 3 137 male
## 6 5 S.Wawrinka SUI 5 137 male
## 7 5 I.Karlovic CRO 4 137 male
## 8 8 J-W.Tsonga FRA 3 136 male
## 9 8 D.Thiem AUT 2 136 male
## 10 8 F.Lopez ESP 2 136 male
## 11 8 S.Querrey USA 2 136 male
## 12 8 N.Kyrgios AUS 4 136 male
## 13 13 G.Monfils FRA 3 135 male
## 14 14 M.Cilic CRO 5 134 male
## 15 14 F.Verdasco ESP 3 134 male
## 16 16 J.Chardy FRA 1 133 male
## 17 16 V.Pospisil CAN 5 133 male
## 18 16 D.Brown GER 3 133 male
## 19 19 T.Berdych CZE 4 132 male
## 20 19 A.Dolgopolov UKR 2 132 male
## 21 1 L.Hradecka CZE 1 125 female
## 22 2 S.Lisicki GER 3 123 female
## 23 2 S.Williams USA 7 123 female
## 24 4 V.Williams USA 4 122 female
## 25 5 A.Friedsam GER 2 120 female
## 26 5 M.Keys USA 5 120 female
## 27 5 C.Garcia FRA 1 120 female
## 28 8 S.Stephens USA 3 119 female
## 29 8 T.Babos HUN 2 119 female
## 30 10 A.Tomljanovic AUS 2 117 female
## 31 10 C.Vandeweghe USA 5 117 female
## 32 12 A.Krunic SRB 3 116 female
## 33 12 S.Stosur AUS 3 116 female
## 34 14 B.Mattek-Sands USA 3 114 female
## 35 15 B.Bencic SUI 4 113 female
## 36 15 Kr.Pliskova CZE 3 113 female
## 37 15 T.Pironkova BUL 1 113 female
## 38 15 K.Mladenovic FRA 3 113 female
## 39 15 D.Kovinic MNE 1 113 female
## 40 15 C.Witthoeft GER 1 113 female
## Следующие объекты скрыты от wmb15:
##
## Country, Matches, Player, Rank
## Следующие объекты скрыты от bp2015:
##
## Country, gender, Matches, Player, Rank
## Следующие объекты скрыты от wmb15:
##
## Country, Matches, Player, Rank
summary(wmb15)
## Rank Player Country Matches
## Min. : 1.00 A.Friedsam: 1 GER : 5 Min. :1.000
## 1st Qu.: 5.75 A.Murray : 1 USA : 5 1st Qu.:2.000
## Median :10.00 A.Riske : 1 CRO : 4 Median :3.000
## Mean : 9.95 A.Zverev : 1 AUS : 2 Mean :3.525
## 3rd Qu.:14.25 B.Coric : 1 BLR : 2 3rd Qu.:4.250
## Max. :20.00 C.Giorgi : 1 CAN : 2 Max. :7.000
## (Other) :34 (Other):20
## Double.faults Gender
## Min. :11.00 female:20
## 1st Qu.:15.00 male :20
## Median :17.00
## Mean :19.23
## 3rd Qu.:21.25
## Max. :44.00
##
summary(bp2015)
## Rank Player Country Matches
## Min. : 1.00 A.Kerber : 1 SUI : 4 Min. :2.00
## 1st Qu.: 5.75 A.Murray : 1 USA : 4 1st Qu.:3.75
## Median : 9.00 A.Petkovic : 1 FRA : 3 Median :4.00
## Mean : 9.90 A.Radwanska: 1 GBR : 3 Mean :4.40
## 3rd Qu.:15.00 A.Seppi : 1 SRB : 3 3rd Qu.:5.00
## Max. :20.00 B.Bencic : 1 BLR : 2 Max. :7.00
## (Other) :34 (Other):21
## Break.points.won gender
## Min. :11.00 female:20
## 1st Qu.:14.00 male :20
## Median :16.00
## Mean :17.82
## 3rd Qu.:20.25
## Max. :30.00
##
summary(fss2015)
## Rank Player Country Matches
## Min. : 1.0 A.Dolgopolov : 1 USA : 8 Min. :1.000
## 1st Qu.: 5.0 A.Friedsam : 1 FRA : 5 1st Qu.:2.000
## Median : 8.0 A.Krunic : 1 AUS : 4 Median :3.000
## Mean : 9.5 A.Tomljanovic : 1 GER : 4 Mean :3.025
## 3rd Qu.:15.0 B.Bencic : 1 CZE : 3 3rd Qu.:4.000
## Max. :19.0 B.Mattek-Sands: 1 BUL : 2 Max. :7.000
## (Other) :34 (Other):14
## Fastest.serve.speed.mph gender
## Min. :113.0 female:20
## 1st Qu.:117.0 male :20
## Median :128.5
## Mean :126.9
## 3rd Qu.:136.0
## Max. :147.0
##
cor.test(Double.faults,Matches)
##
## Pearson's product-moment correlation
##
## data: Double.faults and Matches
## t = 0.43576, df = 38, p-value = 0.6655
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2464086 0.3738116
## sample estimates:
## cor
## 0.0705132
cor.test(Break.points.won,Matches)
##
## Pearson's product-moment correlation
##
## data: Break.points.won and Matches
## t = 0.63952, df = 38, p-value = 0.5263
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2152380 0.4017842
## sample estimates:
## cor
## 0.1031901
cor.test(Fastest.serve.speed.mph,Matches)
##
## Pearson's product-moment correlation
##
## data: Fastest.serve.speed.mph and Matches
## t = 1.0883, df = 38, p-value = 0.2833
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1455362 0.4604287
## sample estimates:
## cor
## 0.173855
library(plotly)
## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## Следующий объект скрыт от 'package:ggplot2':
##
## last_plot
## Следующий объект скрыт от 'package:graphics':
##
## layout
f <- list(
family = "Courier New, monospace",
size = 12,
color = "#7f7f7f"
)
plot_ly(wmb15,x=Double.faults, color=Country,type="box")%>%
layout(title = "Wimbeldon 2015: Double faults versus Country", font = f)
plot_ly(bp2015,x=Break.points.won,color=Country,type="box") %>%
layout(title = "Wimbeldon 2015: Break points won versus Country", font = f)
plot_ly(fss2015,x=Fastest.serve.speed.mph,color=Country,type="box")%>%
layout(title = "Wimbeldon 2015: Fastest serve speed mph versus Country", font = f)
p1=plot_ly(wmb15,x=Double.faults, color=Gender,type="box")
p2=plot_ly(wmb15,x=Double.faults, y=Rank, color=Gender)
subplot(p2,p1)
p3=plot_ly(bp2015,x=Break.points.won, color=Gender,type="box")
p4=plot_ly(bp2015,x=Break.points.won, y=Rank, color=Gender)
subplot(p4,p3)
p5=plot_ly(fss2015,x=Fastest.serve.speed.mph, color=Gender,type="box")
p6=plot_ly(fss2015,x=Fastest.serve.speed.mph, y=Rank, color=Gender)
subplot(p6,p5)
p7=plot_ly(wmb15,x=Double.faults, type="histogram")
p8=plot_ly(bp2015,x=Break.points.won,type="histogram")
p9=plot_ly(fss2015,x=Fastest.serve.speed.mph,type="histogram")
subplot(p7,p8,p9)%>%
layout(title = "Wimbeldon 2015", font = f)
plot_ly(wmb15, x = Double.faults, y = Rank, text=Player, size=Matches ,color=Gender, mode = "markers") %>%
layout(title = "Wimbeldon 2015: Rank versus Double faults", font = f)
plot_ly(fss2015, x=Fastest.serve.speed.mph,y = Rank, text=Player, size=Matches ,color=Gender,mode="markers") %>%
layout(title = "Wimbeldon 2015: Rank versus Fastest serve speed mph", font = f)
plot_ly(bp2015, x=Break.points.won,y = Rank, text=Player, size=Matches,color=Gender,mode="markers") %>%
layout(title = "Wimbeldon 2015: Rank versus Break points won", font = f)
wmb_df=as.matrix(wmb15[,-c(2,3,6)])
plot_ly(z = wmb_df, type = "surface") %>%
layout(title = "Wimbeldon 2015: Rank versus Double faults and Matches", font = f)
wmb_fss=as.matrix(fss2015[,-c(2,3,6)])
plot_ly(z = wmb_fss, type = "surface") %>%
layout(title = "Wimbeldon 2015: Rank versus Fastest serve speed mph and Matches", font = f)
wmb_bpw=as.matrix(bp2015[,-c(2,3,6)])
plot_ly(z = wmb_bpw, type = "surface") %>%
layout(title = "Wimbeldon 2015: Rank versus Break points won and Matches", font = f)
t.test(Double.faults~Gender)
##
## Welch Two Sample t-test
##
## data: Double.faults by Gender
## t = -1.1349, df = 35.225, p-value = 0.2641
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.831585 1.931585
## sample estimates:
## mean in group female mean in group male
## 18.00 20.45
t.test(Break.points.won~Gender)
##
## Welch Two Sample t-test
##
## data: Break.points.won by Gender
## t = 0.48555, df = 36.984, p-value = 0.6302
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.697108 4.397108
## sample estimates:
## mean in group female mean in group male
## 18.25 17.40
t.test(Fastest.serve.speed.mph~Gender)
##
## Welch Two Sample t-test
##
## data: Fastest.serve.speed.mph by Gender
## t = -15.16, df = 37.998, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -21.4805 -16.4195
## sample estimates:
## mean in group female mean in group male
## 117.45 136.40
model1=lm(data=wmb15, Rank~Double.faults+Gender+Matches+Country)
summary(model1)
##
## Call:
## lm(formula = Rank ~ Double.faults + Gender + Matches + Country,
## data = wmb15)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.3228 -0.8802 0.0000 1.3595 6.3228
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 24.2149 3.2951 7.349 1.14e-06 ***
## Double.faults -0.6413 0.1280 -5.011 0.000107 ***
## Gendermale 2.1661 1.8548 1.168 0.258979
## Matches -0.6724 0.4720 -1.425 0.172392
## CountryBEL 2.9279 4.1697 0.702 0.492068
## CountryBLR 1.6747 3.7874 0.442 0.663937
## CountryCAN -1.5070 3.3717 -0.447 0.660556
## CountryCOL 0.4666 4.4611 0.105 0.917920
## CountryCRO -1.0920 2.9242 -0.373 0.713443
## CountryCZE 1.4752 3.7239 0.396 0.696927
## CountryESP -1.1212 3.6767 -0.305 0.764112
## CountryFRA -3.2244 3.3688 -0.957 0.351910
## CountryGBR 2.2058 3.5195 0.627 0.539152
## CountryGER -2.2921 2.9348 -0.781 0.445539
## CountryITA -2.2416 4.8732 -0.460 0.651352
## CountryLAT -4.3269 4.5603 -0.949 0.356004
## CountryROU -1.2422 3.7350 -0.333 0.743515
## CountryRSA -2.7354 4.2304 -0.647 0.526533
## CountryRUS 2.7133 4.2802 0.634 0.534566
## CountrySRB 0.2105 4.1358 0.051 0.960008
## CountrySUI -0.8346 4.1944 -0.199 0.844644
## CountryUKR -3.9682 4.5334 -0.875 0.393595
## CountryUSA -0.3157 3.0337 -0.104 0.918332
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.342 on 17 degrees of freedom
## Multiple R-squared: 0.842, Adjusted R-squared: 0.6375
## F-statistic: 4.118 on 22 and 17 DF, p-value: 0.002185
plot_ly(model1, y=model1$residuals, mode="markers", marker = list(color = "pink")) %>%
layout(title = "Scatter plot of residuals for model1", font = f)
plot_ly(model1, x=model1$residuals, marker = list(color = "pink"), type="histogram") %>%
layout(title = "Histogram of residuals for model1", font = f, yaxis = list(title = "Frequency" ))
shapiro.test(model1$residuals)
##
## Shapiro-Wilk normality test
##
## data: model1$residuals
## W = 0.95714, p-value = 0.1336
model2=lm(data=fss2015, Rank~Fastest.serve.speed.mph+Gender+Matches+Country)
summary(model2)
##
## Call:
## lm(formula = Rank ~ Fastest.serve.speed.mph + Gender + Matches +
## Country, data = fss2015)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.054 -1.173 0.000 1.018 4.166
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 161.54921 11.97858 13.487 8.21e-12 ***
## Fastest.serve.speed.mph -1.28849 0.09913 -12.997 1.65e-11 ***
## Gendermale 24.55328 2.11095 11.631 1.29e-10 ***
## Matches 0.04636 0.34826 0.133 0.895
## CountryAUT -2.96086 2.45625 -1.205 0.241
## CountryBUL -2.85759 1.91204 -1.495 0.150
## CountryCAN 1.81187 1.90588 0.951 0.353
## CountryCRO -2.22102 1.96103 -1.133 0.270
## CountryCZE 0.72290 1.64493 0.439 0.665
## CountryESP -1.27253 1.93420 -0.658 0.518
## CountryFRA -0.82982 1.49760 -0.554 0.585
## CountryGER -0.77420 1.54341 -0.502 0.621
## CountryHUN -0.31187 2.42358 -0.129 0.899
## CountryMNE -0.99644 2.57543 -0.387 0.703
## CountryRSA -3.18813 2.42358 -1.315 0.203
## CountrySRB -0.22370 2.42813 -0.092 0.927
## CountrySUI -2.97350 1.94144 -1.532 0.141
## CountryUKR 2.88519 2.53362 1.139 0.268
## CountryUSA -1.51154 1.36681 -1.106 0.281
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.117 on 21 degrees of freedom
## Multiple R-squared: 0.9165, Adjusted R-squared: 0.845
## F-statistic: 12.81 on 18 and 21 DF, p-value: 1.453e-07
plot_ly(model2, y=model2$residuals, mode="markers", marker = list(color = "yellow")) %>%
layout(title = "Scatter plot of residuals for model2", font = f)
plot_ly(model2, x=model2$residuals, marker = list(color = "yellow"), type="histogram") %>%
layout(title = "Histogram of residuals for model2", font = f, yaxis = list(title = "Frequency" ))
shapiro.test(model2$residuals)
##
## Shapiro-Wilk normality test
##
## data: model2$residuals
## W = 0.97904, p-value = 0.6539
model3=lm(data=bp2015, Rank~Break.points.won+Gender+Matches+Country)
summary(model3)
##
## Call:
## lm(formula = Rank ~ Break.points.won + Gender + Matches + Country,
## data = bp2015)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.5545 -0.6268 0.0000 0.4984 2.6314
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 24.6223 2.6008 9.467 1.83e-07 ***
## Break.points.won -0.9483 0.1863 -5.089 0.000165 ***
## Gendermale -1.2301 0.9325 -1.319 0.208292
## Matches 0.7467 0.8496 0.879 0.394284
## CountryBEL -1.3622 2.9605 -0.460 0.652478
## CountryBLR -2.9139 2.5888 -1.126 0.279276
## CountryCAN -0.7467 2.7901 -0.268 0.792896
## CountryCRO 0.1498 2.7095 0.055 0.956686
## CountryCYP 0.5914 2.9950 0.197 0.846296
## CountryCZE -0.7961 2.4333 -0.327 0.748364
## CountryDEN -1.4370 2.9059 -0.495 0.628604
## CountryESP -1.2784 2.5281 -0.506 0.620944
## CountryFIN 2.5451 3.0664 0.830 0.420463
## CountryFRA 0.4043 2.4133 0.168 0.869344
## CountryGBR 1.0497 2.3122 0.454 0.656811
## CountryGER 2.9131 2.7387 1.064 0.305473
## CountryITA -0.5638 3.2834 -0.172 0.866120
## CountryKAZ 1.6665 2.8372 0.587 0.566311
## CountryPOL -3.3444 2.9796 -1.122 0.280559
## CountryROU -0.1413 2.8450 -0.050 0.961083
## CountryRSA 1.0517 2.6641 0.395 0.698956
## CountryRUS -3.3444 2.9796 -1.122 0.280559
## CountrySRB -0.5707 2.3283 -0.245 0.809929
## CountrySUI -2.7512 2.2894 -1.202 0.249418
## CountryUKR 5.2633 3.2627 1.613 0.129017
## CountryUSA -1.8216 2.3617 -0.771 0.453336
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.879 on 14 degrees of freedom
## Multiple R-squared: 0.9591, Adjusted R-squared: 0.8861
## F-statistic: 13.14 on 25 and 14 DF, p-value: 5.047e-06
plot_ly(model3, y=model3$residuals, mode="markers") %>%
layout(title = "Scatter plot of residuals for model3", font = f)
plot_ly(model3, x=model3$residuals, type="histogram") %>%
layout(title = "Histogram of residuals for model3", font = f, yaxis = list(title = "Frequency" ))
shapiro.test(model3$residuals)
##
## Shapiro-Wilk normality test
##
## data: model3$residuals
## W = 0.96171, p-value = 0.1915
No significant correlation was found between double faults and matches, break points won and matches, fastest serve speed and matches.
We see no significant difference between ladies and gentlemen as far as double faults and break points won are concerned. But it’s not the case for fastest serve speed: gentlemen are ahead of ladies significantly.
Some interesting facts, produced by this statistics, concern Sirena Williams, who won ladies single: she has got the fastest serve speed (123 mph) among ladies except L.Hradecka (125 mph), 30 breakpoints record (number one among ladies and gentlemen) and 21 double faults compared to 44 of M.Sharapova.
Unfortunately for gentleman’s final single N.Djokovic (SRB) has nothing in common with these results except for the 29 break points won.
Ranks for players (double faults, break points won, fastest serve speed) fit linear models with normally distributed residuals.