Research goal

I like sports, especially tennis. Sure you do too. Watching Wimbledon matches on TV in 2015 made me fancy: is there any difference between ladies and gentlemen on the grass court except gender and the country they represent. So we have to get some data: http://www.Wimbledon.com/.

Data

## 'data.frame':    40 obs. of  6 variables:
##  $ Rank         : int  1 2 3 4 5 6 7 7 7 10 ...
##  $ Player       : Factor w/ 40 levels "A.Friedsam","A.Murray",..: 12 10 20 8 30 39 32 16 33 4 ...
##  $ Country      : Factor w/ 20 levels "AUS","BEL","BLR",..: 8 20 15 11 1 4 9 6 9 11 ...
##  $ Matches      : int  3 4 4 3 4 5 2 4 6 2 ...
##  $ Double.faults: int  34 33 28 25 23 22 21 21 21 19 ...
##  $ Gender       : Factor w/ 2 levels "female","male": 2 2 2 2 2 2 2 2 2 2 ...
## 'data.frame':    40 obs. of  6 variables:
##  $ Rank            : int  1 1 3 4 5 6 7 8 9 9 ...
##  $ Player          : Factor w/ 40 levels "A.Kerber","A.Murray",..: 27 31 32 13 2 10 9 5 33 39 ...
##  $ Country         : Factor w/ 23 levels "AUS","BEL","BLR",..: 20 21 11 11 12 23 2 14 21 20 ...
##  $ Matches         : int  7 7 6 5 6 4 4 3 5 4 ...
##  $ Break.points.won: int  29 29 26 24 23 20 19 18 16 16 ...
##  $ gender          : Factor w/ 2 levels "female","male": 2 2 2 2 2 2 2 2 2 2 ...
## 'data.frame':    40 obs. of  6 variables:
##  $ Rank                   : int  1 2 3 4 5 5 5 8 8 8 ...
##  $ Player                 : Factor w/ 40 levels "A.Dolgopolov",..: 29 27 20 21 15 34 17 18 12 13 ...
##  $ Country                : Factor w/ 16 levels "AUS","AUT","BUL",..: 1 4 16 12 3 14 5 8 2 7 ...
##  $ Matches                : int  3 3 3 4 3 5 4 3 2 2 ...
##  $ Fastest.serve.speed.mph: int  147 145 140 139 137 137 137 136 136 136 ...
##  $ gender                 : Factor w/ 2 levels "female","male": 2 2 2 2 2 2 2 2 2 2 ...
##    Rank         Player Country Matches Double.faults Gender
## 1     1     F.Verdasco     ESP       3            34   male
## 2     2        D.Kudla     USA       4            33   male
## 3     3     K.Anderson     RSA       4            28   male
## 4     4        D.Brown     GER       3            25   male
## 5     5      N.Kyrgios     AUS       4            23   male
## 6     6     V.Pospisil     CAN       5            22   male
## 7     7      P.Herbert     FRA       2            21   male
## 8     7     I.Karlovic     CRO       4            21   male
## 9     7      R.Gasquet     FRA       6            21   male
## 10   10       A.Zverev     GER       2            19   male
## 11   10     S.Wawrinka     SUI       5            19   male
## 12   10        B.Coric     CRO       2            19   male
## 13   13      V.Troicki     SRB       4            17   male
## 14   14        M.Cilic     CRO       5            16   male
## 15   14       A.Murray     GBR       6            16   male
## 16   14       M.Raonic     CAN       3            16   male
## 17   17      S.Johnson     USA       2            15   male
## 18   17       D.Goffin     BEL       4            15   male
## 19   17       J.Struff     GER       1            15   male
## 20   20    J.Duckworth     AUS       2            14   male
## 21    1    M.Sharapova     RUS       6            44 female
## 22    2       C.Giorgi     ITA       3            28 female
## 23    3     V.Azarenka     BLR       5            26 female
## 24    4   C.Vandeweghe     USA       5            22 female
## 25    5     S.Williams     USA       7            21 female
## 26    6     G.Muguruza     ESP       7            20 female
## 27    7    J.Ostapenko     LAT       2            18 female
## 28    8     L.Tsurenko     UKR       2            17 female
## 29    9   E.Kulichkova     RUS       2            16 female
## 30    9      S.Lisicki     GER       3            16 female
## 31   11     L.Safarova     CZE       4            15 female
## 32   11     A.Friedsam     GER       2            15 female
## 33   11    M.Niculescu     ROU       4            15 female
## 34   11         I.Begu     ROU       3            15 female
## 35   15 M.Duque-Marino     COL       2            13 female
## 36   15 M.Lucic-Baroni     CRO       2            13 female
## 37   17       H.Watson     GBR       3            12 female
## 38   17        A.Riske     USA       1            12 female
## 39   19   O.Govortsova     BLR       4            11 female
## 40   19    Kr.Pliskova     CZE       3            11 female
##    Rank          Player Country Matches Break.points.won gender
## 1     1      N.Djokovic     SRB       7               29   male
## 2     1       R.Federer     SUI       7               29   male
## 3     3       R.Gasquet     FRA       6               26   male
## 4     4         G.Simon     FRA       5               24   male
## 5     5        A.Murray     GBR       6               23   male
## 6     6         D.Kudla     USA       4               20   male
## 7     7        D.Goffin     BEL       4               19   male
## 8     8         A.Seppi     ITA       3               18   male
## 9     9      S.Wawrinka     SUI       5               16   male
## 10    9       V.Troicki     SRB       4               16   male
## 11    9 R.Bautista Agut     ESP       4               16   male
## 12   12       T.Berdych     CZE       4               15   male
## 13   12     M.Baghdatis     CYP       3               15   male
## 14   14         M.Cilic     CRO       5               14   male
## 15   15      V.Pospisil     CAN       5               12   male
## 16   15       N.Kyrgios     AUS       4               12   male
## 17   17      K.Anderson     RSA       4               11   male
## 18   17          J.Ward     GBR       3               11   male
## 19   17       G.Monfils     FRA       3               11   male
## 20   17      J.Nieminen     FIN       2               11   male
## 21    1      S.Williams     USA       7               30 female
## 22    2      G.Muguruza     ESP       7               29 female
## 23    3     M.Sharapova     RUS       6               24 female
## 24    3     A.Radwanska     POL       6               24 female
## 25    5     M.Niculescu     ROU       4               21 female
## 26    6        B.Bencic     SUI       4               19 female
## 27    6    T.Bacsinszky     SUI       5               19 female
## 28    8    O.Govortsova     BLR       4               18 female
## 29    8          M.Keys     USA       5               18 female
## 30    8      V.Azarenka     BLR       5               18 female
## 31   11     C.Wozniacki     DEN       4               16 female
## 32   11      L.Safarova     CZE       4               16 female
## 33   11    C.Vandeweghe     USA       5               16 female
## 34   14        A.Kerber     GER       3               15 female
## 35   14      J.Jankovic     SRB       4               15 female
## 36   16        H.Watson     GBR       3               14 female
## 37   16          I.Begu     ROU       3               14 female
## 38   16         Z.Diyas     KAZ       4               14 female
## 39   19      A.Petkovic     GER       3               13 female
## 40   20      L.Tsurenko     UKR       2               12 female
##    Rank         Player Country Matches Fastest.serve.speed.mph gender
## 1     1        S.Groth     AUS       3                     147   male
## 2     2       M.Raonic     CAN       3                     145   male
## 3     3        J.Isner     USA       3                     140   male
## 4     4     K.Anderson     RSA       4                     139   male
## 5     5     G.Dimitrov     BUL       3                     137   male
## 6     5     S.Wawrinka     SUI       5                     137   male
## 7     5     I.Karlovic     CRO       4                     137   male
## 8     8     J-W.Tsonga     FRA       3                     136   male
## 9     8        D.Thiem     AUT       2                     136   male
## 10    8        F.Lopez     ESP       2                     136   male
## 11    8      S.Querrey     USA       2                     136   male
## 12    8      N.Kyrgios     AUS       4                     136   male
## 13   13      G.Monfils     FRA       3                     135   male
## 14   14        M.Cilic     CRO       5                     134   male
## 15   14     F.Verdasco     ESP       3                     134   male
## 16   16       J.Chardy     FRA       1                     133   male
## 17   16     V.Pospisil     CAN       5                     133   male
## 18   16        D.Brown     GER       3                     133   male
## 19   19      T.Berdych     CZE       4                     132   male
## 20   19   A.Dolgopolov     UKR       2                     132   male
## 21    1     L.Hradecka     CZE       1                     125 female
## 22    2      S.Lisicki     GER       3                     123 female
## 23    2     S.Williams     USA       7                     123 female
## 24    4     V.Williams     USA       4                     122 female
## 25    5     A.Friedsam     GER       2                     120 female
## 26    5         M.Keys     USA       5                     120 female
## 27    5       C.Garcia     FRA       1                     120 female
## 28    8     S.Stephens     USA       3                     119 female
## 29    8        T.Babos     HUN       2                     119 female
## 30   10  A.Tomljanovic     AUS       2                     117 female
## 31   10   C.Vandeweghe     USA       5                     117 female
## 32   12       A.Krunic     SRB       3                     116 female
## 33   12       S.Stosur     AUS       3                     116 female
## 34   14 B.Mattek-Sands     USA       3                     114 female
## 35   15       B.Bencic     SUI       4                     113 female
## 36   15    Kr.Pliskova     CZE       3                     113 female
## 37   15    T.Pironkova     BUL       1                     113 female
## 38   15   K.Mladenovic     FRA       3                     113 female
## 39   15      D.Kovinic     MNE       1                     113 female
## 40   15    C.Witthoeft     GER       1                     113 female
## Следующие объекты скрыты от wmb15:
## 
##     Country, Matches, Player, Rank
## Следующие объекты скрыты от bp2015:
## 
##     Country, gender, Matches, Player, Rank
## Следующие объекты скрыты от wmb15:
## 
##     Country, Matches, Player, Rank

Preliminary results

summary(wmb15)
##       Rank              Player      Country      Matches     
##  Min.   : 1.00   A.Friedsam: 1   GER    : 5   Min.   :1.000  
##  1st Qu.: 5.75   A.Murray  : 1   USA    : 5   1st Qu.:2.000  
##  Median :10.00   A.Riske   : 1   CRO    : 4   Median :3.000  
##  Mean   : 9.95   A.Zverev  : 1   AUS    : 2   Mean   :3.525  
##  3rd Qu.:14.25   B.Coric   : 1   BLR    : 2   3rd Qu.:4.250  
##  Max.   :20.00   C.Giorgi  : 1   CAN    : 2   Max.   :7.000  
##                  (Other)   :34   (Other):20                  
##  Double.faults      Gender  
##  Min.   :11.00   female:20  
##  1st Qu.:15.00   male  :20  
##  Median :17.00              
##  Mean   :19.23              
##  3rd Qu.:21.25              
##  Max.   :44.00              
## 
summary(bp2015)
##       Rank               Player      Country      Matches    
##  Min.   : 1.00   A.Kerber   : 1   SUI    : 4   Min.   :2.00  
##  1st Qu.: 5.75   A.Murray   : 1   USA    : 4   1st Qu.:3.75  
##  Median : 9.00   A.Petkovic : 1   FRA    : 3   Median :4.00  
##  Mean   : 9.90   A.Radwanska: 1   GBR    : 3   Mean   :4.40  
##  3rd Qu.:15.00   A.Seppi    : 1   SRB    : 3   3rd Qu.:5.00  
##  Max.   :20.00   B.Bencic   : 1   BLR    : 2   Max.   :7.00  
##                  (Other)    :34   (Other):21                 
##  Break.points.won    gender  
##  Min.   :11.00    female:20  
##  1st Qu.:14.00    male  :20  
##  Median :16.00               
##  Mean   :17.82               
##  3rd Qu.:20.25               
##  Max.   :30.00               
## 
summary(fss2015)
##       Rank                 Player      Country      Matches     
##  Min.   : 1.0   A.Dolgopolov  : 1   USA    : 8   Min.   :1.000  
##  1st Qu.: 5.0   A.Friedsam    : 1   FRA    : 5   1st Qu.:2.000  
##  Median : 8.0   A.Krunic      : 1   AUS    : 4   Median :3.000  
##  Mean   : 9.5   A.Tomljanovic : 1   GER    : 4   Mean   :3.025  
##  3rd Qu.:15.0   B.Bencic      : 1   CZE    : 3   3rd Qu.:4.000  
##  Max.   :19.0   B.Mattek-Sands: 1   BUL    : 2   Max.   :7.000  
##                 (Other)       :34   (Other):14                  
##  Fastest.serve.speed.mph    gender  
##  Min.   :113.0           female:20  
##  1st Qu.:117.0           male  :20  
##  Median :128.5                      
##  Mean   :126.9                      
##  3rd Qu.:136.0                      
##  Max.   :147.0                      
## 
cor.test(Double.faults,Matches)
## 
##  Pearson's product-moment correlation
## 
## data:  Double.faults and Matches
## t = 0.43576, df = 38, p-value = 0.6655
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2464086  0.3738116
## sample estimates:
##       cor 
## 0.0705132
cor.test(Break.points.won,Matches)
## 
##  Pearson's product-moment correlation
## 
## data:  Break.points.won and Matches
## t = 0.63952, df = 38, p-value = 0.5263
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2152380  0.4017842
## sample estimates:
##       cor 
## 0.1031901
cor.test(Fastest.serve.speed.mph,Matches)
## 
##  Pearson's product-moment correlation
## 
## data:  Fastest.serve.speed.mph and Matches
## t = 1.0883, df = 38, p-value = 0.2833
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1455362  0.4604287
## sample estimates:
##      cor 
## 0.173855

Visualising data

library(plotly)
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## Следующий объект скрыт от 'package:ggplot2':
## 
##     last_plot
## Следующий объект скрыт от 'package:graphics':
## 
##     layout
f <- list(
  family = "Courier New, monospace",
  size = 12,
  color = "#7f7f7f"
)

plot_ly(wmb15,x=Double.faults, color=Country,type="box")%>%
  layout(title = "Wimbeldon 2015: Double faults versus Country", font = f)

plot_ly(bp2015,x=Break.points.won,color=Country,type="box") %>%
  layout(title = "Wimbeldon 2015: Break points won versus Country", font = f)

plot_ly(fss2015,x=Fastest.serve.speed.mph,color=Country,type="box")%>%
  layout(title = "Wimbeldon 2015: Fastest serve speed mph versus Country", font = f)

p1=plot_ly(wmb15,x=Double.faults, color=Gender,type="box")
p2=plot_ly(wmb15,x=Double.faults, y=Rank, color=Gender)
subplot(p2,p1)

p3=plot_ly(bp2015,x=Break.points.won, color=Gender,type="box")
p4=plot_ly(bp2015,x=Break.points.won, y=Rank, color=Gender)
subplot(p4,p3)

p5=plot_ly(fss2015,x=Fastest.serve.speed.mph, color=Gender,type="box")
p6=plot_ly(fss2015,x=Fastest.serve.speed.mph, y=Rank, color=Gender)
subplot(p6,p5)

p7=plot_ly(wmb15,x=Double.faults, type="histogram")
p8=plot_ly(bp2015,x=Break.points.won,type="histogram")
p9=plot_ly(fss2015,x=Fastest.serve.speed.mph,type="histogram")
subplot(p7,p8,p9)%>%
  layout(title = "Wimbeldon 2015", font = f)

plot_ly(wmb15, x = Double.faults, y = Rank, text=Player, size=Matches ,color=Gender, mode = "markers")   %>%
  layout(title = "Wimbeldon 2015: Rank versus Double faults", font = f)

plot_ly(fss2015, x=Fastest.serve.speed.mph,y = Rank, text=Player, size=Matches ,color=Gender,mode="markers") %>%
  layout(title = "Wimbeldon 2015: Rank versus Fastest serve speed mph", font = f)

plot_ly(bp2015, x=Break.points.won,y = Rank, text=Player, size=Matches,color=Gender,mode="markers") %>%
  layout(title = "Wimbeldon 2015: Rank versus Break points won", font = f)

wmb_df=as.matrix(wmb15[,-c(2,3,6)])
plot_ly(z = wmb_df, type = "surface") %>%
  layout(title = "Wimbeldon 2015: Rank versus Double faults and Matches", font = f)

wmb_fss=as.matrix(fss2015[,-c(2,3,6)])
plot_ly(z = wmb_fss, type = "surface") %>%
  layout(title = "Wimbeldon 2015: Rank versus Fastest serve speed mph and Matches", font = f)

wmb_bpw=as.matrix(bp2015[,-c(2,3,6)])
plot_ly(z = wmb_bpw, type = "surface") %>%
  layout(title = "Wimbeldon 2015: Rank versus Break points won and Matches", font = f)

Note

  • wmb_df - double faults
  • wmb_fss - fastest serve speed mph
  • wmb_bpw - break points won
  • x - matches
  • y - rank

Student test

t.test(Double.faults~Gender)
## 
##  Welch Two Sample t-test
## 
## data:  Double.faults by Gender
## t = -1.1349, df = 35.225, p-value = 0.2641
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.831585  1.931585
## sample estimates:
## mean in group female   mean in group male 
##                18.00                20.45
t.test(Break.points.won~Gender)
## 
##  Welch Two Sample t-test
## 
## data:  Break.points.won by Gender
## t = 0.48555, df = 36.984, p-value = 0.6302
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.697108  4.397108
## sample estimates:
## mean in group female   mean in group male 
##                18.25                17.40
t.test(Fastest.serve.speed.mph~Gender)
## 
##  Welch Two Sample t-test
## 
## data:  Fastest.serve.speed.mph by Gender
## t = -15.16, df = 37.998, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -21.4805 -16.4195
## sample estimates:
## mean in group female   mean in group male 
##               117.45               136.40

Linear models

model1=lm(data=wmb15, Rank~Double.faults+Gender+Matches+Country)
summary(model1)
## 
## Call:
## lm(formula = Rank ~ Double.faults + Gender + Matches + Country, 
##     data = wmb15)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.3228 -0.8802  0.0000  1.3595  6.3228 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    24.2149     3.2951   7.349 1.14e-06 ***
## Double.faults  -0.6413     0.1280  -5.011 0.000107 ***
## Gendermale      2.1661     1.8548   1.168 0.258979    
## Matches        -0.6724     0.4720  -1.425 0.172392    
## CountryBEL      2.9279     4.1697   0.702 0.492068    
## CountryBLR      1.6747     3.7874   0.442 0.663937    
## CountryCAN     -1.5070     3.3717  -0.447 0.660556    
## CountryCOL      0.4666     4.4611   0.105 0.917920    
## CountryCRO     -1.0920     2.9242  -0.373 0.713443    
## CountryCZE      1.4752     3.7239   0.396 0.696927    
## CountryESP     -1.1212     3.6767  -0.305 0.764112    
## CountryFRA     -3.2244     3.3688  -0.957 0.351910    
## CountryGBR      2.2058     3.5195   0.627 0.539152    
## CountryGER     -2.2921     2.9348  -0.781 0.445539    
## CountryITA     -2.2416     4.8732  -0.460 0.651352    
## CountryLAT     -4.3269     4.5603  -0.949 0.356004    
## CountryROU     -1.2422     3.7350  -0.333 0.743515    
## CountryRSA     -2.7354     4.2304  -0.647 0.526533    
## CountryRUS      2.7133     4.2802   0.634 0.534566    
## CountrySRB      0.2105     4.1358   0.051 0.960008    
## CountrySUI     -0.8346     4.1944  -0.199 0.844644    
## CountryUKR     -3.9682     4.5334  -0.875 0.393595    
## CountryUSA     -0.3157     3.0337  -0.104 0.918332    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.342 on 17 degrees of freedom
## Multiple R-squared:  0.842,  Adjusted R-squared:  0.6375 
## F-statistic: 4.118 on 22 and 17 DF,  p-value: 0.002185
plot_ly(model1, y=model1$residuals, mode="markers", marker = list(color = "pink")) %>%
  layout(title = "Scatter plot of residuals for model1", font = f)

plot_ly(model1, x=model1$residuals, marker = list(color = "pink"), type="histogram") %>%
  layout(title = "Histogram of residuals for model1", font = f, yaxis = list(title = "Frequency" ))

shapiro.test(model1$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  model1$residuals
## W = 0.95714, p-value = 0.1336
model2=lm(data=fss2015, Rank~Fastest.serve.speed.mph+Gender+Matches+Country)
summary(model2)
## 
## Call:
## lm(formula = Rank ~ Fastest.serve.speed.mph + Gender + Matches + 
##     Country, data = fss2015)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.054 -1.173  0.000  1.018  4.166 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             161.54921   11.97858  13.487 8.21e-12 ***
## Fastest.serve.speed.mph  -1.28849    0.09913 -12.997 1.65e-11 ***
## Gendermale               24.55328    2.11095  11.631 1.29e-10 ***
## Matches                   0.04636    0.34826   0.133    0.895    
## CountryAUT               -2.96086    2.45625  -1.205    0.241    
## CountryBUL               -2.85759    1.91204  -1.495    0.150    
## CountryCAN                1.81187    1.90588   0.951    0.353    
## CountryCRO               -2.22102    1.96103  -1.133    0.270    
## CountryCZE                0.72290    1.64493   0.439    0.665    
## CountryESP               -1.27253    1.93420  -0.658    0.518    
## CountryFRA               -0.82982    1.49760  -0.554    0.585    
## CountryGER               -0.77420    1.54341  -0.502    0.621    
## CountryHUN               -0.31187    2.42358  -0.129    0.899    
## CountryMNE               -0.99644    2.57543  -0.387    0.703    
## CountryRSA               -3.18813    2.42358  -1.315    0.203    
## CountrySRB               -0.22370    2.42813  -0.092    0.927    
## CountrySUI               -2.97350    1.94144  -1.532    0.141    
## CountryUKR                2.88519    2.53362   1.139    0.268    
## CountryUSA               -1.51154    1.36681  -1.106    0.281    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.117 on 21 degrees of freedom
## Multiple R-squared:  0.9165, Adjusted R-squared:  0.845 
## F-statistic: 12.81 on 18 and 21 DF,  p-value: 1.453e-07
plot_ly(model2, y=model2$residuals, mode="markers", marker = list(color = "yellow")) %>%
  layout(title = "Scatter plot of residuals for model2", font = f)

plot_ly(model2, x=model2$residuals, marker = list(color = "yellow"), type="histogram") %>%
  layout(title = "Histogram of residuals for model2", font = f, yaxis = list(title = "Frequency" ))

shapiro.test(model2$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  model2$residuals
## W = 0.97904, p-value = 0.6539
model3=lm(data=bp2015, Rank~Break.points.won+Gender+Matches+Country)
summary(model3)
## 
## Call:
## lm(formula = Rank ~ Break.points.won + Gender + Matches + Country, 
##     data = bp2015)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5545 -0.6268  0.0000  0.4984  2.6314 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       24.6223     2.6008   9.467 1.83e-07 ***
## Break.points.won  -0.9483     0.1863  -5.089 0.000165 ***
## Gendermale        -1.2301     0.9325  -1.319 0.208292    
## Matches            0.7467     0.8496   0.879 0.394284    
## CountryBEL        -1.3622     2.9605  -0.460 0.652478    
## CountryBLR        -2.9139     2.5888  -1.126 0.279276    
## CountryCAN        -0.7467     2.7901  -0.268 0.792896    
## CountryCRO         0.1498     2.7095   0.055 0.956686    
## CountryCYP         0.5914     2.9950   0.197 0.846296    
## CountryCZE        -0.7961     2.4333  -0.327 0.748364    
## CountryDEN        -1.4370     2.9059  -0.495 0.628604    
## CountryESP        -1.2784     2.5281  -0.506 0.620944    
## CountryFIN         2.5451     3.0664   0.830 0.420463    
## CountryFRA         0.4043     2.4133   0.168 0.869344    
## CountryGBR         1.0497     2.3122   0.454 0.656811    
## CountryGER         2.9131     2.7387   1.064 0.305473    
## CountryITA        -0.5638     3.2834  -0.172 0.866120    
## CountryKAZ         1.6665     2.8372   0.587 0.566311    
## CountryPOL        -3.3444     2.9796  -1.122 0.280559    
## CountryROU        -0.1413     2.8450  -0.050 0.961083    
## CountryRSA         1.0517     2.6641   0.395 0.698956    
## CountryRUS        -3.3444     2.9796  -1.122 0.280559    
## CountrySRB        -0.5707     2.3283  -0.245 0.809929    
## CountrySUI        -2.7512     2.2894  -1.202 0.249418    
## CountryUKR         5.2633     3.2627   1.613 0.129017    
## CountryUSA        -1.8216     2.3617  -0.771 0.453336    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.879 on 14 degrees of freedom
## Multiple R-squared:  0.9591, Adjusted R-squared:  0.8861 
## F-statistic: 13.14 on 25 and 14 DF,  p-value: 5.047e-06
plot_ly(model3, y=model3$residuals, mode="markers") %>%
  layout(title = "Scatter plot of residuals for model3", font = f)

plot_ly(model3, x=model3$residuals, type="histogram") %>%
  layout(title = "Histogram of residuals for model3", font = f, yaxis = list(title = "Frequency" ))

shapiro.test(model3$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  model3$residuals
## W = 0.96171, p-value = 0.1915

Conclusions

  1. No significant correlation was found between double faults and matches, break points won and matches, fastest serve speed and matches.

  2. We see no significant difference between ladies and gentlemen as far as double faults and break points won are concerned. But it’s not the case for fastest serve speed: gentlemen are ahead of ladies significantly.

  3. Some interesting facts, produced by this statistics, concern Sirena Williams, who won ladies single: she has got the fastest serve speed (123 mph) among ladies except L.Hradecka (125 mph), 30 breakpoints record (number one among ladies and gentlemen) and 21 double faults compared to 44 of M.Sharapova.

  4. Unfortunately for gentleman’s final single N.Djokovic (SRB) has nothing in common with these results except for the 29 break points won.

  5. Ranks for players (double faults, break points won, fastest serve speed) fit linear models with normally distributed residuals.