The stargazer package is a great R package to compare outputs of models and display basic summary statistics.
Using pgatour2006 data from MARR 6.5.
if (!require('dplyr')) install.packages('dplyr')
if (!require('stargazer')) install.packages('stargazer')
if (!require('psych')) (install.packages('psych'))
pga <- read.csv('pgatour2006.csv', header=T)
pga <- pga %>%
select(PrizeMoney, DrivingAccuracy, GIR, PuttingAverage,
BirdieConversion, SandSaves, Scrambling, PuttsPerRound)
pgaLog <- pga
pgaLog$logPrizeMoney <- log(pga$PrizeMoney)
pgaLog$PrizeMoney <- NULL{r results='asis'}
| Statistic | N | Mean | St. Dev. | Min | Pctl(25) | Median | Pctl(75) | Max |
| DrivingAccuracy | 196 | 63.380 | 5.413 | 49.750 | 59.758 | 63.240 | 66.965 | 78.430 |
| GIR | 196 | 65.186 | 2.722 | 56.870 | 63.523 | 65.355 | 66.770 | 74.150 |
| PuttingAverage | 196 | 1.780 | 0.025 | 1.712 | 1.763 | 1.778 | 1.796 | 1.851 |
| BirdieConversion | 196 | 28.982 | 2.207 | 23.170 | 27.508 | 29.010 | 30.553 | 35.660 |
| SandSaves | 196 | 48.972 | 5.828 | 33.910 | 45.130 | 48.655 | 52.870 | 63.640 |
| Scrambling | 196 | 57.494 | 3.162 | 49.020 | 55.260 | 57.650 | 59.457 | 66.450 |
| PuttsPerRound | 196 | 29.201 | 0.442 | 27.960 | 28.910 | 29.190 | 29.477 | 30.190 |
| logPrizeMoney | 196 | 10.378 | 0.980 | 7.714 | 9.762 | 10.509 | 10.967 | 13.404 |
Con of using stargazer for summary statistics is that you can’t predict skew and kurtosis. Which we can get by using describe function in the psych library.
| n | mean | sd | median | min | max | skew | kurtosis | |
|---|---|---|---|---|---|---|---|---|
| PrizeMoney | 196 | 50891.168367 | 6.390295e+04 | 36644.500 | 2240.000 | 662771.000 | 5.2943317 | 42.5710106 |
| DrivingAccuracy | 196 | 63.380102 | 5.413023e+00 | 63.240 | 49.750 | 78.430 | 0.0942275 | 0.0281998 |
| GIR | 196 | 65.186071 | 2.722364e+00 | 65.355 | 56.870 | 74.150 | -0.2459686 | 0.6762018 |
| PuttingAverage | 196 | 1.779852 | 2.472810e-02 | 1.778 | 1.712 | 1.851 | 0.1562248 | -0.2411670 |
| BirdieConversion | 196 | 28.982296 | 2.206556e+00 | 29.010 | 23.170 | 35.660 | -0.0215926 | 0.2586721 |
| SandSaves | 196 | 48.971735 | 5.828313e+00 | 48.655 | 33.910 | 63.640 | 0.0028410 | -0.2420466 |
| Scrambling | 196 | 57.494439 | 3.162257e+00 | 57.650 | 49.020 | 66.450 | 0.0037607 | 0.0927853 |
| PuttsPerRound | 196 | 29.201071 | 4.417023e-01 | 29.190 | 27.960 | 30.190 | 0.1282510 | -0.1031871 |
Where the stargazer library really shines in the ability to compare different models using a single table.
mod1 <- lm(PrizeMoney ~., pga)
mod2 <- lm(logPrizeMoney ~., pgaLog)
mod3 <- lm(logPrizeMoney ~ GIR + BirdieConversion + SandSaves + Scrambling + PuttsPerRound, pgaLog)We are using 3 models in this example two of which utilize log transformation of Y and other uses original values for Y. The stargazer package helps plot these models against each other easisly and show clear difference.
| Dependent variable: | |||
| PrizeMoney | logPrizeMoney | ||
| Good | Better | Best | |
| (1) | (2) | (3) | |
| DrivingAccuracy | -1,835.830** | -0.004 | |
| (889.161) | (0.012) | ||
| GIR | 9,671.334*** | 0.199*** | 0.197*** |
| (3,309.355) | (0.044) | (0.029) | |
| PuttingAverage | -47,435.300 | -0.466 | |
| (521,566.400) | (6.906) | ||
| BirdieConversion | 10,426.030*** | 0.157*** | 0.163*** |
| (3,049.642) | (0.040) | (0.033) | |
| SandSaves | 1,182.058 | 0.015 | 0.016 |
| (744.818) | (0.010) | (0.010) | |
| Scrambling | 4,741.258** | 0.052 | 0.050** |
| (2,400.818) | (0.032) | (0.025) | |
| PuttsPerRound | 5,267.517 | -0.343 | -0.350 |
| (35,765.740) | (0.474) | (0.231) | |
| Constant | -1,165,233.000** | 0.194 | -0.583 |
| (587,382.900) | (7.777) | (7.159) | |
| Observations | 196 | 196 | 196 |
| R2 | 0.406 | 0.558 | 0.557 |
| Adjusted R2 | 0.384 | 0.541 | 0.546 |
| Residual Std. Error | 50,142.970 (df = 188) | 0.664 (df = 188) | 0.661 (df = 190) |
| F Statistic | 18.387*** (df = 7; 188) | 33.866*** (df = 7; 188) | 47.875*** (df = 5; 190) |
| Note: | p<0.1; p<0.05; p<0.01 | ||
Stargazer package make it easier to display and visualize data and model outputs which help in decision making.