Introduction

As a student in applied economics with major for Econometrics, I find myself very often in situations where I look for datasets to apply the techniques I learn in class.

Not so long ago, I discovered through one of my teachers a book entitled Introductory Econometrics: A Modern Approach, 6e by Jeffrey M. Wooldridge.

The book is full of examples so that every material can be applied, but what is more exciting is that all the datasets used for the examples are publicly available and even more exciting is that there exists an R package that contains all the datasets.

The R package I will introduce in this article is the {woodridge} package written by Justin M. Shea [aut, cre] and Kennth H. Brown [ctb].

You can read the official documentation at.

The official title of the package is “111 Data Sets for Econometrics”. It contains 111 datasets for regression analysis and various econometric modelling.

It is easy to install the package, just write install.packages("wooldridge") in an R console and it should be okay.

Let’s now install the package and review what it contains

Installing the package and exploring it

Installing and loading

# install.packages("wooldridge")
library(wooldridge)

What’s inside the package

As stated in the title of the package, it contains 111 datasets for econometrics.

Let’s print the list of all the datasets :

ls("package:wooldridge")
##   [1] "admnrev"       "affairs"       "airfare"       "alcohol"      
##   [5] "apple"         "approval"      "athlet1"       "athlet2"      
##   [9] "attend"        "audit"         "barium"        "beauty"       
##  [13] "benefits"      "beveridge"     "big9salary"    "bwght"        
##  [17] "bwght2"        "campus"        "card"          "catholic"     
##  [21] "cement"        "census2000"    "ceosal1"       "ceosal2"      
##  [25] "charity"       "consump"       "corn"          "countymurders"
##  [29] "cps78_85"      "cps91"         "crime1"        "crime2"       
##  [33] "crime3"        "crime4"        "discrim"       "driving"      
##  [37] "earns"         "econmath"      "elem94_95"     "engin"        
##  [41] "expendshares"  "ezanders"      "ezunem"        "fair"         
##  [45] "fertil1"       "fertil2"       "fertil3"       "fish"         
##  [49] "fringe"        "gpa1"          "gpa2"          "gpa3"         
##  [53] "happiness"     "hprice1"       "hprice2"       "hprice3"      
##  [57] "hseinv"        "htv"           "infmrt"        "injury"       
##  [61] "intdef"        "intqrt"        "inven"         "jtrain"       
##  [65] "jtrain2"       "jtrain3"       "k401k"         "k401ksubs"    
##  [69] "kielmc"        "lawsch85"      "loanapp"       "lowbrth"      
##  [73] "mathpnl"       "meap00_01"     "meap01"        "meap93"       
##  [77] "meapsingle"    "minwage"       "mlb1"          "mroz"         
##  [81] "murder"        "nbasal"        "nyse"          "okun"         
##  [85] "openness"      "pension"       "phillips"      "pntsprd"      
##  [89] "prison"        "prminwge"      "rdchem"        "rdtelec"      
##  [93] "recid"         "rental"        "return"        "saving"       
##  [97] "sleep75"       "slp75_81"      "smoke"         "traffic1"     
## [101] "traffic2"      "twoyear"       "volat"         "vote1"        
## [105] "vote2"         "voucher"       "wage1"         "wage2"        
## [109] "wagepan"       "wageprc"       "wine"
set.seed(101)
knitr::kable(
  head(
    sample(
      kielmc, 12), 
    10)
)
rooms ldist y81nrinc y81 agesq lintstsq lrprice y81ldist larea age nbh baths
7 9.277999 0 0 2304 47.71770 11.00210 0 7.414573 48 4 1
6 9.305651 0 0 6889 47.71770 10.59663 0 7.867871 83 4 2
6 9.350102 0 0 3364 47.71770 10.43412 0 7.042286 58 4 1
5 9.384294 0 0 121 47.71770 11.06507 0 7.035269 11 4 1
5 9.400961 0 0 2304 57.77368 10.69195 0 7.532624 48 4 1
6 9.210339 0 0 6084 57.77368 10.73640 0 7.484369 78 4 3
6 9.367344 0 0 484 57.77368 10.93311 0 7.438384 22 4 2
6 9.230143 0 0 6084 57.77368 10.55841 0 7.349874 78 4 2
8 9.259131 0 0 1764 57.77368 11.01040 0 7.403670 42 4 2
5 9.305651 0 0 1681 57.77368 10.91509 0 7.274479 41 4 2

Let’s write a function to display the dimension of each dataset :

print_dimension <- function(names){
  nb_rows = vector(mode = "numeric")
  nb_cols = vector(mode = "numeric")
  
  for (i in names){
    df <- get(i)
    
    nb_rows[i] <- dim(df)[1]
    nb_cols[i] <- dim(df)[2]
    rm(df)
  }
  
  df <- data.frame(row.names = names,
                   "Num.Obs" = nb_rows, "Num.Cols" = nb_cols)
  return(df)
}
data_list = ls("package:wooldridge")

knitr::kable(print_dimension(data_list))
Num.Obs Num.Cols
admnrev 153 5
affairs 601 19
airfare 4596 14
alcohol 9822 33
apple 660 17
approval 78 16
athlet1 118 23
athlet2 30 10
attend 680 11
audit 241 3
barium 131 31
beauty 1260 17
benefits 1848 18
beveridge 135 8
big9salary 786 30
bwght 1388 14
bwght2 1832 23
campus 97 7
card 3010 34
catholic 7430 13
cement 312 30
census2000 29501 6
ceosal1 209 12
ceosal2 177 15
charity 4268 8
consump 37 24
corn 37 5
countymurders 37349 20
cps78_85 1084 15
cps91 5634 24
crime1 2725 16
crime2 92 34
crime3 106 12
crime4 630 59
discrim 410 37
driving 1200 56
earns 41 14
econmath 856 17
elem94_95 1848 14
engin 403 17
expendshares 1519 13
ezanders 108 25
ezunem 198 37
fair 21 28
fertil1 1129 27
fertil2 4361 27
fertil3 72 24
fish 97 20
fringe 616 39
gpa1 141 29
gpa2 4137 12
gpa3 732 23
happiness 17137 33
hprice1 88 10
hprice2 506 12
hprice3 321 19
hseinv 42 14
htv 1230 23
infmrt 102 12
injury 7150 30
intdef 56 13
intqrt 124 23
inven 37 13
jtrain 471 30
jtrain2 445 19
jtrain3 2675 20
k401k 1534 8
k401ksubs 9275 11
kielmc 321 25
lawsch85 156 21
loanapp 1989 59
lowbrth 100 36
mathpnl 3850 52
meap00_01 1692 9
meap01 1823 11
meap93 408 17
meapsingle 229 18
minwage 612 58
mlb1 353 47
mroz 753 22
murder 153 13
nbasal 269 22
nyse 691 8
okun 47 4
openness 114 12
pension 194 19
phillips 56 7
pntsprd 553 12
prison 714 45
prminwge 38 25
rdchem 32 8
rdtelec 29 6
recid 1445 18
rental 128 23
return 142 12
saving 100 7
sleep75 706 34
slp75_81 239 20
smoke 807 10
traffic1 51 13
traffic2 108 48
twoyear 6763 23
volat 558 17
vote1 173 10
vote2 186 26
voucher 990 19
wage1 526 24
wage2 935 17
wagepan 4360 44
wageprc 286 20
wine 21 5