R
Programming languageHave you ever come across some valuable data on a website and wished to access and use it in your research?1 The author is at Karatina University, School of Business, P.O. Box 1957-10101, Karatina, Kenya. Did you get discouraged from using data from the internet because writing down the data by hand and transferring it to a spreadsheet seemed daunting?2 The R code for this project is available in my GitHub account on this link https://github.com/Karuitha/scrapping_la_liga/blob/master/code/scraps.R. Copy and paste the address on a new browser tab Have you ever spent a substantial amount of time trying to copy and paste data from a website with plenty of frustration and headache? If you have found yourself in any of these and similar situations, this write-up is for you. I will describe the basics of harvesting data from websites, commonly known as web scraping, using the R
(R Core Team 2021R Core Team. 2021. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.) programming language. I start with tabular data that most financial analysts, economists, and other business professionals and academics use on many occasions. In future articles, I will delve deeper into scrapping websites for text and different data types. The basics in this section should meet the needs of most users but watch out for more advanced applications of web scraping on my blog and rpubs
.
I have conveniently chosen the results of the premier soccer league in Spain, the La Liga
. The data comes from two sources. The first source is wikipedia. On this site, we shall scrap data regarding the results of the Spanish La Liga from 1929 to 2021. The site gives the top 3 teams over the years together with the names of top scorers..3 You can view this page on this address https://en.wikipedia.org/wiki/List_of_Spanish_football_champions The second source is sky news. This second source provides results of the more recent La Liga results starting 2009 to 2021. I use this site to illustrate how to scrap data that spans multiple web pages..4 Again, you can view this data on this address https://www.skysports.com/la-liga-table/
In this article, I aim to;
R
and the rvest
package from the tidyverse
.In my analysis, I assume basic knowledge of the R
programming language and regular expressions (Regex
). Moreover, my article aims mainly at demonstrating web scrapping and not the analysis of the data. However, I do a bit of data cleaning because it is a natural extension of web scrapping. Most data acquired through web scrapping is rarely clean and hence requires a substantial amount of pre-processing to be useful for analysis. Finally, this project targets a general audience that may not appreciate the academic nuances of inserting citations at the end of a sentence. However, I do include the references as side notes to the main text.
As in every other activity, it is essential to make ethical considerations when harvesting data from the internet. Ethics in garnering data from the internet is especially pertinent with the increase in online criminal activity (Krotov and Silva 2018Krotov, Vlad, and Leiser Silva. 2018. “Legality and Ethics of Web Scraping.”). Several issues are important in this respect.
NB: Just because you can do web scraping does not mean you must do it.
It is important to grasp XML, HTML, and CSS fundamentals to appreciate the basics of web-scrapping in any programming language. Starting with William W. Tunnicliffe’s presentation in 1967, markup languages have evolved into various forms to suit different applications. Examples include TeX, HTML, XML, and XHTML. I highlight XML and HTML and how we can take advantage of each to identify relevant content (Ramasubramanian and Singh 2017Ramasubramanian, Karthik, and Abhishek Singh. 2017. Machine Learning Using r. 1. Springer.).
The Hypertext Markup Language (HTML) is the language that programmers use to create web pages. Usually, HTML combines well with CSS to create elegant web pages. To scrap web pages for data, we have to scan for the following five elements (Bradley and James 2019Bradley, Alex, and Richard JE James. 2019. “Web Scraping Using r.” Advances in Methods and Practices in Psychological Science 2 (3): 264–70.).
Commonly, there are six headers h1 to h6, as follows (Michaud 2013Michaud, Thomas. 2013. Foundations of Web Design: Introduction to HTML & CSS. New Riders.).
<h1> Header 1 </h1>
<h2> Header 2 </h2>
<h3> Header 3 </h3>
<h4> Header 4 </h4>
<h5> Header 5 </h5>
<h6> Header 6 </h6>
p
.<p> Paragraph 1 </p>
<p> Paragraph 2 </p>
<table>
tag that captures the main table structure.
<tbody>
tag that specifies the body of the table.
<thead>
tag that specifies the table header.
<tr>
tag that specifies each of the rows of the table.
<a href="https://rpubs.com/Karuitha/karuitha_cars_pressure"> Click to see my project </a>
In this section, I highlight how one can get data from a webpage. As noted earlier, I use the Wikipedia site. The data of interest is tabular. The steps in scraping this data using the rvest
(Wickham 2021Wickham, Hadley. 2021. Rvest: Easily Harvest (Scrape) Web Pages. https://CRAN.R-project.org/package=rvest.) package in R
are as follows (Wickham et al. 2019Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.).
read_html
function.html_nodes
function and specifying table
.html_table
function. Note that this page has many tables.[[a]]
.The code below reproduces these steps.
Because the dataset is massive, I only present the first ten rows of the data. Please scroll right or left to view the entire set of variables in the table (Mailund 2017Mailund, Thomas. 2017. Beginning Data Science in r. Springer.).
url2 <- "https://en.wikipedia.org/wiki/List_of_Spanish_football_champions"
read_html(url2) %>%
## Capture the nodes for tables
html_nodes("table") %>%
## Capture the tables
html_table() %>%
## Capture the third table in the series
.[[3]] %>%
## Remove square brackets in names
set_names(names(.) %>% str_remove_all("\\[|\\]|\\d")) %>%
## Clean names by removing spaces
janitor::clean_names() %>%
## Remove redundant columns
select(-starts_with("x")) %>%
## Clean the team names
mutate(winners = str_remove_all(winners, "\\(\\d+\\)|\\*")) %>%
## Pick the top 10
head(10) %>%
## make a nice table
knitr::kable()
season | winners | runners_up | third_place | top_scorer_s | top_scorers_club_s | goals |
---|---|---|---|---|---|---|
1929 | Barcelona | Real Madrid (1) | Athletic Bilbao | Paco Bienzobas | Real Sociedad | 14 |
1929–30 | Athletic Bilbao | Barcelona (1) | Arenas | Guillermo Gorostiza | Athletic Bilbao | 19 |
1930–31 | Athletic Bilbao | Racing Santander (1) | Real Sociedad | Bata | Athletic Bilbao | 27 |
1931–32 | Real Madrid | Athletic Bilbao (1) | Barcelona | Guillermo Gorostiza | Athletic Bilbao | 12 |
1932–33 | Real Madrid | Athletic Bilbao (2) | Espanyol | Manuel Olivares | Real Madrid | 16 |
1933–34 | Athletic Bilbao | Real Madrid (2) | Racing Santander | Isidro Lángara | Oviedo | 27 |
1934–35 | Real Betis | Real Madrid (3) | Oviedo | Isidro Lángara | Oviedo | 26 |
1935–36 | Athletic Bilbao | Real Madrid (4) | Oviedo | Isidro Lángara | Oviedo | 27 |
1936–37 | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) |
1937–38 | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) |
I follow the same steps on the sky news website to generate the La Liga standings for the 2020-2021 season.
#########################################
## PART 2: SCRAPPING ONE WEB PAGE
## Initial trial with one url for 2020/2021 season
## Define the url
url <- "https://www.skysports.com/la-liga-table/2020"
## Read the html
read_html(url) %>%
## Capture the nodes for tables
html_nodes("table") %>%
## Capture the tables
html_table() %>%
## Capture the first table in the series
.[[1]] %>%
## rename the position column
rename(pos = `#`) %>%
## Clean the column names
janitor::clean_names() %>%
## Remove redundant last 6 column
select(-last_6) %>%
## make a nice table
knitr::kable(caption = "La Liga Standings 2020-2021 Season")
La Liga Standings 2020-2021 Season
pos | team | pl | w | d | l | f | a | gd | pts |
---|---|---|---|---|---|---|---|---|---|
1 | Atletico Madrid | 38 | 26 | 8 | 4 | 67 | 25 | 42 | 86 |
2 | Real Madrid | 38 | 25 | 9 | 4 | 67 | 28 | 39 | 84 |
3 | Barcelona | 38 | 24 | 7 | 7 | 85 | 38 | 47 | 79 |
4 | Sevilla | 38 | 24 | 5 | 9 | 53 | 33 | 20 | 77 |
5 | Real Sociedad | 38 | 17 | 11 | 10 | 59 | 38 | 21 | 62 |
6 | Real Betis | 38 | 17 | 10 | 11 | 50 | 50 | 0 | 61 |
7 | Villarreal | 38 | 15 | 13 | 10 | 60 | 44 | 16 | 58 |
8 | Celta Vigo | 38 | 14 | 11 | 13 | 55 | 57 | -2 | 53 |
9 | Granada | 38 | 13 | 7 | 18 | 47 | 65 | -18 | 46 |
10 | Athletic Bilbao | 38 | 11 | 13 | 14 | 46 | 42 | 4 | 46 |
11 | Osasuna | 38 | 11 | 11 | 16 | 37 | 48 | -11 | 44 |
12 | Cadiz | 38 | 11 | 11 | 16 | 36 | 58 | -22 | 44 |
13 | Valencia | 38 | 10 | 13 | 15 | 50 | 53 | -3 | 43 |
14 | Levante | 38 | 9 | 14 | 15 | 46 | 57 | -11 | 41 |
15 | Getafe | 38 | 9 | 11 | 18 | 28 | 43 | -15 | 38 |
16 | Alaves | 38 | 9 | 11 | 18 | 36 | 57 | -21 | 38 |
17 | Elche | 38 | 8 | 12 | 18 | 34 | 55 | -21 | 36 |
18 | SD Huesca | 38 | 7 | 13 | 18 | 34 | 53 | -19 | 34 |
19 | Real Valladolid | 38 | 5 | 16 | 17 | 34 | 57 | -23 | 31 |
20 | Eibar | 38 | 6 | 12 | 20 | 29 | 52 | -23 | 30 |
#########################################
So far, so good.
What happens when the data of interest spans multiple pages of a website? For instance, in the Sky News website for the La Liga results, each season has its page. However, the pagination follows a consistent pattern as follows.
Hence, we can generate web addresses that match the seasons we are targeting. We shall revisit5 This reminds me of Kenya’s President Uhuru Kenyatta’s promise to “revisit” the judiciary. Both parties are now busy reviewing each other. this critical issue in a moment. First, I write a function that will allow us to scrap a website when provided with a URL.
#########################################
## PART 3: SCRAP MULTIPLE WEB PAGES
# Scrap the 12 links of the results from 2009 to 2021.
# NB: Every page starts with "https://www.skysports.com/la-liga-table/"
# Then for every respective year, the year is appended.
# For instance for 2020, the address is "https://www.skysports.com/la-liga-table/2020"
## Write a scrapping function
scrapper <- function(url){
## Add delay after scrapping each page
Sys.sleep(2)
## Read the html
read_html(url) %>%
## Capture the nodes for tables
html_nodes("table") %>%
## Capture the tables
html_table() %>%
## Capture the first table in the series
.[[1]] %>%
## rename the position column
rename(pos = `#`) %>%
## Clean the column names
janitor::clean_names() %>%
## Remove redundant last 6 column
select(-last_6)
}
##########################################
The following code generates the web addresses on the Sky News website that correspond to the seasons from 2009 to 2020. Notice now we have all the URLs.
##########################################
## Capture all the urls for years 2009-2021
many_urls <- paste0("https://www.skysports.com/la-liga-table/", 2020:2009)
many_urls
## [1] "https://www.skysports.com/la-liga-table/2020"
## [2] "https://www.skysports.com/la-liga-table/2019"
## [3] "https://www.skysports.com/la-liga-table/2018"
## [4] "https://www.skysports.com/la-liga-table/2017"
## [5] "https://www.skysports.com/la-liga-table/2016"
## [6] "https://www.skysports.com/la-liga-table/2015"
## [7] "https://www.skysports.com/la-liga-table/2014"
## [8] "https://www.skysports.com/la-liga-table/2013"
## [9] "https://www.skysports.com/la-liga-table/2012"
## [10] "https://www.skysports.com/la-liga-table/2011"
## [11] "https://www.skysports.com/la-liga-table/2010"
## [12] "https://www.skysports.com/la-liga-table/2009"
########################################
Now, we run a loop over all URLs using the function defined earlier. The results are in the appendix.
#######################################
## Run a loop over all the web pages
la_liga_09_2020 <- lapply(many_urls, scrapper)
## Give each table a name corresponding to the year
names(la_liga_09_2020) <- paste0("year", 2020:2009)
##########################################
In this section, I do some data exploration using the La Liga data from 2009-2021. My question is, which team has won the most La Liga trophies over these seasons (2009-2021)? The visualization below tells it all.
##########################################
## Get the top team in each year
top_teams_2009_20 <- sapply(la_liga_09_2020, "[", 1, "team") %>%
## Unlist to make one table
unlist() %>%
## Coerce to tibble
tibble() %>%
## Rename column
rename(team = ".")
################################################################################
################################################################################
## Capture the top teams
top_teams_2009_20 %>%
## make a table
table() %>%
## Convert table into tibble
tibble() %>%
## Rename the tibble column
rename(no = ".") %>%
## Add team names
mutate(team = c("Atletico Madrid", "Barcelona", "Real madrid")) %>%
## Convert teams to factors
mutate(team = factor(team)) %>%
## Relocate team column to first position
relocate(team) %>%
## Plot the data
ggplot(aes(x = fct_reorder(team, no), y = no, fill = team)) +
## Add a geom
geom_col(col = "black", show.legend = FALSE) +
## Add text labels
geom_text(aes(label = no), nudge_y = 0.3) +
## Add x and y labels and a title, subtitle
labs(x = "Teams", y = "", title = "LA LIGA ANALYSIS",
subtitle = "La Liga Winners, 2009-2021",
caption = "Developed by John Karuitha using R and the ggplot2 package") +
## Select bar colors
scale_fill_manual(values = c("red", "blue", "white")) +
## Add a pleasant theme
ggthemes::theme_clean()
## Don't know how to automatically pick scale for object of type table. Defaulting to continuous.
## Don't know how to automatically pick scale for object of type table. Defaulting to continuous.
############################################
Web scraping is a valuable skill in every researcher’s toolkit, more so with the rise of social media research. However, most scrapped data can be messy and may require substantial effort cleaning. In this write-up, I used R
to demonstrate web scrapping. Other programming languages like Python
do offer the same capabilities.
############################################
## Appendix: The full list of data scrapped
la_liga_09_2020
## $year2020
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Atletico Madrid 38 26 8 4 67 25 42 86
## 2 2 Real Madrid 38 25 9 4 67 28 39 84
## 3 3 Barcelona 38 24 7 7 85 38 47 79
## 4 4 Sevilla 38 24 5 9 53 33 20 77
## 5 5 Real Sociedad 38 17 11 10 59 38 21 62
## 6 6 Real Betis 38 17 10 11 50 50 0 61
## 7 7 Villarreal 38 15 13 10 60 44 16 58
## 8 8 Celta Vigo 38 14 11 13 55 57 -2 53
## 9 9 Granada 38 13 7 18 47 65 -18 46
## 10 10 Athletic Bilbao 38 11 13 14 46 42 4 46
## 11 11 Osasuna 38 11 11 16 37 48 -11 44
## 12 12 Cadiz 38 11 11 16 36 58 -22 44
## 13 13 Valencia 38 10 13 15 50 53 -3 43
## 14 14 Levante 38 9 14 15 46 57 -11 41
## 15 15 Getafe 38 9 11 18 28 43 -15 38
## 16 16 Alaves 38 9 11 18 36 57 -21 38
## 17 17 Elche 38 8 12 18 34 55 -21 36
## 18 18 SD Huesca 38 7 13 18 34 53 -19 34
## 19 19 Real Valladolid 38 5 16 17 34 57 -23 31
## 20 20 Eibar 38 6 12 20 29 52 -23 30
##
## $year2019
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Real Madrid 38 26 9 3 70 25 45 87
## 2 2 Barcelona 38 25 7 6 86 38 48 82
## 3 3 Atletico Madrid 38 18 16 4 51 27 24 70
## 4 4 Sevilla 38 19 13 6 54 34 20 70
## 5 5 Villarreal 38 18 6 14 63 49 14 60
## 6 6 Real Sociedad 38 16 8 14 56 48 8 56
## 7 7 Granada 38 16 8 14 52 45 7 56
## 8 8 Getafe 38 14 12 12 43 37 6 54
## 9 9 Valencia 38 14 11 13 46 53 -7 53
## 10 10 Osasuna 38 13 13 12 46 54 -8 52
## 11 11 Athletic Bilbao 38 13 12 13 41 38 3 51
## 12 12 Levante 38 14 7 17 47 53 -6 49
## 13 13 Real Valladolid 38 9 15 14 32 43 -11 42
## 14 14 Eibar 38 11 9 18 39 56 -17 42
## 15 15 Real Betis 38 10 11 17 48 60 -12 41
## 16 16 Alaves 38 10 9 19 34 59 -25 39
## 17 17 Celta Vigo 38 7 16 15 37 49 -12 37
## 18 18 Leganes 38 8 12 18 30 51 -21 36
## 19 19 Real Mallorca 38 9 6 23 40 65 -25 33
## 20 20 Espanyol 38 5 10 23 27 58 -31 25
##
## $year2018
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Barcelona 38 26 9 3 90 36 54 87
## 2 2 Atletico Madrid 38 22 10 6 55 29 26 76
## 3 3 Real Madrid 38 21 5 12 63 46 17 68
## 4 4 Valencia 38 15 16 7 51 35 16 61
## 5 5 Getafe 38 15 14 9 48 35 13 59
## 6 6 Sevilla 38 17 8 13 62 47 15 59
## 7 7 Espanyol 38 14 11 13 48 50 -2 53
## 8 8 Athletic Bilbao 38 13 14 11 41 45 -4 53
## 9 9 Real Sociedad 38 13 11 14 45 46 -1 50
## 10 10 Real Betis 38 14 8 16 44 52 -8 50
## 11 11 Alaves 38 13 11 14 39 50 -11 50
## 12 12 Eibar 38 11 14 13 46 50 -4 47
## 13 13 Leganes 38 11 12 15 37 43 -6 45
## 14 14 Villarreal 38 10 14 14 49 52 -3 44
## 15 15 Levante 38 11 11 16 59 66 -7 44
## 16 16 Real Valladolid 38 10 11 17 32 51 -19 41
## 17 17 Celta Vigo 38 10 11 17 53 62 -9 41
## 18 18 Girona 38 9 10 19 37 53 -16 37
## 19 19 SD Huesca 38 7 12 19 43 65 -22 33
## 20 20 Rayo Vallecano 38 8 8 22 41 70 -29 32
##
## $year2017
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Barcelona 38 28 9 1 99 29 70 93
## 2 2 Atletico Madrid 38 23 10 5 58 22 36 79
## 3 3 Real Madrid 38 22 10 6 94 44 50 76
## 4 4 Valencia 38 22 7 9 65 38 27 73
## 5 5 Villarreal 38 18 7 13 57 50 7 61
## 6 6 Real Betis 38 18 6 14 60 61 -1 60
## 7 7 Sevilla 38 17 7 14 49 58 -9 58
## 8 8 Getafe 38 15 10 13 42 33 9 55
## 9 9 Eibar 38 14 9 15 44 50 -6 51
## 10 10 Girona 38 14 9 15 50 59 -9 51
## 11 11 Espanyol 38 12 13 13 36 42 -6 49
## 12 12 Real Sociedad 38 14 7 17 66 59 7 49
## 13 13 Celta Vigo 38 13 10 15 59 60 -1 49
## 14 14 Alaves 38 15 2 21 40 50 -10 47
## 15 15 Levante 38 11 13 14 44 58 -14 46
## 16 16 Athletic Bilbao 38 10 13 15 41 49 -8 43
## 17 17 Leganes 38 12 7 19 34 51 -17 43
## 18 18 Deportivo La Coruna 38 6 11 21 38 76 -38 29
## 19 19 Las Palmas 38 5 7 26 24 74 -50 22
## 20 20 Malaga 38 5 5 28 24 61 -37 20
##
## $year2016
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Real Madrid 38 29 6 3 106 41 65 93
## 2 2 Barcelona 38 28 6 4 116 37 79 90
## 3 3 Atletico Madrid 38 23 9 6 70 27 43 78
## 4 4 Sevilla 38 21 9 8 69 49 20 72
## 5 5 Villarreal 38 19 10 9 56 33 23 67
## 6 6 Real Sociedad 38 19 7 12 59 53 6 64
## 7 7 Athletic Bilbao 38 19 6 13 53 43 10 63
## 8 8 Espanyol 38 15 11 12 49 50 -1 56
## 9 9 Alaves 38 14 13 11 41 43 -2 55
## 10 10 Eibar 38 15 9 14 56 51 5 54
## 11 11 Malaga 38 12 10 16 49 55 -6 46
## 12 12 Valencia 38 13 7 18 56 65 -9 46
## 13 13 Celta Vigo 38 13 6 19 53 69 -16 45
## 14 14 Las Palmas 38 10 9 19 53 74 -21 39
## 15 15 Real Betis 38 10 9 19 41 64 -23 39
## 16 16 Deportivo La Coruna 38 8 12 18 43 61 -18 36
## 17 17 Leganes 38 8 11 19 36 55 -19 35
## 18 18 Sporting Gijon 38 7 10 21 42 72 -30 31
## 19 19 Osasuna 38 4 10 24 40 94 -54 22
## 20 20 Granada 38 4 8 26 30 82 -52 20
##
## $year2015
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Barcelona 38 29 4 5 112 29 83 91
## 2 2 Real Madrid 38 28 6 4 110 34 76 90
## 3 3 Atletico Madrid 38 28 4 6 63 18 45 88
## 4 4 Villarreal 38 18 10 10 44 35 9 64
## 5 5 Athletic Bilbao 38 18 8 12 58 45 13 62
## 6 6 Celta Vigo 38 17 9 12 51 59 -8 60
## 7 7 Sevilla 38 14 10 14 51 50 1 52
## 8 8 Malaga 38 12 12 14 38 35 3 48
## 9 9 Real Sociedad 38 13 9 16 45 48 -3 48
## 10 10 Real Betis 38 11 12 15 34 52 -18 45
## 11 11 Las Palmas 38 12 8 18 45 53 -8 44
## 12 12 Valencia 38 11 11 16 46 48 -2 44
## 13 13 Espanyol 38 12 7 19 40 74 -34 43
## 14 14 Eibar 38 11 10 17 49 61 -12 43
## 15 15 Deportivo La Coruna 38 8 18 12 45 61 -16 42
## 16 16 Granada 38 10 9 19 46 69 -23 39
## 17 17 Sporting Gijon 38 10 9 19 40 62 -22 39
## 18 18 Rayo Vallecano 38 9 11 18 52 73 -21 38
## 19 19 Getafe 38 9 9 20 37 67 -30 36
## 20 20 Levante 38 8 8 22 37 70 -33 32
##
## $year2014
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Barcelona 38 30 4 4 110 21 89 94
## 2 2 Real Madrid 38 30 2 6 118 38 80 92
## 3 3 Atletico Madrid 38 23 9 6 67 29 38 78
## 4 4 Valencia 38 22 11 5 70 32 38 77
## 5 5 Sevilla 38 23 7 8 71 45 26 76
## 6 6 Villarreal 38 16 12 10 48 37 11 60
## 7 7 Athletic Bilbao 38 15 10 13 42 41 1 55
## 8 8 Celta Vigo 38 13 12 13 47 44 3 51
## 9 9 Malaga 38 14 8 16 42 48 -6 50
## 10 10 Espanyol 38 13 10 15 47 51 -4 49
## 11 11 Rayo Vallecano 38 15 4 19 46 68 -22 49
## 12 12 Real Sociedad 38 11 13 14 44 51 -7 46
## 13 13 Elche 38 11 8 19 35 62 -27 41
## 14 14 Levante 38 9 10 19 34 67 -33 37
## 15 15 Getafe 38 10 7 21 33 64 -31 37
## 16 16 Deportivo La Coruna 38 7 14 17 35 60 -25 35
## 17 17 Granada 38 7 14 17 29 64 -35 35
## 18 18 Eibar 38 9 8 21 34 55 -21 35
## 19 19 Almeria 38 8 8 22 35 64 -29 32
## 20 20 Cordoba 38 3 11 24 22 68 -46 20
##
## $year2013
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Atletico Madrid 38 28 6 4 77 26 51 90
## 2 2 Barcelona 38 27 6 5 100 33 67 87
## 3 3 Real Madrid 38 27 6 5 104 38 66 87
## 4 4 Athletic Bilbao 38 20 10 8 66 39 27 70
## 5 5 Sevilla 38 18 9 11 69 52 17 63
## 6 6 Villarreal 38 17 8 13 60 44 16 59
## 7 7 Real Sociedad 38 16 11 11 62 55 7 59
## 8 8 Valencia 38 13 10 15 51 53 -2 49
## 9 9 Celta Vigo 38 14 7 17 49 54 -5 49
## 10 10 Levante 38 12 12 14 35 43 -8 48
## 11 11 Malaga 38 12 9 17 39 46 -7 45
## 12 12 Rayo Vallecano 38 13 4 21 46 80 -34 43
## 13 13 Getafe 38 11 9 18 35 54 -19 42
## 14 14 Espanyol 38 11 9 18 41 51 -10 42
## 15 15 Granada 38 12 5 21 32 56 -24 41
## 16 16 Elche 38 9 13 16 30 50 -20 40
## 17 17 Almeria 38 11 7 20 43 71 -28 40
## 18 18 Osasuna 38 10 9 19 32 62 -30 39
## 19 19 Real Valladolid 38 7 15 16 38 60 -22 36
## 20 20 Real Betis 38 6 7 25 36 78 -42 25
##
## $year2012
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Barcelona 38 32 4 2 115 40 75 100
## 2 2 Real Madrid 38 26 7 5 103 42 61 85
## 3 3 Atletico Madrid 38 23 7 8 65 31 34 76
## 4 4 Real Sociedad 38 18 12 8 70 49 21 66
## 5 5 Valencia 38 19 8 11 67 54 13 65
## 6 6 Malaga 38 16 9 13 53 50 3 57
## 7 7 Real Betis 38 16 8 14 57 56 1 56
## 8 8 Rayo Vallecano 38 16 5 17 50 66 -16 53
## 9 9 Sevilla 38 14 8 16 58 54 4 50
## 10 10 Getafe 38 13 8 17 43 57 -14 47
## 11 11 Levante 38 12 10 16 40 57 -17 46
## 12 12 Athletic Bilbao 38 12 9 17 44 65 -21 45
## 13 13 Espanyol 38 11 11 16 43 52 -9 44
## 14 14 Real Valladolid 38 11 10 17 49 58 -9 43
## 15 15 Granada 38 11 9 18 37 54 -17 42
## 16 16 Osasuna 38 10 9 19 33 50 -17 39
## 17 17 Celta Vigo 38 10 7 21 37 52 -15 37
## 18 18 Real Mallorca 38 9 9 20 43 72 -29 36
## 19 19 Deportivo La Coruna 38 8 11 19 47 70 -23 35
## 20 20 Real Zaragoza 38 9 7 22 37 62 -25 34
##
## $year2011
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Real Madrid 38 32 4 2 121 32 89 100
## 2 2 Barcelona 38 28 7 3 114 29 85 91
## 3 3 Valencia 38 17 10 11 59 44 15 61
## 4 4 Malaga 38 17 7 14 54 53 1 58
## 5 5 Atletico Madrid 38 15 11 12 53 46 7 56
## 6 6 Levante 38 16 7 15 54 50 4 55
## 7 7 Osasuna 38 13 15 10 44 61 -17 54
## 8 8 Real Mallorca 38 14 10 14 42 46 -4 52
## 9 9 Sevilla 38 13 11 14 48 47 1 50
## 10 10 Athletic Bilbao 38 12 13 13 49 52 -3 49
## 11 11 Getafe 38 12 11 15 40 51 -11 47
## 12 12 Real Sociedad 38 12 11 15 46 52 -6 47
## 13 13 Real Betis 38 13 8 17 47 56 -9 47
## 14 14 Espanyol 38 12 10 16 46 56 -10 46
## 15 15 Rayo Vallecano 38 13 4 21 53 73 -20 43
## 16 16 Real Zaragoza 38 12 7 19 36 61 -25 43
## 17 17 Granada 38 12 6 20 35 56 -21 42
## 18 18 Villarreal 38 9 14 15 39 53 -14 41
## 19 19 Sporting Gijon 38 10 7 21 42 69 -27 37
## 20 20 Racing Santander 38 4 15 19 28 63 -35 27
##
## $year2010
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Barcelona 38 30 6 2 95 21 74 96
## 2 2 Real Madrid 38 29 5 4 102 33 69 92
## 3 3 Valencia 38 21 8 9 64 44 20 71
## 4 4 Villarreal 38 18 8 12 54 44 10 62
## 5 5 Sevilla 38 17 7 14 62 61 1 58
## 6 6 Athletic Bilbao 38 18 4 16 59 55 4 58
## 7 7 Atletico Madrid 38 17 7 14 62 53 9 58
## 8 8 Espanyol 38 15 4 19 46 55 -9 49
## 9 9 Osasuna 38 13 8 17 45 46 -1 47
## 10 10 Sporting Gijon 38 11 14 13 35 42 -7 47
## 11 11 Malaga 38 13 7 18 54 68 -14 46
## 12 12 Racing Santander 38 12 10 16 41 56 -15 46
## 13 13 Real Zaragoza 38 12 9 17 40 53 -13 45
## 14 14 Levante 38 12 9 17 41 52 -11 45
## 15 15 Real Sociedad 38 14 3 21 49 66 -17 45
## 16 16 Getafe 38 12 8 18 49 60 -11 44
## 17 17 Real Mallorca 38 12 8 18 41 56 -15 44
## 18 18 Deportivo La Coruna 38 10 13 15 31 47 -16 43
## 19 19 Hercules 38 9 8 21 36 60 -24 35
## 20 20 Almeria 38 6 12 20 36 70 -34 30
##
## $year2009
## # A tibble: 20 x 10
## pos team pl w d l f a gd pts
## <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
## 1 1 Barcelona 38 31 6 1 98 24 74 99
## 2 2 Real Madrid 38 31 3 4 102 35 67 96
## 3 3 Valencia 38 21 8 9 59 40 19 71
## 4 4 Sevilla 38 19 6 13 65 49 16 63
## 5 5 Real Mallorca 38 18 8 12 59 44 15 62
## 6 6 Getafe 38 17 7 14 58 48 10 58
## 7 7 Villarreal 38 16 8 14 58 57 1 56
## 8 8 Athletic Bilbao 38 15 9 14 50 53 -3 54
## 9 9 Atletico Madrid 38 13 8 17 57 61 -4 47
## 10 10 Deportivo La Coruna 38 13 8 17 35 49 -14 47
## 11 11 Espanyol 38 11 11 16 29 46 -17 44
## 12 12 Osasuna 38 11 10 17 37 46 -9 43
## 13 13 Almeria 38 10 12 16 43 55 -12 42
## 14 14 Real Zaragoza 38 10 11 17 46 64 -18 41
## 15 15 Sporting Gijon 38 9 13 16 36 51 -15 40
## 16 16 Racing Santander 38 9 12 17 42 59 -17 39
## 17 17 Malaga 38 7 16 15 42 48 -6 37
## 18 18 Tenerife 38 9 9 20 40 74 -34 36
## 19 19 Real Valladolid 38 7 15 16 37 62 -25 36
## 20 20 Xerez 38 8 10 20 38 66 -28 34
############################################
season | winners | runners_up | third_place | top_scorer_s | top_scorers_club_s | goals |
---|---|---|---|---|---|---|
1929 | Barcelona | Real Madrid (1) | Athletic Bilbao | Paco Bienzobas | Real Sociedad | 14 |
1929–30 | Athletic Bilbao | Barcelona (1) | Arenas | Guillermo Gorostiza | Athletic Bilbao | 19 |
1930–31 | Athletic Bilbao | Racing Santander (1) | Real Sociedad | Bata | Athletic Bilbao | 27 |
1931–32 | Real Madrid | Athletic Bilbao (1) | Barcelona | Guillermo Gorostiza | Athletic Bilbao | 12 |
1932–33 | Real Madrid | Athletic Bilbao (2) | Espanyol | Manuel Olivares | Real Madrid | 16 |
1933–34 | Athletic Bilbao | Real Madrid (2) | Racing Santander | Isidro Lángara | Oviedo | 27 |
1934–35 | Real Betis | Real Madrid (3) | Oviedo | Isidro Lángara | Oviedo | 26 |
1935–36 | Athletic Bilbao | Real Madrid (4) | Oviedo | Isidro Lángara | Oviedo | 27 |
1936–37 | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) |
1937–38 | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) |
1938–39 | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) | Spanish Civil War (League Cancelled) |
1939–40 | Atlético Aviación[a] | Sevilla (1) | Athletic Bilbao | Víctor Unamuno | Athletic Bilbao | 22 |
1940–41 | Atlético Aviación[a] | Athletic Bilbao (3) | Valencia | Pruden | Atlético Aviación | 30 |
1941–42 | Valencia | Real Madrid (5) | Atlético Aviación | Edmundo Suárez | Valencia | 27 |
1942–43 | Athletic Bilbao | Sevilla (2) | Barcelona | Mariano Martín | Barcelona | 32 |
1943–44 | Valencia | Atlético Aviación (1) | Sevilla | Edmundo Suárez | Valencia | 27 |
1944–45 | Barcelona | Real Madrid (6) | Atlético Aviación | Telmo Zarra | Atlético Bilbao | 19 |
1945–46 | Sevilla | Barcelona (2) | Athletic Bilbao | Telmo Zarra | Atlético Bilbao | 24 |
1946–47 | Valencia | Athletic Bilbao (4) | Atlético Aviación | Telmo Zarra | Atlético Bilbao | 34 |
1947–48 | Barcelona | Valencia (1) | Atlético Madrid | Pahiño | Celta Vigo | 23 |
1948–49 | Barcelona | Valencia (2) | Real Madrid | César Rodríguez Álvarez | Barcelona | 28 |
1949–50 | Atlético Madrid | Deportivo La Coruña (1) | Valencia | Telmo Zarra | Athletic Bilbao | 25 |
1950–51 | Atlético Madrid | Sevilla (3) | Valencia | Telmo Zarra | Athletic Bilbao | 38 |
1951–52 | Barcelona | Athletic Bilbao (5) | Real Madrid | Pahiño | Real Madrid | 28 |
1952–53 | Barcelona | Valencia (3) | Real Madrid | Telmo Zarra | Athletic Bilbao | 24 |
1953–54 | Real Madrid | Barcelona (3) | Valencia | Alfredo Di Stéfano | Real Madrid | 27 |
1954–55 | Real Madrid | Barcelona (4) | Athletic Bilbao | Juan Arza | Sevilla | 28 |
1955–56 | Athletic Bilbao | Barcelona (5) | Real Madrid | Alfredo Di Stéfano | Real Madrid | 24 |
1956–57 | Real Madrid † | Sevilla (4) | Barcelona | Alfredo Di Stéfano | Real Madrid | 31 |
1957–58 | Real Madrid † | Atlético Madrid (2) | Barcelona | Manuel BadenesAlfredo Di StéfanoRicardo | ValladolidReal MadridValencia | 19 |
1958–59 | Barcelona | Real Madrid (7) | Athletic Bilbao | Alfredo Di Stéfano | Real Madrid | 23 |
1959–60 | Barcelona | Real Madrid (8) | Athletic Bilbao | Ferenc Puskás | Real Madrid | 26 |
1960–61 | Real Madrid | Atlético Madrid (3) | Zaragoza | Ferenc Puskás | Real Madrid | 27 |
1961–62 | Real Madrid | Barcelona (6) | Atlético Madrid | Juan Seminario | Zaragoza | 25 |
1962–63 | Real Madrid | Atlético Madrid (4) | Oviedo | Ferenc Puskás | Real Madrid | 26 |
1963–64 | Real Madrid | Barcelona (7) | Real Betis | Ferenc Puskás | Real Madrid | 20 |
1964–65 | Real Madrid | Atlético Madrid (5) | Zaragoza | Cayetano Ré | Barcelona | 25 |
1965–66 | Atlético Madrid | Real Madrid (9) | Barcelona | Vavá | Elche | 19 |
1966–67 | Real Madrid | Barcelona (8) | Espanyol | Waldo Machado | Valencia | 24 |
1967–68 | Real Madrid | Barcelona (9) | Las Palmas | Fidel Uriarte | Athletic Bilbao | 22 |
1968–69 | Real Madrid | Las Palmas (1) | Barcelona | Amancio AmaroJosé Eulogio Gárate | Real MadridAtlético Madrid | 14 |
1969–70 | Atlético Madrid | Athletic Bilbao (6) | Sevilla | Amancio AmaroLuis AragonésJosé Eulogio Gárate | Real MadridAtlético MadridAtlético Madrid | 16 |
1970–71 | Valencia | Barcelona (10) | Atlético Madrid | José Eulogio GárateCarles Rexach | Atlético MadridBarcelona | 17 |
1971–72 | Real Madrid | Valencia (4) | Barcelona | Enrique Porta | Granada | 20 |
1972–73 | Atlético Madrid | Barcelona (11) | Espanyol | Marianín | Oviedo | 19 |
1973–74 | Barcelona | Atlético Madrid (6) | Zaragoza | Quini | Sporting Gijón | 20 |
1974–75 | Real Madrid | Zaragoza (1) | Barcelona | Carlos | Athletic Bilbao | 19 |
1975–76 | Real Madrid | Barcelona (12) | Atlético Madrid | Quini | Sporting Gijón | 21 |
1976–77 | Atlético Madrid | Barcelona (13) | Athletic Bilbao | Mario Kempes | Valencia | 24 |
1977–78 | Real Madrid | Barcelona (14) | Athletic Bilbao | Mario Kempes | Valencia | 28 |
1978–79 | Real Madrid | Sporting Gijón (1) | Atlético Madrid | Hans Krankl | Barcelona | 29 |
1979–80 | Real Madrid | Real Sociedad (1) | Sporting Gijón | Quini | Sporting Gijón | 24 |
1980–81 | Real Sociedad | Real Madrid (10) | Atlético Madrid | Quini | Barcelona | 20 |
1981–82 | Real Sociedad | Barcelona (15) | Real Madrid | Quini | Barcelona | 26 |
1982–83 | Athletic Bilbao | Real Madrid (11) | Atlético Madrid | Hipólito Rincón | Real Betis | 20 |
1983–84 | Athletic Bilbao | Real Madrid (12) | Barcelona | Jorge da SilvaJuanito | ValladolidReal Madrid | 17 |
1984–85 | Barcelona | Atlético Madrid (7) | Athletic Bilbao | Hugo Sánchez | Atlético Madrid | 19 |
1985–86 | Real Madrid ‡ | Barcelona (16) | Athletic Bilbao | Hugo Sánchez | Real Madrid | 22 |
1986–87 | Real Madrid | Barcelona (17) | Espanyol | Hugo Sánchez | Real Madrid | 34 |
1987–88 | Real Madrid | Real Sociedad (2) | Atlético Madrid | Hugo Sánchez | Real Madrid | 29 |
1988–89 | Real Madrid | Barcelona (18) | Valencia | Baltazar | Atlético Madrid | 35 |
1989–90 | Real Madrid | Valencia (5) | Barcelona | Hugo Sánchez | Real Madrid | 38 |
1990–91 | Barcelona | Atlético Madrid (8) | Real Madrid | Emilio Butragueño | Real Madrid | 19 |
1991–92 | Barcelona † | Real Madrid (13) | Atlético Madrid | Manolo | Atlético Madrid | 27 |
1992–93 | Barcelona | Real Madrid (14) | Deportivo La Coruña | Bebeto | Deportivo La Coruña | 29 |
1993–94 | Barcelona | Deportivo La Coruña (2) | Zaragoza | Romário | Barcelona | 30 |
1994–95 | Real Madrid | Deportivo La Coruña (3) | Real Betis | Iván Zamorano | Real Madrid | 28 |
1995–96 | Atlético Madrid | Valencia (6) | Barcelona | Juan Antonio Pizzi | Tenerife | 31 |
1996–97 | Real Madrid | Barcelona (19) | Deportivo La Coruña | Ronaldo | Barcelona | 34 |
1997–98 | Barcelona | Athletic Bilbao (7) | Real Sociedad | Christian Vieri | Atlético Madrid | 24 |
1998–99 | Barcelona | Real Madrid (15) | Mallorca | Raúl | Real Madrid | 25 |
1999–2000 | Deportivo La Coruña | Barcelona (20) | Valencia | Salva Ballesta | Racing Santander | 27 |
2000–01 | Real Madrid | Deportivo La Coruña (4) | Mallorca | Raúl | Real Madrid | 24 |
2001–02 | Valencia | Deportivo La Coruña (5) | Real Madrid | Diego Tristán | Deportivo La Coruña | 21 |
2002–03 | Real Madrid | Real Sociedad (3) | Deportivo La Coruña | Roy Makaay | Deportivo La Coruña | 29 |
2003–04 | Valencia ‡ | Barcelona (21) | Deportivo La Coruña | Ronaldo | Real Madrid | 25 |
2004–05 | Barcelona | Real Madrid (16) | Villarreal | Diego Forlán | Villarreal | 25 |
2005–06 | Barcelona † | Real Madrid (17) | Valencia | Samuel Eto’o | Barcelona | 26 |
2006–07 | Real Madrid | Barcelona (22) | Sevilla | Ruud van Nistelrooy | Real Madrid | 25 |
2007–08 | Real Madrid | Villarreal (1) | Barcelona | Daniel Güiza | Mallorca | 27 |
2008–09 | Barcelona | Real Madrid (18) | Sevilla | Diego Forlán | Atlético Madrid | 32 |
2009–10 | Barcelona | Real Madrid (19) | Valencia | Lionel Messi | Barcelona | 34 |
2010–11 | Barcelona † | Real Madrid (20) | Valencia | Cristiano Ronaldo | Real Madrid | 40 |
2011–12 | Real Madrid | Barcelona (23) | Valencia | Lionel Messi | Barcelona | 50 |
2012–13 | Barcelona | Real Madrid (21) | Atlético Madrid | Lionel Messi | Barcelona | 46 |
2013–14 | Atlético Madrid | Barcelona (24) | Real Madrid | Cristiano Ronaldo | Real Madrid | 31 |
2014–15 | Barcelona | Real Madrid (22) | Atlético Madrid | Cristiano Ronaldo | Real Madrid | 48 |
2015–16 | Barcelona | Real Madrid (23) | Atlético Madrid | Luis Suárez | Barcelona | 40 |
2016–17 | Real Madrid † | Barcelona (25) | Atlético Madrid | Lionel Messi | Barcelona | 37 |
2017–18 | Barcelona | Atlético Madrid (9) | Real Madrid | Lionel Messi | Barcelona | 34 |
2018–19 | Barcelona | Atlético Madrid (10) | Real Madrid | Lionel Messi | Barcelona | 36 |
2019–20 | Real Madrid | Barcelona (26) | Atlético Madrid | Lionel Messi | Barcelona | 25 |
2020–21 | Atlético Madrid | Real Madrid (24) | Barcelona | Lionel Messi | Barcelona | 30 |