This redesign proposes some changes to tables presented in WIKIPEDIA, which present the results from the 2006 elections for the National Congress in Peru, a topic we will use for the final project too. The tables are found in the following links:
link1 = "http://es.wikipedia.org/wiki/Elecciones_generales_de_Per%C3%BA_de_2006"
link2 = "http://en.wikipedia.org/wiki/Peruvian_general_election,_2006"
DATA PREPARATION STAGE:
We show the links since we will use them to directly scrap the data from that location, using functionalities that R offers to parse webpages. For that, we have organised a function to simplify the code in this data preparation stage:
options(width = 180)
library(XML)
getTableWWW = function(link, which, headers, skips, cols) {
dirty = getNodeSet(htmlParse(link, encoding = "UTF-8"), "//table")[[which]]
return(readHTMLTable(dirty, as.data.frame = T, header = headers, skip.rows = skips,
colClasses = cols))
}
Using the function above, we can scrap the tables we will redesign:
tableParty = getTableWWW(link1, 5, c("Party", "Seats", "Votes"), c(1, 9), c("character", "numeric", "FormattedInteger"))
tableRegion = getTableWWW(link2, 8, T, 26, c("character", "FormattedInteger", rep("integer", 4)))
tableRegionParty = getTableWWW(link2, 11, T, 26, c("character", rep("integer", 8)))
tableRegionParty[is.na(tableRegionParty)] <- 0
IMPROVEMENT 1:
We will first propose some improvement to the table that shows Party Results, which simple looks like this:

As we can see:
In this situation, we start by adding to the original table some information, that is the role or the position of the party in congress:
tableParty$Position = c("Opposition", "Governing", rep("", length(tableParty[, 2]) - 2))
As you can see, we have not included the label for “Minority”, because that will later cause us trouble when we decide to include position as a text in the chart. We now make use of lattice and plot a barchart to quickly improve comparability, highlight differences and add some context so as to get attention from the reader:
options(width = 180)
library(lattice)
author = "author: Jose Magallanes"
ord <- order(as.numeric(tableParty$Seats))
levelnew = factor(tableParty$Party)[ord]
tableParty$Party = factor(tableParty$Party, levels = levelnew, ordered = TRUE)
voteOp = tableParty$Seats[1]
voteGb = tableParty$Seats[2]
barchart(Party ~ Seats, data = tableParty, xlab = "Number of Seats obtained\n", ylab = "Parties", sub = paste("Source: National Office of Electoral Processes\n", author), main = "Results in 2006-2011 Elections for National Congress in Peru",
scale = list(x = list(limits = c(-1, 62)), y = list(cex = 1.2, rot = 45)), panel = function(y, x, ...) {
panel.fill(col = "gray90")
panel.grid(h = 0, v = -1, col = "black", lty = "dotted")
panel.barchart(x, y, col = c("#FF3333", "#004C99", rep("#FF8000", 5), alpha = 40))
panel.text(x + 1.1, y, label = round(x, 2), cex = 1.5, col = "grey")
panel.text(x + 1, y, label = round(x, 2), cex = 1.5, col = "black")
panel.text(-1.2, y, label = tableParty$Position, cex = 2, col = "white", pos = 4)
panel.text(voteGb + 2, 5.78, label = paste("Votes to get majority:", 61 - voteGb), cex = 1.5, col = "grey", pos = 4)
panel.text(voteGb + 2, 5.8, label = paste("Votes to get majority:", 61 - voteGb), cex = 1.5, col = "blue", pos = 4)
panel.text(voteOp + 1.8, 6.78, label = paste("Votes to get majority:", 61 - voteOp), cex = 1.5, col = "grey", pos = 4)
panel.text(voteOp + 1.8, 6.8, label = paste("Votes to get majority:", 61 - voteOp), cex = 1.5, col = "red", pos = 4)
panel.text((voteGb - 10) - 0.01, 2.45, label = "Which party\nfrom the minority\nwill help the governing party\n rule the Congress?", cex = 3, col = "black")
panel.text(voteGb - 10, 2.5, label = "Which party\nfrom the minority\nwill help the governing party\n rule the Congress?", cex = 3, col = "#CC6600")
panel.arrows(voteOp + 2, 7, 61, 7, col = "red", lwd = 3, length = 0.1)
panel.arrows(voteGb + 2, 6, 61, 6, col = "blue", lwd = 3, length = 0.1)
panel.abline(h = 0, v = 61, col = "black", lwd = 3)
})
As it can be seen, now it is clear that:
The colors chosen help us organise the parties according to their position. For this, we need not a particular color for every party. The use of shadows in some of the texts, help us pop-up some important information and questions to engage the reader. The use of the grid helps, although the differences are very clear. However, the line indicating the number of seats needed to achieve majority, is very important to clarify the purpose of the chart, as it makes clear where every party is located and what is missing to be in the optimal position.
IMPROVEMENT 2:
Now let's check the regional situation, which is in two separate tables:

In Peru. instead of “States”, as in the USA, we have “Regions”. The seats are won proportionally to the population in every Region.However, as Lima, the Capital of Peru, is the most populated Region, it affects the representativeness of the whole country and clearly affects the result.
To highlight this situation, we need to organise both tables into one. To precisely visualize the distribution of seats per Region and per party. We will have to do some work on both tables to get the data:
options(width = 180)
tableRegionParty$Total = NULL
for (i in (1:25)) {
if (tableRegionParty[i, 3] == max(tableRegionParty[i, 2:8])) {
tableRegionParty$group[i] = "PAP"
tableRegionParty$groupcode[i] = 2
}
if (tableRegionParty[i, 4] == max(tableRegionParty[i, 2:8])) {
tableRegionParty$group[i] = "UN"
tableRegionParty$groupcode[i] = 3
}
if (tableRegionParty[i, 5] == max(tableRegionParty[i, 2:8])) {
tableRegionParty$group[i] = "AF"
tableRegionParty$groupcode[i] = 4
}
if (tableRegionParty[i, 6] == max(tableRegionParty[i, 2:8])) {
tableRegionParty$group[i] = "FC"
tableRegionParty$groupcode[i] = 5
}
if (tableRegionParty[i, 7] == max(tableRegionParty[i, 2:8])) {
tableRegionParty$group[i] = "PP"
tableRegionParty$groupcode[i] = 6
}
if (tableRegionParty[i, 8] == max(tableRegionParty[i, 2:8])) {
tableRegionParty$group[i] = "RN"
tableRegionParty$groupcode[i] = 7
}
if (tableRegionParty[i, 2] == max(tableRegionParty[i, 2:8])) {
tableRegionParty$group[i] = "UPP"
tableRegionParty$groupcode[i] = 1
}
if (tableRegionParty[i, 1] == "Pasco") {
tableRegionParty$group[i] = "AF"
tableRegionParty$groupcode[i] = 4
}
if (tableRegionParty[i, 1] == "Lima") {
tableRegionParty$group[i] = "UN"
tableRegionParty$groupcode[i] = 3
}
if (tableRegionParty[i, 1] == "Ancash") {
tableRegionParty$group[i] = "PAP"
tableRegionParty$groupcode[i] = 2
}
}
table <- merge(tableRegionParty, tableRegion, by = "Electoral District")
for (i in (1:25)) {
if (table$group[i] == "UPP")
table$percentWinner[i] = table[i, 2]/table[i, 12]
if (table$group[i] == "UN")
table$percentWinner[i] = table[i, 4]/table[i, 12]
if (table$group[i] == "RN")
table$percentWinner[i] = table[i, 8]/table[i, 12]
if (table$group[i] == "PAP")
table$percentWinner[i] = table[i, 3]/table[i, 12]
if (table$group[i] == "AF")
table$percentWinner[i] = table[i, 5]/table[i, 12]
}
table
## Electoral District UPP PAP UN AF FC PP RN group groupcode Registered voters Seats in Congress Candidates per party Participating parties Total candidates percentWinner
## 1 Amazonas 1 1 0 0 0 0 0 UPP 1 179331 2 3 17 47 0.5000
## 2 Ancash 2 2 1 0 0 0 0 PAP 2 611881 5 5 21 99 0.4000
## 3 ApurÃmac 2 0 0 0 0 0 0 UPP 1 195954 2 3 21 55 1.0000
## 4 Arequipa 3 1 1 0 0 0 0 UPP 1 770535 5 5 21 101 0.6000
## 5 Ayacucho 3 0 0 0 0 0 0 UPP 1 306662 3 3 20 58 1.0000
## 6 Cajamarca 2 1 1 1 0 0 0 UPP 1 721239 5 5 23 109 0.4000
## 7 Callao 1 2 1 0 0 0 0 PAP 2 541730 4 4 24 92 0.5000
## 8 Cusco 4 1 0 0 0 0 0 UPP 1 643629 5 5 22 98 0.8000
## 9 Huancavelica 2 0 0 0 0 0 0 UPP 1 203844 2 3 15 39 1.0000
## 10 Huánuco 2 1 0 0 0 0 0 UPP 1 354416 3 3 22 65 0.6667
## 11 Ica 1 2 1 0 0 0 0 PAP 2 451197 4 5 22 88 0.5000
## 12 JunÃn 2 1 1 1 0 0 0 UPP 1 701190 5 5 22 99 0.4000
## 13 La Libertad 1 5 1 0 0 0 0 PAP 2 942656 7 7 22 145 0.7143
## 14 Lambayeque 1 2 1 1 0 0 0 PAP 2 676735 5 5 22 101 0.4000
## 15 Lima 6 7 8 8 3 2 1 UN 3 6063109 35 35 24 738 0.2286
## 16 Loreto 1 1 0 0 1 0 0 UPP 1 416419 3 3 22 60 0.3333
## 17 Madre de Dios 0 0 0 0 0 0 1 RN 7 47742 1 3 14 35 1.0000
## 18 Moquegua 1 1 0 0 0 0 0 UPP 1 99962 2 3 18 44 0.5000
## 19 Pasco 1 0 0 1 0 0 0 AF 4 135670 2 3 17 51 0.5000
## 20 Piura 2 3 1 0 0 0 0 PAP 2 914912 6 6 23 136 0.5000
## 21 Puno 3 1 0 0 1 0 0 UPP 1 674865 5 5 23 106 0.6000
## 22 San MartÃn 1 1 0 1 0 0 0 UPP 1 357124 3 3 17 47 0.3333
## 23 Tacna 1 1 0 0 0 0 0 UPP 1 172427 2 3 18 57 0.5000
## 24 Tumbes 1 1 0 0 0 0 0 UPP 1 110335 2 3 19 57 0.5000
## 25 Ucayali 1 1 0 0 0 0 0 UPP 1 201342 2 3 22 60 0.5000
We have merged both tables in one, however, we need to make some further work to prepare our chart. In particular, we have the seats per party in one column, and we need that the parties be a factor, so we need to convert this “wide” table into a “long” table format:
options(width = 180)
names(table)[1] = "Electoral_District"
x = table[order(-table[, 10], table[, 16], table[, 1]), ]
a = as.character(x$Electoral_District)
l <- reshape(table, varying = c("UPP", "PAP", "UN", "AF", "FC", "PP", "RN"), v.names = "Seats_Won", timevar = "Party", times = c("UPP", "PAP", "UN", "AF", "FC", "PP", "RN"), direction = "long")
l <- l[!(l$Seats_Won == 0), ]
Now we are ready to present our chart, using ggplot2:
require(ggplot2)
## Loading required package: ggplot2
cols = c("orange", "black", "blue", "yellow", "magenta", "green", "red")
congPlot = ggplot(l, aes(Electoral_District, Seats_Won, fill = Party))
congPlot = congPlot + geom_bar(position = "stack", stat = "identity") + coord_flip()
congPlot = congPlot + xlim(a)
congPlot = congPlot + ggtitle("Distribution of Seats won\n per Party by Region\n(2006 Congress Election - PERU)\n")
congPlot = congPlot + annotate("text", x = 17.95, y = 19.95, label = "DO YOU SEE \nAN STRUCTURAL PROBLEM?", color = "black", size = 5)
congPlot = congPlot + annotate("text", x = 18, y = 20, label = "DO YOU SEE \nAN STRUCTURAL PROBLEM?", color = "#696969", size = 5)
congPlot = congPlot + annotate("text", x = 12, y = 20, label = "SMALL AND LOCAL PARTIES \nGETTING\n COUNTRY WIDE\n REPRESENTATION", color = "black", size = 6)
congPlot = congPlot + annotate("text", x = 11.95, y = 19.95, label = "SMALL AND LOCAL PARTIES \nGETTING\n COUNTRY WIDE\n REPRESENTATION", color = "red", size = 6)
congPlot = congPlot + scale_fill_manual(values = cols, breaks = c("UPP", "PAP", "UN", "AF", "FC", "PP", "RN"))
congPlot + theme(axis.text.y = element_text(colour = "black", size = 10, angle = 0, hjust = 1, vjust = 0, face = "bold"), axis.text.x = element_text(colour = "grey10", size = 18, hjust = 0.5,
vjust = 0.5, face = "plain"))
Our purpose here was: