The Doctorate in Economics, Management and Organization(DEMO) program of the Autonomous University of Barcelona(UAB), according to the program’s website, has had 21 graduates sinde 2008. This brief report intends to find insight into the DEMO program alumni’s placement data, through simple statistaical visualization. More specifically, I would like to investigate the relationship between variables such as graduation year, placement sector, placement continent, etc.
The investigation is visual due to the lucidity of the approach, and the possibility of the representation of the plots in the DEMO website. Since 2014, when I joined the Business Department of UAB as a PhD candidate, the website components has remained unchanged, hence one pripheral goal of this report is suggesion of some addition to the website.
The alumni data is mainly extracted from the program’s website. The dataset includes five variables: * Graduation year: The graduation date based on DEMO website * Placement: Whether the alumnus ended up in Private sector or Academia * Region: The placement Region, a categorical variable with “Europe”, “North America” , and “South America” as outcome set * Country: The country of the placement * Higher Rank Uni?: For the alumni who chose Academia over Private Sector, it returns whether the placement university has had higher rank than UAB or not. The ranking is based on QS ranking, 2017.
Let’s have a look at the first six rows of the dataset.
| Graduation year | Placemenet | Region | Country | Higher Rank Uni? | Nationality |
|---|---|---|---|---|---|
| 2016 | Academia | Europe | Spain | TRUE | Europe |
| 2016 | Academia | Europe | Spain | FALSE | Europe |
| 2015 | Academia | Europe | Spain | FALSE | Europe |
| 2015 | Private Sector | Europe | Spain | NA | Europe |
| 2015 | Academia | Europe | England | TRUE | Europe |
| 2014 | Academia | Europe | England | TRUE | Europe |
The diagrams that are used in this report are called “alluvial diagram”, a variation of parallel coordinates that can lucidly show flow and change.
To prepare this report, I have used R statistical software, and alluvial, dplyr, readr, knitr packages.
The main question that I wanted to investigate was about the placement sector. In other words, whether the DEMO alumni pursue private sector and out-of-academia or continue their career in academica after graduation. The latter was what expected.
Following plot can show the relation between the year of graduation and the placement sector. It is important to clarify that alluvial diagrams work with categorical variables, and I have categorized the year of graduation into three categories: (2007,2010], (2010,2014], (2014,2016]. The reason behind these break points is mainly the relatively uniform number of members in each category. They can be seen as three generations of graduates as well.
Graduation Year and Placement Sector
As one can see, the majority of the alumni have pursued their career in Academia, rather than Private Sector. The proportion of the graduates who followed a job out-side of the academia is seemingly constant over the graduation year categories.
The second question that is investigated visually is about the continent of the placements. Do DEMO graduates prefer to remain in Europe? Do they choose to go to North America? Is there any relation between the graduation year and the placement continent?
Graduation Year and Placement Continent
Figure2 shows that recent graduates all have found their first career after graduation in Europe. Also the majority of the DEMO graduates have found a job in Europe. South America and North America have been the next destinations.
Now let’s put all three variables of figure1 and figure2, in one plot.
Figure3 has three variables as its coordinates. The coloring is based on categories of the
Placement variable. One can see that majority of the graduates who have remained in Europe, also remained in Academia. While the relative proportion of Academia and Private Sector is more equal and balanced for the graduates who headed to South or North America.
All alumni obviously are graduates of Autonomous University of Barcelona(UAB). According to 2016 QS World Ranking Universities, UAB is on 203th world rank. Based on the QS ranking of placement universities, I would investigate this question whether the DEMO alumni have found the first placement in a higher ranked university?
Is the quality of the DEMO program increasing?
Figure 4 is quite interesting. Apparantly, newer graduates have found placements in higher rank universities. Is it a random event or meaningfully the DEMO graduates have increased their quality in recent years? Has DEMO community established new links with higher ranked universities? Chance or trend? From another perspective, since the UAB is the 203rd univeristy in the QS ranking, finding a placement in a lower rank university among thousands of higher education institutes and universities seems nothing surprising.
It is not irrational to think that Europeans tend to remain in Europe, and South Americans tend to go back to South America after graduation. This graph is going to show the flow of nationality categories to placement continents, in order to investigate such hypothesis.
Alumni prefer to work in their original continent?
Figure 5, shows that the early conjucture is not that much compatible with the reality of our data. All the DEMO graduates who have found the first job in South America are South Americans. However, the South American graduates almost equally spread among the continents. Absolute majority of the European graduates have remained in Europe for the first placement.
The prospective graduates of DEMO are more diverese, from Middle East to North America, so this graph would be more interesting in the next updates.
DEMO program is a very young PhD program, whose first graduate has recieved her degree in 2008. However, till 2016, 20 more students have recieved their doctorate degree from the DEMO program of the Business Department of Autonomous University of Barcelona(UAB), Spain.
In this report, I tried to get some insight into the data related to these 21 alumni, through 5 alluvial diagrams. While each plot tries to clarify one specific question of interest, possibly the most interesting findings to me was whether our graduates find their way to higher ranked universities or not. Figure 4 is the reflection of this question on the DEMO alumni data, and emerges up further questions.
As one of the DEMO PhD candidates who is going to defend his work very soon, and more importantly as a data enthusiast and an data analyst, I would like to explore the department’s data a little bit in my spare times. This report is the first to satisfy such curiousity, and it won’t be the last one.
Shahin Ashkiani September 2017
Hadley Wickham, Romain Francois, Lionel Henry and Kirill Müller (2017). dplyr: A Grammar of Data Manipulation. R package version 0.7.2. https://CRAN.R-project.org/package=dplyr
Bojanowski M and Edwards R (2016). alluvial: R Package for Creating Alluvial Diagrams. R package version: 0.1-2, <URL: https://github.com/mbojan/alluvial>.
Hadley Wickham, Jim Hester and Romain Francois (2017). readr: Read Rectangular Text Data. R package version 1.1.1. https://CRAN.R-project.org/package=readr
Yihui Xie (2017). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.17.
Yihui Xie (2015) Dynamic Documents with R and knitr. 2nd edition. Chapman and Hall/CRC. ISBN 978-1498716963
Yihui Xie (2014) knitr: A Comprehensive Tool for Reproducible Research in R. In Victoria Stodden, Friedrich Leisch and Roger D. Peng, editors, Implementing Reproducible Computational Research. Chapman and Hall/CRC. ISBN 978-1466561595
require(alluvial)
require(dplyr)
require(readr)
require(knitr)
alluvial_dataset <-dataset
alluvial_dataset$`Graduation year ` <- cut(x = dataset$`Graduation year ` , breaks = c(2007,2010,2014,2016))
# Private Sector or Academica?
alluvial_dataset_1 <- alluvial_dataset %>% group_by(`Graduation year `,Region,Placemenet ) %>% summarise(freq = n())
alluvial(alluvial_dataset_1[,c(1,3)], freq=alluvial_dataset_1$freq,
col = ifelse(alluvial_dataset_1$Placemenet == "Academia", "green", "Skyblue"),
border = ifelse(alluvial_dataset_1$Placemenet == "Academia", "green", "Skyblue"),
hide = alluvial_dataset_1$freq == 0,
cex = 0.7
)
#Which regions our alumni ended up?
alluvial_dataset_2 <- alluvial_dataset %>% group_by(`Graduation year `,Region) %>% summarise(freq = n())
colnames(alluvial_dataset_2)[which(colnames(alluvial_dataset_2)=="Region")] <- "Placement Region"
alluvial(alluvial_dataset_2[,c(1,2)], freq=alluvial_dataset_2$freq,
col = ifelse(alluvial_dataset_2$`Placement Region` == "Europe", "green",no = ifelse(test = alluvial_dataset_2$`Placement Region` == "North America",yes = "Skyblue" , no = "gold" )),
border = ifelse(alluvial_dataset_2$`Placement Region` == "Europe", "green",no = ifelse(test = alluvial_dataset_2$`Placement Region` == "North America",yes = "Skyblue" , no = "gold" )),
hide = alluvial_dataset_2$freq == 0,
cex = 0.7
)
alluvial_dataset_22 <- alluvial_dataset %>% select(`Graduation year `,Placemenet,Region) %>% group_by(`Graduation year `,Region,Placemenet) %>% summarise(freq = n())
colnames(alluvial_dataset_22)[which(colnames(alluvial_dataset_22)=="Region")] <- "Placement Region"
alluvial(alluvial_dataset_22[,c(1,2,3)], freq=alluvial_dataset_22$freq,
col = ifelse(alluvial_dataset_22$Placemenet == "Academia", "green", "Skyblue"),
border = ifelse(alluvial_dataset_22$Placemenet == "Academia", "green", "Skyblue"),
hide = alluvial_dataset_22$freq == 0,
cex = 0.7
)
alluvial_dataset_3 <- alluvial_dataset %>% filter(Placemenet=="Academia" )
alluvial_dataset_3$`Higher Rank Uni? `[which(is.na(alluvial_dataset_3$`Higher Rank Uni? `))] <- FALSE
alluvial_dataset_3 <- alluvial_dataset_3 %>% group_by(`Graduation year `,`Higher Rank Uni? `) %>% summarise(freq = n())
alluvial(alluvial_dataset_3[,c(1,2)], freq=alluvial_dataset_3$freq,
col = ifelse(test = alluvial_dataset_3$`Higher Rank Uni? `==TRUE , yes = "green", no = "skyblue"),
border = ifelse(test = alluvial_dataset_3$`Higher Rank Uni? `==TRUE , yes = "green", no = "skyblue"),
hide = alluvial_dataset_3$freq == 0,
cex = 0.7
)
alluvial_dataset_5 <- dataset %>% select(Nationality,Region) %>% group_by(Nationality,Region) %>% summarize(freq = n())
colnames(alluvial_dataset_5)[which(colnames(alluvial_dataset_5)=="Region")] <- "Placement Region"
alluvial(alluvial_dataset_5[,c(1,2)], freq=alluvial_dataset_5$freq,
col = ifelse(alluvial_dataset_5$`Placement Region` == "Europe", "green",no = ifelse(test = alluvial_dataset_5$`Placement Region` == "North America",yes = "Skyblue" , no = "gold" )),
border = ifelse(alluvial_dataset_5$`Placement Region` == "Europe", "green",no = ifelse(test = alluvial_dataset_5$`Placement Region` == "North America",yes = "Skyblue" , no = "gold" )),
hide = alluvial_dataset_5$freq == 0,
cex = 0.7
)