Background

The objective of this Midterm Submission is to perform Exploratory Data Analysis on the World University Rankings Dataset to gain insights on the performance of our Singapore Universities (NUS and NTU) and how they compared to the rest of universities in the world. We will also be exploring each individual component of our local universities to discover their strengths and weaknesses.

The dataset is extracted from Kaggle and we will only be focusing on the top 100 universities from the “Times Higher Education World University Ranking” which is widely regarded as one of the most influential and widely observed university measures in the world.

Please note that the basic methodologies may be different from year to year and hence may affect the results interpretation when comparing between different years.

Data Source

https://www.kaggle.com/mylesoneill/world-university-rankings

Definitions of Variables

Variable Description
world_rank world rank for the university
university_name Name of university
country Country of each university
teaching University score for teaching (the learning environment)
international University score international outlook (staff, students, research)
research University score for research (volume, income and reputation)
citations University score for citations (research influence)
income University score for industry income (knowledge transfer)
total_score Total score for university, used to determine rank
num_students Number of students at the university
student_staff_ratio Number of students divided by number of staffs
international_students Percentage of students who are international
female_male_ratio Female student to Male student ratio
year year of the ranking (2011 to 2016 included)

Data Exploration

Number of Top 100 Universities by Countries in 2016

ggplot(df_times_100_2016, aes(x = country, fill = country)) +
geom_bar(stat = "count", color = "black") +
geom_text(stat = "count", aes(label = ..count..), vjust = 1.5) +
theme(axis.text.x = element_text(angle = 90)) +
labs(title = "Number of Top 100 Universities by Countries")

# table(df_times_100_2016$country)

USA has the greatest number of top 100 universities in 2016 of 39. This was followed by UK (16), Germany (9), Netherlands (8), Australia (6).

Our local universities, NUS and NTU are also in the list of the top 100 universities in 2016 according to the Times Higher Education World University Ranking.

Number of Students in NUS and NTU

ggplot(df_times_100_Spore, aes(x = year, y = num_students)) +
geom_line(aes(color = university_name)) + 
geom_text(aes(color = university_name, label = num_students), vjust = 1.5) + 
geom_point(aes(color = university_name)) + 
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Number of Students in NUS and NTU", color = 'University')

According to the Times dataset, the numbers of students of NUS and NTU stay consistent at 31592 and 25028 respectively. It is unlikely for the universities to have such a consistent number of students throughout the years and hence, these numbers could be just as estimate.

NUS has around 31592 number of students and NTU has around 25028 number of students.

Performance Trend of NUS and NTU (Teaching)

ggplot(df_times_100_Spore, aes(x = year, y = teaching)) +
geom_line(aes(color = university_name)) + 
geom_text(aes(color = university_name, label = teaching), vjust = 1.5) + 
geom_point(aes(color = university_name)) + 
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (Teaching)", color = 'University')

The Teaching component measures the University score for teaching and the learning environment. It examines the perceived prestige of institutions in teaching.

NUS scored between 65.5 to 74.4 from 2011 to 2016.

NTU scored between 37.7 to 48.4 from 2013 to 2016.

NTU teaching score has been low when compared to NUS.

Performance Trend of NUS and NTU (International Outlook)

ggplot(df_times_100_Spore, aes(x = year, y = international)) +
geom_line(aes(color = university_name)) + 
geom_text(aes(color = university_name, label = international), vjust = 1.5) + 
geom_point(aes(color = university_name)) + 
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (International Outlook)", color = 'University')

The International Outlook measures the University score for international outlook of staff, students, research. It measures the ability of a university to attract undergraduates, postgraduates and faculty from other countries. It also calculates the proportion of a university’s total research journal publications with at least 1 international co-author and reward the higher volumes.

NUS scored between 92.3 to 97.8 from 2011 to 2016.

NTU scored between 90.5 to 94.6 from 2013 to 2016.

Both NUS and NTU are improving from 2013 to 2016.

Performance Trend of NUS and NTU (Research)

ggplot(df_times_100_Spore, aes(x = year, y = research)) +
geom_line(aes(color = university_name)) + 
geom_text(aes(color = university_name, label = research), vjust = 1.5) + 
geom_point(aes(color = university_name)) + 
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (Research)", color = 'University') 

The Research component measures the University score for research influence by volume, income and reputation. It examines the universities reputation for research excellence among its peers as well as their research productivity.

NUS scored between 72.6 to 87.2 from 2011 to 2016.

NTU scored between 54.3 to 66.9 from 2013 to 2016.

Both NUS and NTU are improving from 2014 to 2016.

Performance Trend of NUS and NTU (Citations)

ggplot(df_times_100_Spore, aes(x = year, y = citations)) +
geom_line(aes(color = university_name)) + 
geom_text(aes(color = university_name, label = citations), vjust = 1.5) + 
geom_point(aes(color = university_name)) + 
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (Citations)", color = 'University') 

The Citations component measures the University score for citations and research influence. It examines the research influence of the universities in spreading new knowledge and ideas.

NUS scored between 63.4 to 79.4 from 2011 to 2016.

NTU scored between 54.5 to 85.6 from 2013 to 2016.

NTU has been improving exponentially from 2013 to 2016.

Performance Trend of NUS and NTU (Industry Income)

ggplot(df_times_100_Spore, aes(x = year, y = income)) +
geom_line(aes(color = university_name)) + 
geom_text(aes(color = university_name, label = income), vjust = 1.5) + 
geom_point(aes(color = university_name)) + 
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (Industry Income)", color = 'University') 

The Industry Income component measures the University score for industry income and knowledge transfer. It captures the universities knowledge-transfer activity by examining their research income earned from industry. It also measures the extent to which businesses are willing to pay for research and the university’s ability to attract funding in the commercial marketplace.

NUS scored between 40.5 to 77.4 from 2011 to 2016.

NTU scored between 99.5 to 100 from 2013 to 2016.

NTU is doing extremely well in this component.

Performance Trend of NUS and NTU (Overall Score)

ggplot(df_times_100_Spore, aes(x = year, y = total_score)) +
geom_line(aes(color = university_name)) + 
geom_text(aes(color = university_name, label = total_score), vjust = 1.5) + 
geom_point(aes(color = university_name)) + 
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (Overall Score)", color = 'University') 

This is the Overall Score of the university used to determine university rank.

NUS scored between 70.9 to 79.2 from 2011 to 2016.

NTU scored between 57.2 to 68.2 from 2013 to 2016.

NTU has been improving rapidly from 2014 to 2016.

Performance Trend of NUS and NTU (World Ranking)

df_times_100_Spore$world_rank_fac = factor(df_times_100_Spore$world_rank)

ggplot(df_times_100_Spore, aes(x = year, y = world_rank_fac)) +
geom_text(aes(color = university_name, label = world_rank_fac), vjust = 1.5) + 
geom_point(aes(color = university_name)) + 
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (World Ranking)", color = 'University') +
scale_y_discrete(limits = rev(levels(df_times_100_Spore$world_rank_fac)))

df_times_100_Spore$world_rank_fac = NULL

This is the University World Ranking of NUS and NTU.

NUS has been attaining the rank of between 40th to 25th position from 2011 to 2016.

NTU has been attaining the rank of between 86th to 55th position from 2013 to 2016.

NTU has been improving rapidly from 2013 to 2016.

Comparison between NUS VS NTU in 2016

df_times_100_Spore_kpi = df_times_100_Spore[df_times_100_Spore$year == 2016, ]
rownames(df_times_100_Spore_kpi) = df_times_100_Spore_kpi$university_name
df_times_100_Spore_kpi = df_times_100_Spore_kpi[, c("teaching", "international", "research", "citations", "income")]
colnames(df_times_100_Spore_kpi) = c("Teaching", "International Outlook", "Research", "Citations", "Industry Income")

# Max and Min value for Radar Chart
df_times_100_Spore_kpi = rbind(rep(100,5) , rep(0,5) , df_times_100_Spore_kpi)

# Color vector
colors_border = c(rgb(0.2,0.5,0.5,0.9), rgb(0.8,0.2,0.5,0.9))
colors_in = c(rgb(0.2,0.5,0.5,0.4), rgb(0.8,0.2,0.5,0.4))

# Radar Chart
radarchart(df_times_100_Spore_kpi, axistype = 1 , 
pcol = colors_border, pfcol = colors_in , plwd = 4, plty = 1,
cglcol = "grey", cglty = 1, axislabcol = "grey",  cglwd = 0.8,
vlcex = 0.8,  title = "NUS VS NTU")

# Add Radar Chart Legend
legend(x = 0.7, y = 1, legend = rownames(df_times_100_Spore_kpi[-c(1,2), ]), 
bty = "n", pch = 20, col = colors_in , text.col = "grey", cex = 1.2, pt.cex = 3)

This Radar Chart is comparing the 5 different components of between NUS and NTU in 2016.

In 2016, NUS performed better in Teaching, Research and International Outlook.

NTU performed better in Industry Income and Citations.