The objective of this Midterm Submission is to perform Exploratory Data Analysis on the World University Rankings Dataset to gain insights on the performance of our Singapore Universities (NUS and NTU) and how they compared to the rest of universities in the world. We will also be exploring each individual component of our local universities to discover their strengths and weaknesses.
The dataset is extracted from Kaggle and we will only be focusing on the top 100 universities from the “Times Higher Education World University Ranking” which is widely regarded as one of the most influential and widely observed university measures in the world.
Please note that the basic methodologies may be different from year to year and hence may affect the results interpretation when comparing between different years.
| Variable | Description |
|---|---|
| world_rank | world rank for the university |
| university_name | Name of university |
| country | Country of each university |
| teaching | University score for teaching (the learning environment) |
| international | University score international outlook (staff, students, research) |
| research | University score for research (volume, income and reputation) |
| citations | University score for citations (research influence) |
| income | University score for industry income (knowledge transfer) |
| total_score | Total score for university, used to determine rank |
| num_students | Number of students at the university |
| student_staff_ratio | Number of students divided by number of staffs |
| international_students | Percentage of students who are international |
| female_male_ratio | Female student to Male student ratio |
| year | year of the ranking (2011 to 2016 included) |
ggplot(df_times_100_2016, aes(x = country, fill = country)) +
geom_bar(stat = "count", color = "black") +
geom_text(stat = "count", aes(label = ..count..), vjust = 1.5) +
theme(axis.text.x = element_text(angle = 90)) +
labs(title = "Number of Top 100 Universities by Countries")
# table(df_times_100_2016$country)
USA has the greatest number of top 100 universities in 2016 of 39. This was followed by UK (16), Germany (9), Netherlands (8), Australia (6).
Our local universities, NUS and NTU are also in the list of the top 100 universities in 2016 according to the Times Higher Education World University Ranking.
ggplot(df_times_100_Spore, aes(x = year, y = num_students)) +
geom_line(aes(color = university_name)) +
geom_text(aes(color = university_name, label = num_students), vjust = 1.5) +
geom_point(aes(color = university_name)) +
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Number of Students in NUS and NTU", color = 'University')
According to the Times dataset, the numbers of students of NUS and NTU stay consistent at 31592 and 25028 respectively. It is unlikely for the universities to have such a consistent number of students throughout the years and hence, these numbers could be just as estimate.
NUS has around 31592 number of students and NTU has around 25028 number of students.
ggplot(df_times_100_Spore, aes(x = year, y = teaching)) +
geom_line(aes(color = university_name)) +
geom_text(aes(color = university_name, label = teaching), vjust = 1.5) +
geom_point(aes(color = university_name)) +
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (Teaching)", color = 'University')
The Teaching component measures the University score for teaching and the learning environment. It examines the perceived prestige of institutions in teaching.
NUS scored between 65.5 to 74.4 from 2011 to 2016.
NTU scored between 37.7 to 48.4 from 2013 to 2016.
NTU teaching score has been low when compared to NUS.
ggplot(df_times_100_Spore, aes(x = year, y = international)) +
geom_line(aes(color = university_name)) +
geom_text(aes(color = university_name, label = international), vjust = 1.5) +
geom_point(aes(color = university_name)) +
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (International Outlook)", color = 'University')
The International Outlook measures the University score for international outlook of staff, students, research. It measures the ability of a university to attract undergraduates, postgraduates and faculty from other countries. It also calculates the proportion of a university’s total research journal publications with at least 1 international co-author and reward the higher volumes.
NUS scored between 92.3 to 97.8 from 2011 to 2016.
NTU scored between 90.5 to 94.6 from 2013 to 2016.
Both NUS and NTU are improving from 2013 to 2016.
ggplot(df_times_100_Spore, aes(x = year, y = research)) +
geom_line(aes(color = university_name)) +
geom_text(aes(color = university_name, label = research), vjust = 1.5) +
geom_point(aes(color = university_name)) +
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (Research)", color = 'University')
The Research component measures the University score for research influence by volume, income and reputation. It examines the universities reputation for research excellence among its peers as well as their research productivity.
NUS scored between 72.6 to 87.2 from 2011 to 2016.
NTU scored between 54.3 to 66.9 from 2013 to 2016.
Both NUS and NTU are improving from 2014 to 2016.
ggplot(df_times_100_Spore, aes(x = year, y = citations)) +
geom_line(aes(color = university_name)) +
geom_text(aes(color = university_name, label = citations), vjust = 1.5) +
geom_point(aes(color = university_name)) +
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (Citations)", color = 'University')
The Citations component measures the University score for citations and research influence. It examines the research influence of the universities in spreading new knowledge and ideas.
NUS scored between 63.4 to 79.4 from 2011 to 2016.
NTU scored between 54.5 to 85.6 from 2013 to 2016.
NTU has been improving exponentially from 2013 to 2016.
ggplot(df_times_100_Spore, aes(x = year, y = income)) +
geom_line(aes(color = university_name)) +
geom_text(aes(color = university_name, label = income), vjust = 1.5) +
geom_point(aes(color = university_name)) +
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (Industry Income)", color = 'University')
The Industry Income component measures the University score for industry income and knowledge transfer. It captures the universities knowledge-transfer activity by examining their research income earned from industry. It also measures the extent to which businesses are willing to pay for research and the university’s ability to attract funding in the commercial marketplace.
NUS scored between 40.5 to 77.4 from 2011 to 2016.
NTU scored between 99.5 to 100 from 2013 to 2016.
NTU is doing extremely well in this component.
ggplot(df_times_100_Spore, aes(x = year, y = total_score)) +
geom_line(aes(color = university_name)) +
geom_text(aes(color = university_name, label = total_score), vjust = 1.5) +
geom_point(aes(color = university_name)) +
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (Overall Score)", color = 'University')
This is the Overall Score of the university used to determine university rank.
NUS scored between 70.9 to 79.2 from 2011 to 2016.
NTU scored between 57.2 to 68.2 from 2013 to 2016.
NTU has been improving rapidly from 2014 to 2016.
df_times_100_Spore$world_rank_fac = factor(df_times_100_Spore$world_rank)
ggplot(df_times_100_Spore, aes(x = year, y = world_rank_fac)) +
geom_text(aes(color = university_name, label = world_rank_fac), vjust = 1.5) +
geom_point(aes(color = university_name)) +
scale_color_manual(values = c("Dark Blue", "Dark Orange")) +
labs(title = "Performance Trend of NUS and NTU (World Ranking)", color = 'University') +
scale_y_discrete(limits = rev(levels(df_times_100_Spore$world_rank_fac)))
df_times_100_Spore$world_rank_fac = NULL
This is the University World Ranking of NUS and NTU.
NUS has been attaining the rank of between 40th to 25th position from 2011 to 2016.
NTU has been attaining the rank of between 86th to 55th position from 2013 to 2016.
NTU has been improving rapidly from 2013 to 2016.
df_times_100_Spore_kpi = df_times_100_Spore[df_times_100_Spore$year == 2016, ]
rownames(df_times_100_Spore_kpi) = df_times_100_Spore_kpi$university_name
df_times_100_Spore_kpi = df_times_100_Spore_kpi[, c("teaching", "international", "research", "citations", "income")]
colnames(df_times_100_Spore_kpi) = c("Teaching", "International Outlook", "Research", "Citations", "Industry Income")
# Max and Min value for Radar Chart
df_times_100_Spore_kpi = rbind(rep(100,5) , rep(0,5) , df_times_100_Spore_kpi)
# Color vector
colors_border = c(rgb(0.2,0.5,0.5,0.9), rgb(0.8,0.2,0.5,0.9))
colors_in = c(rgb(0.2,0.5,0.5,0.4), rgb(0.8,0.2,0.5,0.4))
# Radar Chart
radarchart(df_times_100_Spore_kpi, axistype = 1 ,
pcol = colors_border, pfcol = colors_in , plwd = 4, plty = 1,
cglcol = "grey", cglty = 1, axislabcol = "grey", cglwd = 0.8,
vlcex = 0.8, title = "NUS VS NTU")
# Add Radar Chart Legend
legend(x = 0.7, y = 1, legend = rownames(df_times_100_Spore_kpi[-c(1,2), ]),
bty = "n", pch = 20, col = colors_in , text.col = "grey", cex = 1.2, pt.cex = 3)
This Radar Chart is comparing the 5 different components of between NUS and NTU in 2016.
In 2016, NUS performed better in Teaching, Research and International Outlook.
NTU performed better in Industry Income and Citations.