QUIZ1

Author

selhan çil

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readxl)
library(ggplot2)

. Download the  dataset below. impact gives the impact of each researcher. aff denotes affiliation. 

data <- read_excel("~/Econ 465 R/data/Incites Researchers.xlsx")
New names:
• `` -> `...19`
• `` -> `...20`
• `` -> `...21`
• `` -> `...22`
• `` -> `...23`
• `` -> `...24`
• `` -> `...25`
• `` -> `...26`
• `` -> `...27`
• `` -> `...28`
• `` -> `...29`
• `` -> `...30`
• `` -> `...31`

 2. Filter for observations where the **first affiliation** is **İzmir Ekonomi Üniversitesi**

ieu <- data %>% 
  filter(aff1 == "Izmir Ekonomi Universitesi")

3. Create a **histogram** of the `impact` variable

ggplot(ieu, aes(x = impact)) +
  geom_histogram(binwidth = 0.5, fill = "#2E86AB", color = "white", alpha = 0.85) +
  geom_vline(aes(xintercept = mean(impact), color = "Mean"),
             linetype = "dashed", linewidth = 1) +
  geom_vline(aes(xintercept = median(impact), color = "Median"),
             linetype = "solid", linewidth = 1) +
  scale_color_manual(name = "Reference Lines",
                     values = c("Mean" = "#E63946", "Median" = "#F4A261")) +
  labs(
    title    = "Histogram of Research Impact",
    subtitle = "Researchers with First Affiliation: İzmir Ekonomi Üniversitesi",
    x        = "Impact Score",
    y        = "Number of Researchers"
  ) +
  theme_minimal(base_size = 13) +
  theme(plot.title = element_text(face = "bold"))

 4. Create a **boxplot** of the `impact` variable

ggplot(ieu, aes(x = "", y = impact)) +
  geom_boxplot(fill = "#2E86AB", color = "#1B4F72",
               outlier.colour = "#E63946", outlier.shape = 16,
               outlier.size = 2, alpha = 0.8) +
  stat_summary(fun = mean, geom = "point", shape = 18,
               size = 4, color = "#E63946") +
  labs(
    title    = "Boxplot of Research Impact",
    subtitle = "Researchers with First Affiliation: İzmir Ekonomi Üniversitesi\n(Red diamond = Mean)",
    y        = "Impact Score",
    x        = ""
  ) +
  theme_minimal(base_size = 13) +
  theme(
    plot.title   = element_text(face = "bold"),
    axis.text.x  = element_blank(),
    axis.ticks.x = element_blank()
  )

summary_stats <- ieu %>%
  summarise(
    N        = n(),
    Mean     = round(mean(impact, na.rm = TRUE), 4),
    Median   = round(median(impact, na.rm = TRUE), 4),
    SD       = round(sd(impact, na.rm = TRUE), 4),
    Min      = round(min(impact, na.rm = TRUE), 4),
    Q1       = round(quantile(impact, 0.25, na.rm = TRUE), 4),
    Q3       = round(quantile(impact, 0.75, na.rm = TRUE), 4),
    Max      = round(max(impact, na.rm = TRUE), 4),
    IQR      = round(IQR(impact, na.rm = TRUE), 4),
    Skewness = round(e1071::skewness(impact, na.rm = TRUE), 4)
  )

knitr::kable(t(summary_stats), col.names = c("Value"),
             caption = "Descriptive Statistics for Impact (İzmir Ekonomi Üniversitesi)")
Descriptive Statistics for Impact (İzmir Ekonomi Üniversitesi)
Value
N 630.0000
Mean 0.6888
Median 0.2410
SD 1.8707
Min 0.0000
Q1 0.0000
Q3 0.7804
Max 33.0800
IQR 0.7804
Skewness 11.0529

Distribution Interpretation

Shape

The distribution of impact scores is right-skewed. Most researchers are concentrated at low values, while a few researchers have very high scores. This is a common pattern in academic citation data.

Center

The median (0.24) is noticeably lower than the mean (0.69). This gap is caused by a small number of high-impact researchers pulling the mean upward. Therefore, the median better represents the typical researcher at this institution.

Spread

The standard deviation (1.87) is much larger than the mean, which indicates high variability among researchers. The middle 50% of researchers have impact scores ranging from 0 to 0.78, showing that most researchers are clustered at the lower end.

Outliers

The boxplot shows several high-value outliers. About 32% of researchers (199 out of 630) have an impact score of zero, meaning they have no recorded citation impact. On the other end, a small group reaches scores above 10, which are the institution’s highest performers.

Summary

Overall, research impact at İzmir Ekonomi Üniversitesi is highly unequal. The typical researcher has a low impact score, while a small number of highly cited individuals raise the average. To improve overall research performance, the institution should focus on supporting mid-level researchers rather than depending on a few top performers.