Introduction

This report provides an analysis of science and technology indicators in Kenya, focusing on charges related to the use of intellectual property and other associated metrics. Science and technology have become drivers of economic growth, innovation, and competitiveness in the global economy, and Kenya is increasingly recognizing their importance in its national development agenda.

The data analyzed in this report spans multiple years and includes information on intellectual property usage, alongside other relevant indicators. By examining trends in these indicators, the report aims to provide insights into how Kenya is progressing in its efforts to strengthen science and technology sectors, with a focus on the commercialization of intellectual property.

Through statistical summaries and visual representations, this analysis seeks to highlight both the growth and challenges within the sector, offering key takeaways that could inform policy makers, researchers, and stakeholders in science, technology, and innovation. The analysis also sheds light on the potential implications for Kenya’s economic development and its role within the broader regional and global landscape.

This report is intended to support evidence-based decision-making and guide strategies for fostering innovation, ensuring that Kenya continues to build a resilient, knowledge-driven economy.

# Load the data
data <- read.csv("C:\\Users\\elvir\\OneDrive\\Desktop\\RProjects\\kenya\\education\\science-and-technology_ken.csv")
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# Remove invalid rows (assumed headers in the dataset)
data.clean <- data[-1,] # Removing the first row with the headers

# Rename columns
colnames(data.clean) <- c("Country", "Country_ISO3", "Year", "Indicator_Name", "Indicator_Code", "Value")

# Convert columns to appropriate data types
data.clean$Year <- as.numeric(data.clean$Year)
data.clean$Value <- as.numeric(data.clean$Value)

# Filter out rows with NA values
data.clean <- data.clean %>% filter(!is.na(Value))

# Preview the cleaned data
head(data.clean)
##   Country Country_ISO3 Year
## 1   Kenya          KEN 2022
## 2   Kenya          KEN 2021
## 3   Kenya          KEN 2020
## 4   Kenya          KEN 2019
## 5   Kenya          KEN 2018
## 6   Kenya          KEN 2017
##                                                              Indicator_Name
## 1 Charges for the use of intellectual property, payments (BoP, current US$)
## 2 Charges for the use of intellectual property, payments (BoP, current US$)
## 3 Charges for the use of intellectual property, payments (BoP, current US$)
## 4 Charges for the use of intellectual property, payments (BoP, current US$)
## 5 Charges for the use of intellectual property, payments (BoP, current US$)
## 6 Charges for the use of intellectual property, payments (BoP, current US$)
##   Indicator_Code     Value
## 1 BM.GSR.ROYL.CD  39863722
## 2 BM.GSR.ROYL.CD  42251397
## 3 BM.GSR.ROYL.CD  76192115
## 4 BM.GSR.ROYL.CD 121644123
## 5 BM.GSR.ROYL.CD 131726255
## 6 BM.GSR.ROYL.CD 206821952
# Summary of data
summary_statistics <- data.clean %>% 
  group_by(Indicator_Name) %>% 
  summarize(mean_value = mean(Value, na.rm = TRUE), 
            median_value = median(Value, na.rm = TRUE), 
            min_value = min(Value, na.rm = TRUE), 
            max_value = max(Value, na.rm = TRUE))

# Print summary
summary_statistics
## # A tibble: 10 × 5
##    Indicator_Name                    mean_value median_value min_value max_value
##    <chr>                                  <dbl>        <dbl>     <dbl>     <dbl>
##  1 Charges for the use of intellect…    4.36e+7 30109419.      0         2.07e+8
##  2 Charges for the use of intellect…    2.24e+7 11354740.      0         9.17e+7
##  3 High-technology exports (% of ma…    4.63e+0        4.48    2.32e+0   7.59e+0
##  4 High-technology exports (current…    8.06e+7 80661302.      4.69e+7   1.36e+8
##  5 Patent applications, nonresidents    7.11e+1       62       2   e+1   1.36e+2
##  6 Patent applications, residents       9.13e+1       48       1   e+0   3.41e+2
##  7 Research and development expendi…    4.86e-1        0.410   3.55e-1   6.92e-1
##  8 Researchers in R&D (per million …    1.51e+2      169.      5.62e+1   2.27e+2
##  9 Scientific and technical journal…    7.31e+2      591.      3.60e+2   1.75e+3
## 10 Technicians in R&D (per million …    3.44e+2      344.      6.10e+1   6.28e+2
library(ggplot2)
#trend analysis 
# Plot trend for the top indicator
ggplot(data.clean, aes(x = Year, y = Value, color = Indicator_Name)) +
  geom_line() +
  labs(title = "Trend of Intellectual Property Charges Over the Years", 
       x = "Year", 
       y = "Value")

# Save the cleaned data
write.csv(data.clean, "cleaned_science_and_technology_ken.csv")

# Save summary statistics
write.csv(summary_statistics, "summary_statistics_ken.csv")