CSIR Ecosystem Data

Exploratory Data Analysis (EDA)

Introduction

Mapping the entrepreneurial ecosystem is an important step in understanding how diverse actors interact to drive technological and economic development. An entrepreneurial ecosystem typically includes creators of new products or services, organizations that support or enable innovation, and providers of funding or investment, all of which interact in complex and interdependent ways. Understanding these networks helps identify structural strengths and weaknesses in how innovation is supported and sustained. Prior work has emphasized that flows of information, knowledge, and technology among individuals, enterprises, and institutions are central to innovation processes, and that systematic mapping of these relationships can reveal mismatches or gaps that constrain innovation outcomes (Organisation for Economic Co-operation and Development [OECD], 1997). Ecosystem mapping therefore provides a structured approach for diagnosing coordination failures and informing policy and strategic interventions.

Beyond diagnostic value, ecosystem mapping enables the identification of opportunities for growth and collaboration. By visually and analytically representing actors and linkages, analysts and policymakers can identify under-served sectors, missing intermediaries, or weak connections between innovators and sources of support or capital (Derr, 2025). As such, ecosystem mapping not only offers a snapshot of the current innovation landscape but also supports forward-looking decision-making by highlighting areas where targeted investment, policy reform, or partnership development may be most impactful.

This EDA study draws on a comprehensive master dataset designed to capture the core components of an entrepreneurial ecosystem. The dataset integrates information across 4 primary actor categories: Innovators, Supporters, Entrepreneurs, and Investors. The structure of the dataset makes it well suited for ecosystem-level analysis, including ecosystem mapping, gap analysis, and matchmaking across actors. Ecosystem mapping allows for visualization of how innovators connect with supporters and investors across sectors and geographies.


Methodology

All analyses were conducted using the R statistical computing environment (R Core Team, 2023). The master dataset, originally stored in an Excel workbook, was imported into R and structured into analytical data frames corresponding to the 4 ecosystem components: innovators, supporters, entrepreneurs, and investors. Exploratory data analysis (EDA) was undertaken to examine the structure, completeness, and distributional properties of the data prior to any advanced modeling or inferential analysis.

Data manipulation and summarization were performed using the tidyverse suite of packages (Wickham et al., 2019), which provides a consistent and reproducible framework for data science workflows in R. In particular, the dplyr package (Wickham et al., 2025) was used for data cleaning, variable selection, grouping, and aggregation. These operations enabled the generation of descriptive summaries for each ecosystem component, including counts of organizations by sector, role, and geographic coverage.

Initial dataset diagnostics were produced using the skimr package (Waring et al., 2025), which generates compact summaries of variable types, distributions, and missing values. This step provided a rapid overview of dataset quality, highlighting patterns of completeness across variables and actor categories, and identifying fields with substantial missingness that may warrant caution in interpretation.

The EDA comprised both univariate and bivariate analyses. Univariate analysis focused on describing individual variables through frequency distributions and summary statistics. This included counts of organizations by ecosystem role, sector classification, and beneficiary group, as well as summary measures for any available numeric variables, such as founding year or investment-related attributes. Bivariate analysis examined relationships between pairs of variables, primarily through grouped summaries and cross-tabulations. Examples included comparing sector distributions across innovators, supporters, and investors, and assessing alignment between targeted business stages and types of support or funding offered.

Data visualization was used extensively to support interpretation and communication of findings. Visual outputs were generated using the ggplot2 package (Wickham, 2016). Bar charts were used to illustrate distributions of organizations by category and sector, while grouped bar charts were employed to compare patterns across ecosystem roles. Where applicable, histograms and boxplots were used to examine the distribution of numeric variables and identify potential outliers.

To assess data quality, missing data maps were produced to visualize patterns of missingness across variables and observations. These visualizations allowed for the identification of systematic gaps, such as variables that were consistently missing for particular actor categories. Identifying such patterns is essential for interpreting descriptive results and for informing subsequent analytical steps.


Results

This section presents descriptive and data quality results for the relevant actor group within the innovation ecosystem. The analysis focuses on examining the structure of the dataset, the distribution of variables, and patterns of missingness across key fields. Summary statistics and visual diagnostics are used to assess the completeness and consistency of organisational, sectoral, geographic, and role-specific attributes. Emphasis is placed on understanding how data availability and missingness may influence interpretation, rather than on drawing causal or evaluative conclusions.

Innovators

Missing Data

Descriptive Summaries

Data summary
Name innovators
Number of rows 182
Number of columns 22
_______________________
Column type frequency:
character 11
logical 11
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Organisation Name 0 1.00 3 53 0 182 0
Description 20 0.89 9 3492 0 161 0
Organisation Main Category 31 0.83 10 23 0 4 0
EMAIL Address 2 0.99 13 33 0 180 0
Countries supported 0 1.00 12 58 0 36 0
Main Service Offering Category 163 0.10 27 253 0 16 0
Targeted Business Stage 162 0.11 13 20 0 3 0
Funding Required 139 0.24 5 106 0 20 0
IBP registration reason 178 0.02 29 121 0 4 0
Sector 10 0.95 10 7209 0 139 0
Organisation Custodian 116 0.36 13 33 0 66 0

Variable type: logical

skim_variable n_missing complete_rate mean count
Organisation Sub Category 182 0 NaN :
CONTACT NUMBER 182 0 NaN :
URL 182 0 NaN :
Physical address 182 0 NaN :
Province supported 182 0 NaN :
Service Offering Sub-Categories 182 0 NaN :
Stakeholder Group/Cluster 182 0 NaN :
Primary targeted beneficiary group 182 0 NaN :
Secondary targeted beneficiary group 182 0 NaN :
Sector Sub-Group 182 0 NaN :
Sector Subclass 182 0 NaN :

Supporters

Missing Data

Descriptive Summaries

Data summary
Name supporters
Number of rows 251
Number of columns 21
_______________________
Column type frequency:
character 8
logical 13
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Organisation Name 0 1.00 3 93 0 250 0
Description 4 0.98 17 5739 0 247 0
EMAIL ADDRESS 34 0.86 12 37 0 214 0
Countries supported 99 0.61 11 1162 0 54 0
Service Offerings 70 0.72 18 1106 0 112 0
…17 35 0.86 12 57 0 30 0
Sector 40 0.84 10 10302 0 126 0
Organisation Custodian 101 0.60 12 35 0 149 0

Variable type: logical

skim_variable n_missing complete_rate mean count
Organisation Main Category 251 0 NaN :
Organisation Sub Category 251 0 NaN :
CONTACT NUMBER 251 0 NaN :
URL 251 0 NaN :
PHYSICAL ADDRESS 251 0 NaN :
Provinces 251 0 NaN :
Service Offering Sub-Class 251 0 NaN :
Stakeholder Group/Cluster 251 0 NaN :
Targeted Business Stage 251 0 NaN :
Primary targeted beneficiary group 251 0 NaN :
Secondary targeted beneficiary group 251 0 NaN :
Sector Sub-Group 251 0 NaN :
Sector Sub-Class 251 0 NaN :

Investors

Missing Data

Descriptive Summaries

Data summary
Name investors
Number of rows 152
Number of columns 24
_______________________
Column type frequency:
character 24
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Organisation Name 0 1.00 4 55 0 152 0
Description 39 0.74 44 1781 0 113 0
Organisation Main Category 39 0.74 6 26 0 5 0
Organisation Sub Category 40 0.74 7 27 0 7 0
EMAIL ADDRESS 5 0.97 11 40 0 146 0
CONTACT NUMBER 80 0.47 12 15 0 70 0
URL 40 0.74 10 67 0 112 0
PHYSICAL ADDRESS 49 0.68 21 145 0 101 0
Countries supported 8 0.95 12 182 0 45 0
HQ Country 1 0.99 12 56 0 11 0
Province 148 0.03 3 12 0 2 0
Main Service Offering Category 45 0.70 7 67 0 5 0
Service Offering Sub categories 13 0.91 4 36 0 27 0
Stakeholder Group/Cluster 46 0.70 6 43 0 14 0
Targeted Business Stage 46 0.70 10 49 0 13 0
Primary targeted beneficiary group 128 0.16 5 38 0 10 0
Secondary targeted beneficiary group 97 0.36 5 85 0 18 0
Sector 67 0.56 13 25925 0 76 0
Sector Sub-Group 151 0.01 78 78 0 1 0
Sector Subclass 138 0.09 4 285 0 14 0
Service data 137 0.10 13 98 0 9 0
Investment Instrument 148 0.03 6 23 0 3 0
Operational Status 136 0.11 9 27 0 4 0
Organisation Custodian 135 0.11 14 43 0 17 0

Entrepreneurs

Missing Data

Descriptive Summaries

Data summary
Name entrepreneurs
Number of rows 214
Number of columns 228
_______________________
Column type frequency:
character 7
logical 8
numeric 213
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Organisation 1 1.00 3 75 0 212 0
Description 1 1.00 41 939 0 212 0
Contact Number 52 0.76 11 16 0 159 0
E-mail Address 36 0.83 11 41 0 175 0
Location 20 0.91 12 159 0 192 0
URL 2 0.99 10 145 0 209 0
Stakeholder Group 1 1.00 6 38 0 13 0

Variable type: logical

skim_variable n_missing complete_rate mean count
Main Service Offering Category 214 0 NaN :
Organisation Main Categories 214 0 NaN :
Primary Targeted Beneficiary 214 0 NaN :
Province 214 0 NaN :
Secondary Targeted Beneficiary 214 0 NaN :
Service Offering Subgroup 214 0 NaN :
Targeted Business Stage 214 0 NaN :
Countries 214 0 NaN :

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Main Sectors 210 0.02 0.00 0.00 0 0 0 0 0 ▁▁▇▁▁
​Main Sectors: Accommodation and food service activities 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Administrative and support service activities 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Construction 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Education 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Electricity, gas, steam and air conditioning supply 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Financial and insurance activities 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Human health and social work activities 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Information and communication 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Manufacturing 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Professional scientific and technical activities 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Transportation and storage 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Water supply; sewerage, waste management and remediation activities 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
​Wholesale and retail trade and repair of motor vehicles and motorcycles 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Accommodation and food service activities 1 1.00 0.41 0.49 0 0 0 1 1 ▇▁▁▁▆
Activities of households as employers 1 1.00 0.05 0.21 0 0 0 0 1 ▇▁▁▁▁
Administrative and support service activities 1 1.00 0.39 0.49 0 0 0 1 1 ▇▁▁▁▅
Agriculture forestry and fishing 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Agriculture, forestry and fishing 1 1.00 0.54 0.50 0 0 1 1 1 ▇▁▁▁▇
Arts and entertainment and recreation 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Arts, entertainment and recreation 1 1.00 0.39 0.49 0 0 0 1 1 ▇▁▁▁▅
Construction 1 1.00 0.42 0.50 0 0 0 1 1 ▇▁▁▁▆
Creative, arts and entertainment activities 1 1.00 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Education 1 1.00 0.55 0.50 0 0 1 1 1 ▆▁▁▁▇
Electricity, gas, steam and air conditioning supply 1 1.00 0.47 0.50 0 0 0 1 1 ▇▁▁▁▇
Financial and insurance activities 1 1.00 0.51 0.50 0 0 1 1 1 ▇▁▁▁▇
Human health and social work activities 1 1.00 0.53 0.50 0 0 1 1 1 ▇▁▁▁▇
Information and communication 1 1.00 0.58 0.50 0 0 1 1 1 ▆▁▁▁▇
Manufacturing 1 1.00 0.53 0.50 0 0 1 1 1 ▇▁▁▁▇
Mining and quarrying 1 1.00 0.43 0.50 0 0 0 1 1 ▇▁▁▁▆
Not Sector Specific 1 1.00 0.09 0.29 0 0 0 0 1 ▇▁▁▁▁
Other sector wide service activities 1 1.00 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Professional, scientific and technical activities 1 1.00 0.46 0.50 0 0 0 1 1 ▇▁▁▁▇
Public administration and defence; compulsory social security 1 1.00 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Real estate activities 1 1.00 0.39 0.49 0 0 0 1 1 ▇▁▁▁▅
Transportation and storage 1 1.00 0.42 0.50 0 0 0 1 1 ▇▁▁▁▆
Water supply; sewerage, waste management and remediation activities 1 1.00 0.42 0.49 0 0 0 1 1 ▇▁▁▁▆
Wholesale and retail trade and repair of motor vehicles and motorcycles 1 1.00 0.42 0.50 0 0 0 1 1 ▇▁▁▁▆
Business Support & Development 5 0.98 0.69 0.46 0 0 1 1 1 ▃▁▁▁▇
Community & Ecosystem Development 5 0.98 0.49 0.50 0 0 0 1 1 ▇▁▁▁▇
Funding Support 5 0.98 0.41 0.49 0 0 0 1 1 ▇▁▁▁▆
Legal & Compliance Support 5 0.98 0.05 0.22 0 0 0 0 1 ▇▁▁▁▁
Networking & Collaboration 5 0.98 0.32 0.47 0 0 0 1 1 ▇▁▁▁▃
Government Initiative 2 0.99 0.33 0.47 0 0 0 1 1 ▇▁▁▁▃
Not for Profit 2 0.99 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Private 2 0.99 0.32 0.47 0 0 0 1 1 ▇▁▁▁▃
Public 2 0.99 0.64 0.48 0 0 1 1 1 ▅▁▁▁▇
Public-Private Partnership 2 0.99 0.35 0.48 0 0 0 1 1 ▇▁▁▁▅
Science and Research Institute 2 0.99 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
State Owned Enterprise 2 0.99 0.02 0.15 0 0 0 0 1 ▇▁▁▁▁
University 2 0.99 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Communities 7 0.97 0.04 0.20 0 0 0 0 1 ▇▁▁▁▁
Co-operatives 7 0.97 0.02 0.15 0 0 0 0 1 ▇▁▁▁▁
Entrepreneurs 7 0.97 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Industry 7 0.97 0.21 0.41 0 0 0 0 1 ▇▁▁▁▂
Innovators 7 0.97 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Innovators/Inventors 7 0.97 0.24 0.43 0 0 0 0 1 ▇▁▁▁▂
Investors 7 0.97 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Policy makers 7 0.97 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
SMMEs…64 7 0.97 0.97 0.17 0 1 1 1 1 ▁▁▁▁▇
Support Providers 7 0.97 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
University and Research Institutes 7 0.97 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
University students 7 0.97 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Vocational and college learners 7 0.97 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Eastern Cape 30 0.86 1.17 7.93 0 0 1 1 108 ▇▁▁▁▁
Free State 30 0.86 0.34 2.30 0 0 0 0 31 ▇▁▁▁▁
Gauteng 30 0.86 1.50 10.13 0 1 1 1 138 ▇▁▁▁▁
KwaZulu Natal 30 0.86 1.27 8.59 0 0 1 1 117 ▇▁▁▁▁
Limpopo 30 0.86 1.08 7.28 0 0 1 1 99 ▇▁▁▁▁
Mpumalanga 30 0.86 1.00 6.76 0 0 1 1 92 ▇▁▁▁▁
North West…76 30 0.86 1.02 6.91 0 0 1 1 94 ▇▁▁▁▁
North West…77 30 0.86 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Northern Cape 30 0.86 1.02 6.91 0 0 1 1 94 ▇▁▁▁▁
Western Cape 30 0.86 1.36 9.18 0 0 1 1 125 ▇▁▁▁▁
Black owned enterprises 8 0.96 0.41 0.49 0 0 0 1 1 ▇▁▁▁▆
Debt funding…82 8 0.96 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
High-growth 8 0.96 0.37 0.48 0 0 0 1 1 ▇▁▁▁▅
Informal enterprises 8 0.96 0.36 0.48 0 0 0 1 1 ▇▁▁▁▅
People living with disabilities 8 0.96 0.33 0.47 0 0 0 1 1 ▇▁▁▁▅
Service Providers 8 0.96 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
SMMEs…87 8 0.96 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Social enterprises 8 0.96 0.36 0.48 0 0 0 1 1 ▇▁▁▁▅
Technology driven 8 0.96 0.40 0.49 0 0 0 1 1 ▇▁▁▁▆
Women 8 0.96 0.46 0.50 0 0 0 1 1 ▇▁▁▁▇
Youth 1 1.00 1.00 0.00 1 1 1 1 1 ▁▁▇▁▁
Acceleration 5 0.98 0.09 0.29 0 0 0 0 1 ▇▁▁▁▁
Access to funders 5 0.98 0.08 0.27 0 0 0 0 1 ▇▁▁▁▁
Alumni and mentorship networks 5 0.98 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Angel funding 5 0.98 0.01 0.12 0 0 0 0 1 ▇▁▁▁▁
Business development support 5 0.98 0.42 0.49 0 0 0 1 1 ▇▁▁▁▆
Business scaling structuring and strategy development 5 0.98 0.10 0.29 0 0 0 0 1 ▇▁▁▁▁
CashFlow Financing 5 0.98 0.03 0.17 0 0 0 0 1 ▇▁▁▁▁
Compliance assistance 5 0.98 0.05 0.21 0 0 0 0 1 ▇▁▁▁▁
Conditional grants 5 0.98 0.06 0.24 0 0 0 0 1 ▇▁▁▁▁
Coworking spaces and community support 5 0.98 0.13 0.34 0 0 0 0 1 ▇▁▁▁▁
Crowdfunding 5 0.98 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Debt funding…104 5 0.98 0.11 0.31 0 0 0 0 1 ▇▁▁▁▁
Ecosystem events and opportunities 5 0.98 0.15 0.36 0 0 0 0 1 ▇▁▁▁▂
Enterprise and supplier development 5 0.98 0.06 0.24 0 0 0 0 1 ▇▁▁▁▁
Equity funding 5 0.98 0.10 0.29 0 0 0 0 1 ▇▁▁▁▁
Funders 5 0.98 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Grant funding 5 0.98 0.05 0.21 0 0 0 0 1 ▇▁▁▁▁
Import and Export services 5 0.98 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Incubation 5 0.98 0.19 0.39 0 0 0 0 1 ▇▁▁▁▂
Investor readiness 5 0.98 0.00 0.07 0 0 0 0 1 ▇▁▁▁▁
Invoice and Purchase Order Financing 5 0.98 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Legal services 5 0.98 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Market access and validation 5 0.98 0.11 0.31 0 0 0 0 1 ▇▁▁▁▁
Marketing and branding services 5 0.98 0.02 0.15 0 0 0 0 1 ▇▁▁▁▁
Mentorship and coaching 5 0.98 0.14 0.35 0 0 0 0 1 ▇▁▁▁▁
MicroCredit and MicroEnterprise funding 5 0.98 0.02 0.15 0 0 0 0 1 ▇▁▁▁▁
Networking opportunities 5 0.98 0.16 0.37 0 0 0 0 1 ▇▁▁▁▂
Partnerships and collaborations 5 0.98 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Private Equity 5 0.98 0.04 0.20 0 0 0 0 1 ▇▁▁▁▁
Regulatory guidance 5 0.98 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Research and development 5 0.98 0.25 0.43 0 0 0 0 1 ▇▁▁▁▂
Technical advice and consulting 5 0.98 0.11 0.31 0 0 0 0 1 ▇▁▁▁▁
Technology development services 5 0.98 0.03 0.18 0 0 0 0 1 ▇▁▁▁▁
Training and skills development 5 0.98 0.29 0.46 0 0 0 1 1 ▇▁▁▁▃
Venture building 5 0.98 0.03 0.17 0 0 0 0 1 ▇▁▁▁▁
Venture Capital 5 0.98 0.04 0.20 0 0 0 0 1 ▇▁▁▁▁
Idea stage 2 0.99 0.67 0.47 0 0 1 1 1 ▅▁▁▁▇
Scale-up growth stage 2 0.99 0.77 0.42 0 1 1 1 1 ▂▁▁▁▇
Start-up stage 2 0.99 0.79 0.41 0 1 1 1 1 ▂▁▁▁▇
Algeria 26 0.88 0.03 0.16 0 0 0 0 1 ▇▁▁▁▁
America 26 0.88 0.03 0.16 0 0 0 0 1 ▇▁▁▁▁
Angola 26 0.88 0.04 0.19 0 0 0 0 1 ▇▁▁▁▁
Australia 26 0.88 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Benin 26 0.88 0.03 0.18 0 0 0 0 1 ▇▁▁▁▁
Botswana 26 0.88 0.09 0.28 0 0 0 0 1 ▇▁▁▁▁
Brazil 26 0.88 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Burkina Faso 26 0.88 0.03 0.16 0 0 0 0 1 ▇▁▁▁▁
Burundi 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Cabo Verde 26 0.88 0.03 0.16 0 0 0 0 1 ▇▁▁▁▁
Cameroon 26 0.88 0.03 0.18 0 0 0 0 1 ▇▁▁▁▁
Canada 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Central African Republic (CAR) 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Chad 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Chile 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
China 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Colombia 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Comoros 26 0.88 0.03 0.18 0 0 0 0 1 ▇▁▁▁▁
Congo 26 0.88 0.04 0.20 0 0 0 0 1 ▇▁▁▁▁
Côte d’Ivoire 26 0.88 0.04 0.19 0 0 0 0 1 ▇▁▁▁▁
Democratic Republic of the Congo 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Djibouti 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Eastern Africa 26 0.88 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Egypt 26 0.88 0.03 0.18 0 0 0 0 1 ▇▁▁▁▁
Equatorial Guinea 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Eritrea 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Eswatini 26 0.88 0.05 0.21 0 0 0 0 1 ▇▁▁▁▁
Ethiopia 26 0.88 0.04 0.20 0 0 0 0 1 ▇▁▁▁▁
France 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Gabon 26 0.88 0.03 0.16 0 0 0 0 1 ▇▁▁▁▁
Gambia 26 0.88 0.03 0.18 0 0 0 0 1 ▇▁▁▁▁
Germany 26 0.88 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Ghana 26 0.88 0.09 0.29 0 0 0 0 1 ▇▁▁▁▁
Greece 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Guinea 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Guinea-Bissau 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
India 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
International 26 0.88 0.01 0.10 0 0 0 0 1 ▇▁▁▁▁
Ireland 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Italy 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Ivory Coast 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Japan 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Kenya 26 0.88 0.13 0.34 0 0 0 0 1 ▇▁▁▁▁
Lesotho 26 0.88 0.07 0.26 0 0 0 0 1 ▇▁▁▁▁
Liberia 26 0.88 0.03 0.16 0 0 0 0 1 ▇▁▁▁▁
Libya 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Madagascar 26 0.88 0.04 0.19 0 0 0 0 1 ▇▁▁▁▁
Malawi 26 0.88 0.06 0.25 0 0 0 0 1 ▇▁▁▁▁
Mali 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Mauritania 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Mauritius 26 0.88 0.05 0.23 0 0 0 0 1 ▇▁▁▁▁
Mayotte 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Mexico 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Morocco 26 0.88 0.03 0.18 0 0 0 0 1 ▇▁▁▁▁
Mozambique 26 0.88 0.07 0.26 0 0 0 0 1 ▇▁▁▁▁
Namibia 26 0.88 0.11 0.32 0 0 0 0 1 ▇▁▁▁▁
Netherlands 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Niger 26 0.88 0.03 0.16 0 0 0 0 1 ▇▁▁▁▁
Nigeria 26 0.88 0.11 0.32 0 0 0 0 1 ▇▁▁▁▁
Not specific 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Pakistan 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Pan African 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Peru 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Philippines 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Poland 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Réunion 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Rwanda 26 0.88 0.04 0.19 0 0 0 0 1 ▇▁▁▁▁
Sao Tome and Principe 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Saudi Arabi 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Senegal 26 0.88 0.04 0.19 0 0 0 0 1 ▇▁▁▁▁
Seychelles 26 0.88 0.03 0.18 0 0 0 0 1 ▇▁▁▁▁
Sierra Leone 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Somalia 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
South Africa 26 0.88 0.94 0.25 0 1 1 1 1 ▁▁▁▁▇
South Sudan 26 0.88 0.03 0.16 0 0 0 0 1 ▇▁▁▁▁
Southern Africa 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Sub-Saharan Africa 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Sudan 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Swaziland 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Tanzania 26 0.88 0.09 0.29 0 0 0 0 1 ▇▁▁▁▁
Togo 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Tunisia 26 0.88 0.03 0.18 0 0 0 0 1 ▇▁▁▁▁
Uganda 26 0.88 0.09 0.28 0 0 0 0 1 ▇▁▁▁▁
UK 26 0.88 0.02 0.14 0 0 0 0 1 ▇▁▁▁▁
Ukraine 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Western Africa 26 0.88 0.01 0.07 0 0 0 0 1 ▇▁▁▁▁
Zambia 26 0.88 0.07 0.26 0 0 0 0 1 ▇▁▁▁▁
Zimbabwe 26 0.88 0.10 0.30 0 0 0 0 1 ▇▁▁▁▁
Grand Total 26 0.88 3.48 8.23 1 1 1 3 54 ▇▁▁▁▁

Key Takeaways

This section synthesises the main insights emerging from the exploratory data analysis across the four ecosystem actor groups.

Innovators, key takeaways

  • Coverage and size: 182 organisations and 22 variables, with a split between character fields and fields imported as logical.
  • Strong basics: Organisation Name and Countries supported are complete, and EMAIL Address is nearly complete, so identification and high-level reach are usable.
  • Weak strategic detail: Main Service Offering Category and Targeted Business Stage are missing for almost all records, which limits need profiling and pathway mapping.
  • Funding signals are thin: Funding Required is missing for most records, so interpreting finance demand patterns will be unreliable without enrichment.
  • IBP registration info is largely absent: IBP registration reason is almost entirely missing, so compliance or registration driven insights are not currently feasible.
  • Import or schema issue present: Several fields appear as logical with 100% missingness, suggesting a data type or ingestion problem that should be fixed before deeper analysis.

Supporters, key takeaways

  • Largest group: 251 organisations and 21 variables, supporters are the most represented actor category.
  • Descriptions are strong: Description has very high completeness, which is useful for text based classification or tagging.
  • Reach is inconsistently captured: Countries supported has substantial missingness, so geographic coverage comparisons will be biased toward better documented organisations.
  • Service offering fields are incomplete: Service Offerings has notable missingness, limiting the ability to match innovators and entrepreneurs to support types.
  • Sector tagging is partial: Sector is missing for a meaningful share, which weakens sectoral gap analysis for enabling services.
  • Many key fields appear fully missing: Targeted Business Stage, beneficiary groups, and several classification variables are imported as logical with 100% missingness, this needs a data structure fix.

Entrepreneurs, key takeaways

  • High dimensional dataset: 214 organisations with 228 columns, most variables are binary indicators, which enables rich profiling but requires careful variable selection during analysis.
  • Core identifiers are solid: Organisation, Description, and URL are near complete, with acceptable completeness for email and contact numbers.
  • Designed for indicator based analysis: Many fields look like binary flags across sectors, provinces, beneficiary groups, stages, and service types, this supports scoring, indexing, and segmentation.
  • Interpretability depends on a data dictionary: To avoid misreading indicator columns, we will need clear documentation on what each numeric field represents and how multiple selections are encoded.

Investors, key takeaways

  • Smaller but information dense: 152 organisations and 24 character variables, with more structured fields than innovators and supporters.
  • Good identification and reach: Organisation Name and Countries supported are highly complete, HQ Country is almost complete.
  • Contactability is mixed: CONTACT NUMBER has high missingness, and URL and physical address are also incomplete, outreach and verification will require supplementation.
  • Local granularity is absent: Province is almost entirely missing, so sub-national investor mapping is not currently feasible.
  • Impact alignment variables are weak: Primary and Secondary targeted beneficiary groups have heavy missingness, limiting inclusive finance or target group analysis.
  • Investment mechanism fields are largely missing: Investment Instrument and Operational Status are near empty, which constrains segmentation by instrument and active status.

References

Derr, A. (2025, February 10). What is ecosystem mapping? A beginner’s guide. Visible Network Labs. https://visiblenetworklabs.com/2025/02/10/what-is-ecosystem-mapping-a-beginners-guide/

Organisation for Economic Co-operation and Development. (1997). National innovation systems. OECD Publishing.

R Core Team. (2023). R: A language and environment for statistical computing (Version 4.x) [Computer software]. R Foundation for Statistical Computing. https://www.r-project.org/

Waring, E., Quinn, M., McNamara, A., Arino de la Rubia, E., Zhu, H., & Ellis, S. (2025). skimr: Compact and flexible summaries of data (R package version 2.2.1) [Computer software]. https://CRAN.R-project.org/package=skimr

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686

Wickham, H., François, R., Henry, L., Müller, K., & Vaughan, D. (2025). dplyr: A grammar of data manipulation (R package version 1.1.4) [Computer software]. https://CRAN.R-project.org/package=dplyr