Predicting Startup Success
ECON 465 - Stage 3 Final Presentation
Silan Kilicarslan - Arda Cem Acar
Economic Question
Can startup success be predicted using founder characteristics, funding structure, and market indicators?
- The Objective: Examine whether early-stage startup features can predict success or failure.
- The Impact: Better prediction supports:
- Venture capital decisions
- Entrepreneurship policy
- Job creation & efficient resource allocation
Why Does This Matter?
Startups are inherently risky, and many fail before reaching long-term success.
For Investors
- Predicting success reduces bad investment decisions.
- Optimizes capital allocation and risk management.
For Policymakers
- Helps design better entrepreneurship support programs.
- Understands which startup characteristics act as useful market signals.
The Dataset
Startup Funding and Outcome Dataset (Source: Kaggle)
- Observation Unit: 100,000 startups
- Outcome Variable: Binary Classification
Features Included: Information about funding structure, founder characteristics, market conditions, and specific startup features.
Stage 1: Probability Analysis
Failure occurs more often than success, but the difference is not extreme.
Key Findings
- Failure: 55.6%
- Success: 44.4%
This demonstrates that while startup success is risky, the success rate is balanced enough to make predictive modeling mathematically meaningful (Bernoulli distribution).
Predictive Modeling Approach
Since the outcome variable is binary, we deployed Logistic Regression.
- Why? It predicts the probability of success/failure and offers highly interpretable coefficients for economic analysis.
Data Splitting (Reproducible via Seed): - 80% Training Set - 20% Test Set
Models Compared
We compared two logistic regression models to see if adding categorical data improves prediction.
Model 1: Full
Used both numerical and categorical features: - Investor type - Sector - Funding rounds - Team size
Model 2: Reduced
A simpler model strictly using numerical proxies: - Funding rounds - Team size
Model Selection: Model 1
The full model was selected as the final model.
- Although the reduced model is simpler, the full model captures crucial economic context.
- Investor type and sector capture vital differences in startup environments.
- Example: Startups in different sectors face fundamentally different risks, market sizes, and funding opportunities. Therefore, Model 1 is far more useful for answering our economic question.
Main Results & Economic Insight
Finding: Funding-related and structural characteristics are powerful predictors of success.
The “Signaling Theory”
- ‘funding_rounds’ is economically meaningful.
- Receiving more funding rounds acts as a positive market signal.
- It shows continuous investor confidence, validating the startup’s potential to the broader market and increasing the ultimate probability of success.
Limitations and Improvements
This project has some limitations. The dataset does not include macroeconomic factors such as inflation, interest rates, or economic growth.
It may also miss important financial details such as profitability, cash flow, and exact investment amounts.
In future work, we could add macroeconomic variables and test interaction effects such as funding_rounds * sector.
Conclusion
This project shows that startup success can be predicted to some extent using funding and structural characteristics.
The final logistic regression model was chosen because it is interpretable and includes economically meaningful variables.
The main takeaway is that funding rounds, investor information, sector, and team structure can provide useful signals about startup success.
However, startup outcomes remain uncertain, so predictions should be used carefully.