Post-Harvest Loss and Agripreneur Profitability Analysis
Author
Zephania Mwangi
Code
library(readxl)library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Code
library(knitr)library(scales)
Attaching package: 'scales'
The following object is masked from 'package:purrr':
discard
The following object is masked from 'package:readr':
col_factor
Code
# Load the data setdata <-read_excel("Nigeria_agri_dataset.xlsx")dim(data)
This project analyzes post-harvest loss, profitability, and operational efficiency among youth agripreneurs in Nigeria. It uses real-world agricultural data to identify patterns and recommend interventions to reduce spoilage and increase revenue.
2 Objectives
To identify factors contributing to post-harvest losses.
To assess the impact of storage, transport, and training on spoilage.
To evaluate how training and technology influence revenue.
To provide regional insights for targeted policy interventions.
3 Data Description
Source: Nigerian agricultural field survey on youth agripreneurs.
Variables: Includes crop type, storage and transport methods, spoilage, revenue, training status, and environmental conditions.
Time Range: Not explicitly provided.
Size: observations and variables.
4 Data Cleaning & Preparation
Calculated post-harvest loss percentage (PHL_Percent).
Calculated adjusted revenue per kg (Revenue_per_kg).
Created indicators for long-distance transport and potential savings.
Removed NA values for numeric analysis where needed.
ggplot(data, aes(x = Storage_Duration_Days, y = PHL_Percent)) +geom_point(alpha =0.4) +geom_smooth(method ="loess") +labs(title ="PHL vs Storage Duration", x ="Storage Days", y ="PHL (%)")
`geom_smooth()` using formula = 'y ~ x'
5.4 Revenue Loss by Transport Distance
Code
ggplot(data, aes(x = Transport_Distance_km, y = Revenue_Loss_NGN)) +geom_point(alpha =0.4) +geom_smooth(method ="lm", color ="red") +labs(title ="Revenue Loss vs Transport Distance", x ="Distance (km)", y ="Loss (NGN)")
`geom_smooth()` using formula = 'y ~ x'
5.5 Market Access and Spoilage
Code
ggplot(data, aes(x = Market_Access, y = PHL_Percent)) +geom_boxplot(fill ="skyblue") +labs(title ="PHL by Market Access", x ="Market Access", y ="PHL (%)")
6 Modeling (if applicable)
Code
# Simple linear model: PHL ~ Storage Durationmodel <-lm(PHL_Percent ~ Storage_Duration_Days + Humidity_Percent + Temperature_C, data = data)summary(model)
Call:
lm(formula = PHL_Percent ~ Storage_Duration_Days + Humidity_Percent +
Temperature_C, data = data)
Residuals:
Min 1Q Median 3Q Max
-20.8894 -3.4220 0.0153 3.3712 17.3315
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -29.77685 0.80549 -36.97 <2e-16 ***
Storage_Duration_Days 0.92034 0.02298 40.05 <2e-16 ***
Humidity_Percent 0.16678 0.00495 33.70 <2e-16 ***
Temperature_C 0.99363 0.02343 42.41 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5.029 on 4996 degrees of freedom
Multiple R-squared: 0.4729, Adjusted R-squared: 0.4726
F-statistic: 1494 on 3 and 4996 DF, p-value: < 2.2e-16
7 Results & Discussion
Post-harvest loss increases sharply after 30 days of storage.
Long-distance transport (>50 km) is associated with higher revenue loss.
Regions differ significantly in spoilage and revenue efficiency.
Trained farmers and those using technology generally experience lower losses.
8 Limitations
No clear time reference in the dataset (harvest year or month missing).
Some missing values in critical numeric fields.
Potential underreporting or misclassification in Tech_Used and Training_Received.
9 Conclusion & Recommendations
Invest in improved, region-specific storage methods for high-loss crops.
Encourage training and technology adoption among youth farmers.
Prioritize infrastructure in long-distance regions to reduce spoilage.
Use insights from this report to design dashboards and mobile alerts for farmers.