R project folder on github

Business task:

Identify any relationship between food items price change and 1) COVID-19 cases, 2) Afghanistan government fall (in 2021)

Data Overview:

All the datasets are downloaded from HDX, they are publicly available and expected they should already be cleaned and validated by organizations who own them. As per business task, we’d need two different datasets, 1) Afghanistan COVID-19 dataset, 2) Joing Market Monitoring Initiative (JMMI) dataset.

Important! There was no data available for the month of March, 2021 in JMMI datasets, so that month is ignored.

Manipulation of datasets:

Both datasets seem clean and publicly pubished. I just have done few modification to make it easier to deal with them.Please note, that all the modification except file renaming done by R coding, so it handles by itself - you don’t need to worry.

  1. I’ve slightly renamed the file names, just changed the month from name to number, like January to 01, April to 04 and etc. (If you want to download the data directly from HDX, and feed it to script, you’d just need to rename the files as above OR change the script where it reads the data - that’s it.)
  2. I’ve added Date information to JMMI data just to align it with COVID data and finally make it analyzable. The date information inserted as per file name.
  3. I’ve kept only food items in JMMI data as we only need them, and removed the other items.
  4. In two first months of JMMI data, the district names were not aligned with other months, so I’ve searched and found that information in the file itself and streamlined them. (all done via R coding, you don’t need to worry about it)
  5. In JMMI data, changed the N/A to NA
  6. Added the collapse variable in the JMMI data, so to understand the pre and post fall of country (15th August recognizes as the country fall date). January to August is pre-fall, and August to December is post-fall.
  7. I’ve calculated the average of food price at national level and used it for visualizations.

Finding or summary of anlaysis:

I’ve done some exploratory data analysis to better understand my datasets, then plotted them to see if there is any relationship between these two indicators. The findings show that there is quite obvious positive relationship between food price change and 1) COVID-19 cases, and 2) Taliban take over.

Supporting visualizations:

#To see if there is any relationship between food item price change and COVID cases change
price_vs_covid %>% ggplot(aes(x = cases_avg, y = food_items_avg))+
  geom_point()+
  geom_smooth(method = 'lm', se = F)+
  labs(title = "Relationship between COVID and food item price change", x = "COVID cases", y = "Food items average price(per kg)")
## `geom_smooth()` using formula 'y ~ x'

#To see if there is any change in food item price in pre and post fall of country 
price_vs_collapse %>% ggplot(aes(x = collapse, y = food_items_avg, fill = collapse))+
  geom_bar(stat = 'identity', position = 'dodge')+
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 1))+
  labs(title = "Collapse impact in Food item price", y = "Food items average price(per kg)", x = "Collapse")

Test of significance:

The relationship between food price change and COVID cases seems statistically significant. And it shows that around 50% of the reason of food price change was the COVID-19 pandemic.

##Modeling
#linear regression on food item price and COVID-19 cases 
price_mod <- lm(cases_avg ~ food_items_avg, data = price_vs_covid)
summary(price_mod)
## 
## Call:
## lm(formula = cases_avg ~ food_items_avg, data = price_vs_covid)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -907.2 -772.3  108.7  630.3 1287.8 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)   
## (Intercept)    -4793.56    2398.49  -1.999  0.07673 . 
## food_items_avg   112.74      34.29   3.288  0.00941 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 863.7 on 9 degrees of freedom
## Multiple R-squared:  0.5457, Adjusted R-squared:  0.4952 
## F-statistic: 10.81 on 1 and 9 DF,  p-value: 0.009411

Note: JMMI food items are:

  1. Wheat local
  2. Wheat imported
  3. Local rice
  4. Vegetable oil
  5. Lentils
  6. Beans
  7. Split peas
  8. Sugar
  9. Tomatoes