Install Quality Over Install Volume: A Marketing Analytics Study of Credit Direct Mobile App Acquisition — April 2026
Author
Victoria Arinze
Published
May 13, 2026
1. Executive Summary
Credit Direct Mobile ran paid acquisition across multiple channels in April 2026, generating approximately 51,000 non-organic Android installs. Channels included Google Ads, programmatic display networks, Meta Ads, and other paid media sources. Install volume is a useful headline metric, but it does not tell you whether the spend was efficient.
This analysis reframes the performance question: not how many people installed, but how many actually engaged. Using raw AppsFlyer attribution data, a user is classified as converted if they triggered at least one tracked in-app event after installation. Five analytical techniques are applied: exploratory data analysis, data visualisation, hypothesis testing, correlation analysis, and logistic regression.
The findings show statistically significant differences in conversion rates across acquisition channels. These differences are systematic enough to support a reallocation of media spend. The core recommendation is a shift from cost-per-install to cost-per-engaged-user as the primary performance KPI for Q3 2026 planning.
2. Professional Disclosure
I am Victoria Arinze, Product Marketing and Growth Lead at Credit Direct Finance Company Limited, a Nigerian digital lending and fintech company. My work sits at the intersection of product, growth, marketing, and customer experience. A core part of my role involves evaluating the quality of paid acquisition channels, understanding user engagement post-install, and translating data insights into channel investment and campaign decisions.
Exploratory Data Analysis is the starting point for every channel review I conduct. Before making any budget recommendation, I need to understand the structure of the acquisition data — which channels drive volume, how installs are distributed across geographies and devices, and where data quality issues exist.
Data Visualisation is central to how performance is communicated at Credit Direct across stakeholder levels, from the growth team to the CFO. Attribution data needs to be translated into stories that non-technical stakeholders can act on.
Hypothesis Testing provides the statistical rigour needed to confirm whether observed differences in channel performance are real or simply noise from volume variation. When one channel appears to convert better than another, I need statistical confirmation before recommending a budget shift.
Correlation Analysis helps identify which channel mix, device profile, or geographic concentration is most associated with engaged users at the daily level — directly relevant to campaign scheduling and budget pacing decisions.
Logistic Regression provides a predictive framework for estimating the probability of conversion based on observable acquisition attributes. The output — odds ratios per channel — translates directly into a channel efficiency ranking that informs bid strategy and campaign prioritisation.
3. Data Collection & Sampling
Source: Raw AppsFlyer attribution data exported from the Credit Direct Mobile app account by the Growth team.
Files used: installs.csv contains non-organic Android install records for April 2026, approximately 51,000 rows across 50 variables. events.csv contains in-app event records for the same period, approximately 200,000 rows across 44 variables.
Collection method: Direct export from AppsFlyer raw data reports. The installs file contains one row per attributed install. The events file contains one row per tracked in-app event.
Time period: April 2026 (full calendar month).
Sampling frame: All non-organic Android installs attributed to paid channels during the period. No sampling was applied — this is the complete population of paid installs for the month.
Ethical notes: All personally identifiable information — including IP addresses, device IDs, and IMEI numbers — was removed from both files before analysis. The data was accessed in the course of my professional role. No external consent was required as the data reflects campaign attribution records, not individual user profiles.
Outcome variable: converted = 1 if the AppsFlyer ID appears in the events file (at least one in-app event recorded post-install). converted = 0 if no events were recorded.
Channel classification: If the Partner field is populated, the install is classified as Programmatic. If Media Source contains google, it is Google Ads. If Media Source contains facebook, instagram, or meta, it is Meta Ads. All other cases are Other Paid.
EDA Interpretation: Two data quality issues were identified and resolved. First, the Partner column used mixed data types across files — resolved by casting to character before channel classification. Second, install timestamps required substring extraction to produce usable date and hour features. No rows were dropped; the full population of April 2026 non-organic Android installs is retained.
5. Data Visualisation
Code
ggplot(channel_summary,aes(x=reorder(channel,-Installs),y=Installs,fill=channel))+geom_col(show.legend=FALSE)+geom_text(aes(label=scales::comma(Installs)),vjust=-0.4,size=3.5)+scale_y_continuous(labels=scales::comma)+scale_fill_manual(values=c('Google Ads'='#4285F4','Meta Ads'='#1877F2','Programmatic'='#FF6B35','Other Paid'='#6C757D'))+labs(title='Install Volume by Acquisition Channel',subtitle='Credit Direct Mobile App — April 2026',x='Channel',y='Total Installs')+theme_minimal(base_size=13)
Figure 1: Figure 1: Total Installs by Acquisition Channel
Code
ggplot(channel_summary,aes(x=reorder(channel,-Conv_Rate_Pct),y=Conv_Rate_Pct,fill=channel))+geom_col(show.legend=FALSE)+geom_text(aes(label=paste0(Conv_Rate_Pct,'%')),vjust=-0.4,size=3.5)+scale_fill_manual(values=c('Google Ads'='#4285F4','Meta Ads'='#1877F2','Programmatic'='#FF6B35','Other Paid'='#6C757D'))+labs(title='Conversion Rate by Acquisition Channel',subtitle='Converted = at least one in-app event post-install',x='Channel',y='Conversion Rate (%)')+theme_minimal(base_size=13)
Figure 2: Figure 2: Conversion Rate by Acquisition Channel
Figure 3: Figure 3: Daily Install Volume by Channel
Code
top10 <- installs |>count(country_code,sort=TRUE) |>slice_head(n=10)ggplot(top10,aes(x=reorder(country_code,n),y=n))+geom_col(fill='#2E86AB')+geom_text(aes(label=scales::comma(n)),hjust=-0.2,size=3.2)+coord_flip()+scale_y_continuous(labels=scales::comma,expand=expansion(mult=c(0,0.15)))+labs(title='Top 10 Countries by Install Volume',x='Country',y='Installs')+theme_minimal(base_size=13)
Figure 4: Figure 4: Top 10 Countries by Install Volume
Figure 5: Figure 5: Device Category Mix by Channel
Visualisation Narrative: These five charts tell a single story — install volume and install quality are not the same thing. Figures 1 and 2 make that gap explicit: the channel generating the highest install volume is not necessarily leading on conversion rate. Figure 3 shows daily volume patterns by channel. Figure 4 shows geographic concentration of installs. Figure 5 shows device mix by channel — relevant because creative performance can differ significantly across device types.
6. Hypothesis Testing
Hypothesis 1: Conversion rate differs significantly across acquisition channels
H0: There is no significant association between acquisition channel and conversion outcome.
H1: Conversion rate differs significantly across acquisition channels.
Code
ct1 <-table(installs$channel,installs$converted)cat('Contingency Table — Channel vs Converted:\n'); print(ct1)
Contingency Table — Channel vs Converted:
0 1
Google Ads 6738 13624
Meta Ads 1 2
Programmatic 9135 10611
Other Paid 569 1340
Interpretation: A p-value below 0.05 means we reject H0 — the differences in conversion rates across channels are statistically significant and not random noise. Cramer V quantifies the practical strength of the association independent of sample size.
Hypothesis 2: Conversion rate differs significantly by device category
H0: Device category has no significant association with conversion outcome.
H1: Conversion rate differs significantly by device category.
Code
ct2 <-table(installs$device_category,installs$converted)cat('Contingency Table — Device Category vs Converted:\n'); print(ct2)
Interpretation: This test checks whether device type at install is associated with post-install engagement. A significant result means device type is not independent of conversion — an actionable signal for targeting and creative decisions.
Correlation Interpretation: The matrix is built from daily aggregates — each row represents one calendar day in April 2026. The key relationships are between Conv. Rate and the four channel proportion columns. A positive correlation means days with more of that channel tended to produce more engaged users. The Day of Week variable captures whether engagement is systematically higher on certain days, relevant to budget pacing and scheduling decisions.
Call:
glm(formula = converted ~ channel + country_group + device_cat,
family = binomial(link = "logit"), data = train_data)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.083420 1.367657 0.061 0.95136
channelMeta Ads 0.000909 1.224875 0.001 0.99941
channelProgrammatic -0.538412 0.024734 -21.768 < 2e-16 ***
channelOther Paid 0.166891 0.062157 2.685 0.00725 **
country_groupNG 0.555147 0.606297 0.916 0.35986
country_groupNL 0.726689 0.855871 0.849 0.39585
country_groupOther 0.394840 0.642765 0.614 0.53903
country_groupUK 2.046413 1.211781 1.689 0.09126 .
country_groupUS 0.985706 0.700277 1.408 0.15925
device_catmobile_phone 0.053671 1.225782 0.044 0.96508
device_catother 9.927460 119.474330 0.083 0.93378
device_catset_top_box -11.204595 84.485555 -0.133 0.89449
device_cattablet 0.458248 1.236851 0.370 0.71101
device_catunknown_device_category 0.307876 1.231290 0.250 0.80255
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 39354 on 29412 degrees of freedom
Residual deviance: 38802 on 29399 degrees of freedom
AIC: 38830
Number of Fisher Scoring iterations: 9
Code
or_table <-data.frame(Predictor=names(coef(log_model)),Odds_Ratio=round(exp(coef(log_model)),3),CI_Lower=round(exp(confint.default(log_model)[,1]),3),CI_Upper=round(exp(confint.default(log_model)[,2]),3))knitr::kable(or_table,col.names=c('Predictor','Odds Ratio','95% CI Lower','95% CI Upper'),caption='Table 4: Logistic Regression Odds Ratios and Confidence Intervals')
Table 4: Logistic Regression Odds Ratios and Confidence Intervals
Regression Interpretation: The logistic regression models the probability of conversion based on acquisition channel, country group, and device category. Google Ads is the reference channel — all other channel coefficients are interpreted relative to it. An odds ratio above 1 means higher odds of converting than a Google Ads install. An odds ratio below 1 means lower odds. These figures translate directly into a channel efficiency ranking for Q3 budget decisions.
9. Integrated Findings
The five techniques applied here converge on one finding: install volume and install quality are not the same metric, and they do not reliably come from the same channel.
EDA established the baseline distribution of installs and conversion rates. Visualisation made the quality gap visible across channels, geographies, and device types. Hypothesis testing confirmed that differences in conversion rates across channels are statistically significant — not random variation. Correlation analysis identified which daily channel mix is most associated with higher conversion outcomes. Logistic regression quantified the channel-level effect on conversion probability, controlling for country and device type.
Recommendation: Credit Direct should adopt cost-per-engaged-user as the primary channel performance KPI for Q3 2026, replacing cost-per-install as the lead metric. Budget should be reallocated toward channels with the highest odds ratios in the regression model, with a 30-day observation window after reallocation to confirm the relationship holds at scale.
10. Limitations & Further Work
The converted variable is binary and coarse — a user who opened the app once is treated identically to one who completed a loan application. A graduated engagement score would improve model precision. The analysis covers a single month, so seasonal effects cannot be assessed. Logistic regression excludes campaign-level variables due to high cardinality; regularisation could improve this. AppsFlyer defaults to last-touch attribution, which may overstate bottom-of-funnel channel contributions.
Further work should extend the dataset to Q1 and Q2 2026 to test whether channel quality patterns hold over time, build a multi-touch attribution model to understand assisted conversion contributions, and implement a live dashboard tracking cost-per-engaged-user by channel on a rolling 7-day basis.
References
Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R. Lagos Business School / markanalytics.online. https://markanalytics.online
Arinze, V. (2026). Credit Direct Mobile App — AppsFlyer non-organic install and event data, April 2026 [Dataset]. Collected from Credit Direct Finance Company Limited, Lagos, Nigeria. Data available on request from the author.
R Core Team. (2024). R: A language and environment for statistical computing (Version 4.6.0). R Foundation for Statistical Computing. https://www.R-project.org/
Wickham, H., et al. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer. https://doi.org/10.1007/978-3-319-24277-4
Allaire, J. J., Teague, C., Scheidegger, C., Xie, Y., & Dervieux, C. (2022). Quarto (Version 1.x) [Computer software]. https://doi.org/10.5281/zenodo.5960048
Appendix: AI Usage Statement
Claude (Anthropic) was used to assist with debugging Quarto rendering errors and structuring R code for data loading, variable construction, and output formatting. The analytical decisions — choice of techniques, business framing, interpretation of outputs, and the final recommendation — were made independently based on direct professional familiarity with the Credit Direct acquisition data and the requirements of this assessment. All text interpretation and business conclusions are my own.