This case study examines the operational factors that influence whether Lagos real estate leads progress from initial inquiry to property inspection. The analysis uses 100 anonymised lead records independently assembled from JahDay Real Estate’s buyer engagement activity between November 2025 and May 2026. The dataset captures inquiry date, lead source, property type, location, budget, response time, follow-up activity, and inspection outcome. Five CS1 techniques were applied: exploratory data analysis, data visualisation, hypothesis testing, correlation analysis, and logistic regression. The analysis shows that inquiry volume alone is not a reliable measure of lead quality. Instagram produced the highest number of leads, but inspection progression appears to depend on a wider combination of source quality, budget profile, response behaviour, and follow-up activity. Budget values were highly skewed because of luxury property inquiries, so log budget was used to improve interpretation. Overall, the findings support a more structured lead prioritisation process. JahDay Real Estate should focus on leads with stronger budget alignment, credible source quality, and consistent follow-up potential, rather than relying only on speed of response or total inquiry volume.
2. Professional Disclosure
I work in the Lagos real estate sector through JahDay Real Estate, where I support high-value client acquisition, buyer advisory, lead qualification, inspection coordination, developer engagement, and investment-focused property transactions. My role requires evaluating buyer intent across multiple inquiry channels, including Instagram, WhatsApp, referrals, agent networks, and website leads. Because property inspection is a key milestone between initial interest and serious transaction activity, understanding what moves a lead from inquiry to inspection is directly relevant to my professional decision making.
The business question guiding this analysis is: Which operational factors influence whether a Lagos real estate inquiry progresses into a serious property inspection opportunity?
Exploratory Data Analysis relevance: Exploratory Data Analysis is relevant because it provides the first view of lead behaviour before deeper testing. In my work, it helps identify where inquiries are coming from, how buyer budgets are distributed, which property types attract interest, and whether the data contains missing values, outliers, or unusual patterns that could affect decision making.
Data Visualisation relevance: Data visualisation is important because real estate decisions are often discussed with non-technical stakeholders, including buyers, developers, agents, and strategic partners. Visuals make it easier to communicate channel performance, property demand, budget patterns, and inspection behaviour in a clear and practical way.
Hypothesis Testing relevance: Hypothesis testing is useful because it allows me to move beyond observation and test whether differences in lead behaviour are statistically meaningful. For example, it can help determine whether inspected leads have different budget profiles from non-inspected leads, or whether inspection outcomes are associated with lead source.
Correlation Analysis relevance: Correlation analysis helps assess whether numeric operational variables move together. In this case, it supports a better understanding of how budget, response time, follow-up activity, and inspection outcome relate to one another. This is useful for deciding which lead management factors deserve closer attention.
Regression relevance: Logistic regression is appropriate because the main business outcome is binary: a lead either inspected or did not inspect. This technique helps estimate how multiple factors, such as budget, response time, follow-up activity, lead source, and property type, are associated with the probability of inspection. In practice, this supports a more structured and evidence-based approach to client prioritisation and inspection conversion.
3. Data Collection and Sampling
The dataset used for this analysis was collected from anonymised Lagos real estate inquiry records connected to active property marketing and buyer engagement activity through JahDay Real Estate.
Each row represents one unique lead inquiry. The dataset contains 100 observations and 9 original variables, with additional derived variables created during analysis for inspection outcome and log-transformed budget. The time period covered by the dataset runs from November 2025 to May 2026.
The sampling approach was purposive sampling because the dataset focused specifically on real estate inquiries connected to active Lagos property marketing operations. The sampling frame includes inquiries received through Instagram, WhatsApp, referrals, agent networks, and website channels. These channels are directly relevant to my professional work because they represent the main ways real estate leads enter the buyer engagement process.
To protect confidentiality, all personally identifiable information was removed before analysis. The dataset does not include buyer names, phone numbers, email addresses, or sensitive client information. The data is used only for academic analysis and operational learning.
Code
data_collection_summary <-data.frame(Item =c("Number of observations","Number of original variables","Earliest inquiry date","Latest inquiry date","Lead source channels","Property type categories","Inspection outcome variable" ),Value =c(nrow(df),"9 original variables, plus 2 derived variables for analysis",as.character(min(df$inquiry_date, na.rm =TRUE)),as.character(max(df$inquiry_date, na.rm =TRUE)),paste(levels(df$lead_source), collapse =", "),paste(levels(df$property_type), collapse =", "),"Inspected: Yes or No" ))knitr::kable(data_collection_summary)
Item
Value
Number of observations
100
Number of original variables
9 original variables, plus 2 derived variables for analysis
Sampling justification: A sample size of 100 lead inquiries is sufficient for this exploratory and inferential case study because it meets the assignment requirement and provides enough variation across lead source, property type, budget, response time, follow-up activity, and inspection outcome. The dataset is not intended to represent the entire Lagos real estate market. Instead, it is intended to analyse operational lead behaviour within my own professional context.
4. Data Description
The dataset contains categorical, numeric, date, and outcome variables. This makes it suitable for CS1 because it supports exploratory data analysis, visualisation, hypothesis testing, correlation analysis, and regression modelling.
The main outcome variable is inspected, which shows whether a real estate inquiry progressed into a property inspection opportunity. The main predictor variables include lead source, property type, location, budget, response time, and follow-up activity.
Code
variable_table <-data.frame(Variable =c("lead_id","inquiry_date","lead_source","property_type","location","budget_naira","response_time_hours","follow_ups","inspected","inspected_binary","log_budget" ),Description =c("Anonymous identifier for each lead","Date the inquiry was received","Channel where the inquiry came from","Property category requested by the lead","Property location associated with the inquiry","Estimated property value or client budget in Naira","Time taken to respond to the inquiry","Number of follow-up interactions after first contact","Whether the lead progressed to inspection","Numeric version of inspection outcome, where Yes = 1 and No = 0","Log-transformed budget used to reduce skewness" ),Type =c("Categorical","Date","Categorical","Categorical","Categorical","Numeric","Numeric","Numeric","Categorical outcome","Numeric outcome","Numeric" ))knitr::kable(variable_table)
Variable
Description
Type
lead_id
Anonymous identifier for each lead
Categorical
inquiry_date
Date the inquiry was received
Date
lead_source
Channel where the inquiry came from
Categorical
property_type
Property category requested by the lead
Categorical
location
Property location associated with the inquiry
Categorical
budget_naira
Estimated property value or client budget in Naira
Numeric
response_time_hours
Time taken to respond to the inquiry
Numeric
follow_ups
Number of follow-up interactions after first contact
Numeric
inspected
Whether the lead progressed to inspection
Categorical outcome
inspected_binary
Numeric version of inspection outcome, where Yes = 1 and No = 0
Numeric outcome
log_budget
Log-transformed budget used to reduce skewness
Numeric
Code
requirement_check <-data.frame(Requirement =c("Minimum observations","Minimum variables","At least 3 numeric variables","At least 2 categorical variables","At least 1 date variable","At least 1 outcome variable" ),Dataset_Status =c(paste(nrow(df), "observations"),paste(ncol(df), "variables after derived variables"),"budget_naira, response_time_hours, follow_ups, inspected_binary, log_budget","lead_source, property_type, location, inspected","inquiry_date","inspected" ),Meets_Requirement =c("Yes", "Yes", "Yes", "Yes", "Yes", "Yes"))knitr::kable(requirement_check)
lead_source_summary <-as.data.frame(table(df$lead_source))names(lead_source_summary) <-c("Lead Source", "Number of Leads")knitr::kable(lead_source_summary)
Lead Source
Number of Leads
Agent Network
9
Instagram
47
Referral
19
Website
3
WhatsApp
22
Code
property_type_summary <-as.data.frame(table(df$property_type))names(property_type_summary) <-c("Property Type", "Number of Leads")knitr::kable(property_type_summary)
Property Type
Number of Leads
Apartment
27
Bungalow
26
Duplex
25
Land
22
Code
inspection_summary <-as.data.frame(table(df$inspected))names(inspection_summary) <-c("Inspection Outcome", "Number of Leads")knitr::kable(inspection_summary)
Inspection Outcome
Number of Leads
No
49
Yes
51
Data description interpretation: The dataset meets the CS1 data requirements because it contains 100 observations, more than 6 variables, at least 3 numeric variables, at least 2 categorical variables, a date variable, and a clear outcome variable. The outcome variable, inspected, allows the analysis to focus on real estate lead conversion rather than only general inquiry activity.
5. Analysis 1: Exploratory Data Analysis
Exploratory Data Analysis was conducted to understand the structure, quality, and operational behaviour of the Lagos real estate lead dataset before applying statistical tests and regression modelling.
For JahDay Real Estate, this stage is important because lead conversion decisions should not be based only on instinct. EDA helps identify which channels generate inquiries, how budgets are distributed, whether response times vary meaningfully, and whether the dataset contains missing values, outliers, or unusual patterns that could affect interpretation.
EDA interpretation: The missing-value check confirms whether the dataset is complete enough for analysis. The key summary statistics show that the budget variable is highly skewed because the mean budget is much higher than the median budget. This is expected in Lagos real estate because a small number of luxury inquiries can raise the average significantly. To handle this, a log-transformed budget variable was created for later visualisation and regression. A second data quality issue was the date format. The inquiry date was imported from CSV and converted into a proper date variable before analysis. Outlier detection was also performed for budget, response time, and follow-up activity so that unusual values could be identified and interpreted rather than ignored.
6. Analysis 2: Data Visualisation
Data visualisation was used to communicate the Lagos real estate lead conversion story in a clear and manager-friendly way. The visual narrative moves from lead source volume, to inspection quality, to property demand, budget profile, and operational engagement behaviour.
These charts help distinguish lead volume from lead quality. A source may generate many inquiries, but that does not automatically mean it produces inspection-ready buyers.
6.1 Lead Volume by Source
This chart shows where the 100 real estate inquiries came from. It helps identify which channels are generating the highest inquiry volume.
Chart interpretation: Instagram generated the highest number of inquiries, followed by WhatsApp and referrals. This shows that digital visibility and direct relationship channels are important sources of buyer interest. However, inquiry volume alone does not confirm lead quality, so this chart should be read together with the inspection rate chart below.
6.2 Inspection Rate by Lead Source
This chart shows the share of leads from each source that progressed to inspection. It is useful because lead quality is not the same as lead volume.
Chart interpretation: This chart shifts the focus from quantity to quality. A source with fewer total inquiries can still produce stronger inspection movement if a higher share of those leads convert. This helps JahDay Real Estate avoid overvaluing high-volume channels and instead identify channels that are more likely to produce serious buyers.
6.3 Lead Volume by Property Type
This chart shows the distribution of inquiries across apartments, bungalows, duplexes, and land. It helps identify which property categories attract the most interest.
Chart interpretation: Property interest is fairly spread across the four categories, with apartments and bungalows slightly ahead. This suggests that the lead pool is not concentrated in only one property type, which makes the dataset more useful for comparing buyer behaviour across different real estate categories.
6.4 Lead Volume by Budget Band
This chart groups buyer budgets into practical business bands. This is easier to interpret than a raw budget histogram because Lagos real estate values are highly skewed by a small number of luxury inquiries.
Code
df$budget_band <-cut( df$budget_naira,breaks =c(-Inf, 100000000, 250000000, 500000000, 1000000000, Inf),labels =c("Up to 100M","101M to 250M","251M to 500M","501M to 1B","Above 1B" ))budget_band_counts <-sort(table(df$budget_band), decreasing =TRUE)par(mar =c(5, 12, 4, 5))bp4 <-barplot( budget_band_counts,horiz =TRUE,col ="orange",border ="grey30",main ="Lead Volume by Budget Band",xlab ="Number of Leads",las =1,xlim =c(0, max(budget_band_counts) *1.3))text(x = budget_band_counts,y = bp4,labels = budget_band_counts,pos =4,cex =0.9)mtext("Budget bands are in Naira",side =1,line =4,cex =0.8)
Chart interpretation: Most leads fall within the lower and mid-range budget bands, while fewer leads sit in the highest luxury budget category. This is important because the average budget can be misleading when a few very high-value leads are present. Grouping budgets into bands gives a more practical view of buyer affordability and helps with lead qualification.
6.5 Response Time by Inspection Outcome
This chart compares response time for leads that inspected versus leads that did not inspect. It helps assess whether faster response appears linked to inspection progression.
Code
par(mar =c(5, 5, 4, 2))boxplot( response_time_hours ~ inspected,data = df,col =c("lightcoral", "lightblue"),main ="Response Time by Inspection Outcome",xlab ="Inspection Outcome",ylab ="Response Time in Hours")
Chart interpretation: This chart helps show whether inspected leads were generally responded to faster than non-inspected leads. If the inspected group has a lower median response time, it suggests that speed may support conversion. If the difference is small, it suggests that response time matters, but it is not enough on its own to explain inspection movement.
6.6 Follow-Up Activity by Inspection Outcome
This chart compares the number of follow-ups for inspected and non-inspected leads. It helps evaluate whether sustained engagement is associated with inspection movement.
Chart interpretation: This chart shows whether inspected leads received more follow-up activity than non-inspected leads. In Lagos real estate, follow-up can be especially important because buyers often need additional information about location, title, documentation, pricing, and inspection logistics before taking action.
6.7 Response Time and Follow-Up Activity
This scatterplot shows the relationship between response time and follow-up activity, with inspection outcome shown by colour. It helps show whether leads that inspect have a distinct response or follow-up pattern.
Code
par(mar =c(5, 5, 4, 2))plot( df$response_time_hours, df$follow_ups,pch =19,col =ifelse(df$inspected =="Yes", "darkgreen", "darkred"),main ="Response Time and Follow-Up Activity",xlab ="Response Time in Hours",ylab ="Number of Follow-Ups")legend("topright",legend =c("Inspected = Yes", "Inspected = No"),col =c("darkgreen", "darkred"),pch =19,bty ="n")
Chart interpretation: The scatterplot shows that response time and follow-up activity do not form a perfectly simple pattern. Some leads require many follow-ups even when response time is fast, while others do not progress despite engagement. This supports the broader finding that inspection conversion is influenced by multiple factors, not one operational variable alone.
Overall visualisation interpretation: The visualisations show that lead volume, lead quality, budget profile, and engagement behaviour should be evaluated together. Instagram may produce the highest inquiry volume, but inspection rate and budget band distribution provide deeper insight into lead seriousness. The response time and follow-up charts show that operational discipline matters, but conversion is not explained by speed alone. Together, the charts support a more structured approach to client prioritisation and inspection conversion.
7. Analysis 3: Hypothesis Testing
Hypothesis testing was used to evaluate whether observed differences in the data are statistically meaningful or likely due to random variation. Two hypotheses were tested. The first examines whether inspected and non-inspected leads differ by budget profile. The second examines whether inspection outcome is associated with lead source.
Hypothesis 1: Budget Difference by Inspection Outcome
H0: The average log-transformed budget is the same for inspected and non-inspected leads.
H1: The average log-transformed budget is different for inspected and non-inspected leads.
A Welch two-sample t-test was used because the test compares the mean of a numeric variable across two independent groups. Log-transformed budget was used instead of raw budget because the original budget values were highly skewed by a small number of luxury property inquiries.
Mean log-transformed budget for non-inspected leads
19.572
p-value
0.0559
Effect size: Cohen’s d
-0.393
Decision at 5 percent significance level
Fail to reject H0
Code
assumption_check_h1 <-data.frame(Assumption =c("Independent observations","Numeric dependent variable","Two comparison groups","Equal variance requirement" ),Check =c("Each row represents one unique lead inquiry","Log-transformed budget is numeric","Inspection outcome has two groups: Yes and No","Welch t-test was used because it does not require equal variances" ))knitr::kable(assumption_check_h1)
Assumption
Check
Independent observations
Each row represents one unique lead inquiry
Numeric dependent variable
Log-transformed budget is numeric
Two comparison groups
Inspection outcome has two groups: Yes and No
Equal variance requirement
Welch t-test was used because it does not require equal variances
Hypothesis 1 interpretation: The Welch two-sample t-test produced a p-value slightly above the 0.05 significance level. This means there is not enough statistical evidence to conclude that inspected and non-inspected leads have different average log-transformed budgets. Therefore, I fail to reject the null hypothesis. However, the result is close to the threshold, and the Cohen’s d value indicates a small to moderate difference between the two groups. From a business perspective, budget profile may still matter, but it should not be used alone to predict inspection conversion. JahDay Real Estate should evaluate budget together with lead source, response behaviour, and follow-up activity.
Hypothesis 2: Lead Source and Inspection Outcome
H0: Lead source and inspection outcome are independent.
H1: Lead source and inspection outcome are associated.
A chi-square test was used because both lead source and inspection outcome are categorical variables.
Code
lead_source_table <-table(df$lead_source, df$inspected)chi_test <-chisq.test(lead_source_table, simulate.p.value =TRUE, B =10000)cramers_v <-sqrt(as.numeric(chi_test$statistic) / (sum(lead_source_table) * (min(dim(lead_source_table)) -1)))hypothesis_2_results <-data.frame(Measure =c("Test used","p-value","Effect size: Cramer's V","Decision at 5 percent significance level" ),Result =c("Chi-square test with simulated p-value",round(chi_test$p.value, 4),round(cramers_v, 3),ifelse(chi_test$p.value <0.05, "Reject H0", "Fail to reject H0") ))knitr::kable(as.data.frame.matrix(lead_source_table))
No
Yes
Agent Network
6
3
Instagram
28
19
Referral
4
15
Website
2
1
WhatsApp
9
13
Code
knitr::kable(hypothesis_2_results)
Measure
Result
Test used
Chi-square test with simulated p-value
p-value
0.0343
Effect size: Cramer’s V
0.318
Decision at 5 percent significance level
Reject H0
Code
assumption_check_h2 <-data.frame(Assumption =c("Categorical variables","Independent observations","Expected cell count concern" ),Check =c("Lead source and inspection outcome are both categorical","Each row represents one unique lead inquiry","Simulated p-value was used because some lead source categories have small counts" ))knitr::kable(assumption_check_h2)
Assumption
Check
Categorical variables
Lead source and inspection outcome are both categorical
Independent observations
Each row represents one unique lead inquiry
Expected cell count concern
Simulated p-value was used because some lead source categories have small counts
Hypothesis 2 interpretation: The chi-square test evaluates whether inspection outcome differs by lead source. If the p-value is below 0.05, the result suggests that lead source and inspection outcome are statistically associated. Cramer’s V measures the strength of that association. In business terms, this test helps show whether some inquiry channels are more likely to produce inspection-ready leads than others. For JahDay Real Estate, this is important because time and follow-up effort should not be allocated only based on lead volume. Channels that produce stronger inspection movement should receive greater attention in lead prioritisation and client engagement planning.
Overall hypothesis testing interpretation: The hypothesis tests show that budget profile alone does not fully explain inspection conversion, while lead source may provide stronger insight into inspection movement. This supports the broader conclusion that conversion is influenced by a combination of buyer profile, acquisition channel, and operational follow-up rather than one single variable.
8. Analysis 4: Correlation Analysis
Correlation analysis was used to examine the strength and direction of relationships between numeric variables in the dataset. This helps identify whether budget, response time, follow-up activity, and inspection outcome move together in ways that may be useful for real estate lead prioritisation.
Correlation does not prove causation, but it helps identify operational relationships that deserve further investigation.
Correlation interpretation: The correlation matrix shows how the numeric variables relate to one another. A positive correlation means that two variables tend to increase together, while a negative correlation means that one variable tends to decrease as the other increases. The strongest relationships are useful because they indicate which operational variables may be connected to inspection progression. In this dataset, the key managerial question is whether response time, follow-up activity, and budget profile are meaningfully related to inspection outcome. If follow-up activity has a stronger relationship with inspection than response time, this would suggest that sustained engagement may matter more than speed alone. However, correlation should not be treated as proof of causation. A stronger causal test would require tracking similar leads over time and comparing outcomes based on controlled differences in response strategy or follow-up intensity.
9. Analysis 5: Logistic Regression
Logistic regression was used because the main business outcome in this case study is binary: a lead either progressed to inspection or did not progress to inspection. The model estimates how selected lead characteristics are associated with the probability of inspection.
For JahDay Real Estate, this is useful because it supports a more structured approach to lead prioritisation. Instead of treating all inquiries equally, the model helps assess whether budget profile, response time, follow-up activity, lead source, and property type are associated with inspection conversion.
significant_terms <- model_coefficients[model_coefficients$`Pr(>|z|)`<0.05, ]if (nrow(significant_terms) ==0) { significant_terms <-data.frame(Note ="No predictor was statistically significant at the 5 percent level in this model." )}knitr::kable(significant_terms)
Regression interpretation: The logistic regression model estimates the likelihood that a lead will progress to inspection based on lead characteristics. Odds ratios above 1 suggest that a variable is associated with higher inspection odds, while odds ratios below 1 suggest lower inspection odds. Statistically significant predictors should be interpreted as the strongest candidates for operational action. For example, if follow-up activity has an odds ratio above 1 and is statistically significant, this would support a more structured follow-up process. If a lead source has a higher odds ratio than the reference category, that channel should receive greater prioritisation in marketing and response planning. The model should not be treated as a perfect prediction system because the sample size is limited, but it provides a practical evidence base for improving lead scoring and inspection conversion.
10. Integrated Findings
The five analyses collectively show that Lagos real estate lead conversion is not driven by one isolated factor. Inspection progression is shaped by a combination of lead source quality, buyer budget profile, response behaviour, and follow-up activity. This is important because real estate inquiries can appear promising at first glance, but not every inquiry represents a buyer who is financially ready, serious, or prepared to move toward inspection.
The exploratory data analysis confirmed that the dataset was suitable for CS1 and met the required structure for analysis. It also revealed two important data quality issues. First, budget values were highly skewed because a small number of luxury property inquiries increased the average budget. This meant that the mean budget alone could give a misleading impression of the typical buyer profile. To address this, log-transformed budget was created for analysis, while budget bands were used in the visual section to make the pattern easier to interpret from a business perspective. Second, inquiry date was imported from the CSV as a text field and was converted into a proper date variable before analysis. These cleaning steps made the dataset more reliable for interpretation.
The visualisation section showed that lead volume and lead quality are not the same. Instagram generated the highest number of inquiries, which confirms its usefulness as a visibility and discovery channel. However, the inspection rate chart showed that high inquiry volume does not automatically translate into stronger inspection conversion. This distinction is important for decision making because a business can waste time chasing a large number of low-readiness leads if it focuses only on volume. The budget band chart also gave a more practical view of buyer affordability by showing how many leads fall into each budget range instead of allowing a few luxury inquiries to distort the full picture.
The hypothesis testing section added statistical discipline to the business interpretation. The first test showed that the difference in log-transformed budget between inspected and non-inspected leads was not statistically significant at the 5 percent level, although the result was close enough to remain operationally relevant. This suggests that budget profile may matter, but budget alone should not be treated as a reliable predictor of inspection conversion. The second test showed that lead source and inspection outcome were statistically associated. This finding is more actionable because it indicates that where a lead comes from may influence the likelihood of inspection. In practical terms, source quality deserves attention alongside buyer budget.
The correlation analysis reinforced the idea that no single numeric variable fully explains inspection behaviour. Budget, response time, follow-up activity, and inspection outcome should be interpreted together rather than separately. The relationships between these variables help identify patterns, but they do not prove causation. For example, stronger follow-up activity may be linked to inspection progression, but it may also reflect the fact that more serious buyers naturally engage more. This means the business should use correlation as a signal for further investigation, not as final proof.
The logistic regression analysis brought the variables together into one model and provided a more structured way to assess inspection probability. Because the outcome variable is binary, logistic regression was the appropriate method for this business problem. The model supports the broader finding that inspection conversion should be viewed as a probability influenced by several factors, not a simple yes-or-no judgment based on one characteristic. Even if not every predictor is statistically significant, the model is useful as a decision-support tool because it shows how multiple operational and buyer characteristics can be considered together.
The single recommendation is that JahDay Real Estate should adopt a simple lead prioritisation framework. Leads should not be ranked only by who came in first, which channel produced the most inquiries, or who appears to have the highest budget. Instead, priority should be given to leads with stronger source quality, realistic budget alignment, clear inspection intent, and consistent follow-up potential. This would help focus time, inspection coordination, buyer advisory effort, and developer engagement on the leads most likely to progress into serious transaction opportunities.
Overall, the analysis supports a shift from reactive lead handling to more structured client prioritisation. The business should continue responding quickly, but speed should be supported by better qualification. A practical next step would be to create a simple scoring system that assigns weight to lead source, budget band, follow-up engagement, and inspection readiness. This would make the lead management process more evidence based while still allowing room for professional judgment in a relationship-driven market like Lagos real estate.
11. Limitations and Further Work
This study is limited by the sample size of 100 observations. Although this meets the assignment requirement, a larger dataset collected over a longer period would provide more stable and generalisable findings. The data reflects lead behaviour within one real estate business context and should not be interpreted as a complete representation of the entire Lagos real estate market.
The dataset also does not include some variables that may strongly influence inspection conversion. These include buyer income level, financing structure, urgency, title concerns, property documentation stage, marketing spend, developer reputation, and final sale outcome. These factors are important in Lagos real estate because buyers often require trust, legal comfort, and financial readiness before progressing from inquiry to inspection.
Another limitation is that the analysis focuses on inspection as the outcome variable, not final purchase. Inspection is an important operational milestone, but it does not always result in a completed transaction.
With more data and time, I would collect additional observations across multiple property campaigns and track each lead from first inquiry to inspection, negotiation, and final sale. Further work could also apply customer segmentation, predictive modelling, and time based analysis to improve lead scoring, marketing allocation, and follow up planning.
References
Adi, B. (2026). Data Analytics 1 Course Materials. Lagos Business School.
JahDay Real Estate. (2026). Anonymised Lagos real estate lead inquiry dataset, November 2025 to May 2026 [Unpublished primary dataset].
R Core Team. (2026). R: A language and environment for statistical computing. R Foundation for Statistical Computing.
Xie, Y. (2026). knitr: A general-purpose package for dynamic report generation in R.
Appendix: AI Usage Statement
ChatGPT was used as a support tool during the preparation of this report, specifically for code troubleshooting, Quarto formatting guidance, and refinement of interpretation language. The dataset was independently assembled, cleaned, and anonymised by the author from real estate lead activity connected to JahDay Real Estate. The business question, professional context, selection of analytical techniques, review of outputs, interpretation of results, and final recommendations were independently evaluated and validated by the author. AI assistance was used to support presentation, formatting, and clarity, while the underlying data ownership, analytical judgement, and business reasoning remained with the author.