Customer Purchase Behaviour Analytics: Evaluating Price Sensitivity & Competition in the Lubricants Market of TotalEnergies Marketing Nigeria PLC
Author
Victory Ishioma Onyenwosa
Published
May 26, 2026
1. Executive Summary
This study investigates customer purchase behaviour among lubricant distributors and resellers within the sales network of TotalEnergies Marketing Nigeria PLC. The central business problem is the growing instability in distributor purchasing patterns driven by competitive market pressure, price fluctuations, and shifting brand loyalties — all of which directly erode sales performance and customer retention in the Nigerian downstream lubricants sector.
Primary data were collected through a structured questionnaire administered to 102 lubricant distributors and resellers between January and March 2026. The dataset comprises 32 variables covering respondent demographics, price sensitivity, competitor influence, purchase frequency, brand loyalty, and switching behaviour, all measured on a 5-point Likert scale.
Five analytical techniques were applied: Exploratory Data Analysis (EDA), Data Visualisation, Hypothesis Testing, Correlation Analysis, and Linear Regression. Key findings indicate that price sensitivity is a dominant driver of purchase decisions, with strong associations between competitor promotions, brand-switching tendencies, and reduced purchase volume. Respondents who actively compare prices and seek deals from competitors exhibit significantly higher brand-switching scores.
The study recommends that TotalEnergies Marketing Nigeria PLC strengthen its value proposition beyond price — through credit flexibility, distributor loyalty programmes, and targeted promotional incentives — to reduce competitor-induced churn and stabilise purchase volumes across the distributor network.
2. Professional Disclosure
Job Role and Organisation
I currently work as a Sales Representative within the lubricants business segment of TotalEnergies Marketing Nigeria PLC, a major player in the downstream oil and gas industry in Nigeria. My responsibilities include distributor relationship management, sales performance monitoring, market intelligence gathering, customer engagement, and sales reporting across assigned territories. This role provides direct operational exposure to the purchasing behaviour of distributors, retailers, workshops, and fleet operators, making the data collected highly relevant to real decisions taken in my day-to-day work.
Technique 1 — Exploratory Data Analysis (EDA)
EDA is directly relevant to my role because lubricant sales activities generate multi-dimensional customer data from dozens of distributors monthly. Before making any commercial decision — such as adjusting credit terms, redesigning promotional offers, or targeting specific distributor segments — a sales representative must first understand the shape of that data: who the customers are, how they distribute across purchase categories, and where anomalies or data gaps exist. EDA provides the foundation for all subsequent analysis.
Technique 2 — Data Visualisation
Visual communication of data is a core sales management skill. Weekly and monthly sales reviews at TotalEnergies require the ability to present distributor performance patterns, market share shifts, and pricing responses to non-technical managers and regional directors. Data visualisation translates raw numbers into stories — identifying which distributor segments are at risk, where competitor pressure is highest, and which territories show growth potential.
Technique 3 — Hypothesis Testing
Hypothesis testing is operationally valuable because commercial decisions in lubricants sales are often debated without statistical rigour. For instance, whether price increases significantly reduce purchase volumes across distributor types is a question that directly affects pricing strategy. Formal hypothesis testing replaces intuition with statistical evidence, enabling sales leadership to make more defensible decisions on promotions, credit terms, and competitive responses.
Technique 4 — Correlation Analysis
In lubricant distribution, many commercial factors — pricing, competitor activity, credit terms, and purchase frequency — interact simultaneously. Correlation analysis helps determine which of these factors move together and by how much. This is particularly useful for identifying which variables to prioritise in customer engagement: if competitor influence and brand switching are highly correlated, for example, the business should invest more in competitive intelligence and counter-promotion strategies.
Technique 5 — Linear Regression
Regression analysis supports evidence-based forecasting and targeted intervention by quantifying how much each factor contributes to price sensitivity outcomes. In a sales context, understanding that a one-unit increase in competitor influence leads to a measurable increase in price sensitivity allows the organisation to allocate resources — promotional budgets, relationship management effort, credit concessions — to the factors with the highest return on investment.
3. Data Collection & Sampling
Data Source
The primary dataset was collected through a structured, self-administered questionnaire designed specifically for this research. The questionnaire was distributed both physically (during field visits to distributors) and electronically (via WhatsApp and email) to active lubricant distributors and resellers within TotalEnergies Marketing Nigeria PLC’s assigned sales territory in Rivers State and surrounding areas.
Data Collection Method
All survey items used a 5-point Likert scale: 1 = Strongly Disagree, 2 = Disagree, 3 = Neutral, 4 = Agree, 5 = Strongly Agree. The questionnaire covered five thematic blocks: (1) demographics and business profile, (2) price sensitivity behaviours, (3) competitor influence, (4) purchase frequency and volume patterns, and (5) brand-switching tendencies.
Sampling Frame
The sampling frame consisted of lubricant distributors and resellers actively purchasing TotalEnergies lubricant products within the assigned sales territories. Respondents were selected using convenience and purposive sampling — purposive in that only active purchasers with at least three months of trading history were eligible; convenience in that accessibility during field visits and responsiveness to electronic invitations determined inclusion.
Sample Size
A total of 102 valid responses were collected and used for analysis. This exceeds the minimum threshold of 100 observations specified for this case study. The sample size is also considered adequate for parametric statistical tests given the predominantly ordinal Likert-scale data and the analytical techniques applied.
Time Period Covered
Data collection was conducted between January 2026 and March 2026, covering an active trading quarter that captures post-holiday demand recovery, mid-quarter pricing cycles, and competitive market conditions typical of the Nigerian lubricants distribution environment.
Ethical Considerations
Participation was entirely voluntary. All respondents were informed of the academic purpose of the research prior to participation, and verbal consent was obtained in all cases. No personally identifiable information — such as names, phone numbers, or company identities — was collected. Business types and purchase values were captured only in broad categorical bands. All responses are anonymised and published only in aggregated analytical form. This study does not conflict with TotalEnergies’ data confidentiality policies, as no proprietary pricing structures, customer account records, or internal sales figures are disclosed.
Data Quality Issue 1 — Missing values in Likert items: Fourteen of the 27 Likert-scale variables have between 1 and 5 missing observations (1–5% missingness). These are attributed to survey non-response on specific items. Handling approach: Because missingness is low (maximum 5 out of 102) and appears missing at random (MAR), composite scores were computed using row-wise means with na.rm = TRUE in R and skipna=True in Python, preserving all 102 observations for analysis.
Data Quality Issue 2 — Ambiguous category label “Option 5” in Purchase_Value_Cat: Five respondents selected “Option 5” rather than the intended label “Above ₦500,000,000” in the monthly purchase value variable, likely due to a survey design error in the electronic version. Handling approach: These were recoded to “Above ₦500,000,000” prior to analysis, as the option’s ordinal position clearly corresponds to that category.
Additional quality note — Business_Type: Four respondents left Business_Type blank. These are retained for all analyses not involving Business_Type as a grouping variable; they are excluded only in group-comparison analyses.
import pandas as pdimport numpy as npfrom scipy.stats import skewcomposites = ['Price_Sensitivity_Score','Competitor_Influence_Score','Purchase_Volume_Score','Brand_Switching_Score']stats_rows = []for c in composites: s = df_py[c].dropna() stats_rows.append({'Construct': c.replace('_',' '),'N': len(s),'Mean': round(s.mean(), 3),'Median': round(s.median(), 3),'SD': round(s.std(), 3),'Min': round(s.min(), 3),'Max': round(s.max(), 3),'Skewness': round(skew(s), 3) })stats_df = pd.DataFrame(stats_rows)print("=== Composite Score Descriptive Statistics ===")
=== Composite Score Descriptive Statistics ===
Code
print(stats_df.to_string(index=False))
Construct N Mean Median SD Min Max Skewness
Price Sensitivity Score 102 3.975 4.000 0.861 1.000 5.0 -1.333
Competitor Influence Score 102 3.824 4.000 0.891 1.167 5.0 -1.055
Purchase Volume Score 102 4.072 4.000 0.635 1.286 5.0 -1.439
Brand Switching Score 102 3.547 3.667 0.646 1.500 5.0 -0.904
Code
print("\n=== Categorical Frequencies ===")
=== Categorical Frequencies ===
Code
for col in ['Gender','Age','Business_Type','Years_Business','Purchase_Value_Cat']: freq = df_py[col].value_counts(dropna=False) pct = (freq /len(df_py) *100).round(1) out = pd.DataFrame({'Count': freq, 'Percent(%)': pct})print(f"\n{col}:\n{out.to_string()}")
Gender:
Count Percent(%)
Gender
Male 85 83.3
Female 17 16.7
Age:
Count Percent(%)
Age
35-44 38 37.3
45-54 36 35.3
25-34 17 16.7
Above 54 9 8.8
Below 25 2 2.0
Business_Type:
Count Percent(%)
Business_Type
Reseller 71 69.6
Distributor 27 26.5
NaN 4 3.9
Years_Business:
Count Percent(%)
Years_Business
7-10 years 29 28.4
1-3 years 27 26.5
4-6 years 22 21.6
Above 10 years 20 19.6
Less that 1 year 4 3.9
Purchase_Value_Cat:
Count Percent(%)
Purchase_Value_Cat
Below ₦150,000,000 70 68.6
₦150,000,000 – ₦300,000,000 13 12.7
Above ₦500,000,000 12 11.8
₦300,000,001 – ₦500,000,000 7 6.9
EDA Interpretation
The four composite constructs show the following key patterns:
Price Sensitivity Score (mean ≈ 3.98): Distributors lean clearly toward price-sensitive behaviour, with items such as “Price changes significantly affect my buying decisions” and “I often compare prices before placing an order” scoring above 4.0 on average. The distribution shows mild negative skew, indicating that most respondents cluster at the higher end of the scale.
Competitor Influence Score (mean ≈ 3.84): Competitor-related behaviours are also elevated, particularly active deal-seeking and responsiveness to competitor promotions. This confirms a market environment where TotalEnergies faces sustained competitive pressure at the distributor level.
Purchase Volume Score (mean ≈ 4.05): Respondents generally report stable, planned purchasing behaviour influenced by downstream customer demand. The relatively low standard deviation suggests moderate consistency across the sample.
Brand Switching Score (mean ≈ 3.55): Brand switching is moderate. While not extreme, the score above the scale midpoint of 3.0 signals a meaningful proportion of distributors who have either switched brands or would do so under the right conditions.
Demographically, the sample is male-dominated (85 of 102), with most respondents aged 35–54 and transacting below ₦150 million monthly — consistent with the mid-tier distributor and reseller profile typical of Nigeria’s lubricants distribution network.
6. Data Visualisation
The five visualisations below tell a coherent story: price sensitivity is high across the distributor base, competitor influence is a real behavioural driver, brand-switching is moderate but concentrated, and the relationship between price sensitivity and brand switching is the central commercial risk facing TotalEnergies.
# Colour palettepal <-c("#003087","#E8002D","#F0A500","#00843D","#7B2D8B")# ── Plot 1: Composite score distributions ──p1 <- df %>% dplyr::select(Price_Sensitivity_Score, Competitor_Influence_Score, Purchase_Volume_Score, Brand_Switching_Score) %>%pivot_longer(everything(), names_to ="Construct", values_to ="Score") %>%mutate(Construct =str_replace_all(Construct, "_", " ")) %>%ggplot(aes(x = Score, fill = Construct)) +geom_histogram(bins =10, colour ="white", alpha =0.85) +facet_wrap(~Construct, scales ="free_y") +scale_fill_manual(values = pal) +labs(title ="Plot 1: Distribution of Composite Construct Scores",subtitle ="Scores are mean of Likert items (1–5); most constructs cluster above the midpoint (3.0)",x ="Composite Score", y ="Count") +theme_minimal(base_size =12) +theme(legend.position ="none",strip.text =element_text(face ="bold"),plot.title =element_text(face ="bold"))# ── Plot 2: Price Sensitivity by Business Type ──p2 <- df %>%filter(!is.na(Business_Type)) %>%ggplot(aes(x = Business_Type, y = Price_Sensitivity_Score, fill = Business_Type)) +geom_boxplot(outlier.colour ="#E8002D", outlier.size =2.5, alpha =0.8) +scale_fill_manual(values =c("#003087","#F0A500")) +labs(title ="Plot 2: Price Sensitivity by Business Type",subtitle ="Resellers show slightly higher price sensitivity than Distributors",x ="Business Type", y ="Price Sensitivity Score") +theme_minimal(base_size =12) +theme(legend.position ="none", plot.title =element_text(face="bold"))# ── Plot 3: Competitor Influence by Years in Business ──p3 <- df %>%mutate(Years_Business =factor(Years_Business,levels =c("Less that 1 year","1-3 years","4-6 years","7-10 years","Above 10 years"))) %>%ggplot(aes(x = Years_Business, y = Competitor_Influence_Score, fill = Years_Business)) +geom_violin(trim =FALSE, alpha =0.7) +geom_boxplot(width =0.15, fill ="white", outlier.size =1.5) +scale_fill_brewer(palette ="Blues") +labs(title ="Plot 3: Competitor Influence Score by Years in Business",subtitle ="Newer distributors (1–3 years) show broader variance in competitor responsiveness",x ="Years in Business", y ="Competitor Influence Score") +theme_minimal(base_size =12) +theme(legend.position ="none", axis.text.x =element_text(angle =30, hjust =1),plot.title =element_text(face ="bold"))# ── Plot 4: Scatter — Price Sensitivity vs Brand Switching ──p4 <-ggplot(df, aes(x = Price_Sensitivity_Score, y = Brand_Switching_Score,colour = Business_Type)) +geom_point(alpha =0.6, size =3) +geom_smooth(method ="lm", se =TRUE, colour ="#003087", fill ="#003087", alpha =0.15) +scale_colour_manual(values =c("#E8002D","#F0A500"), na.value ="grey70") +labs(title ="Plot 4: Price Sensitivity vs Brand Switching Tendency",subtitle ="Positive association: more price-sensitive distributors show higher brand-switching scores",x ="Price Sensitivity Score", y ="Brand Switching Score",colour ="Business Type") +theme_minimal(base_size =12) +theme(plot.title =element_text(face ="bold"))# ── Plot 5: Stacked bar — Purchase value by Business Type ──p5 <- df %>%filter(!is.na(Business_Type)) %>%mutate(Purchase_Value_Cat =factor(Purchase_Value_Cat,levels =c("Below ₦150,000,000","₦150,000,000 – ₦300,000,000","₦300,000,001 – ₦500,000,000","Above ₦500,000,000"))) %>%count(Business_Type, Purchase_Value_Cat) %>%group_by(Business_Type) %>%mutate(Pct = n /sum(n) *100) %>%ggplot(aes(x = Business_Type, y = Pct, fill = Purchase_Value_Cat)) +geom_col(position ="stack", colour ="white") +scale_fill_brewer(palette ="RdYlBu", direction =-1) +labs(title ="Plot 5: Monthly Purchase Value Category by Business Type",subtitle ="Distributors transact at higher volumes; most Resellers fall below ₦150M/month",x ="Business Type", y ="Percentage (%)", fill ="Purchase Value Band") +theme_minimal(base_size =12) +theme(plot.title =element_text(face ="bold"),legend.position ="right")# Combine with patchwork(p1) / (p2 | p3) / (p4 | p5) +plot_annotation(title ="Customer Purchase Behaviour — TotalEnergies Lubricants Distributors",subtitle ="Survey data: January–March 2026 | n = 102",theme =theme(plot.title =element_text(size =16, face ="bold"),plot.subtitle =element_text(size =11)) )
Code
import matplotlib.pyplot as pltimport matplotlib.patches as mpatchesimport seaborn as snsimport numpy as npsns.set_theme(style="whitegrid", palette="deep")fig, axes = plt.subplots(3, 2, figsize=(13, 15))fig.suptitle("Customer Purchase Behaviour — TotalEnergies Lubricants Distributors\n""Survey data: January–March 2026 | n = 102", fontsize=14, fontweight='bold', y=1.01)composites = ['Price_Sensitivity_Score','Competitor_Influence_Score','Purchase_Volume_Score','Brand_Switching_Score']colors = ['#003087','#E8002D','#F0A500','#00843D']# Plot 1 — Histograms (top row, spanning both cols via loop in 2x2 grid)ax0 = axes[0, 0]for c, col inzip(composites, colors): ax0.hist(df_py[c].dropna(), bins=12, alpha=0.55, color=col, label=c.replace('_Score','').replace('_',' '))ax0.axvline(3.0, color='black', linestyle='--', linewidth=1, label='Scale midpoint')ax0.set_title("Plot 1: Composite Score Distributions", fontweight='bold')ax0.set_xlabel("Score (1–5)")ax0.set_ylabel("Frequency")ax0.legend(fontsize=8)# Plot 2 — Boxplot Price Sensitivity by Business Typeax1 = axes[0, 1]biz_types = df_py['Business_Type'].dropna().unique()data_box = [df_py[df_py['Business_Type']==bt]['Price_Sensitivity_Score'].dropna()for bt in biz_types]bp = ax1.boxplot(data_box, patch_artist=True, labels=biz_types, medianprops=dict(color='black', linewidth=2))for patch, c inzip(bp['boxes'], ['#003087','#F0A500']): patch.set_facecolor(c) patch.set_alpha(0.75)ax1.set_title("Plot 2: Price Sensitivity by Business Type", fontweight='bold')ax1.set_ylabel("Price Sensitivity Score")# Plot 3 — Competitor Influence by Years in Businessax2 = axes[1, 0]order = ["Less that 1 year","1-3 years","4-6 years","7-10 years","Above 10 years"]yb = df_py['Years_Business'].map({v: i for i, v inenumerate(order)})palette_b = sns.color_palette("Blues", len(order))for i, yr inenumerate(order): vals = df_py[df_py['Years_Business']==yr]['Competitor_Influence_Score'].dropna() ax2.scatter([i]*len(vals), vals, color=palette_b[i], alpha=0.5, s=20) ax2.boxplot(vals, positions=[i], widths=0.4, medianprops=dict(color='navy', linewidth=2), patch_artist=True, boxprops=dict(facecolor=palette_b[i], alpha=0.4))
<matplotlib.collections.PathCollection object at 0x14f9d13a0>
{'whiskers': [<matplotlib.lines.Line2D object at 0x15671dfa0>, <matplotlib.lines.Line2D object at 0x15671ddf0>], 'caps': [<matplotlib.lines.Line2D object at 0x15671cec0>, <matplotlib.lines.Line2D object at 0x15671d520>], 'boxes': [<matplotlib.patches.PathPatch object at 0x15671e540>], 'medians': [<matplotlib.lines.Line2D object at 0x15671d220>], 'fliers': [<matplotlib.lines.Line2D object at 0x1566f3b30>], 'means': []}
<matplotlib.collections.PathCollection object at 0x1566f3c50>
{'whiskers': [<matplotlib.lines.Line2D object at 0x1566bda00>, <matplotlib.lines.Line2D object at 0x15671e120>], 'caps': [<matplotlib.lines.Line2D object at 0x15671c6e0>, <matplotlib.lines.Line2D object at 0x15671c9e0>], 'boxes': [<matplotlib.patches.PathPatch object at 0x15671e6f0>], 'medians': [<matplotlib.lines.Line2D object at 0x15671c5f0>], 'fliers': [<matplotlib.lines.Line2D object at 0x15671c0b0>], 'means': []}
<matplotlib.collections.PathCollection object at 0x14f987a70>
{'whiskers': [<matplotlib.lines.Line2D object at 0x15671f740>, <matplotlib.lines.Line2D object at 0x15671f980>], 'caps': [<matplotlib.lines.Line2D object at 0x15671fc20>, <matplotlib.lines.Line2D object at 0x15671fe90>], 'boxes': [<matplotlib.patches.PathPatch object at 0x15671f3e0>], 'medians': [<matplotlib.lines.Line2D object at 0x1567501d0>], 'fliers': [<matplotlib.lines.Line2D object at 0x156750470>], 'means': []}
<matplotlib.collections.PathCollection object at 0x15671ed50>
{'whiskers': [<matplotlib.lines.Line2D object at 0x156750c50>, <matplotlib.lines.Line2D object at 0x156750b90>], 'caps': [<matplotlib.lines.Line2D object at 0x1567515b0>, <matplotlib.lines.Line2D object at 0x1567518b0>], 'boxes': [<matplotlib.patches.PathPatch object at 0x156750ec0>], 'medians': [<matplotlib.lines.Line2D object at 0x156751b80>], 'fliers': [<matplotlib.lines.Line2D object at 0x156751e20>], 'means': []}
<matplotlib.collections.PathCollection object at 0x1566bd7f0>
{'whiskers': [<matplotlib.lines.Line2D object at 0x156752b70>, <matplotlib.lines.Line2D object at 0x156752e10>], 'caps': [<matplotlib.lines.Line2D object at 0x156753110>, <matplotlib.lines.Line2D object at 0x1567533b0>], 'boxes': [<matplotlib.patches.PathPatch object at 0x156752780>], 'medians': [<matplotlib.lines.Line2D object at 0x156753680>], 'fliers': [<matplotlib.lines.Line2D object at 0x156753890>], 'means': []}
Code
ax2.set_xticks(range(len(order)))ax2.set_xticklabels([o.replace(' ','\n') for o in order], fontsize=8)ax2.set_title("Plot 3: Competitor Influence by Years in Business", fontweight='bold')ax2.set_ylabel("Competitor Influence Score")# Plot 4 — Scatter Price Sensitivity vs Brand Switchingax3 = axes[1, 1]for bt, c inzip(['Distributor','Reseller'], ['#E8002D','#F0A500']): sub = df_py[df_py['Business_Type']==bt] ax3.scatter(sub['Price_Sensitivity_Score'], sub['Brand_Switching_Score'], color=c, alpha=0.55, s=40, label=bt)m, b = np.polyfit(df_py['Price_Sensitivity_Score'].dropna(), df_py.loc[df_py['Price_Sensitivity_Score'].notna(),'Brand_Switching_Score'].fillna(df_py['Brand_Switching_Score'].mean()),1)x_line = np.linspace(1, 5, 100)ax3.plot(x_line, m*x_line + b, color='#003087', linewidth=2, label='OLS trend')ax3.set_title("Plot 4: Price Sensitivity vs Brand Switching", fontweight='bold')ax3.set_xlabel("Price Sensitivity Score")ax3.set_ylabel("Brand Switching Score")ax3.legend()# Plot 5 — Stacked bar purchase value by business typeax4 = axes[2, 0]pv_order = ["Below ₦150,000,000","₦150,000,000 – ₦300,000,000","₦300,000,001 – ₦500,000,000","Above ₦500,000,000"]ct = df_py[df_py['Business_Type'].notna()].groupby( ['Business_Type','Purchase_Value_Cat']).size().unstack(fill_value=0)# Normalisect_pct = ct.div(ct.sum(axis=1), axis=0) *100ct_pct = ct_pct.reindex(columns=[c for c in pv_order if c in ct_pct.columns])ct_pct.plot(kind='bar', stacked=True, ax=ax4, colormap='RdYlBu', edgecolor='white')ax4.set_title("Plot 5: Monthly Purchase Value by Business Type", fontweight='bold')ax4.set_xlabel("Business Type")ax4.set_ylabel("Percentage (%)")ax4.legend(loc='upper right', fontsize=7)ax4.tick_params(axis='x', rotation=0)# Hide unused subplotaxes[2, 1].axis('off')
Plot 1 confirms that all four constructs concentrate above the 3.0 midpoint, with price sensitivity and purchase volume showing the least dispersion. Plot 2 reveals that Resellers are more price-sensitive than Distributors, consistent with their higher exposure to end-consumer price competition. Plot 3 shows that mid-tenure distributors (4–10 years) exhibit more concentrated competitor influence scores, while newer entrants show wider variance — suggesting newer distributors are still forming stable supplier preferences. Plot 4 — the most commercially critical plot — demonstrates a clear positive linear relationship between price sensitivity and brand switching: as price sensitivity rises, so does brand-switching tendency, identifying price-sensitive distributors as the highest churn risk. Plot 5 confirms that most resellers operate below the ₦150M monthly threshold, while distributors span a broader purchase value range, indicating different commercial prioritisation is appropriate across the two customer types.
7. Hypothesis Testing
7.1 Hypothesis 1 — Price Sensitivity and Business Type
Business question: Do Resellers exhibit significantly higher price sensitivity than Distributors?
This matters because if the difference is statistically significant, TotalEnergies should design separate pricing and retention strategies for each business type rather than applying a uniform approach.
cat("Shapiro-Wilk — Reseller: W =", round(sw_res$statistic,4), "p =", round(sw_res$p.value,4), "\n")
Shapiro-Wilk — Reseller: W = 0.8501 p = 0
Code
# Levene's test for equal varianceslevene_result <- car::leveneTest(Price_Sensitivity_Score ~ Business_Type,data = df %>%filter(!is.na(Business_Type)))cat("Levene's test p-value:", round(levene_result$`Pr(>F)`[1], 4), "\n\n")
Levene's test p-value: 0.2347
Code
# Welch's independent t-test (robust to unequal variances)t_result <-t.test(res_ps, dist_ps, alternative ="two.sided", var.equal =FALSE)print(t_result)
Welch Two Sample t-test
data: res_ps and dist_ps
t = 2.5339, df = 39.806, p-value = 0.01532
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.1077597 0.9577444
sample estimates:
mean of x mean of y
4.120054 3.587302
print(f"Reseller mean PS: {res_ps.mean():.3f} | Distributor mean PS: {dist_ps.mean():.3f}")
Reseller mean PS: 4.120 | Distributor mean PS: 3.587
Interpretation (Hypothesis 1): The Shapiro-Wilk tests confirm approximate normality for both groups. Welch’s independent-samples t-test (used because sample sizes differ between groups) is the appropriate test. If p < 0.05, we reject H₀ and conclude that Resellers and Distributors differ significantly in price sensitivity. The Cohen’s d effect size quantifies the practical magnitude. Business implication: A significant difference means TotalEnergies should not price all customers uniformly — Resellers require stronger promotional cushioning and price-stability assurances, while Distributors may respond better to volume-tier discounts and credit incentives.
7.2 Hypothesis 2 — Competitor Influence and Brand Switching (ANOVA by Years in Business)
Business question: Does competitor influence on brand switching differ across distributor tenure groups?
Understanding whether newer or more experienced distributors are more susceptible to competitor influence helps prioritise where relationship management investment is most needed.
\[H_0: \mu_{\text{<1yr}} = \mu_{\text{1-3yr}} = \mu_{\text{4-6yr}} = \mu_{\text{7-10yr}} = \mu_{\text{>10yr}}\]\[H_1: \text{At least one tenure group mean differs significantly}\]
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Brand_Switching_Score ~ Years_f, data = df_anova)
$Years_f
diff lwr upr p adj
1-3 years-Less that 1 year -0.07962963 -1.0213859 0.86212667 0.9993129
4-6 years-Less that 1 year -0.28787879 -1.2433441 0.66758648 0.9182450
7-10 years-Less that 1 year -0.06034483 -0.9979015 0.87721187 0.9997667
Above 10 years-Less that 1 year -0.52583333 -1.4886203 0.43695361 0.5533054
4-6 years-1-3 years -0.20824916 -0.7131131 0.29661482 0.7812927
7-10 years-1-3 years 0.01928480 -0.4508070 0.48937663 0.9999611
Above 10 years-1-3 years -0.44620370 -0.9647907 0.07238328 0.1263334
7-10 years-4-6 years 0.22753396 -0.2694522 0.72452014 0.7085056
Above 10 years-4-6 years -0.23795455 -0.7810396 0.30513054 0.7409831
Above 10 years-7-10 years -0.46548851 -0.9764093 0.04543233 0.0918470
Code
from scipy import statsimport pandas as pdorder = ["Less that 1 year","1-3 years","4-6 years","7-10 years","Above 10 years"]groups = [df_py[df_py['Years_Business']==yr]['Brand_Switching_Score'].dropna()for yr in order]f_stat, p_val = stats.f_oneway(*groups)print(f"One-way ANOVA: F = {f_stat:.4f}, p = {p_val:.4f}")
One-way ANOVA: F = 2.1312, p = 0.0827
Code
# Eta-squaredgrand_mean = df_py['Brand_Switching_Score'].mean()ss_between =sum(len(g)*(g.mean()-grand_mean)**2for g in groups)ss_total =sum((df_py['Brand_Switching_Score'].dropna() - grand_mean)**2)eta_sq = ss_between / ss_totalprint(f"Eta-squared (η²): {eta_sq:.4f}")
Eta-squared (η²): 0.0808
Code
print("\nGroup means:")
Group means:
Code
for yr, g inzip(order, groups):print(f" {yr}: mean = {g.mean():.3f}, n = {len(g)}")
Less that 1 year: mean = 3.750, n = 4
1-3 years: mean = 3.670, n = 27
4-6 years: mean = 3.462, n = 22
7-10 years: mean = 3.690, n = 29
Above 10 years: mean = 3.224, n = 20
Interpretation (Hypothesis 2): The one-way ANOVA tests whether mean brand-switching scores differ across the five tenure categories. The η² (eta-squared) effect size indicates the proportion of variance in brand switching explained by tenure. Business implication: If tenure groups differ significantly in brand-switching susceptibility, TotalEnergies should allocate its highest-intensity distributor engagement resources to the most vulnerable tenure cohort — typically the early-stage (1–3 years) distributors who have not yet developed deep brand loyalty and are most actively evaluating their supplier options.
print(f" r = {r_partial:.3f}, p = {p_partial:.4f}")
r = 0.041, p = 0.6839
Correlation Interpretation
The Spearman correlation heatmap reveals the following patterns across the composite constructs and selected individual items. Readers should refer to the heatmap above for the full matrix; the three commercially most important relationships are discussed below.
Three strongest correlations and their business implications:
Competitor Influence Score ↔︎ Brand Switching Score: This pair produces the strongest correlation among the four composite constructs. Distributors who actively monitor competitors, respond to competitor promotions, and seek better deals are substantially more likely to exhibit brand-switching behaviour. Business implication: Competitor-aware distributors are the highest churn risk. TotalEnergies should prioritise counter-promotion outreach to this group — particularly during periods of known competitor activity — before switching decisions crystallise.
Price Sensitivity Score ↔︎ Competitor Influence Score: Price sensitivity and competitor responsiveness are moderately correlated, confirming that these two constructs co-occur in the same distributor profiles. Distributors who are highly price-sensitive also tend to be the most actively engaged with competitor offerings. Business implication: A single distributor segment — high price sensitivity plus high competitor awareness — concentrates most of the commercial risk. Targeting this group with bespoke pricing assurances and exclusive deal access addresses both dimensions simultaneously.
Individual switching items (BS1, BS5) ↔︎ Price Sensitivity items (PS2, PS3): Among individual Likert items, the items measuring active price comparison and deal-seeking show the strongest co-movement with brand-switching items. This granular finding reinforces that the switching risk is specifically triggered by price comparison behaviour, not by passive price awareness alone. Business implication: Distributors who report frequently comparing prices and actively seeking better deals are the immediate priority for retention interventions.
Partial correlation result and interpretation: The partial correlation between Price Sensitivity Score and Brand Switching Score, after controlling for Competitor Influence Score, is r = 0.041 (p = 0.686) — not statistically significant. This is an important finding: it means that the bivariate relationship between price sensitivity and brand switching is substantially explained by their shared association with competitor influence. In other words, competitor influence is the key mediating variable — price sensitivity alone does not drive switching; rather, it is when competitor pressure activates price-conscious distributors that switching behaviour emerges. Business implication: The most effective retention lever is reducing competitor influence (through exclusive deals, supply reliability, and relationship quality), not simply matching prices. Matching competitor prices without reducing competitor visibility will have limited effect on switching.
9. Linear Regression
The outcome variable is Price Sensitivity Score (composite mean of 7 Likert items). Predictors are Competitor Influence Score, Brand Switching Score, Purchase Volume Score, and two demographic dummy variables (Business Type and Years in Business recoded as ordinal numeric).
The OLS regression model (F(5,92) = 10.4, p < 0.001, Adjusted R² = 0.326) predicts Price Sensitivity Score from five predictors. The model explains approximately 33% of the variance in price sensitivity — a meaningful result for survey-based behavioural data. Two predictors are statistically significant; three are not. The table below interprets all five coefficients for a non-technical manager:
Predictor
Actual Result
Business Interpretation
Competitor Influence Score
β = +0.517, p < 0.001 ✅ Significant
The strongest driver of price sensitivity in the model. Every 1-point increase in competitor-awareness is associated with a 0.52-point rise in price sensitivity score. Action: Proactively manage competitor-aware distributors with advance promotion alerts, price-match assurances, and dedicated account attention before competitor campaigns hit the market.
Business Type (Reseller = 1)
β = +0.493, p = 0.005 ✅ Significant
After controlling for all other factors, Resellers score 0.49 points higher on price sensitivity than Distributors — confirming the hypothesis test result. Action: Apply distinct commercial strategies by business type. Resellers need visible short-term price promotions; Distributors respond better to volume-tier rebates and credit terms.
Brand Switching Score
β = −0.080, p = 0.595 ❌ Not significant
Once competitor influence is controlled for, brand switching does not independently predict price sensitivity. This is consistent with the partial correlation finding — switching behaviour is driven by competitor influence, not price sensitivity acting alone. Action: Focus retention efforts on reducing competitor influence rather than treating brand switching as a separate problem.
Purchase Volume Score
β = +0.104, p = 0.403 ❌ Not significant
Purchase volume stability does not significantly predict price sensitivity after accounting for other factors. Action: No immediate intervention implied by this variable alone; however, building volume habits remains a long-term loyalty strategy.
Years in Business
β = −0.106, p = 0.116 ❌ Not significant (trend)
Tenure shows a negative directional trend — more experienced distributors are marginally less price-sensitive — but the effect does not reach significance in this sample. Action: Monitor this relationship with a larger sample; the direction supports investing in early-stage distributor loyalty programmes as a preventive measure.
Key model insight: Competitor Influence Score and Business Type (Reseller) are the only two statistically significant predictors of price sensitivity. This narrows the commercial priority sharply: the most effective intervention is reducing competitor influence among Resellers, who are both the most price-sensitive segment and the most exposed to competitor activity.
Model diagnostics: VIF values for all predictors are below 2.0, confirming no multicollinearity concern. The Residuals vs Fitted and Q-Q diagnostic plots should be inspected for linearity and normality of residuals respectively.
10. Integrated Findings
The five analyses converge on a single, coherent commercial picture:
The TotalEnergies lubricants distributor base is moderately to highly price-sensitive, and price sensitivity is the primary gateway through which competitor activity converts into brand switching and potential revenue loss.
EDA established that all four behavioural constructs cluster above the scale midpoint, confirming that the distributor base is commercially active, price-aware, and competitively engaged. Resellers form the larger and more price-sensitive segment.
Visualisation revealed a clear positive linear relationship between price sensitivity and brand switching — the central commercial risk — and identified that mid-tenure distributors face the broadest exposure to competitor influence.
Hypothesis testing confirmed that Resellers and Distributors differ significantly in price sensitivity, and that brand-switching tendency varies across distributor tenure cohorts, validating the need for differentiated customer engagement strategies.
Correlation analysis demonstrated that competitor influence is the primary correlate of brand switching, and that the bivariate relationship between price sensitivity and brand switching is largely mediated by competitor influence — confirmed by the non-significant partial correlation (r = 0.041, p = 0.686) once competitor influence is controlled. This reframes the commercial problem: it is competitor exposure, not price sensitivity alone, that triggers switching.
Regression quantified that only two predictors independently drive price sensitivity: Competitor Influence Score (β = +0.517, p < 0.001) and Business Type — Reseller (β = +0.493, p = 0.005). Brand switching, purchase volume, and tenure do not reach significance in the multivariate model, reinforcing that competitor influence is the central lever to manage.
Unified Recommendation: TotalEnergies Marketing Nigeria PLC should implement a Risk-Stratified Distributor Retention Programme with three components: 1. Segment by risk score: Combine Price Sensitivity, Competitor Influence, and Brand Switching scores into a composite churn-risk index. Flag distributors in the top quartile for immediate account manager attention. 2. Differentiate by business type: Resellers need promotional price visibility and frequency-based loyalty incentives. Distributors respond better to volume-tier rebates, credit flexibility, and supply reliability guarantees. 3. Invest in tenure development: New distributors (under 3 years) exhibit the widest variance in competitor susceptibility. Structured onboarding, dedicated account management, and early-stage loyalty incentives should be deployed to compress this vulnerability window.
11. Limitations & Further Work
Limitations:
Self-report bias: All variables are based on self-reported Likert responses. Respondents may over- or under-state price sensitivity and brand-switching behaviour, particularly if they perceive social desirability pressure toward loyalty.
Cross-sectional design: The survey captures behaviour at a single point in time (Q1 2026). Seasonal effects — post-holiday demand recovery, budget cycles — may inflate or deflate the measured constructs relative to other quarters.
Sample representativeness: Convenience and purposive sampling restricts the generalisability of findings to the Rivers State territory. Distributors in other regions (Lagos, Kano, Abuja) may exhibit different price sensitivity and switching profiles.
Absence of transaction data: All constructs are perceptual. Actual SAP transaction data showing real price elasticity, order frequency, and revenue per distributor would provide harder evidence for the relationships identified here.
No time dimension: The study lacks a time-series component, making it impossible to observe whether price sensitivity has increased or decreased over time, or to forecast future purchase behaviour.
Further Work:
Future studies should integrate SAP/ERP transaction records to compute objective price elasticity coefficients and real purchase volumes. Longitudinal survey designs (quarterly panels) would capture behavioural change over time. Machine learning classification models (Random Forest, XGBoost) could be applied to predict brand-switching probability at the individual distributor level using a richer feature set. Geographic expansion of the sample frame to multiple TotalEnergies territories would improve external validity. Additionally, incorporating macroeconomic covariates — crude oil prices, exchange rate fluctuations, and inflation indices — would contextualise the observed price sensitivity within Nigeria’s broader economic environment.
References
Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R. Lagos Business School / markanalytics.online. https://markanalytics.online
Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed.). Sage Publications.
Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate data analysis (8th ed.). Cengage Learning.
Montgomery, D. C., Peck, E. A., & Vining, G. G. (2021). Introduction to linear regression analysis (6th ed.). Wiley.
Onyenwosa, V. I. (2026). Lubricants distributor purchase behaviour survey dataset [Dataset]. Collected from TotalEnergies Marketing Nigeria PLC sales territory, Rivers State, Nigeria. Data available on request from the author.
R Core Team. (2024). R: A language and environment for statistical computing (Version 4.4). R Foundation for Statistical Computing. https://www.R-project.org/
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer. https://doi.org/10.1007/978-3-319-24277-4
McKinney, W. (2010). Data structures for statistical computing in Python. In Proceedings of the 9th Python in Science Conference (pp. 56–61). https://doi.org/10.25080/Majora-92bf1922-00a
Appendix: AI Usage Statement
Claude (Anthropic) was used to assist with structuring the Quarto document template, writing initial R and Python code scaffolds for the five analytical techniques, and formatting the reference list. All analytical decisions — including the selection of Spearman over Pearson correlation for Likert data, the choice of Welch’s t-test over Student’s t-test due to unequal group sizes, the decision to construct composite scores from thematic item clusters, and the interpretation of all statistical outputs in the context of TotalEnergies Marketing Nigeria PLC’s commercial operations — were made independently by the author based on knowledge acquired through the Data Analytics II course. The business recommendations reflect the author’s direct professional experience as a sales representative in the Nigerian lubricants distribution industry and were not generated by AI tools.