DT2 ASSIGNMENT II

Author

UZOMA

Quarto

Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see https://quarto.org.

Running Code

When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:

1 + 1
[1] 2

You can add options to executable code like this

[1] 4

The echo: false option disables the printing of code (only output is displayed).

#Data-based claim

#Claim

The focus of this analysis is the assumption of a causal relationship between customer satisfaction and sales outcomes. This idea is widely accepted in business and marketing, where higher customer satisfaction is often associated with increased customer loyalty and repeat purchasing behaviour. As a result, many organizations treat customer satisfaction as a key performance indicator, expecting improvements in satisfaction to lead directly to higher sales. However, this assumption may oversimplify a more complex relationship. Much of the supporting evidence is based on correlation rather than proven causation. While customer satisfaction and sales may move in the same direction, this does not necessarily mean that satisfaction alone drives sales performance. There are also limitations related to data quality and measurement. Customer satisfaction is typically collected through surveys, which can be subjective and influenced by temporary emotions or external circumstances. In addition, datasets often fail to capture other important factors such as pricing strategies, marketing activities, economic conditions, competition, and customer purchasing power, all of which can significantly affect sales. Furthermore, the claim assumes that the relationship between satisfaction and sales is consistent across all contexts. In reality, this relationship may vary depending on customer segments, product types, and external conditions. Therefore, although customer satisfaction can influence sales, it is unlikely to be a consistent or sole determinant of sales outcomes.

#Data Selection

The dataset used in this analysis is the “Sales and Satisfaction” dataset obtained from Kaggle. For this assignment, a subset of the dataset (the first 20 observations) was used to explore the relationship between customer satisfaction and sales. This dataset is appropriate because it contains both sales and customer satisfaction values measured before and after a given period or intervention. This allows for the calculation of changes in both variables, providing a more meaningful way to assess whether increases in satisfaction lead to increases in sales. Additionally, the dataset includes customer segments (High Value, Medium Value, Low Value), which allows for further exploration of how different groups may respond differently. However, there are some limitations. The dataset contains missing values and inconsistencies, which require cleaning before analysis. Furthermore, the dataset is synthetic, meaning it may not fully represent real-world customer behaviour. It also does not include other important variables, such as marketing activity or pricing,economy, which may influence sales outcomes. Overall, while the dataset is suitable for exploring the relationship between satisfaction and sales, the findings should be interpreted with caution.

#Create sample data

data <- data.frame(

Group = c(“Control”,“Treatment”,“Control”,“control”,“Control”,“Treatment”,“Control”,“control”,“Control”,“Treatment”, “Control”,“Control”,“Control”,“Control”,“Treatment”,“Control”,“Treatment”,“Treatment”,“Treatment”,“Control”),

Customer_Segment = c(“High Value”,“High Value”,“High Value”,“Medium Value”,“High Value”,“Low Value”,“High Value”,“Low Value”,“High Value”,“High Value”, “Low Value”,“High Value”,“High Value”,“High Value”,“Medium Value”,“Low Value”,“High Value”,“Low Value”,“Low Value”,“Low Value”),

Sales_Before = c(240.5,246.8,156.9,192.2,229.6,135.5,191.7,173.7,208.3,235.0, 139.9,270.9,211.8,217.7,173.1,188.3,306.7,164.6,151.7,136.6),

Sales_After = c(300.0,381.3,179.3,229.2,270.1,218.5,222.4,213.6,248.1,352.7, 170.0,333.0,254.4,259.9,284.9,232.5,485.1,242.3,231.6,162.7),

Customer_Satisfaction_Before = c(74.6,100.0,98,7,49.3,83.9,58.0,89.9,66.9,95.3,72.9, 59.8,74.3,80.5,100.0,81.9,53.4,77.4,50.5,51.6,50.2),

Customer_Satisfaction_After = c(74.0,100.0,100.0,39.8,87.7,69.4,85.1,67.8,84.7,70.7, 51.8,67.9,87.6,100.0,90.3,63.4,78.3,63.4,55.7,46.9),

Purchase_Made = c(“No”,“Yes”,“No”,“Yes”,“Yes”,“No”,“Yes”,“No”,“Yes”,“No”, “Yes”,“No”,“Yes”,“Yes”,“No”,“Yes”,“Yes”,“No”,“Yes”,“Yes”) )

library(tidyverse)

data_clean <- data %>% mutate( Sales_Change = Sales_After - Sales_Before, Satisfaction_Change = Customer_Satisfaction_After - Customer_Satisfaction_Before )

data_clean

ggplot(data_clean, aes(x = Satisfaction_Change, y = Sales_Change)) + geom_point(alpha = 0.6) + geom_smooth(method = “lm”) + labs( title = “Change in Customer Satisfaction vs Change in Sales”, x = “Change in Satisfaction (0–100 scale)”, y = “Change in Sales” ) + theme_minimal()

ggplot(data_clean, aes(x = Customer_Segment, y = Sales_Change)) + geom_boxplot() + labs( title = “Sales Change by Customer Segment”, x = “Customer Segment”, y = “Sales Change” ) + theme_minimal()

#Data Story

This analysis investigates the assumption of a causal relationship between customer satisfaction and sales outcomes. The claim suggests that higher customer satisfaction consistently leads to increased sales, implying a direct and stable relationship between the two variables. To evaluate this, a subset of the dataset containing customer satisfaction and sales values before and after a given period was analysed.

To provide a more meaningful assessment, changes in both customer satisfaction and sales were calculated for each observation. This approach allows for a clearer understanding of how increases or decreases in satisfaction influence sales performance, rather than relying solely on absolute values.

The findings indicate that there is some evidence of a positive relationship between customer satisfaction and sales. In several instances, increases in satisfaction are associated with increases in sales, supporting the general notion that improved customer experience can contribute to better business outcomes.

However, the analysis also highlights notable inconsistencies that challenge the claim. There are multiple cases where sales increase despite a decrease in customer satisfaction, as well as situations where improvements in satisfaction do not result in significant changes in sales. This variability demonstrates that the relationship between the two variables is not consistent.

These results suggest that customer satisfaction alone is not a sufficient predictor of sales performance. Other factors, such as pricing strategies, marketing efforts, customer preferences, and external economic conditions, are likely to influence sales outcomes. Additionally, differences observed across customer segments indicate that the impact of satisfaction may vary depending on the type of customer.

It is also important to acknowledge limitations within the dataset. The data is synthetic and contains missing values, which required cleaning prior to analysis. These factors may affect the reliability and generalisability of the findings.

In conclusion, while customer satisfaction appears to have some influence on sales, the results do not support the assumption that it always leads to increased sales. The relationship is more complex and influenced by multiple interacting factors, meaning that customer satisfaction should be considered as one of several drivers of sales rather than a guaranteed determinant.