import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv(r"C:\Users\LJ\OneDrive - Loyola University Maryland\Desktop\index_1.csv")
df['date'] = pd.to_datetime(df['date'])

Introduction

This report explores a dataset of coffee shop sales transactions using Python code embedded in R Markdown. Visualizations are organized into tabs and the source code is hidden by default but can be revealed if desired.

Descriptive Statistics

Summary Stats

df.describe()
##                                 date        money
## count                           3636  3636.000000
## mean   2024-09-30 11:56:02.376237568    31.746859
## min              2024-03-01 00:00:00    18.120000
## 25%              2024-07-03 00:00:00    27.920000
## 50%              2024-10-06 12:00:00    32.820000
## 75%              2025-01-08 00:00:00    35.760000
## max              2025-03-23 00:00:00    40.000000
## std                              NaN     4.919926
{
    "Unique Customers": df['card'].nunique(),
    "Unique Coffee Types": df['coffee_name'].nunique()
}
## {'Unique Customers': 1316, 'Unique Coffee Types': 8}

Visualizations

1. Total Sales Over Time (Line Chart)

This chart shows the total revenue earned per day. It highlights fluctuations in business activity across time. Peaks in the graph may correspond to busier weekdays or successful promotional events, while troughs may indicate weekends, holidays, or low customer turnout.

sales_over_time = df.groupby('date')['money'].sum().reset_index()
plt.figure(figsize=(10, 5))
sns.lineplot(data=sales_over_time, x='date', y='money')
plt.title('Total Sales Over Time')
plt.xlabel('Date')
plt.ylabel('Sales ($)')
plt.xticks(rotation=45)
## (array([19783., 19844., 19905., 19967., 20028., 20089., 20148.]), [Text(19783.0, 0, '2024-03'), Text(19844.0, 0, '2024-05'), Text(19905.0, 0, '2024-07'), Text(19967.0, 0, '2024-09'), Text(20028.0, 0, '2024-11'), Text(20089.0, 0, '2025-01'), Text(20148.0, 0, '2025-03')])
plt.tight_layout()
plt.show()


2. Payment Method Distribution (Bar Chart)

This chart compares the number of transactions made by cash and card. It reveals a clear preference among customers for card payments, which suggests a need for reliable card processing systems and potentially a contactless or mobile payment option.

plt.figure(figsize=(6, 4))
sns.countplot(data=df, x='cash_type')
plt.title('Payment Method Distribution')
plt.xlabel('Payment Type')
plt.ylabel('Count')
plt.tight_layout()
plt.show()


3. Top 5 Coffee Types (Pie Chart)

This pie chart displays the proportion of the five most frequently ordered coffee types. It offers insight into the most popular menu items. The business could focus marketing efforts on these top drinks or explore product bundles involving them.

top = df['coffee_name'].value_counts().nlargest(5)
plt.figure(figsize=(6, 6))
plt.pie(top, labels=top.index, autopct='%1.1f%%')
## ([<matplotlib.patches.Wedge object at 0x000001E1BD36BA10>, <matplotlib.patches.Wedge object at 0x000001E1BDE12350>, <matplotlib.patches.Wedge object at 0x000001E1BDE12710>, <matplotlib.patches.Wedge object at 0x000001E1BDE12AD0>, <matplotlib.patches.Wedge object at 0x000001E1BDE12E90>], [Text(0.7096812699037421, 0.8404477944214096, 'Americano with Milk'), Text(-0.921767692986742, 0.6002868648953594, 'Latte'), Text(-0.7193891863623642, -0.832153350377739, 'Americano'), Text(0.4543633631122595, -1.0017753911238374, 'Cappuccino'), Text(1.0481879573857822, -0.3336195527714492, 'Cortado')], [Text(0.3870988744929502, 0.4584260696844052, '27.7%'), Text(-0.5027823779927684, 0.3274291990338324, '26.3%'), Text(-0.39239410165219857, -0.4539018274787667, '19.4%'), Text(0.24783456169759607, -0.5464229406130021, '16.8%'), Text(0.5717388858467902, -0.18197430151169958, '9.8%')])
plt.title('Top 5 Coffee Types')
plt.tight_layout()
plt.show()


4. Spending by Payment Method (Box Plot)

This chart compares how much customers typically spend depending on whether they pay by cash or card. The box plot shows that card payments generally involve slightly higher amounts, possibly indicating larger or more customized orders. Outliers suggest occasional high-spending purchases via card.

plt.figure(figsize=(7, 5))
sns.boxplot(data=df, x='cash_type', y='money')
plt.title('Spending by Payment Method')
plt.xlabel('Payment Type')
plt.ylabel('Amount ($)')
plt.tight_layout()
plt.show()


5. Distribution of Transaction Amounts (Density Plot)

This plot shows how customer spending is distributed. The peak of the curve indicates the most common transaction amount falls between $30 and $40. There’s a right skew, suggesting fewer but larger purchases occur occasionally.

plt.figure(figsize=(8, 5))
sns.kdeplot(df['money'], fill=True)
plt.title('Transaction Amount Density')
plt.xlabel('Amount ($)')
plt.ylabel('Density')
plt.tight_layout()
plt.show()


Conclusion

Summary

This analysis of coffee shop sales data reveals several important trends about customer behavior and business performance. The line chart tracking total daily sales shows noticeable fluctuations, suggesting that certain days likely weekends or promotional periods drive higher revenue. The bar chart comparing payment methods clearly indicates that the majority of customers prefer using cards over cash, which aligns with current consumer habits and may influence future investment in contactless payment systems. The pie chart illustrating the top five coffee choices shows a small group of products dominates customer preference, providing an opportunity to focus marketing and inventory on those high-demand items. The box plot examining spending by payment method reveals that card users tend to spend slightly more on average, with a few larger purchases also standing out, which may be associated with custom or bulk orders. The density plot of transaction amounts further reinforces that most purchases fall between $30 and $40, highlighting a common spending range among patrons. Together, these visualizations suggest that the business has consistent customer spending behavior, with a skew toward digital payment and loyalty to certain drinks. These insights can help guide business decisions in marketing, product placement, and operational efficiency.