week4課外作業

1. 哪一種產品最賣？？
2. 哪幾位顧客曾經買過這一種產品？？
3. 這一種產品哪幾年最賣？？
4. 這一種產品曾經賣給哪幾個國家？？
5. 這一種產品哪一個國家買最多個？？

1. 哪一種產品最賣？？

#1.在此輸入你的程式碼
library(shiny)
library(rmarkdown)
library(tinytex)

## Warning: 套件 'tinytex' 是用 R 版本 4.3.3 來建造的

library(dplyr)

## Warning: 套件 'dplyr' 是用 R 版本 4.3.3 來建造的

## 
## 載入套件：'dplyr'

## 下列物件被遮斷自 'package:stats':
## 
##     filter, lag

## 下列物件被遮斷自 'package:base':
## 
##     intersect, setdiff, setequal, union

library(readxl)

## Warning: 套件 'readxl' 是用 R 版本 4.3.3 來建造的

#2.讀取數據
OrderDetails <- read_excel("C:\\data\\Northwind\\OrderDetails.xlsx")
Orders <- read_excel("C:\\data\\Northwind\\Orders.xlsx")
Products <- read_excel("C:\\data\\Northwind\\Products.xlsx")

#3.創建 price 數據框(提取需要的數據出來)
price <- data.frame(ProductID = Products$ProductID, Price = Products$Price)
Quantity <- data.frame(ProductID = OrderDetails$ProductID, Quantity = OrderDetails$Quantity)

#4.合併 OrderDetails 和 Products 以獲取顧客資訊
product_sales <- merge(OrderDetails, Products, by = "ProductID") %>%
  merge(price, by = "ProductID")  # 獲取價格
  
#5.根據 ProductID 分組，計算每種產品的總銷量
product_sales <- product_sales %>%
  group_by(ProductID) %>%
  summarise(TotalQuantity = sum(Quantity)) %>%
  arrange(desc(TotalQuantity))  # 按銷量降序排列
  
#6.顯示銷量最高的產品
most_sold_product <- product_sales[1,]
most_sold_product

#在此輸入你對本題的解讀

與第三周的哪一位顧客是消費大戶題型相仿，只是將消費者改成商品銷量，不變的是都要找最大值。

2. 哪幾位顧客曾經買過這一種產品？？

#在此輸入你的程式碼
#1.最熱銷的產品
most_sold_product <- product_sales[1,]

#2.找到最熱銷的產品ID
most_sold_product_id <- most_sold_product$ProductID
#($這表示從名為 most_sold_product 的數據框中提取 ProductID 這一列。)

#3.找出購買過產品的訂單
purchased_orders <- OrderDetails %>%
  filter(ProductID == most_sold_product_id)
#(filter為篩選)  

#4.合併 purchased_orders 和 Orders 以獲取顧客資訊(customer_info)
customer_info <- merge(purchased_orders, Orders, by = "OrderID") 

#5. 顯示購買過該產品的顧客 ID（不重複）
customers <-  sort(unique(customer_info$CustomerID))
customers

##  [1] 17 20 25 31 34 37 39 51 63 65 71 72 86 91

在此輸入你對本題的解讀

沿用上一題找到的最熱銷產品，從數據框找到該產品的id，再篩選出購買過該產品的訂單，根據OrderID合併 purchased_orders 和 Orders，找到購買過該產品的顧客id並刪去重複的顧客id由小至大排序

3. 這一種產品哪幾年最賣？？

# 在此輸入你的程式碼

#1.提取年份並計算每年的銷量
yearly_sales <- customer_info %>%
  mutate(Year = as.integer(format(OrderDate, "%Y"))) %>%  # 提取年份
  group_by(Year) %>%  # 按年份分組
  summarise(TotalQuantity = sum(Quantity)) %>%  # 計算每年的總銷量
  arrange(Year)  # 按年份升序排列


#2..結果
yearly_sales

在此輸入你對本題的解讀延續上一題根據OrderID合併 purchased_orders 和 Orders，找到購買過該產品的顧客id並刪去重複的顧客id，提取每年銷量 mutate創建或改變數據框中的變數 group_by分組 summarise根據一個或多個分組變數對數據進行匯總計算 sum總和

4. 這一種產品曾經賣給哪幾個國家？？

#1.在此輸入你的程式碼
Customers <- read_excel("C:\\data\\Northwind\\Customers.xlsx") 

#2.合併 customer_info 和 Customers 以獲取顧客所在國家
customer_con <- merge(customer_info, Customers, by = "CustomerID")

#3.提取國家名稱
countries <- unique(customer_con$Country)

#4.結果
countries

## [1] "Germany" "Austria" "Brazil"  "Ireland" "Canada"  "USA"     "UK"     
## [8] "Poland"

在此輸入你對本題的解讀新增Customers資料，延續前面題目根據OrderID合併 purchased_orders 和 Orders，提取購買此產品的國家

5. 這一種產品哪一個國家買最多個？？

# 在此輸入你的程式碼
#1.按國家分組計算每個國家的購買總量
country_sales <- customer_con %>%
  group_by(Country) %>%  # 根據國家分組
  summarise(TotalQuantity = sum(Quantity)) %>%  # 計算每個國家的總購買量
  arrange(desc(TotalQuantity))  

#2.找到購買最多的國家
top_country <- country_sales[1,]

#3.結果
top_country

在此輸入你對本題的解讀根據OrderID合併 purchased_orders 和 Orders得到customer_info，按國家分組並計算總購買量排序並找到購買最多的國家