Data Exploration

Exercises ~ Week 2

Logo


1 Exercise 1

The following table shows sample information for three students. Each observation represents a single student and includes details such as their unique student ID, name, age, total credits completed, major field of study, and year level.

This dataset demonstrates a mixture of variable types:

  • Nominal: StudentID, Name, Major
  • Numeric: Age (continuous), CreditsCompleted (discrete)
  • Ordinal: YearLevel (Freshman → Senior)
StudentID Name Age CreditsCompleted Major YearLevel
S001 Alice 20 45 Data Sains Sophomore
S002 Budi 21 60 Mathematics Junior
S003 Citra 19 30 Statistics Freshman
# 1. Create vectors for each variable
StudentID <- c("S001", "S002", "S003")       # Nominal / ID
Name <- c("Alice", "Budi", "Citra")          # Nominal / Name
Age <- c(20, 21, 19)                         # Numeric / Continuous
CreditsCompleted <- c(45, 60, 30)            # Numeric / Discrete

# Nominal
Major <- c("Data Sains", "Mathematics", "Statistics")  

# Ordinal
YearLevel <- factor(c("Sophomore", "Junior", "Freshman"),
                    levels = c("Freshman","Sophomore","Junior","Senior"),
                    ordered = TRUE)          

# 2. Combine all vectors into a data frame
students <- data.frame(
  StudentID, Name, Age, CreditsCompleted, Major, YearLevel,
  stringsAsFactors = FALSE
)

# 3. Display the data frame
print(students)
##   StudentID  Name Age CreditsCompleted       Major YearLevel
## 1      S001 Alice  20               45  Data Sains Sophomore
## 2      S002  Budi  21               60 Mathematics    Junior
## 3      S003 Citra  19               30  Statistics  Freshman

2 Exercise 2

Identify Data Types: Determine the type of data for each of the following variables:

# Install knitr package if not already installed
# install.packages("knitr")
library(knitr)

# Create a data frame for Data Types
variables_info <- data.frame(
  No = 1:5,
  Variable = c(
    "Number of vehicles passing through the toll road each day",
    "Student height in cm",
    "Employee gender (Male / Female)",
    "Customer satisfaction level: Low, Medium, High",
    "Respondent's favorite color: Red, Blue, Green"
  ),
  DataType = c(
    "Quantitative",
    "Quantitative",
    "Quantitative",
    "Qualitative",
    "Qualitative"
  ),
  Subtype = c(
    "Diskrete",
    "Continuous",
    "Nominal",
    "Ordinal",
    "Nominal"
  ),
  stringsAsFactors = FALSE
)

# Display the data frame as a neat table
kable(variables_info, 
      caption = "Table of Variables and Data Types")
Table of Variables and Data Types
No Variable DataType Subtype
1 Number of vehicles passing through the toll road each day Quantitative Diskrete
2 Student height in cm Quantitative Continuous
3 Employee gender (Male / Female) Quantitative Nominal
4 Customer satisfaction level: Low, Medium, High Qualitative Ordinal
5 Respondent’s favorite color: Red, Blue, Green Qualitative Nominal

3 Exercise 3

Classify Data Sources: Determine whether the following data comes from internal or external sources, and whether it is structured or unstructured:

# Install DT package if not already installed
# install.packages("DT") 
library(DT)

# Create a data frame for data sources 
data_sources <- data.frame(
  No = 1:4,
  DataSource = c(
    "Daily sales transaction data of the company",
    "Weather reports from BMKG",
    "Product reviews on social media",
    "Warehouse inventory reports"
  ),
  Internal_External = c(
    "Internal",
    "Eksternal",
    "Eksternal",
    "Internal"
  ),
  Structured_Unstructured = c(
    "Structured",
    "Structured",
    "Unstructured",
    "Structured"
  ),
  stringsAsFactors = FALSE
)

# Display the data frame as a neat table
datatable(data_sources, 
          caption = "Table of Data Sources",
          rownames = FALSE) # hides the index column

4 Exercise 4

Dataset Structure: Consider the following transaction table:

Date Qty Price Product CustomerTier
2025-10-01 2 1000 Laptop High
2025-10-01 5 20 Mouse Medium
2025-10-02 1 1000 Laptop Low
2025-10-02 3 30 Keyboard Medium
2025-10-03 4 50 Mouse Medium
2025-10-03 2 1000 Laptop High
2025-10-04 6 25 Keyboard Low
2025-10-04 1 1000 Laptop High
2025-10-05 3 40 Mouse Low
2025-10-05 5 10 Keyboard Medium

Your Assignment Instructions: Creating a Transactions Table above in R

  1. Create a data frame in R called transactions containing the data above.

  2. Identify which variables are numeric and which are categorical

  3. Calculate total revenue for each transaction by multiplying Qty × Price and add it as a new column Total.

  4. Compute summary statistics:

    • Total quantity sold for each product
    • Total revenue per product
    • Average price per product
  5. Visualize the data:

    • Create a barplot showing total quantity sold per product.
    • Create a pie chart showing the proportion of total revenue per customer tier.
  6. Optional Challenge:

    • Find which date had the highest total revenue.
    • Create a stacked bar chart showing quantity sold per product by customer tier.

Hints: Use data.frame(), aggregate(), barplot(), pie(), and basic arithmetic operations in R.

4.1 Create Data Frame

transactions <- data.frame(
  Date = as.Date(c ("2025-10-01", "2025-10-01", "2025-10-02", "2025-10-02", 
                "2025-10-03", "2025-10-03", "2025-10-04", "2025-10-04",
                "2025-10-05", "2025-10-05")),
  Qty = c(2, 5, 1, 3, 4, 2, 6, 1, 3, 5),
  Price = c(1000, 20, 1000, 30, 50, 1000, 25, 1000, 40, 10),
  Product = c("Laptop", "Mouse", "Laptop", "Keyboard", "Mouse",
              "Laptop", "Keyboard", "Laptop", "Mouse", "Keyboard"),
  CustomerTier = c("High", "Medium", "Low", "Medium", "Medium",
                   "High", "Low", "High", "Low", "Medium"),
  stringsAsFactors = FALSE)

# View the dataset contents
kable(transactions,
      caption = "Transactions Data by Customer Tier")
Transactions Data by Customer Tier
Date Qty Price Product CustomerTier
2025-10-01 2 1000 Laptop High
2025-10-01 5 20 Mouse Medium
2025-10-02 1 1000 Laptop Low
2025-10-02 3 30 Keyboard Medium
2025-10-03 4 50 Mouse Medium
2025-10-03 2 1000 Laptop High
2025-10-04 6 25 Keyboard Low
2025-10-04 1 1000 Laptop High
2025-10-05 3 40 Mouse Low
2025-10-05 5 10 Keyboard Medium

4.2 Identify data types

str(transactions)  # Look at the (numeric vs categorical)
## 'data.frame':    10 obs. of  5 variables:
##  $ Date        : Date, format: "2025-10-01" "2025-10-01" ...
##  $ Qty         : num  2 5 1 3 4 2 6 1 3 5
##  $ Price       : num  1000 20 1000 30 50 1000 25 1000 40 10
##  $ Product     : chr  "Laptop" "Mouse" "Laptop" "Keyboard" ...
##  $ CustomerTier: chr  "High" "Medium" "Low" "Medium" ...

4.3 Add a Total column

transactions$Total <- transactions$Qty * transactions$Price

4.4 View results

print(transactions)
##          Date Qty Price  Product CustomerTier Total
## 1  2025-10-01   2  1000   Laptop         High  2000
## 2  2025-10-01   5    20    Mouse       Medium   100
## 3  2025-10-02   1  1000   Laptop          Low  1000
## 4  2025-10-02   3    30 Keyboard       Medium    90
## 5  2025-10-03   4    50    Mouse       Medium   200
## 6  2025-10-03   2  1000   Laptop         High  2000
## 7  2025-10-04   6    25 Keyboard          Low   150
## 8  2025-10-04   1  1000   Laptop         High  1000
## 9  2025-10-05   3    40    Mouse          Low   120
## 10 2025-10-05   5    10 Keyboard       Medium    50

4.5 Summary Statistics

### Total quantity sold per product
total_qty <- aggregate(Qty ~ Product, data = transactions, sum)

### Total revenue per product
total_revenue <- aggregate(Total ~ Product, data = transactions, sum)

### Average price per product
avg_price <- aggregate(Price ~ Product, data = transactions, mean)

4.6 Show summary results

cat("\n Total Quantity per Product \n")
## 
##  Total Quantity per Product
print(total_qty)
##    Product Qty
## 1 Keyboard  14
## 2   Laptop   6
## 3    Mouse  12
cat("\n Total Revenue per Product \n")
## 
##  Total Revenue per Product
print(total_revenue)
##    Product Total
## 1 Keyboard   290
## 2   Laptop  6000
## 3    Mouse   420
cat("\n Average Price per Product \n")
## 
##  Average Price per Product
print(avg_price)
##    Product      Price
## 1 Keyboard   21.66667
## 2   Laptop 1000.00000
## 3    Mouse   36.66667

4.7 Visualization

### (a) Barplot - total quantity sold per product
barplot(
  total_qty$Qty,
  names.arg = total_qty$Product,
  main = "Total Quantity Sold per Product",
  xlab = "Product",
  ylab = "Total Quantity",
  col = c("skyblue", "lightgreen", "orange")
)

### (b) Pie chart - proportion of total revenue per Customer Tier
revenue_tier <- aggregate(Total ~ CustomerTier, data = transactions, sum)
pie(
  revenue_tier$Total,
  labels = paste(revenue_tier$CustomerTier, "-", revenue_tier$Total),
  main = "Proportion of Total Revenue per Customer Tier",
  col = c("gold", "lightblue", "tomato")
)

4.8 Optional Challenge

### (a) Date with highest total revenue
date_revenue <- aggregate(Total ~ Date, data = transactions, sum)
max_rev_date <- date_revenue[which.max(date_revenue$Total), ]
cat("\nDate with highest total revenue:\n")
## 
## Date with highest total revenue:
print(max_rev_date)
##         Date Total
## 3 2025-10-03  2200
### (b) Stacked bar chart: quantity sold per product by customer tier
qty_stack <- aggregate(Qty ~ Product + CustomerTier, data = transactions, sum)
qty_matrix <- xtabs(Qty ~ CustomerTier + Product, data = qty_stack)
barplot(
  qty_matrix,
  beside = FALSE,
  main = "Quantity Sold per Product by Customer Tier",
  xlab = "Product",
  ylab = "Quantity Sold",
  col = c("lightblue", "gold", "tomato")
)
legend("topright", legend = rownames(qty_matrix),
       fill = c("lightblue", "gold", "tomato"), title = "Customer Tier")

5 Exercise 5

Create Your Own Data Frame:

Objective: Create a data frame in R with 30 rows containing a mix of data types: continuous, discrete, nominal, and ordinal.

5.1 Instructions

  1. Open RStudio or the R console.

  2. Create a vector for each column in your data frame:

    • Date: 30 dates (can be sequential or random within a month/year)
    • Continuous: numeric values that can take decimal values (e.g., height, weight, temperature)
    • Discrete: numeric values that can only take whole numbers (e.g., number of items, number of vehicles)
    • Nominal: categorical values with no order (e.g., color, gender, city)
    • Ordinal: categorical values with a defined order (e.g., Low, Medium, High; Beginner, Intermediate, Expert)
  3. Combine all vectors into a data frame called my_data.

  4. Check your data frame using head() or View() to ensure it has 30 rows and the columns are correct.

  5. Optional tasks:

    • Summarize each column using summary()
    • Count the frequency of each category for Nominal and Ordinal columns using table()

5.2 Hints

  • Use seq.Date() or as.Date() to generate the Date column.
  • Use runif() or rnorm() for continuous numeric data.
  • Use sample() for discrete, nominal, and ordinal data.
  • Ensure the ordinal vector is created with factor(..., levels = c("Low","Medium","High"), ordered = TRUE) (or similar).

5.3 Create each column

## Date: 30 consecutive dates in October 2025
Date <- seq.Date(from = as.Date("2025-10-01"), 
                 by = "day", 
                 length.out = 30)

## Continuous: for example body temperature data (in °C), use decimals
Continuous <- round(runif(30, min = 35.5, max = 37.5), 1)

## Discrete: e.g. number of items sold (whole number)
Discrete <- sample(1:50, 30, replace = TRUE)

## Nominal: e.g. cutomer's city of origin (no order)
Nominal <- sample(c("Jakarta", "Bandung", "Surabaya", "Medan", "Bali"),
                  30, replace = TRUE)

## Ordinal: e.g. satisfaction level (there is a sequence)
Ordinal <- factor(
  sample(c("Low", "Medium", "High"), 30, replace = TRUE),
  levels = c("Low", "Medium", "High"),
  ordered = TRUE
)

5.4 Combine all into a data frame

my_data <- data.frame(Date, Continuous, Discrete, Nominal, Ordinal)

5.5 Check the data contents

head(my_data)   # display the first 6 rows
View(my_data)   # open in Rstudio window (optional)

5.6 (Optional) Data Summary

summary(my_data)
##       Date              Continuous       Discrete       Nominal         
##  Min.   :2025-10-01   Min.   :35.60   Min.   : 1.00   Length:30         
##  1st Qu.:2025-10-08   1st Qu.:36.00   1st Qu.:13.50   Class :character  
##  Median :2025-10-15   Median :36.65   Median :27.50   Mode  :character  
##  Mean   :2025-10-15   Mean   :36.57   Mean   :25.17                     
##  3rd Qu.:2025-10-22   3rd Qu.:37.27   3rd Qu.:33.50                     
##  Max.   :2025-10-30   Max.   :37.40   Max.   :48.00                     
##    Ordinal  
##  Low   :10  
##  Medium:15  
##  High  : 5  
##             
##             
## 

5.7 Calculate frequency categories

cat("\n Nominal Frequency (City) \n")
## 
##  Nominal Frequency (City)
print(table(my_data$Nominal))
## 
##     Bali  Bandung  Jakarta    Medan Surabaya 
##        8        7        2        7        6
cat("\n Ordinal Frequency (Level of Satisfaction) \n")
## 
##  Ordinal Frequency (Level of Satisfaction)
print(table(my_data$Ordinal))
## 
##    Low Medium   High 
##     10     15      5
---
title: "Data Exploration"       # Main title of the document
subtitle: "Exercises ~ Week 2"  # Subtitle or topic for week 2
author: 
- "Paskalis Farelnata Zamasi"
- "M. Fitrah Aidil Harahap"
- "Hanafi Malik Rifa'i"
- "Den Yuan Frasseka"
- "Zidhan Alfarezi Afdi"# Replace with your full name
date:  "`r format(Sys.Date(), '%B %d, %Y')`" # Auto displays the current date
output:                         # Output section defines the format and layout 
  rmdformats::readthedown:      # https://github.com/juba/rmdformats
    self_contained: true        # Embeds all resources (CSS, JS, images) 
    thumbnails: true            # Displays image thumbnails in the doc
    lightbox: true              # Enables click to enlarge images
    gallery: true               # Groups images into an interactive gallery
    number_sections: true       # Automatically numbers all sections
    lib_dir: libs               # Directory where JavaScript/CSS libraries
    df_print: "paged"           # Displays data frames as interactive paged 
    code_folding: "show"        # Allows folding/unfolding R code blocks 
    code_download: yes          # Adds a button to download all R code
---


<img id="Foto" src="C:\Users\HYPE AMD\Pictures\WhatsApp Image 2025-10-10 at 18.42.13_3e2e72f4.jpg" alt="Logo" style="width:200px; display: block; margin: auto;">

---

# Exercise 1

The following table shows sample information for three students. Each observation represents a single student and includes details such as their unique student ID, name, age, total credits completed, major field of study, and year level.  

This dataset demonstrates a mixture of variable types:  

- **Nominal:** StudentID, Name, Major  
- **Numeric:** Age (continuous), CreditsCompleted (discrete)  
- **Ordinal:** YearLevel (Freshman → Senior)  

| StudentID | Name   | Age | CreditsCompleted | Major            | YearLevel |
|-----------|--------|-----|-----------------|-----------------|-----------|
| S001      | Alice  | 20  | 45              | Data Sains      | Sophomore |
| S002      | Budi   | 21  | 60              | Mathematics     | Junior    |
| S003      | Citra  | 19  | 30              | Statistics      | Freshman  |

```{r}
# 1. Create vectors for each variable
StudentID <- c("S001", "S002", "S003")       # Nominal / ID
Name <- c("Alice", "Budi", "Citra")          # Nominal / Name
Age <- c(20, 21, 19)                         # Numeric / Continuous
CreditsCompleted <- c(45, 60, 30)            # Numeric / Discrete

# Nominal
Major <- c("Data Sains", "Mathematics", "Statistics")  

# Ordinal
YearLevel <- factor(c("Sophomore", "Junior", "Freshman"),
                    levels = c("Freshman","Sophomore","Junior","Senior"),
                    ordered = TRUE)          

# 2. Combine all vectors into a data frame
students <- data.frame(
  StudentID, Name, Age, CreditsCompleted, Major, YearLevel,
  stringsAsFactors = FALSE
)

# 3. Display the data frame
print(students)
```


# Exercise 2

**Identify Data Types:** Determine the type of data for each of the following variables:

```{r}
# Install knitr package if not already installed
# install.packages("knitr")
library(knitr)

# Create a data frame for Data Types
variables_info <- data.frame(
  No = 1:5,
  Variable = c(
    "Number of vehicles passing through the toll road each day",
    "Student height in cm",
    "Employee gender (Male / Female)",
    "Customer satisfaction level: Low, Medium, High",
    "Respondent's favorite color: Red, Blue, Green"
  ),
  DataType = c(
    "Quantitative",
    "Quantitative",
    "Quantitative",
    "Qualitative",
    "Qualitative"
  ),
  Subtype = c(
    "Diskrete",
    "Continuous",
    "Nominal",
    "Ordinal",
    "Nominal"
  ),
  stringsAsFactors = FALSE
)

# Display the data frame as a neat table
kable(variables_info, 
      caption = "Table of Variables and Data Types")
```
---

# Exercise 3

**Classify Data Sources:** Determine whether the following data comes from **internal** or **external sources**, and whether it is **structured** or **unstructured**:

```{r}
# Install DT package if not already installed
# install.packages("DT") 
library(DT)

# Create a data frame for data sources 
data_sources <- data.frame(
  No = 1:4,
  DataSource = c(
    "Daily sales transaction data of the company",
    "Weather reports from BMKG",
    "Product reviews on social media",
    "Warehouse inventory reports"
  ),
  Internal_External = c(
    "Internal",
    "Eksternal",
    "Eksternal",
    "Internal"
  ),
  Structured_Unstructured = c(
    "Structured",
    "Structured",
    "Unstructured",
    "Structured"
  ),
  stringsAsFactors = FALSE
)

# Display the data frame as a neat table
datatable(data_sources, 
          caption = "Table of Data Sources",
          rownames = FALSE) # hides the index column
```

---

# Exercise 4

**Dataset Structure:** Consider the following transaction table:

| Date       | Qty | Price | Product  | CustomerTier |
|------------|-----|-------|----------|--------------|
| 2025-10-01 | 2   | 1000  | Laptop   | High         |
| 2025-10-01 | 5   | 20    | Mouse    | Medium       |
| 2025-10-02 | 1   | 1000  | Laptop   | Low          |
| 2025-10-02 | 3   | 30    | Keyboard | Medium       |
| 2025-10-03 | 4   | 50    | Mouse    | Medium       |
| 2025-10-03 | 2   | 1000  | Laptop   | High         |
| 2025-10-04 | 6   | 25    | Keyboard | Low          |
| 2025-10-04 | 1   | 1000  | Laptop   | High         |
| 2025-10-05 | 3   | 40    | Mouse    | Low          |
| 2025-10-05 | 5   | 10    | Keyboard | Medium       |


**Your Assignment Instructions:** Creating a Transactions Table above in R

1. **Create a data frame** in R called `transactions` containing the data above.

2. Identify which variables are numeric and which are categorical

3. **Calculate total revenue** for each transaction by multiplying `Qty × Price` and add it as a new column `Total`.

4. **Compute summary statistics**:
   - Total quantity sold for each product
   - Total revenue per product
   - Average price per product

5. **Visualize the data**:
   - Create a **barplot** showing total quantity sold per product.
   - Create a **pie chart** showing the proportion of total revenue per customer tier.

6. **Optional Challenge**:
   - Find which **date** had the highest total revenue.
   - Create a **stacked bar chart** showing quantity sold per product by customer tier.

**Hints:** Use `data.frame()`, `aggregate()`, `barplot()`, `pie()`, and basic arithmetic operations in R.


## Create Data Frame
```{R}
transactions <- data.frame(
  Date = as.Date(c ("2025-10-01", "2025-10-01", "2025-10-02", "2025-10-02", 
                "2025-10-03", "2025-10-03", "2025-10-04", "2025-10-04",
                "2025-10-05", "2025-10-05")),
  Qty = c(2, 5, 1, 3, 4, 2, 6, 1, 3, 5),
  Price = c(1000, 20, 1000, 30, 50, 1000, 25, 1000, 40, 10),
  Product = c("Laptop", "Mouse", "Laptop", "Keyboard", "Mouse",
              "Laptop", "Keyboard", "Laptop", "Mouse", "Keyboard"),
  CustomerTier = c("High", "Medium", "Low", "Medium", "Medium",
                   "High", "Low", "High", "Low", "Medium"),
  stringsAsFactors = FALSE)

# View the dataset contents
kable(transactions,
      caption = "Transactions Data by Customer Tier")
```

## Identify data types
```{R}
str(transactions)  # Look at the (numeric vs categorical)
```

## Add a Total column
```{R}
transactions$Total <- transactions$Qty * transactions$Price
```

## View results
```{R}
print(transactions)
```

## Summary Statistics
```{R}

### Total quantity sold per product
total_qty <- aggregate(Qty ~ Product, data = transactions, sum)

### Total revenue per product
total_revenue <- aggregate(Total ~ Product, data = transactions, sum)

### Average price per product
avg_price <- aggregate(Price ~ Product, data = transactions, mean)
```

## Show summary results
```{R}
cat("\n Total Quantity per Product \n")
print(total_qty)
cat("\n Total Revenue per Product \n")
print(total_revenue)
cat("\n Average Price per Product \n")
print(avg_price)
```

## Visualization
```{R}

### (a) Barplot - total quantity sold per product
barplot(
  total_qty$Qty,
  names.arg = total_qty$Product,
  main = "Total Quantity Sold per Product",
  xlab = "Product",
  ylab = "Total Quantity",
  col = c("skyblue", "lightgreen", "orange")
)

### (b) Pie chart - proportion of total revenue per Customer Tier
revenue_tier <- aggregate(Total ~ CustomerTier, data = transactions, sum)
pie(
  revenue_tier$Total,
  labels = paste(revenue_tier$CustomerTier, "-", revenue_tier$Total),
  main = "Proportion of Total Revenue per Customer Tier",
  col = c("gold", "lightblue", "tomato")
)
```

## Optional Challenge
```{r}

### (a) Date with highest total revenue
date_revenue <- aggregate(Total ~ Date, data = transactions, sum)
max_rev_date <- date_revenue[which.max(date_revenue$Total), ]
cat("\nDate with highest total revenue:\n")
print(max_rev_date)

### (b) Stacked bar chart: quantity sold per product by customer tier
qty_stack <- aggregate(Qty ~ Product + CustomerTier, data = transactions, sum)
qty_matrix <- xtabs(Qty ~ CustomerTier + Product, data = qty_stack)
barplot(
  qty_matrix,
  beside = FALSE,
  main = "Quantity Sold per Product by Customer Tier",
  xlab = "Product",
  ylab = "Quantity Sold",
  col = c("lightblue", "gold", "tomato")
)
legend("topright", legend = rownames(qty_matrix),
       fill = c("lightblue", "gold", "tomato"), title = "Customer Tier")
```

# Exercise 5

**Create Your Own Data Frame:**

**Objective:** Create a data frame in R with **30 rows** containing a mix of data types: continuous, discrete, nominal, and ordinal.  

## Instructions

1. **Open RStudio** or the R console.  

2. **Create a vector for each column** in your data frame:  

   - **Date**: 30 dates (can be sequential or random within a month/year)  
   - **Continuous**: numeric values that can take decimal values (e.g., height, weight, temperature)  
   - **Discrete**: numeric values that can only take whole numbers (e.g., number of items, number of vehicles)  
   - **Nominal**: categorical values with **no order** (e.g., color, gender, city)  
   - **Ordinal**: categorical values with a **defined order** (e.g., Low, Medium, High; Beginner, Intermediate, Expert)  

3. **Combine all vectors into a data frame** called `my_data`.  

4. **Check your data frame** using `head()` or `View()` to ensure it has **30 rows** and the columns are correct.  

5. **Optional tasks**:  
   - Summarize each column using `summary()`  
   - Count the frequency of each category for **Nominal** and **Ordinal** columns using `table()`  

## Hints

- Use `seq.Date()` or `as.Date()` to generate the Date column.  
- Use `runif()` or `rnorm()` for continuous numeric data.  
- Use `sample()` for discrete, nominal, and ordinal data.  
- Ensure the **ordinal vector** is created with `factor(..., levels = c("Low","Medium","High"), ordered = TRUE)` (or similar).  


## Create each column
```{R}
## Date: 30 consecutive dates in October 2025
Date <- seq.Date(from = as.Date("2025-10-01"), 
                 by = "day", 
                 length.out = 30)

## Continuous: for example body temperature data (in °C), use decimals
Continuous <- round(runif(30, min = 35.5, max = 37.5), 1)

## Discrete: e.g. number of items sold (whole number)
Discrete <- sample(1:50, 30, replace = TRUE)

## Nominal: e.g. cutomer's city of origin (no order)
Nominal <- sample(c("Jakarta", "Bandung", "Surabaya", "Medan", "Bali"),
                  30, replace = TRUE)

## Ordinal: e.g. satisfaction level (there is a sequence)
Ordinal <- factor(
  sample(c("Low", "Medium", "High"), 30, replace = TRUE),
  levels = c("Low", "Medium", "High"),
  ordered = TRUE
)
```

## Combine all into a data frame
```{R}
my_data <- data.frame(Date, Continuous, Discrete, Nominal, Ordinal)
```

## Check the data contents
```{R}
head(my_data)   # display the first 6 rows
View(my_data)   # open in Rstudio window (optional)
```

## (Optional) Data Summary
```{R}
summary(my_data)
```
## Calculate frequency categories
```{R}
cat("\n Nominal Frequency (City) \n")
print(table(my_data$Nominal))

cat("\n Ordinal Frequency (Level of Satisfaction) \n")
print(table(my_data$Ordinal))
```




