The built-in cars dataset contains speed and stopping
distances of cars.
I’ll compute the median of the first column
(speed).
data(cars)
median_speed <- median(cars[,1])
median_speed
## [1] 15
✅ Answer: The median of the first column is 15.
I’ll use the CryptoCompare
API to retrieve daily BTC-USD prices for the past 100
days.
This demonstrates working with JSON data.
# Construct the URL
url <- "https://min-api.cryptocompare.com/data/v2/histoday?fsym=BTC&tsym=USD&limit=100"
# Read JSON
btc_data <- fromJSON(url)
# Inspect structure
str(btc_data, max.level = 2)
## List of 6
## $ Response : chr "Success"
## $ Message : chr ""
## $ HasWarning: logi FALSE
## $ Type : int 100
## $ RateLimit : Named list()
## $ Data :List of 4
## ..$ Aggregated: logi FALSE
## ..$ TimeFrom : int 1751155200
## ..$ TimeTo : int 1759795200
## ..$ Data :'data.frame': 101 obs. of 9 variables:
# Extract DataFrame
df <- btc_data$Data$Data
# Convert time to date
df$time <- as.POSIXct(df$time, origin="1970-01-01")
# Preview data
head(df)
## time high low open volumefrom volumeto close
## 1 2025-06-28 20:00:00 108538.4 107233.6 107346.4 6094.35 657403121 108391.9
## 2 2025-06-29 20:00:00 108815.7 106756.4 108391.9 14220.51 1530756559 107169.8
## 3 2025-06-30 20:00:00 107574.1 105295.0 107169.8 15730.48 1673173145 105724.2
## 4 2025-07-01 20:00:00 109818.5 105143.1 105724.2 19611.45 2124516059 108886.6
## 5 2025-07-02 20:00:00 110584.4 108579.6 108886.6 16647.63 1825182142 109639.0
## 6 2025-07-03 20:00:00 109810.0 107283.9 109639.0 10440.09 1130647601 108027.7
## conversionType conversionSymbol
## 1 direct
## 2 direct
## 3 direct
## 4 direct
## 5 direct
## 6 direct
Now I’ll find the maximum daily close price:
max_close <- max(df$close, na.rm = TRUE)
max_close
## [1] 124723
✅ Answer: This returns the max daily close over the 100 days (value depends on current data).
Title: Bitcoin Price Trends Over the Last 100 Days
Research Questions:
1. What is the overall trend in BTC closing prices?
2. Are there missing or special values in the dataset?
3. How does mean and kNN imputation affect missing values?
4. Can we visualize the price changes clearly over time?
I’ll reuse the df dataset extracted above. This dataset
includes OHLCV data (open, high, low, close, volume) for BTC.
summary(df)
## time high low
## Min. :2025-06-28 20:00:00 Min. :107574 Min. :105143
## 1st Qu.:2025-07-23 20:00:00 1st Qu.:112938 1st Qu.:110217
## Median :2025-08-17 20:00:00 Median :116361 Median :114140
## Mean :2025-08-17 20:00:00 Mean :116154 Mean :113405
## 3rd Qu.:2025-09-11 20:00:00 3rd Qu.:118927 3rd Qu.:116866
## Max. :2025-10-06 20:00:00 Max. :126287 Max. :123143
## open volumefrom volumeto close
## Min. :105724 Min. : 3465 Min. :3.747e+08 Min. :105724
## 1st Qu.:111292 1st Qu.:10960 1st Qu.:1.291e+09 1st Qu.:111549
## Median :115392 Median :18335 Median :2.125e+09 Median :115401
## Mean :114732 Mean :18821 Mean :2.168e+09 Mean :114878
## 3rd Qu.:117778 3rd Qu.:25660 3rd Qu.:2.999e+09 3rd Qu.:117853
## Max. :124723 Max. :50123 Max. :5.882e+09 Max. :124723
## conversionType conversionSymbol
## Length:101 Length:101
## Class :character Class :character
## Mode :character Mode :character
##
##
##
To demonstrate imputation, I’ll introduce some artificial
missing values into the close column.
set.seed(4310)
missing_idx <- sample(1:nrow(df), 5)
df$close[missing_idx] <- NA
table(is.na(df$close))
##
## FALSE TRUE
## 96 5
any(is.nan(df$close))
## [1] FALSE
any(is.infinite(df$close))
## [1] FALSE
No NaN or Inf values were detected.
df_mean_imp <- df
mean_val <- mean(df_mean_imp$close, na.rm = TRUE)
df_mean_imp$close[is.na(df_mean_imp$close)] <- mean_val
table(is.na(df_mean_imp$close))
##
## FALSE
## 101
I’ll use the VIM package for kNN imputation (k = 5 by
default).
df_knn_imp <- kNN(df, variable = "close")
## high low open volumefrom volumeto high
## 1.075741e+05 1.051431e+05 1.057242e+05 3.465160e+03 3.746681e+08 1.262873e+05
## low open volumefrom volumeto
## 1.231435e+05 1.247230e+05 5.012319e+04 5.881848e+09
table(is.na(df_knn_imp$close))
##
## FALSE
## 101
I’ll plot the closing price trends before and after imputation.
ggplot(df, aes(x = time, y = close)) +
geom_line(color = "blue") +
labs(title = "Bitcoin Daily Closing Prices (With Missing Values)", x = "Date", y = "Closing Price (USD)")
ggplot(df_mean_imp, aes(x = time, y = close)) +
geom_line(color = "green") +
labs(title = "Bitcoin Daily Closing Prices (Mean Imputation)", x = "Date", y = "Closing Price (USD)")
ggplot(df_knn_imp, aes(x = time, y = close)) +
geom_line(color = "red") +
labs(title = "Bitcoin Daily Closing Prices (kNN Imputation)", x = "Date", y = "Closing Price (USD)")
Below are the answers to the quiz associated with this assignment:
Q1: Median of first column in cars
dataset?
✅ C. 15
Q2: Extract BTC data and find max close.
✅ Code shown in Section 2, answer depends on latest data.
Q3: Mini project steps (Title, Questions, Data,
Cleaning, Imputation).
✅ Completed in Section 3.
Q4: Presentation Bonus.
✅ If presented, click True.