1. Introduction

This report analyzes the returns of six portfolios formed on size and book-to-market ratios from the Kenneth French Data Library. The goal is to compute descriptive statistics and compare the return distributions across two time periods.

2. Load Packages

library(tidyverse) library(readr) library(dplyr) library(moments)

3. Import data

data<- read.csv(“6_Portfolios_2x3.csv”,skip = 15) view(data) data <- data %>% rename( date = X192708, SL = X2.4876, SM = X.2.3003, SH = X0.7332, BL = X4.0055, BM = X1.1524, BH = X.1.9351 ) library(dplyr)

data <- data %>% mutate( # Convert numeric-like characters/factors to numeric across(where(~all(grepl(“¹+$”, .))), ~as.numeric(.)),

# Convert date-like strings to Date
across(where(~all(grepl("^\\d{4}-\\d{2}-\\d{2}$", .))), ~as.Date(., format = "%Y-%m-%d")),

# Leave text/categorical columns as character
across(where(is.character), ~as.character(.))

)

data[,2:7] <- data[,2:7] / 100

4. Split Dataset

first_half <- data %>% filter(date >= 193001 & date <= 197412) second_half <- data %>% filter(date >= 197501 & date <= 201812) View(first_half) view(second_half)

5. Statistics Function

portfolio_stats <- function(df){

data.frame( Portfolio = c(“SL”,“SM”,“SH”),

Mean = c(mean(df$SL), mean(df$SM), mean(df$SH)),

SD = c(sd(df$SL), sd(df$SM), sd(df$SH)),

Skewness = c(skewness(df$SL),
             skewness(df$SM),
             skewness(df$SH)),

Kurtosis = c(kurtosis(df$SL),
             kurtosis(df$SM),
             kurtosis(df$SH))

) }

6. Calculate Statistics

stats_1930_1974 <- portfolio_stats(first_half) stats_1975_2018 <- portfolio_stats(second_half)

stats_1930_1974 stats_1975_2018

view(stats_1930_1974) view(stats_1975_2018)

7. Comparison Table

comparison <- merge(stats_1930_1974, stats_1975_2018, by=“Portfolio”, suffixes=c(“_1930_1974”, “_1975_2018”))

comparison

View(comparison)

8. Average Return Comparison

ggplot(comparison, aes(x=Portfolio, y=Mean_1930_1974)) + geom_bar(stat=“identity”) + ggtitle(“Average Returns (1930–1974)”)

ggplot(comparison, aes(x=Portfolio, y=Mean_1975_2018)) + geom_bar(stat=“identity”) + ggtitle(“Average Returns (1975–2018)”)

9. Risk Comparison

ggplot(comparison, aes(x=Portfolio, y=SD_1930_1974)) + geom_bar(stat=“identity”) + ggtitle(“Volatility (1930–1974)”)

ggplot(comparison, aes(x=Portfolio, y=SD_1975_2018)) + geom_bar(stat=“identity”) + ggtitle(“Volatility (1975–2018)”)

10. Discussion

The descriptive statistics reveal differences between the two periods. Average returns are higher in the later period. Portfolios with high book-to-market ratios tend to produce higher returns.

Standard deviation is also larger in the second period, indicating greater volatility.

Both periods exhibit positive skewness and high kurtosis, suggesting that financial returns have asymmetric distributions and fat tails.

11. Conclusion

Because the mean, standard deviation, skewness, and kurtosis change significantly between the two periods, the results suggest that the portfolio returns do not come from the same distribution across time.

0-9↩︎

Hw02

2026-03-09