List of required packages
Package Version
moments 0.14.1
knitr 1.51
kableExtra 1.4.0
ggplot2 4.0.3
tidyr 1.3.2
dplyr 1.2.1
scales 1.4.0
stringr 1.6.0
# load dataset stored in the same folder of .rmd file
df <- read.csv("realestate_texas.csv")
# Randomly select row indices, then sort them to keep original order
idx <- sort(sample(nrow(df), min(10, nrow(df))))
kable(df[idx, ])
city year month sales volume median_price listings months_inventory
10 Beaumont 2010 10 150 23.904 138500 1779 11.5
34 Beaumont 2012 10 193 27.350 121800 1671 9.9
42 Beaumont 2013 6 232 36.275 134100 1675 9.0
74 Bryan-College Station 2011 2 101 16.125 148500 1562 9.3
95 Bryan-College Station 2012 11 159 28.882 149100 1442 7.3
119 Bryan-College Station 2014 11 169 34.903 172800 973 3.8
165 Tyler 2013 9 287 51.099 147600 2917 10.2
199 Wichita Falls 2011 7 127 13.594 102300 1029 9.2
222 Wichita Falls 2013 6 121 15.547 104700 923 7.9
233 Wichita Falls 2014 5 140 17.833 115700 899 7.6

1. Analysis of the variables

Description of the study variables
Variable Description Type
city City or market area observed Qualitative nominal
year Year of the observation Quantitative discrete
month Month of the observation Quantitative discrete
sales Number of sales in that city–month Quantitative discrete
volume Total sales volume in millions of US dollars Quantitative continuous
median_price Median sale price (US dollars) Quantitative continuous
listings Active for-sale listings (inventory) Quantitative discrete
months_inventory Months needed to clear inventory at the current sales pace Quantitative continuous

Time dimension: year and month they index repeated observations per city. For time-based analysis, we can create a period variable (e.g., first day of the month) from year and month in order to enables chronological ordering, moving averages to analyze price trends, trend-seasonality (aggregated series or city-level series) and cross-cities comparisons in the same period.

2. Measures of location, variability and shape

Comprehensive descriptive analysis involves using measures of position (central tendency), variability (dispersion) and shape to summarize distributions. For quantitative variables, the relevant measures include the mean, median, 1st and 3rd quartiles, standard deviation, interquartile range, variance and skewness/kurtosis indices.

In the code chunk below, the function compute_indices() is defined to produce a comprehensive set of descriptive statistics for a quantitative variable. The function returns measures of position (mean, median, first and third quartiles) together with minimum and maximum values, dispersion (variance, standard deviation, interquartile range and coefficient of variation), and distributional shape (skewness and excess kurtosis).

The function is then applied to the selected quantitative variables (sales, volume, median_price, listings, and months_inventory), and the resulting statistics are assembled into a single summary table and rounded to improve readability in the final report.

# Function to compute descriptive statistics for a quantitative variable
compute_indices <- function(x) {
  # Convert to numeric and remove missing values
  x <- na.omit(as.numeric(x))
  
  # if no valid observations (size of data equals to 0)
  if (length(x) == 0) { 
    return(c(mean = NA, median = NA, q1 = NA, q3 = NA,min = NA, max = NA,
             variance = NA, sd = NA, iqr = NA, cv = NA,skewness = NA, kurtosis_excess = NA))
  }
  
  # Aliases for mean and std.dev functions
  m <- mean(x)
  s <- sd(x)
  
  # Return a named vector of descriptive indices
  c(
    # --- position indices ---
    mean   = m,
    median = median(x),
    q1     = quantile(x, 0.25),
    q3     = quantile(x, 0.75),
    
    # --- extreme values ---
    min = min(x),
    max = max(x),
    
    # --- variability indices ---
    variance = var(x),
    sd       = s,
    iqr      = IQR(x),
    cv       = ifelse(m == 0, NA, s/m),
    
    # --- shape indices ---
    # When std.dev is 0, all observations are identical (constant variable),
    # so skewness and kurtosis are not defined
    skewness = ifelse(is.na(s) || s == 0, NA, skewness(x)),
    # the excess kurtosis is defined as kurtosis minus 3
    kurtosis_excess = ifelse(is.na(s) || s == 0, NA, kurtosis(x) - 3)
  )
}

# Quantitative variables of interest
vars <- c("sales", "volume", "median_price", "listings", "months_inventory")

# Apply the function to each variable and combine results in a table
results <- t(sapply(vars, function(v) compute_indices(df[[v]])))
results <- as.data.frame(results)

# round values for cleaner reporting
results_rounded <- round(results, 2)

Then, the summary statistics are transformed into a report-ready table.

# Build final table from computed results
# names of variables are moved from row names into first column (Variable)
table_results <- cbind(Variable = rownames(results_rounded), results_rounded)
rownames(table_results) <- NULL

# Rename columns to report-friendly label
# with first column header intentionally left blank
colnames(table_results) <- c("","Mean","Median","Q1","Q3","Min","Max","Variance","SD","IQR","CV","Skewness","Excess Kurtosis")

# Create formatted table
kable(
  table_results,
  caption = "Descriptive statistics for quantitative variables",
  align = "lrrrrrrrrrrrr",
  booktabs = TRUE
) %>%
  # Add headers above for grouping indices
  add_header_above(c(" " = 1,"Position" = 4,"Extremes" = 2,"Variability" = 4,"Shape" = 2)) %>%
  # Add style for HTML
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed")
  )
Descriptive statistics for quantitative variables
Position
Extremes
Variability
Shape
Mean Median Q1 Q3 Min Max Variance SD IQR CV Skewness Excess Kurtosis
sales 192.29 175.50 127.00 247.00 79.00 423.00 6.34430e+03 79.65 120.00 0.41 0.72 -0.31
volume 31.01 27.06 17.66 40.89 8.17 83.55 2.77270e+02 16.65 23.23 0.54 0.88 0.18
median_price 132665.42 134500.00 117300.00 150050.00 73800.00 180000.00 5.13573e+08 22662.15 32750.00 0.17 -0.36 -0.62
listings 1738.02 1618.50 1026.50 2056.00 743.00 3296.00 5.66569e+05 752.71 1029.50 0.43 0.65 -0.79
months_inventory 9.19 8.95 7.80 10.95 3.40 14.90 5.31000e+00 2.30 3.15 0.25 0.04 -0.17

Given that descriptive statistics were computed for the quantitative variables (sales, volume, median_price, listings, and months_inventory), it is appropriate to compute frequency distributions for the remaining categorical or discrete time-related variables (i.e. Year):

# function used to compute frequency table for specific variable
freq_dist_1var <- function(x) {
  ni <- table(x)
  fi <- ni/length(x)
  Ni <- cumsum(ni)
  Fi <- cumsum(fi)
  return (cbind(ni,fi,Ni,Fi))
}

# apply function to variable year
freq_year  <- freq_dist_1var(df$year)
# render table
kable(freq_year, caption = "Frequency distribution - Year", align = "lrrrr", booktabs = TRUE)
Frequency distribution - Year
ni fi Ni Fi
2010 48 0.2 48 0.2
2011 48 0.2 96 0.4
2012 48 0.2 144 0.6
2013 48 0.2 192 0.8
2014 48 0.2 240 1.0

3. Identification of variables with the highest variability and skewness

Based on the computed summary table, the conclusions are as follows:

## 1 - Highest variability (CV): volume - 0.54
## 2 - Highest skewness (absolute): volume - 0.88

The first conclusion is based on the coefficient of variation (CV = sd / mean), which is the appropriate measure for comparing dispersion across variables expressed on different scales. In the reported results, volume has the largest CV (0.54), therefore volume exhibits the greatest relative variability.

The second conclusion is based on skewness. Considering asymmetry in absolute value, volume shows the largest skewness (0.88). Since the skewness is positive, the distribution of volume is right-skewed, indicating a longer upper tail and the presence of relatively high observations.

4. Creating class intervals for a quantitative variable

The quantitative variable median_price was selected and partitioned into class intervals in order to construct a frequency distribution and visualize the resulting frequencies using a bar chart.

# width for class intervals
breaks_price <- pretty(df$median_price)

# Human-readable numeric labels (no scientific notation)
labels_price <- paste0(
  "[",
  formatC(breaks_price[-length(breaks_price)], format = "f", digits = 0, big.mark = ""),
  " - ",
  formatC(breaks_price[-1],                   format = "f", digits = 0, big.mark = ""),
  ")"
)

df$median_price_class <- cut(
  df$median_price,
  breaks = breaks_price,
  labels = labels_price,
  include.lowest = TRUE,
  right = FALSE
)
# apply function to variable median_price_class
freq_median_price  <- freq_dist_1var(df$median_price_class)
# render table
kable(freq_median_price, caption = "Frequency distribution - Median price", align = "lrrrr", booktabs = TRUE)
Frequency distribution - Median price
ni fi Ni Fi
[60000 - 80000) 1 0.0041667 1 0.0041667
[80000 - 100000) 23 0.0958333 24 0.1000000
[100000 - 120000) 41 0.1708333 65 0.2708333
[120000 - 140000) 74 0.3083333 139 0.5791667
[140000 - 160000) 80 0.3333333 219 0.9125000
[160000 - 180000) 21 0.0875000 240 1.0000000
# Convert matrix to data frame and keep class labels
freq_median_price_df <- as.data.frame(freq_median_price)
freq_median_price_df$Class <- rownames(freq_median_price_df)
# Preserve original order of classes
freq_median_price_df$Class <- factor(freq_median_price_df$Class, levels = freq_median_price_df$Class)

# Relative frequency bar chart
ggplot(freq_median_price_df, aes(x = Class, y = fi)) +
  geom_col(fill = "#2c7fb8", width = 0.5, color = "grey25") +
  labs(
    title = "Relative frequency distribution for median price classes",
    x = "Median price",
    y = "Relative frequency"
  ) +
  theme_bw(base_size = 12) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    panel.grid.minor = element_blank()
  )

The Gini heterogeneity index was computed based on the distribution of observations across classes by using the relative frequency fi values.

The unnormalized index is

\[\begin{equation} G = 1 - \sum_{i} fi_i^2 \end{equation}\]

while the normalized version is

\[\begin{equation} G^{*} = \frac{k}{k-1}\left(1 - \sum_{i=1}^{k} fi_i^2\right) \end{equation}\] where k is the number of non empty classes

p <- freq_median_price_df$fi
# remove possible zero/NA classes
p <- p[!is.na(p) & p > 0]

# number of non empty classes
k <- length(p)

# gini heterogeneity (unnormalized)
G <- 1 - sum(p^2)

# normalized Gini heterogeneity in [0,1]
G_norm <- if (k > 1) (k / (k - 1)) * G else NA_real_

gini_df <- data.frame(
  k = k,
  G = G,
  G_norm = G_norm
)

kable(
  gini_df,
  caption = "Gini heterogeneity indices based on class relative frequencies",
  digits = 4,
  col.names = c("Number of non empty classes (k)",
                "Gini heterogeneity (G)",
                "Normalized Gini heterogeneity (G*)"),
  align = "lll",
  booktabs = TRUE
)
Gini heterogeneity indices based on class relative frequencies
Number of non empty classes (k) Gini heterogeneity (G) Normalized Gini heterogeneity (G*)
6 0.7478 0.8973

Based on the reported frequency distribution and the Gini heterogeneity indices, you can state the empirical distribution of median_price across the six class intervals has a relatively high Gini heterogeneity index (G=0.7478). With k=6 non empty classes, the corresponding normalized index (G*=0.8973) is close to the upper bound of 1, indicating substantial dispersion of observations across price bands.

At the same time, the class frequencies show clear substantive concentration in the mid-to-upper market segments: the intervals [120,000–140,000) and [140,000–160,000) jointly account for approximately 64% of the sample (about 0.31 and 0.33 relative frequency, respectively). By contrast, the lowest interval [60,000–80,000) is extremely sparse (ni=1, fi≈0.0042), indicating negligible representation at the bottom of the defined price bands.

5. Probability calculation

Probabilities are estimated empirically from observed frequencies. For an event \(A\), the estimated probability is \[ \widehat{P}(A)=\frac{n_A}{N}, \] where \(n_A\) denotes the absolute frequency of event \(A\), and \(N\) is the total number of observations.
For two events \(A\) and \(B\), the joint probability is \[ \widehat{P}(A \cap B)=\frac{n_{A,B}}{N}, \] while the conditional probability (for \(n_B>0\)) is \[ \widehat{P}(A \mid B)=\frac{n_{A,B}}{n_B}. \] Percent values are obtained as \(100 \times \widehat{P}(\cdot)\).

5.1 Probability that a randomly selected row from this dataset corresponds to the city of Beaumont

The probability is estimated using an empirical frequency approach by computing a frequency distribution for city, obtaining absolute frequencies (ni) and relative frequencies (fi) for each category.

# apply function to variable city
freq_city  <- freq_dist_1var(df$city)
# render table
kable(freq_city, caption = "Frequency distribution - City", align = "lrrrr", booktabs = TRUE)
Frequency distribution - City
ni fi Ni Fi
Beaumont 60 0.25 60 0.25
Bryan-College Station 60 0.25 120 0.50
Tyler 60 0.25 180 0.75
Wichita Falls 60 0.25 240 1.00

The estimated probability of selecting a row corresponding to Beaumont is then derived from the relative frequency of that category:

\[ \hat{P}(\text{City}=\text{Beaumont}) = f_i \]

# Estimated probability P(city = "Beaumont")
p_beaumont <- freq_city["Beaumont", "fi"]
# Build report table
prob_table <- data.frame(Event = "City = Beaumont", Probability = p_beaumont, Percentage = 100 * p_beaumont)
# Print with kable
knitr::kable(prob_table, caption = "Probability of selecting Beaumont",
             digits = c(0, 2, 2), align = "lrr", booktabs = TRUE)
Probability of selecting Beaumont
Event Probability Percentage
City = Beaumont 0.25 25

Based on the frequency distribution, the estimate is:

\[ \hat{P}(\text{City}=\text{Beaumont}) = 0.25 \]

which corresponds to 25% of the sample.

5.2 Probability that a randomly selected row corresponds to the month of July

Following the same approach described in section 5.1, the probability of selecting a row corresponding to July is obtained from the relative frequency of month = 7 in the distribution of month:

\[ \hat{P}(\text{Month}=7)=f_{7} \]

# apply function to variable month
freq_month  <- freq_dist_1var(df$month)

# render table
kable(freq_month, row.names = TRUE, caption = "Frequency distribution - Month", 
      digits = c(0, 4, 0, 4), align = "lrrrr", booktabs = TRUE)
Frequency distribution - Month
ni fi Ni Fi
1 20 0.0833 20 0.0833
2 20 0.0833 40 0.1667
3 20 0.0833 60 0.2500
4 20 0.0833 80 0.3333
5 20 0.0833 100 0.4167
6 20 0.0833 120 0.5000
7 20 0.0833 140 0.5833
8 20 0.0833 160 0.6667
9 20 0.0833 180 0.7500
10 20 0.0833 200 0.8333
11 20 0.0833 220 0.9167
12 20 0.0833 240 1.0000
# Estimated probability P(month = 7)
p_month_july <- freq_month[7, "fi"]
# Build report table
prob_table <- data.frame(Event = "Month = 7", Probability = p_month_july, Percentage = 100 * p_month_july)
# Print with kable
knitr::kable(prob_table,caption = "Probability of selecting July",
             digits = c(0, 4, 4), align = "lrr", booktabs = TRUE)
Probability of selecting July
Event Probability Percentage
Month = 7 0.0833 8.3333

The estimated probability is:

\[ \hat{P}(\text{Month}=7)=0.0833 \]

which corresponds to approximately 8.33% of the sample.
Based on the above frequency table, this result is consistent with a monthly partition over 12 months, where each month is expected to contribute about one-twelfth of observations under a balanced temporal structure.

5.3 Probability that a randomly selected row corresponds to December 2012

Consistent with the empirical procedure adopted in sections 5.1 and 5.2, the probability for this event is obtained from the relative frequency of the combined category period = 2012_12 in the frequency distribution of period.

# define column period in data frame
# format month in 2 chars for ordering purposes
df$period = as.Date(sprintf("%04d-%02d-01", as.integer(df$year), as.integer(df$month)))


# apply function to variable period
freq_period  <- freq_dist_1var(df$period)

# filter only 2012 rows (2012-01-01 ... 2012-12-01) to avoid printing full table
# Ni and Fi remain cumulative with respect to the original full period ordering
freq_period_2012 <- freq_period[grepl("^2012-", rownames(freq_period)), , drop = FALSE]

# render table 
kable(freq_period_2012, caption = "Frequency distribution - Period \n (filtered for 2012)", 
      digits = c(0, 4, 0, 4), align = "lrrrr", booktabs = TRUE)
Frequency distribution - Period (filtered for 2012)
ni fi Ni Fi
2012-01-01 4 0.0167 100 0.4167
2012-02-01 4 0.0167 104 0.4333
2012-03-01 4 0.0167 108 0.4500
2012-04-01 4 0.0167 112 0.4667
2012-05-01 4 0.0167 116 0.4833
2012-06-01 4 0.0167 120 0.5000
2012-07-01 4 0.0167 124 0.5167
2012-08-01 4 0.0167 128 0.5333
2012-09-01 4 0.0167 132 0.5500
2012-10-01 4 0.0167 136 0.5667
2012-11-01 4 0.0167 140 0.5833
2012-12-01 4 0.0167 144 0.6000
# Estimated probability P(month = 12 and year = 2012)
p_event_dec2012 <- freq_period["2012-12-01", "fi"]
# Build report table
prob_table <- data.frame(Event = "December 2012",
                         Probability = p_event_dec2012, Percentage = 100 * p_event_dec2012)
# Print with kable
knitr::kable(prob_table, caption = "Probability of selecting December 2012",
             digits = c(0, 4, 4), align = "lrr", booktabs = TRUE)
Probability of selecting December 2012
Event Probability Percentage
December 2012 0.0167 1.6667

Therefore, the estimated probability is:

\[ \hat{P}(\text{December 2012})=0.0167 \]

which corresponds to approximately 1.67% of the full dataset.

6. Creation of new variables

6.1 Define estimated average sale price

The estimated average sale price is computed as: \[ \text{avg_sale_price}=\frac{\text{volume}\times 10^{6}}{\text{sales}} \] since, as by study variables description in section 1, volume is measured in millions of dollars, while sales represents the number of transactions per city-month.

# volume is in millions of USD, so multiply by 10^6
# then divide by number of sales
df$avg_sale_price <- ifelse(df$sales > 0, (df$volume * 1e+06) / df$sales, NA)

# print examples
# randomly select row indices, then sort them to keep original order
idx <- sort(sample(nrow(df), min(5, nrow(df))))
kable(df[idx, c("city", "year", "month", "volume", "sales", "avg_sale_price")])
city year month volume sales avg_sale_price
40 Beaumont 2013 4 29.433 198 148651.5
93 Bryan-College Station 2012 9 28.434 149 190832.2
206 Wichita Falls 2012 2 10.697 90 118855.6
209 Wichita Falls 2012 5 12.451 102 122068.6
229 Wichita Falls 2014 1 9.626 89 108157.3

The avg_sale_price provides a mean value per transaction and complements the existing median-based price indicator (median_price).

6.2 Define listing effectiveness

Listing effectiveness is defined as the extent to which available inventory is converted into sales.
A simple monthly absorption proxy is: \[ \text{listing_effectiveness}=\frac{\text{sales}}{\text{listings}} \] An alternative turnover-based indicator is: \[ \text{inventory_turnover_ratio}=\frac{1}{\text{months_inventory}} \]

While months_inventory describes how long current stock will last, its inverse flips the perspective to show how efficiently that inventory is being converted into revenue, where higher values indicate faster market turnover.

#listing effectiveness
df$listing_effectiveness <- ifelse(df$listings > 0, df$sales / df$listings, NA)

# inverse of months_inventory (higher means faster inventory turnover)
df$inventory_turnover_ratio <- ifelse(df$months_inventory > 0, 1 / df$months_inventory, NA)

# print examples
# randomly select row indices, then sort them to keep original order
idx <- sort(sample(nrow(df), min(5, nrow(df))))
kable(df[idx, c("city", "year", "month", "sales", "listings", "months_inventory",
                "listing_effectiveness", "inventory_turnover_ratio")])
city year month sales listings months_inventory listing_effectiveness inventory_turnover_ratio
25 Beaumont 2012 1 110 1647 11.4 0.0667881 0.0877193
92 Bryan-College Station 2012 8 296 1518 8.1 0.1949934 0.1234568
133 Tyler 2011 1 143 2852 12.6 0.0501403 0.0793651
163 Tyler 2013 7 369 2998 10.7 0.1230821 0.0934579
233 Wichita Falls 2014 5 140 899 7.6 0.1557286 0.1315789
# aggregate data by period across all cities
monthly_trend <- df %>%
  group_by(period) %>%
  summarise(
    # calculate means for both indicators
    listing_effectiveness = mean(listing_effectiveness, na.rm = TRUE),
    inventory_turnover_ratio = mean(inventory_turnover_ratio, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  pivot_longer(cols = c(listing_effectiveness, inventory_turnover_ratio),
               names_to = "Metric", values_to = "Value")

# plot monthly trends of effectiveness metrics
ggplot(monthly_trend, aes(x = period, y = Value, color = Metric)) +
  geom_line(linewidth = 0.9) +
  theme_bw(base_size = 11) +
  labs(title = "Monthly Trend of effectiveness metrics", x = NULL, y = "Mean value", color = NULL)

The monthly trend confirms a general increase in market absorption over the sample period for both indicators, which is consistent with their conceptual link as both proxy the speed of inventory absorption:

  • inventory_turnover_ratio shows a relatively smooth upward trajectory, rising from about 0.10–0.12 in early years to around 0.17–0.18 by the end of the period

  • listing_effectiveness is more volatile, with pronounced short-term oscillations and several peaks (up to ~0.20), suggesting higher sensitivity to short-run fluctuations in sales/listings but it also exhibits a clear positive trend

7. Conditional analysis

Conditional summary statistics are descriptive measures computed on subsets of the data defined by specific criteria (strata). Typical measures include the mean, median, total, and standard deviation; in particular, the mean under a given condition can be interpreted as a conditional expectation. These summaries are used to characterize how a variable behaves across different sub-populations or time segments.

This section reports conditional statistical analyses stratified by city, year, and month. The analysis is implemented in R (using either dplyr or base R). For each stratum, key summary statistics, primarily mean and standard deviation, are estimated and then visualized to support comparison of cross-sectional differences and temporal patterns.

# quantitative variables and derived indicators
num_vars <- c("avg_sale_price","listing_effectiveness", "inventory_turnover_ratio")

# compute mean and standard deviation of every variable in num_vars
summarise_mean_sd <- function(data) {
  data %>%
    summarise(
      # mean of each selected column within the current group
      across(all_of(num_vars),\(x) mean(x, na.rm = TRUE), .names = "{.col}_mean"),
      # standard deviation of each selected column within the current group
      across(all_of(num_vars), \(x) sd(x, na.rm = TRUE), .names = "{.col}_sd"),
      .groups = "drop"  # return an not grouped data frame
    )
}

# conditional summaries for different stratifications
summary_by_city        <- df %>% group_by(city)        %>% summarise_mean_sd()
summary_by_year        <- df %>% group_by(year)        %>% summarise_mean_sd()
summary_by_month       <- df %>% group_by(month)       %>% summarise_mean_sd()
summary_by_city_year   <- df %>% group_by(city, year)  %>% summarise_mean_sd()
summary_by_city_month  <- df %>% group_by(city, month) %>% summarise_mean_sd()
# print summary tables
kable(summary_by_city,       caption = "Conditional summary by city (mean, sd)",       digits = 4, booktabs = TRUE)
Conditional summary by city (mean, sd)
city avg_sale_price_mean listing_effectiveness_mean inventory_turnover_ratio_mean avg_sale_price_sd listing_effectiveness_sd inventory_turnover_ratio_sd
Beaumont 146640.4 0.1061 0.1032 11232.13 0.0267 0.0182
Bryan-College Station 183534.3 0.1473 0.1453 15149.35 0.0729 0.0533
Tyler 167676.8 0.0935 0.0909 12350.51 0.0235 0.0164
Wichita Falls 119430.0 0.1280 0.1293 11398.48 0.0247 0.0136
kable(summary_by_year,       caption = "Conditional summary by year (mean, sd)",       digits = 4, booktabs = TRUE)
Conditional summary by year (mean, sd)
year avg_sale_price_mean listing_effectiveness_mean inventory_turnover_ratio_mean avg_sale_price_sd listing_effectiveness_sd inventory_turnover_ratio_sd
2010 150188.6 0.0997 0.1046 23279.55 0.0337 0.0217
2011 148250.6 0.0927 0.0950 24938.38 0.0232 0.0181
2012 150898.7 0.1097 0.1040 26438.50 0.0281 0.0178
2013 158705.2 0.1346 0.1288 26523.81 0.0448 0.0314
2014 163558.7 0.1570 0.1534 31740.53 0.0618 0.0496
kable(summary_by_month,      caption = "Conditional summary by month (mean, sd)",      digits = 4, booktabs = TRUE)
Conditional summary by month (mean, sd)
month avg_sale_price_mean listing_effectiveness_mean inventory_turnover_ratio_mean avg_sale_price_sd listing_effectiveness_sd inventory_turnover_ratio_sd
1 145640.4 0.0831 0.1190 29819.11 0.0230 0.0290
2 148840.5 0.0878 0.1160 25120.42 0.0219 0.0284
3 151136.5 0.1160 0.1119 23237.92 0.0346 0.0283
4 151461.3 0.1253 0.1091 26174.30 0.0380 0.0297
5 158235.0 0.1415 0.1102 25787.19 0.0503 0.0314
6 161545.8 0.1424 0.1104 23470.46 0.0576 0.0337
7 156881.0 0.1435 0.1127 27220.12 0.0740 0.0386
8 156455.6 0.1419 0.1154 28253.21 0.0526 0.0394
9 156522.3 0.1117 0.1188 29669.41 0.0348 0.0417
10 155897.4 0.1119 0.1218 32527.29 0.0360 0.0418
11 154233.0 0.1025 0.1259 29684.87 0.0293 0.0437
12 154995.5 0.1173 0.1351 27008.87 0.0379 0.0496
kable(summary_by_city_year,  caption = "Conditional summary by city/year (mean, sd)",  digits = 4, booktabs = TRUE)
Conditional summary by city/year (mean, sd)
city year avg_sale_price_mean listing_effectiveness_mean inventory_turnover_ratio_mean avg_sale_price_sd listing_effectiveness_sd inventory_turnover_ratio_sd
Beaumont 2010 146582.5 0.0898 0.0920 13960.173 0.0195 0.0062
Beaumont 2011 145921.9 0.0823 0.0855 12655.337 0.0117 0.0052
Beaumont 2012 141475.9 0.1015 0.0933 10345.771 0.0158 0.0079
Beaumont 2013 150079.0 0.1225 0.1142 6245.121 0.0215 0.0069
Beaumont 2014 149142.7 0.1346 0.1311 11234.169 0.0218 0.0064
Bryan-College Station 2010 174601.8 0.1056 0.1163 11964.068 0.0396 0.0111
Bryan-College Station 2011 173689.0 0.1027 0.1033 11645.001 0.0315 0.0117
Bryan-College Station 2012 179360.6 0.1215 0.1141 9072.876 0.0423 0.0167
Bryan-College Station 2013 187315.8 0.1708 0.1613 12931.505 0.0649 0.0372
Bryan-College Station 2014 202704.3 0.2362 0.2318 8625.369 0.0768 0.0312
Tyler 2010 159537.5 0.0745 0.0795 8554.899 0.0151 0.0059
Tyler 2011 160248.0 0.0773 0.0747 8949.978 0.0126 0.0064
Tyler 2012 165533.0 0.0902 0.0866 12271.146 0.0134 0.0052
Tyler 2013 174501.8 0.1012 0.0986 8939.224 0.0143 0.0065
Tyler 2014 178563.5 0.1242 0.1152 10805.818 0.0199 0.0122
Wichita Falls 2010 120032.5 0.1290 0.1306 12351.214 0.0302 0.0084
Wichita Falls 2011 113143.6 0.1085 0.1166 8247.222 0.0159 0.0084
Wichita Falls 2012 117225.3 0.1255 0.1222 13981.539 0.0154 0.0072
Wichita Falls 2013 122924.3 0.1439 0.1413 8760.490 0.0283 0.0133
Wichita Falls 2014 123824.3 0.1331 0.1355 10994.397 0.0187 0.0135
kable(summary_by_city_month, caption = "Conditional summary by city/month (mean, sd)", digits = 4, booktabs = TRUE)
Conditional summary by city/month (mean, sd)
city month avg_sale_price_mean listing_effectiveness_mean inventory_turnover_ratio_mean avg_sale_price_sd listing_effectiveness_sd inventory_turnover_ratio_sd
Beaumont 1 142059.2 0.0760 0.1050 20363.512 0.0201 0.0151
Beaumont 2 146503.0 0.0826 0.1030 12974.719 0.0197 0.0146
Beaumont 3 149918.4 0.1037 0.1029 5398.706 0.0132 0.0198
Beaumont 4 142949.1 0.1118 0.1000 5511.596 0.0141 0.0176
Beaumont 5 146873.9 0.1208 0.0993 6495.480 0.0303 0.0187
Beaumont 6 148591.7 0.1183 0.0990 4913.971 0.0252 0.0186
Beaumont 7 153993.7 0.1061 0.0981 15215.577 0.0179 0.0190
Beaumont 8 150966.9 0.1278 0.1012 6549.042 0.0352 0.0198
Beaumont 9 144663.8 0.1043 0.1038 13874.571 0.0352 0.0238
Beaumont 10 148133.6 0.1137 0.1051 9899.859 0.0319 0.0213
Beaumont 11 134896.1 0.0966 0.1074 11773.634 0.0173 0.0223
Beaumont 12 150135.5 0.1119 0.1139 10028.542 0.0229 0.0228
Bryan-College Station 1 179365.7 0.0862 0.1403 13494.092 0.0256 0.0355
Bryan-College Station 2 169985.7 0.0867 0.1330 18446.113 0.0305 0.0389
Bryan-College Station 3 174920.3 0.1226 0.1246 8552.149 0.0546 0.0435
Bryan-College Station 4 182128.2 0.1443 0.1200 14123.928 0.0573 0.0468
Bryan-College Station 5 181804.4 0.1950 0.1294 18412.798 0.0620 0.0480
Bryan-College Station 6 181582.2 0.2164 0.1363 18298.850 0.0701 0.0530
Bryan-College Station 7 183344.8 0.2228 0.1447 16508.899 0.1132 0.0612
Bryan-College Station 8 184104.9 0.1943 0.1506 16633.849 0.0737 0.0608
Bryan-College Station 9 191815.7 0.1236 0.1578 9544.628 0.0520 0.0618
Bryan-College Station 10 193938.3 0.1214 0.1614 13905.882 0.0587 0.0614
Bryan-College Station 11 192760.5 0.1167 0.1661 11943.247 0.0436 0.0659
Bryan-College Station 12 186660.8 0.1381 0.1802 15651.209 0.0618 0.0775
Tyler 1 154935.3 0.0669 0.0929 6400.878 0.0161 0.0126
Tyler 2 164516.8 0.0768 0.0921 8645.045 0.0132 0.0135
Tyler 3 161441.0 0.0947 0.0899 11066.124 0.0114 0.0126
Tyler 4 162962.8 0.0971 0.0868 10856.908 0.0148 0.0133
Tyler 5 178711.5 0.1042 0.0866 6087.930 0.0233 0.0155
Tyler 6 180028.9 0.1071 0.0854 11050.260 0.0258 0.0151
Tyler 7 170866.7 0.1040 0.0852 8333.915 0.0225 0.0149
Tyler 8 173738.0 0.1028 0.0871 11343.693 0.0213 0.0159
Tyler 9 169106.3 0.0955 0.0896 17250.045 0.0248 0.0180
Tyler 10 167987.0 0.0950 0.0927 15113.128 0.0300 0.0199
Tyler 11 166102.4 0.0826 0.0975 7061.601 0.0267 0.0223
Tyler 12 161724.3 0.0952 0.1053 14740.546 0.0302 0.0260
Wichita Falls 1 106201.5 0.1032 0.1379 9788.224 0.0169 0.0156
Wichita Falls 2 114356.4 0.1052 0.1359 7397.539 0.0152 0.0120
Wichita Falls 3 118266.5 0.1428 0.1301 12167.279 0.0263 0.0067
Wichita Falls 4 117805.3 0.1481 0.1294 7684.451 0.0286 0.0116
Wichita Falls 5 125550.3 0.1459 0.1256 5015.104 0.0285 0.0136
Wichita Falls 6 135980.5 0.1278 0.1208 13412.726 0.0119 0.0092
Wichita Falls 7 119318.8 0.1411 0.1229 7206.987 0.0288 0.0120
Wichita Falls 8 117012.4 0.1428 0.1226 5664.009 0.0211 0.0128
Wichita Falls 9 120503.5 0.1235 0.1241 6905.672 0.0213 0.0162
Wichita Falls 10 113530.6 0.1176 0.1281 13971.742 0.0165 0.0166
Wichita Falls 11 123173.0 0.1141 0.1325 12234.014 0.0145 0.0154
Wichita Falls 12 121461.4 0.1242 0.1412 12532.343 0.0176 0.0144

7.1 Comparison of average sale price by city (and year)

The city-level chart summarizes conditional means of average sale price across all months and years within each market. Bryan–College Station and Tyler exhibit the highest central price levels, while Wichita Falls records the lowest mean with Beaumont occupies an intermediate position. The error bars reflect dispersion at city-level, indicating that price levels are not only shifted across cities but also differ in short-run variability.

# average sale price by city (mean ± SD)
ggplot(summary_by_city, 
       aes(x = city, y = avg_sale_price_mean)) +
  geom_col(col="black", fill = "#2c7fb8", width = 0.7) +
  geom_errorbar(
    aes(
      ymin = avg_sale_price_mean - avg_sale_price_sd,
      ymax = avg_sale_price_mean + avg_sale_price_sd
    ),
    width = 0.3
  ) +
  labs(
    title = "Average sale price by city (mean ± SD)",
    x = "City",
    y = "Average sale price (USD)"
  ) +
  theme_bw(base_size = 11)

The grouped bar chart refines the comparison by conditioning on both city and year, so each bar represents the mean price within a city–year stratum and the error bar captures dispersion within that stratum. Bryan–College Station shows the highest levels and a positive temporal gradient from 2011 to 2014, consistent with strengthening market conditions in that sub-market. Also Tyler displays a stable increase over the sample period. Beaumont appears comparatively stable with limited evidence of a sustained upward shift. Wichita Falls remains the lowest-priced market throughout, with only modest growth after 2011.

# average sale price by city and year (mean ± SD)
ggplot(summary_by_city_year, 
       aes(x = city, y = avg_sale_price_mean, fill = factor(year))) +
  geom_col(color = "black", position = position_dodge(width = 0.8), width = 0.7) +
  geom_errorbar(
    aes(
      ymin = avg_sale_price_mean - avg_sale_price_sd,
      ymax = avg_sale_price_mean + avg_sale_price_sd
    ),
    position = position_dodge(width = 0.8),
    width = 0.2
  ) +
  labs(
    title = "Average sale price by city and year (mean ± SD)",
    x = "City",
    y = "Average sale price (USD)",
    fill = "Year"
  ) +
  theme_bw(base_size = 11)

Overall, the figures point to marked cross-sectional heterogeneity in the Texas panel: geographic stratification is a primary source of variation in average transaction values.

7.2 Comparison of listing effectiveness over years by city

The line chart displays conditional means of listing effectiveness (sales relative to listings) for each city–year stratum, with the series show a common upward shift after an early dip around 2011, consistent with an improvement in the conversion of active listings into sales. The magnitude and timing of that improvement, however, differ across cities:

  1. Bryan–College Station stands out from 2012 onward: effectiveness rises from roughly 0.10 in the early years to about 0.24 by 2014, the highest level in the plot.

  2. Wichita Falls begins with the highest effectiveness in 2010 (near 0.13), then declines in 2011 before recovering to a local peak around 2013 (≈0.144) then a slight fall between 2013 and 2014 (to ≈0.133).

  3. Beaumont follows a comparatively smooth upward path after 2011, from about 0.09 to roughly 0.135 in 2014, converging toward the levels of Wichita Falls by the end of the period.

  4. Tyler shows a stable linear improvement, reaching approximately 0.124 in 2014 and reducing the gap with Beaumont and Wichita Falls.

# listing effectiveness (sales/listings) over time, by city
ggplot(
  summary_by_city_year,
  aes(x = year, y = listing_effectiveness_mean, color = city, group = city)
) +
  geom_line(linewidth = 0.8) +
  geom_point(size = 1.8) +
  labs(
    title = "Listing Effectiveness by year and city",
    x = "Year",
    y = "Listing effectiveness (mean)",
    color = "City"
  ) +
  theme_bw(base_size = 11)

The figure supports the below main conclusions for the report.

First, city-specific dynamics dominate: Bryan–College Station diverges upward while the other three markets move in a narrower band.

Second, by 2014 there is partial convergence among Beaumont, Tyler and Wichita Falls (effectiveness roughly 0.12–0.14), whereas Bryan–College Station remains an outlier on the high side.

7.3 Seasonal patterns of inventory turnover and listing effectiveness by month and city

The chart reports conditional means of inventory turnover by calendar month within each city by aggregating across years in summary_by_city_month:

  1. Bryan–College Station shows the strongest seasonal swing: turnover falls to about 0.12 around month 4, then rises steadily to roughly 0.18 in December, the highest value on the plot, which is consistent with faster clearing of listings in the second half of the year.

  2. Wichita Falls starts close to Bryan–College Station (≈0.138), drifts down to a trough near 0.12 in month 6, and recovers to about 0.14 by December, ranking as second.

  3. Beaumont and Tyler follow a milder U-shaped pattern, with mid-year lows (Tyler near 0.085 in month 7, the lowest overall) and modest year-end gains (Tyler ≈0.105, Beaumont ≈0.115).

Shared features include a mid-year dip (months 4–7) and a fourth-quarter rise, which supports interpreting turnover as driven by both seasonality and persistent city effects.

# seasonal pattern of inventory turnover by month and city
ggplot(
  summary_by_city_month,
  aes(x = month, y = inventory_turnover_ratio_mean, color = city, group = city)
) +
  geom_line(linewidth = 0.9) +
  geom_point(size = 1.6) +
  scale_x_continuous(breaks = 1:12) +
  labs(
    title = "Inventory turnover ratio by month and city",
    x = "Month",
    y = "Inventory turnover ratio (mean)",
    color = "City"
  ) +
  theme_bw(base_size = 11)

The second figure plots conditional means of listing effectiveness by month and city. Seasonality is again evident, with most cities showing higher absorption in spring–summer and lower values in late autumn, then a small December uptick.

Bryan–College Station exhibits the largest amplitude: effectiveness rises from about 0.09 early in the year to a peak near 0.22 in July, then falls sharply through August–September and stabilizes around 0.12–0.14. That pattern indicates a concentrated summer selling season in that market.

Wichita Falls peaks earlier, around 0.15 in April, with a secondary high near 0.14 in August, and ends the year near 0.12.

Beaumont fluctuates between roughly 0.08 and 0.13, with a high near 0.13 in August.

Tyler stays below the other cities all year (about 0.07–0.11), with a gentle peak near 0.11 in June and a November low near 0.08.

# seasonal pattern of listing effectiveness by month and city
ggplot(
  summary_by_city_month,
  aes(x = month, y = listing_effectiveness_mean, color = city, group = city)
) +
  geom_line(linewidth = 0.9) +
  geom_point(size = 1.6) +
  scale_x_continuous(breaks = 1:12) +
  labs(
    title = "Seasonal pattern of listing effectiveness by month and city",
    x = "Month",
    y = "Listing effectiveness (mean)",
    color = "City"
  ) +
  theme_bw(base_size = 11)

8. Creating visualizations with ggplot2

In this section, customized graphics are produced using ggplot2 to support comparative analysis of the Texas real estate data. The visualizations address three objectives:

  1. compare the distribution of median sale price across cities (boxplots);
  2. compare total sales by month and city (bar charts);
  3. examine sales dynamics over different historical periods (line charts).

Together, these figures facilitate assessment of cross-sectional heterogeneity, seasonal patterns, and temporal trends in market activity.

8.1 Distribution of median sale price by city

The boxplots summarize the empirical distribution of monthly median sale prices within each city (all years pooled). The median line and interquartile range (IQR) describe the central level and spread of typical prices; whiskers and points indicate the remaining range and upper-tail outliers.

Bryan–College Station shows the highest median (about $157,000), followed by Tyler (≈ $141,000), Beaumont (≈ $130,000), and Wichita Falls (≈ $102,000). The gap between cities indicates strong cross-sectional heterogeneity in price levels.

Mean and IQR of median sale price by city
City Mean (USD) IQR (USD)
Beaumont 129988.3 11525
Bryan-College Station 157488.3 11175
Tyler 141441.7 13700
Wichita Falls 101743.3 16375

Box heights (IQR) are basically similar across cities, so within-city variability of monthly medians is comparable in relative terms, though Wichita Falls and Tyler appear slightly wider. That pattern suggests differences between cities are driven mainly by level shifts, not by market volatility.

ggplot(df, aes(x = city, y = median_price, fill = city)) +
  geom_boxplot(width = 0.6, color = "black") +
  labs(
    title = "Distribution of median sale price by city",
    x = "City",
    y = "Median sale price (USD)"
  ) +
  theme_bw(base_size = 11) +
  theme(legend.position = "none")

Disaggregating by year refines the pooled comparison and highlights temporal dynamics within each market.

The ordering by price level is preserved in most years: Bryan–College Station remains highest, Wichita Falls lowest, with Tyler and Beaumont in between. In particular, by 2014, Tyler’s distribution has shifted upward and overtakes Beaumont, displaying the most regular monotonic rise in median level year over year with a sustained local price increase.

ggplot(df, aes(x = city, y = median_price, fill = factor(year))) +
  geom_boxplot(
    position = position_dodge(width = 0.8),
    width = 0.7,
    color = "black",
  ) +
  labs(
    title = "Distribution of median sale price by city and year",
    x = "City",
    y = "Median sale price (USD)",
    fill = "Year"
  ) +
  theme_bw(base_size = 11)

8.2 Total sales by month and city

The stacked bar chart aggregates sales counts by calendar month and city, pooling observations across all years in the panel.

Total market activity shows a pronounced seasonal cycle: volumes are lowest in January (2548 sales), rise through spring, and peak in June (4871 sales), then decline through autumn to a secondary low in November (3137), with a modest recovery in December, defining a summer-weighted selling season pattern.

ggplot(total_sales_month_city, aes(x = factor(month), y = total_sales, fill = city)) +
  geom_col(position = "stack", width = 0.7, color = "black") +
  scale_x_discrete(breaks = 1:12) +
  labs(title = "Total sales by month and city (absolute)", x = "Month", y = "Total sales (count)", fill = "City") +
  theme_bw(base_size = 11)

The normalized chart displays each city’s share of that month’s aggregate. Relative composition is stable across months: Tyler typically accounts for about 34–39% of monthly sales, Beaumont and Bryan–College Station each near mean of 23–26, and Wichita Falls about 13–17%. Minor mid-year shifts appear with Bryan–College Station gains in relative share around months 5–7 and with small offsetting changes for Tyler and Beaumont.

This view supports the conclusion that geographic structure in the four-city panel is persistent, with only limited evidence of month-specific reallocation among cities.

Month Beaumont Bryan-College Station Tyler Wichita Falls
1 0.24 0.23 0.36 0.17
2 0.24 0.22 0.38 0.16
3 0.23 0.25 0.35 0.17
4 0.22 0.28 0.34 0.16
5 0.22 0.32 0.33 0.14
6 0.21 0.33 0.34 0.13
7 0.20 0.32 0.34 0.14
8 0.23 0.28 0.34 0.15
9 0.24 0.22 0.39 0.16
10 0.26 0.21 0.38 0.15
11 0.25 0.23 0.36 0.16
12 0.26 0.23 0.36 0.15
ggplot(total_sales_month_city) +
  geom_bar(aes(x = month, y = total_sales, fill = factor(city)), stat = "identity", position = "fill") +
  scale_x_continuous(breaks = 1:12) +
  labs(title = "Total sales by month and city (normalized)", x = "Month", y = "Total sales (count)", fill = "City") +
  theme_bw(base_size = 11)

The below chart displays absolute monthly sales counts stacked by city within each calendar year. Each bar’s height is total transactions in that month–year and segment heights show each city’s contribution in units.

Total monthly sales generally increase over the sample period with city segments grow in parallel during many months, so the rise reflects an overall volume gains rather than a single city driving the entire increase.

For every considered year, sales tend to be lower in early months (especially January), peak in mid-summer (June–August) then ease toward year-end. This pattern aligns with the pooled monthly stacked chart with summer as the dominant selling season in these markets.

# total sales per city and month and years
total_sales_year_month_city <- df %>%
  group_by(city, year, month) %>%
  summarise(total_sales = sum(sales, na.rm = TRUE), .groups = "drop")

ggplot(total_sales_year_month_city) +
  geom_bar(aes(x = month, y = total_sales, fill = factor(city)), stat = "identity", position = "stack") +
  scale_x_continuous(breaks = 1:12) +
  facet_wrap(~ year, ncol = 2) +
  labs(title = "Total sales by year, month and city", x = "Month", y = "Total sales (count)", fill = "City") +
  theme_bw(base_size = 11)

8.3 Sales dynamics over historical periods

The city-specific line charts (month on the x-axis, one line per year, 2010–2014) show strong seasonality and generally rising sales over the period, but with clear cross-city differences.

Tyler, Beaumont, and Bryan–College Station all record higher volumes in last two years (2013–2014), with strong counts in spring–summer and lower in winter. In particular, Bryan–College Station has the most regular pattern (sharp June–July peak, then a fast drop) and the highest peak levels.

# plot monthly sales for one city with one line per calendar year.
plot_sales_by_city_period <- function(city_name) {
  df %>%
    # case insensitive comparison
    filter(str_to_upper(city) == str_to_upper(city_name)) %>%
    # one line for each year
    ggplot(aes(x = month, y = sales, color = factor(year), group = year)) +
    geom_line(linewidth = 0.9) +
    geom_point(size = 1.5) +
    scale_x_continuous(breaks = 1:12) +
    labs(
      title = paste("Seasonal sales by year —", city_name),
      x = "Month",
      y = "Sales (count)",
      color = "Year"
    ) +
    theme_bw(base_size = 11)
}

Wichita Falls is smaller in scale and less predictable: year lines overlap, peaks shift across months, and there is no clear upward trend over 2010–2014, then showing the more volatile market in the report.

The figure plots monthly sales against period (2010–early 2015), faceted by city with free y-scales so each panel shows local level and variation. Observed series (blue) are overlaid with a linear trend (red dashed).

Beaumont shows a positive upward trend with clear seasonal oscillation.

Bryan–College Station combines a steep positive trend with regular, amplifying seasonality: peak months rise markedly in later year (above 400 units), while troughs remain near 100, exhibiting both the strongest growth and large seasonal swings.

Tyler displays a steady upward trend and consistent seasonal cycles, with a higher baseline than Beaumont and Wichita Falls (exceeding 400 at peak in 2014). Tyler is the largest active market in peak volume among the four.

For Wichita Falls the linear trend is basically flat, implying little net growth over the periods with the seasonal pattern less smooth than in the other three cities.

ggplot(df, aes(x = period, y = sales)) +
  geom_line(linewidth = 0.8, color = "#2c7fb8") +
  geom_point(size = 1.2, color = "#2c7fb8") +
  geom_smooth(method = "lm", se = FALSE, color = "red",
                         linewidth = 0.8, linetype = "dashed") +
  facet_wrap(~ city, ncol = 2, scales = "free_y") +
  labs(
    title = "Sales dynamics over time by city",
    x = NULL,
    y = "Sales (count)"
  ) +
  theme_bw(base_size = 11)

9. Conclusions

This report presents a descriptive statistical analysis of the Texas real estate panel in realestate_texas.csv with monthly market indicators for four cities (Beaumont, Bryan–College Station, Tyler, Wichita Falls) over 2010–2014 (N = 240 city–month observations).

Pooled descriptive statistics show that volume has the greatest relative dispersion (CV ≈ 0.54) and the strongest right-skewness** (≈ 0.88), while median_price is comparatively stable (CV ≈ 0.17):

Classifying median_price into six intervals yields substantial heterogeneity across bands (Gini G = 0.7478, normalized G = 0.8973*), yet most observations fall in mid-to-upper segments ([120,000–140,000) and [140,000–160,000) ≈ 64% combined), with negligible mass at the lowest band.

Conditional summaries by city, year, and month confirm that geography, calendar year, and season jointly shape outcomes:

With graphical evidence, boxplots of median_price reveal a stable city ranking (Bryan–College Station and Tyler highest; Wichita Falls lowest), with appreciation in 2013–2014 for the upper-tier cities.

Stacked and normalized sales charts show strong seasonality in aggregate volume (peak around June) but stable city shares across months and years.

Line charts of sales over period and by year within city indicate positive trends in Beaumont, Bryan–College Station, and Tyler, with pronounced and regular seasonality in the larger markets. On the contrary, we can observe stagnant dynamics in Wichita Falls.

As overall assessment, the Texas submarkets in this sample are characterized by

Wichita Falls behaves as a smaller and less trending market.