At Texas Realty Insights, we continuously strive to provide data-driven insights that guide our strategic decisions in the real estate market. This analysis focuses on historical sales data across Texas cities, examining trends in property prices, sales volume, and listing activity over time.
By combining descriptive statistics with clear visualizations, our goal is to identify patterns that reveal market dynamics, seasonal fluctuations, and city-level differences, allowing us to optimize marketing strategies and resource allocation.
The dataset used for this analysis contains historical real estate sales information for four Texas cities between 2010 and 2014. It includes metrics such as the total number of sales, total sales volume, median sale price, number of active listings, and months of inventory.
In this section, we first provide a preview of the dataset and then describe each variable in terms of type, meaning, and statistical classification.
The dataset includes 240 monthly records covering four metropolitan areas in Texas (Beaumont, Bryan-College Station, Tyler, and Wichita Falls) from 2010 to 2014.
The following table shows the first few rows of the dataset, providing a quick glimpse of the cities, time frame, and the key numerical metrics.
| Table 1. Dataset Preview | |||||||
| city | year | month | sales | volume | median_price | listings | months_inventory |
|---|---|---|---|---|---|---|---|
| Beaumont | 2010 | 1 | 83 | 14.162 | 163800 | 1533 | 9.5 |
| Beaumont | 2010 | 2 | 108 | 17.690 | 138200 | 1586 | 10.0 |
| Beaumont | 2010 | 3 | 182 | 28.701 | 122400 | 1689 | 10.6 |
| Beaumont | 2010 | 4 | 200 | 26.819 | 123200 | 1708 | 10.6 |
| Beaumont | 2010 | 5 | 202 | 28.833 | 123100 | 1771 | 10.9 |
| Beaumont | 2010 | 6 | 189 | 27.219 | 122800 | 1803 | 11.1 |
Below is a detailed description of each variable, including its type and statistical classification.
| Table 2. Variable Description | ||||
| Variable Name | R Type | Meaning | Classification | Possible Analysis |
|---|---|---|---|---|
| city | Character | Reference city | Qualitative nominal | Frequency, mode |
| year | Integer | Reference year | Qualitative ordinal | Frequency, mode, temporal trends |
| month | Integer | Reference month | Qualitative ordinal | Frequency, mode, seasonal trends |
| sales | Integer | Total number of sales | Quantitative discrete | Position, variability, and shape indexes |
| volume | Numeric | Total value of sales (million USD) | Quantitative continuous | Position, variability, and shape indexes |
| median_price | Numeric | Median sale price (USD) | Quantitative continuous | Position, variability, and shape indexes |
| listings | Integer | Total number of active listings | Quantitative discrete | Position, variability, and shape indexes |
| months_inventory | Numeric | Months of inventory | Quantitative continuous | Position, variability, and shape indexes |
In this dataset, city, year, and
month are treated as qualitative variables. While
year and month are stored as integers,
arithmetic operations such as averaging are not meaningful for these
variables. Instead, they serve to group and organize the data over time,
enabling the examination of temporal and seasonal trends across
cities.
For all qualitative variables, descriptive analysis focuses on
frequencies, proportions, and modes, while preserving the natural order
of ordinal variables like year and month for
clear visualizations and summaries.
Conversely, for all quantitative variables, it is possible to compute the full range of descriptive statistics — including measures of position, dispersion, and shape — to better understand their distributional characteristics.
This chapter summarizes the key statistical characteristics of the quantitative and qualitative variables in the dataset. For quantitative variables, measures of position, variability, and shape are calculated. For qualitative variables, frequency distributions are created to understand the distribution of observations across categories.
This section presents the position indexes of the quantitative variables:
Mean: the arithmetic average of the variable, representing its central value.
Min: the smallest observed value in the dataset.
Q1 (1st quartile): the value below which 25% of the observations fall.
Median (Q2 / 2nd quartile): the value below which 50% of the observations fall, representing central tendency.
Q3 (3rd quartile): the value below which 75% of the observations fall.
Max: the largest observed value in the dataset.
| Table 3. Position indexes: mean, median, and quartiles | ||||||
| Variable | Mean | Min | Q1 | Median | Q3 | Max |
|---|---|---|---|---|---|---|
| sales | 192.29 | 79.00 | 127.00 | 175.50 | 247.00 | 423.00 |
| volume | 31.01 | 8.17 | 17.66 | 27.06 | 40.89 | 83.55 |
| median_price | 132,665.42 | 73,800.00 | 117,300.00 | 134,500.00 | 150,050.00 | 180,000.00 |
| listings | 1,738.02 | 743.00 | 1,026.50 | 1,618.50 | 2,056.00 | 3,296.00 |
| months_inventory | 9.19 | 3.40 | 7.80 | 8.95 | 10.95 | 14.90 |
This section presents measures of dispersion for the quantitative variables, providing insight into the spread and relative variability of the data:
Standard Deviation (SD): measures the average distance of each observation from the mean, indicating how spread out the values are.
Range: the difference between the maximum and minimum values, showing the total spread of the data.
Interquartile Range (IQR): the difference between the third quartile (Q3) and the first quartile (Q1), representing the spread of the middle 50% of the data.
Coefficient of Variation (CV%): the standard deviation expressed as a percentage of the mean, allowing comparison of variability between variables with different units or scales.
| Table 4. Variability Indexes | ||||
| Variable | StdDev | Range | IQR | CV |
|---|---|---|---|---|
| sales | 79.65 | 344.00 | 120.00 | 41.42% |
| volume | 16.65 | 75.38 | 23.23 | 53.71% |
| median_price | 22,662.15 | 106,200.00 | 32,750.00 | 17.08% |
| listings | 752.71 | 2,553.00 | 1,029.50 | 43.31% |
| months_inventory | 2.30 | 11.50 | 3.15 | 25.06% |
To compare variability across variables measured in different units, the coefficient of variation (CV) is used. In this dataset:
The variable volume exhibits the highest relative
variability, with a CV of 53.71%, indicating that the
total sales value fluctuates substantially from month to month and
across cities.
In contrast, median_price has the lowest CV at
17.08%, suggesting that median property prices are
relatively less dispersed over time and cities compared to other
metrics.
This section examines the asymmetry and peakedness of the quantitative variables using skewness and kurtosis, providing insight into the shape of their distributions:
Skewness measures the asymmetry of a distribution. Positive skewness indicates a longer right tail, while negative skewness indicates a longer left tail. Values near 0 suggest a roughly symmetric distribution.
Kurtosis measures the peakedness or tail weight relative to a normal distribution. A value of 0 corresponds to a normal-like shape; positive values indicate a more peaked distribution with heavier tails (leptokurtic), and negative values indicate a flatter distribution with lighter tails (platykurtic).
| Table 5. Shape Indexes | ||
| Variable | Skewness | Kurtosis |
|---|---|---|
| sales | 0.72 | -0.31 |
| volume | 0.88 | 0.18 |
| median_price | -0.36 | -0.62 |
| listings | 0.65 | -0.79 |
| months_inventory | 0.04 | -0.17 |
In this dataset, the shape indexes of the quantitative variables indicate the following:
sales is moderately positively skewed (0.72) and
slightly platykurtic (-0.31).
volume is positively skewed (0.88) and slightly
leptokurtic (0.18).
median_price is slightly negatively skewed (-0.36)
and platykurtic (-0.62).
listings is positively skewed (0.65) and platykurtic
(-0.79).
months_inventory is nearly symmetric (0.04) and
slightly platykurtic (-0.17), indicating a distribution quite similar to
a normal distribution.
Of these variables, volume exhibits the most pronounced
asymmetry, with a positive skew. In real estate, this implies that,
although the total sales value is generally moderate, there are
occasional months with exceptionally high sales values. These spikes may
be due to seasonal effects, large transactions or market
opportunities.
This section summarizes the distribution of observations across categorical variables using frequency counts, relative frequencies, and cumulative frequencies to describe how data are distributed among categories.
A frequency distribution is a summary of how often each category occurs in a dataset:
Frequency (ni): the number of observations in each category.
Relative frequency (fi): the proportion of observations in each category, calculated as the count divided by the total number of observations.
Cumulative count (Ni) and cumulative frequency (Fi): the running total of frequencies as categories are ordered, showing how many observations fall below or within a given category.
Frequency distributions are essential for categorical variables because arithmetic measures like mean or standard deviation do not apply.
| Table 6. Frequency distribution: city | ||||
| city | ni | fi | Ni | Fi |
|---|---|---|---|---|
| Beaumont | 60 | 25.00% | 60 | 25.00% |
| Bryan-College Station | 60 | 25.00% | 120 | 50.00% |
| Tyler | 60 | 25.00% | 180 | 75.00% |
| Wichita Falls | 60 | 25.00% | 240 | 100.00% |
| Table 7. Frequency distribution: year | ||||
| year | ni | fi | Ni | Fi |
|---|---|---|---|---|
| 2010 | 48 | 20.00% | 48 | 20.00% |
| 2011 | 48 | 20.00% | 96 | 40.00% |
| 2012 | 48 | 20.00% | 144 | 60.00% |
| 2013 | 48 | 20.00% | 192 | 80.00% |
| 2014 | 48 | 20.00% | 240 | 100.00% |
| Table 8. Frequency distribution: month | ||||
| month | ni | fi | Ni | Fi |
|---|---|---|---|---|
| 1 | 20 | 8.33% | 20 | 8.33% |
| 2 | 20 | 8.33% | 40 | 16.67% |
| 3 | 20 | 8.33% | 60 | 25.00% |
| 4 | 20 | 8.33% | 80 | 33.33% |
| 5 | 20 | 8.33% | 100 | 41.67% |
| 6 | 20 | 8.33% | 120 | 50.00% |
| 7 | 20 | 8.33% | 140 | 58.33% |
| 8 | 20 | 8.33% | 160 | 66.67% |
| 9 | 20 | 8.33% | 180 | 75.00% |
| 10 | 20 | 8.33% | 200 | 83.33% |
| 11 | 20 | 8.33% | 220 | 91.67% |
| 12 | 20 | 8.33% | 240 | 100.00% |
For each qualitative variable, all categories show the same relative
frequency: 25% for city, 20% for year, and
8.33% for month. This uniform distribution reflects the
complete, ordered structure of the dataset and confirms that there are
no missing values.
In this chapter, the frequency distribution for the
variable sales is constructed to identify how transaction
volumes are distributed across different ranges and to highlight
potential concentration patterns.
A frequency distribution groups data values into intervals (classes) and shows how often observations fall within each range. This approach helps summarize large datasets and reveals where observations are more concentrated, making it easier to interpret overall trends and variability in the data.
| Table 9. Frequency distribution for total sales classes | ||||
| class | ni | fi | Ni | Fi |
|---|---|---|---|---|
| [79,148] | 84 | 35.00% | 84 | 35.00% |
| (148,217] | 77 | 32.08% | 161 | 67.08% |
| (217,285] | 41 | 17.08% | 202 | 84.17% |
| (285,354] | 27 | 11.25% | 229 | 95.42% |
| (354,423] | 11 | 4.58% | 240 | 100.00% |
The distribution of sales classes shows a higher
concentration of observations in the lower and mid-range intervals, with
only a few months recording exceptionally high number of sales. This
pattern reflects the positive skewness observed for this variable
before, indicating that most months have moderate market activity, while
a smaller number of months experience unusually strong sales
performance. Such periods may correspond to seasonal peaks or
particularly favorable market conditions.
The Gini Index is a measure of heterogeneity or dispersion used to assess how evenly observations are distributed across classes. It ranges between 0 and 1:
A value close to 0 indicates low variability or a highly concentrated distribution (most observations fall in one class).
A value close to 1 indicates high heterogeneity, meaning observations are spread more evenly across classes.
For the variable sales, the Gini index is: 0.913 .
This value suggests that the distribution of sales across the defined classes is highly heterogeneous. Even though the classes are not perfectly balanced—reflecting the positive skewness of the distribution—no single class dominates, indicating that observations are spread across multiple sales ranges.
This section calculates specific probabilities for selected events in the dataset: the probability of a randomly chosen observation corresponding to the city Beaumont, the probability of it being in July, and the joint probability of being in December 2012. These calculations help quantify the likelihood of individual and combined events.
| Table 10. Specific Probability Calculations | ||
| P(city == 'Beaumont') | P(month == 7) | P(month == 12 & year == 2012) |
|---|---|---|
| 25.00% | 8.33% | 1.67% |
As shown in Table 10, the probability of a randomly selected
observation belonging to the city Beaumont is
25.0%, and the probability of a month being July is
8.33%, consistent with the frequency distributions
presented in Chapter 3.4.
The joint probability of an observation being in December 2012 is 1.67%. This value is the same for every combination of year and month, since the dataset contains 12 months × 5 years = 60 month-year combinations per city, each equally represented.
Two new variables were created to enrich the analysis:
average_unit_price estimates the average price per
property by dividing the total sales volume by the number of
transactions.
listings_effectiveness measures the efficiency of
property listings by calculating the ratio of sales to active
listings.
Table 11 summarizes these new variables with key statistics, including mean, median, variability, skewness, and excess kurtosis, providing insights into typical prices and the effectiveness of sales listings across the dataset.
| Table 11. Summary of New Variables | ||
| Statistic | Average Property Price (USD) | Listings effectiveness (%) |
|---|---|---|
| Mean | 154,320.37 | 11.87% |
| Median | 156,588.48 | 10.96% |
| StdDev | 27,147.46 | 4.69% |
| Min | 97,010.20 | 5.01% |
| Max | 213,233.94 | 38.71% |
| CV (%) | 17.59% | 39.50% |
| Skewness | −0.07 | 2.09 |
| Kurtosis | −0.78 | 6.88 |
For average_unit_price, the mean (154,320 USD) is higher
than the mean of the median_price variable (132,665 USD)
calculated before. This difference indicates that, on average, the newly
calculated average property price tends to exceed the median property
price, which is consistent with the positively skewed property price
distributions often observed in real estate markets, where occasional
high-value transactions elevate the overall average relative to the
medians.
The listings_effectiveness variable has a mean of
11.87%, indicating that, on average, 11.87% of active property listings
are converted into actual sales monthly. This metric reflects the
efficiency of listings in generating transactions across the market. The
relatively low average effectiveness rate highlights opportunities to
optimize marketing and sales strategies. Actions such as targeting
high-demand periods, enhancing listing visibility, or refining pricing
approaches could improve overall effectiveness and transaction
volume.
Additionally, the coefficient of variation (CV) for
average_unit_price is 17.59%, indicating moderate
variability relative to the mean, similar to the CV of the previously
analyzed median_price variable (17.08%). In contrast,
listings_effectiveness shows a CV of 39.50%, highlighting
substantially greater relative variability.
Regarding distribution shape, average_unit_price shows a
mild negative skew (-0.07) and slightly platykurtic kurtosis (-0.78),
indicating a roughly symmetric and somewhat flat distribution. By
comparison, listings_effectiveness has extreme kurtosis
(6.88), reflecting sharp peaks in certain months, and a positive
skewness (2.09), suggesting that most months exhibit relatively low
effectiveness, while occasional periods achieve unusally high sales
efficiency.
Overall, these statistics suggest that while property prices remain relatively stable across cities and time, the effectiveness of listings is highly variable, with occasional peaks likely driven by successful marketing campaigns or favorable market conditions.
The following boxplots illustrate how the newly created
variables—average_unit_price and
listings_effectiveness—are distributed across the four
cities, highlighting differences in central tendency, variability, and
the presence of outliers in each location.
The boxplots for average_unit_price reveal notable
variability in both central tendency and interquartile range (IQR)
across cities. The cities ranked by decreasing median average property
price are: Bryan-College Station, Tyler, Beaumont, and Wichita Falls.
The IQRs for most cities do not overlap, with only a slight overlap
between Bryan-College Station and Tyler, suggesting that the four cities
cover different market price ranges. Bryan-College Station stands out
with a visibly larger IQR, indicating higher range in average property
prices within this city.
The medians of listings_effectiveness are relatively
similar across cities, ranging approximately between 9% and 13%.
Bryan-College Station stands out with a notably larger interquartile
range (IQR) and more pronounced outliers compared to the other
cities.
To complement the exploratory analysis of
listings_effectiveness by city, the boxplot below
illustrates the distribution of months_inventory, which
represents the average number of months a property remains on the market
before being sold. The results reveal substantial variability across
cities, with the median values decreasing in the following order: Tyler,
Beaumont, Bryan-College Station, and Wichita Falls. This indicates that
listings in Tyler generally take the longest time to sell, whereas
properties in Wichita Falls tend to sell the fastest. Furthermore, the
variable months_inventory does not appear to be correlated
with listings_effectiveness,as it does not show the same
trend.
This chapter transitions from the exploratory analysis of the newly created variables, initially examined through boxplots, to a more detailed conditional analysis.
Conditional analysis consists of calculating summary statistics, such
as the mean and standard deviation, separately for different
subgroups—here defined by city, year, and month. This approach allows
for understanding how the variables, in this case
average_unit_price and listings_effectiveness,
vary across different locations and time periods, highlighting trends
and patterns that may not be visible in the overall data.
| Table 12. Conditional Analysis of New Variables by City | ||||||
| city | Mean Average Price (USD) | StdDev Average Price (USD) | CV Average Price (%) | Mean Listings Effectiveness (%) | StdDev Listings Effectiveness (%) | CV Listings Effectiveness (%) |
|---|---|---|---|---|---|---|
| Beaumont | 146,640.41 | 11,232.13 | 7.66% | 10.61% | 2.67% | 25.14% |
| Bryan-College Station | 183,534.29 | 15,149.35 | 8.25% | 14.73% | 7.29% | 49.44% |
| Tyler | 167,676.76 | 12,350.51 | 7.37% | 9.35% | 2.35% | 25.09% |
| Wichita Falls | 119,430.00 | 11,398.48 | 9.54% | 12.80% | 2.47% | 19.31% |
The conditional analysis by city highlights notable differences in both the average property price and the effectiveness of listings:
Average Property Price (USD): Bryan-College Station records the highest mean average price at $183,534, followed by Tyler ($167,677) and Beaumont ($146,640), while Wichita Falls has the lowest ($119,430). The coefficients of variation are relatively consistent across cities, indicating similar levels of price dispersion within each market.
Listings Effectiveness (%): Bryan-College Station also stands out with the highest mean effectiveness (14.73%) and the largest coefficient of variation (49.44%), suggesting substantial variability in the performance of listings. Wichita Falls and Beaumont show relatively high mean effectiveness (12.80% and 10.61%, respectively), while Tyler has the lowest mean effectiveness (9.35%).
| Table 13. Conditional Analysis of New Variables by Year | ||||||
| year | Mean Average Price (USD) | StdDev Average Price (USD) | CV Average Price (%) | Mean Listings Effectiveness (%) | StdDev Listings Effectiveness (%) | CV Listings Effectiveness (%) |
|---|---|---|---|---|---|---|
| 2010 | 150,188.58 | 23,279.55 | 15.50% | 9.97% | 3.37% | 33.84% |
| 2011 | 148,250.63 | 24,938.38 | 16.82% | 9.27% | 2.32% | 25.04% |
| 2012 | 150,898.68 | 26,438.50 | 17.52% | 10.97% | 2.81% | 25.59% |
| 2013 | 158,705.25 | 26,523.81 | 16.71% | 13.46% | 4.48% | 33.28% |
| 2014 | 163,558.70 | 31,740.53 | 19.41% | 15.70% | 6.18% | 39.34% |
The conditional analysis by year reveals clear temporal patterns in both average property price and listings effectiveness:
Average Property Price (USD): There is a gradual increase over time, from $150,189 in 2010 to $163,559 in 2014. The coefficient of variation increases slightly over the same period, reaching 19.41% in 2014, indicating that price variability has gradually widened as the market expanded.
Listings Effectiveness (%): Effectiveness shows a steady upward trend, from 9.97% in 2010 to 15.70% in 2014, suggesting that listings become more successful over time. The coefficient of variation also peaks in 2014 (6.18%), indicating that, although listings have become more successful overall, the disparity in performance across different listings has grown as well.
| Table 14. Conditional Analysis of New Variables by Month | ||||||
| month | Mean Average Price (USD) | StdDev Average Price (USD) | CV Average Price (%) | Mean Listings Effectiveness (%) | StdDev Listings Effectiveness (%) | CV Listings Effectiveness (%) |
|---|---|---|---|---|---|---|
| 1 | 145,640.42 | 29,819.11 | 20.47% | 8.31% | 2.30% | 27.71% |
| 2 | 148,840.48 | 25,120.42 | 16.88% | 8.78% | 2.19% | 24.97% |
| 3 | 151,136.54 | 23,237.92 | 15.38% | 11.60% | 3.46% | 29.82% |
| 4 | 151,461.33 | 26,174.30 | 17.28% | 12.53% | 3.80% | 30.30% |
| 5 | 158,235.03 | 25,787.19 | 16.30% | 14.15% | 5.03% | 35.53% |
| 6 | 161,545.82 | 23,470.46 | 14.53% | 14.24% | 5.76% | 40.44% |
| 7 | 156,881.00 | 27,220.12 | 17.35% | 14.35% | 7.40% | 51.60% |
| 8 | 156,455.56 | 28,253.21 | 18.06% | 14.19% | 5.26% | 37.09% |
| 9 | 156,522.32 | 29,669.41 | 18.96% | 11.17% | 3.48% | 31.16% |
| 10 | 155,897.37 | 32,527.29 | 20.86% | 11.19% | 3.60% | 32.14% |
| 11 | 154,233.00 | 29,684.87 | 19.25% | 10.25% | 2.93% | 28.60% |
| 12 | 154,995.52 | 27,008.87 | 17.43% | 11.73% | 3.79% | 32.29% |
The conditional analysis by month highlights seasonal patterns in both average property price and listings effectiveness:
Average Property Price (USD): Prices start relatively lower in January ($145,640) and gradually increase, peaking in June ($161,546), before stabilizing around $155,000–$156,000 in the last months. This suggests a mid-year peak in property values. The coefficient of variation does not show any particular seasonal pattern.
Listings Effectiveness (%): Effectiveness is lowest at the beginning of the year (8.31% in January), rises steadily to a peak in July (14.35%), and then gradually decreases toward the end of the year. The coefficient of variation also rises mid-year, particularly in July (51.60%), suggesting more variability in listing success during peak months.
Overall, the data shows a clear seasonal trend, with both prices and listing effectiveness peaking around mid-year, reflecting higher market activity in the summer months.
Below are the plots illustrating the conditional analysis of
listings_effectiveness and average_unit_price,
showing how these variables vary across city-year combinations and
city-month combinations. The graphs complement the tables by providing a
visual representation of trends, differences between cities, and
seasonal patterns throughout the year.
Comments on the graph above: the increasing trend in average property price over the years is clearly visible across all cities.
Comments on the graph above: the peak in June is not immediately visible in the graph, as the bars across months are quite similar. The peak is likely driven by smaller cities, which reduces the visual impact of the increase.
Comments on the graph above: the increase in listings effectiveness over time is very noticeable. Initially, Wichita Falls had the highest effectiveness, but over the years Bryan-College Station became clearly the most effective city.
Comments on the graph above: the peak in July is prominent, mainly due to Bryan-College Station, which consistently shows the highest effectiveness among the cities.
This chapter presents visualizations that highlight trends and patterns in the real estate market in our selected are of study. Boxplots are used to compare the distribution of median property prices across cities, bar charts illustrate the total sales by month and city, and line charts show how sales evolved over historical periods. These graphics provide a clear and intuitive overview of market performance and temporal dynamics.
The boxplots for median_price reveal notable variability
in central tendency across cities. The cities ranked by decreasing
median property price are: Bryan-College Station, Tyler, Beaumont, and
Wichita Falls. The IQRs do not overlap, indicating that the
distributions cover distinct market price ranges and highlighting a
clear difference in property prices between cities. Geography appears to
be a major determinant of price. Additionally, the IQR lengths are quite
similar across cities, suggesting that the price range is consistent
across cities.
The total volume of sales does not mirror the trend
observed in median property price across cities. For total sales, the
cities ranked in decreasing order are: Tyler, Bryan-College Station,
Beaumont, and Wichita Falls. This indicates that, despite Bryan-College
Station having higher property prices, the overall sales value is
greater in Tyler, likely due to a higher number of transactions or
larger market activity.
Bryan-College Station shows a significantly wider interquartile range (IQR), followed by Tyler and Beaumont, while Wichita Falls has a notably smaller IQR. This indicates greater sales value ranges in the larger markets, particularly Bryan-College Station.
Looking at trends over the years, the general pattern is an increase in total sales value, except for Wichita Falls, where the yearly trend is less clear.
A clear seasonal trend is visible, with lower sales in January that steadily increase to a peak in June, then gradually decrease. This suggests a strong seasonal pattern in the real estate market. The trend is likely driven by buyer activity, but the underlying causes—such as weather conditions, school calendars, or other local factors—should be further investigated, as it’s not immediately clear which factors are the main drivers.
This chart emphasizes how the relative market share of each city fluctuates throughout the year. Tyler appears to account for the highest contribution to total sales, followed by Bryan-College Station, Beaumont, and Wichita Falls. An exception occurs during the peak months of June and July, where the contributions of Tyler and Bryan-College Station are roughly equal.
This plot represents the monthly value of total sales from 2010 to 2014, allowing for an assessment of the persistence and evolution of the seasonal patterns observed earlier.
From 2011 to 2014, a clear seasonal trend emerges, with sales values rising from January, peaking during the summer months, and then gradually declining. The peaks occurs in late spring and summer months—May (2010), June (2011 and 2014), July (2013), and August (2012)—and then decline during the fall and winter seasons.
The relative contribution of each city over time mirrors previous findings: Bryan-College Station and Tyler dominate sales volumes, followed by Beaumont and Wichita Falls, whose contributions remain comparatively smaller.
The time series of the total sales value reveals a clear cyclical pattern with an annual periodicity, characterized by peaks typically occurring around the middle of the year, corresponding to the summer months. This seasonal trend is particularly evident in Bryan-College Station and Tyler, which display distinct and recurrent mid-year surges in sales activity.
In contrast, Wichita Falls exhibits a less pronounced seasonal cycle, suggesting a more stable or less responsive market throughout the year.
The time series visualization also confirms the relative ranking of cities in terms of overall sales volume, previously observed in the bar charts: Tyler consistently leads, followed by Bryan-College Station, Beaumont, and Wichita Falls. This pattern reinforces the notion of persistent differences in market scale and activity among the analyzed cities.
This time series plot shows that the four analyzed cities consistently occupy distinct ranges of median property prices throughout the entire observation period. The hierarchy remains remarkably stable, with Bryan-College Station showing the highest median prices, followed by Tyler, Beaumont, and Wichita Falls.
The fact that the lines representing different cities rarely intersect or overlap suggests a form of market segmentation, where each city operates within a relatively independent price range. This may reflect structural differences such as local economic conditions, demand dynamics, or differences in property characteristics and urban development.
Unlike the total sales value, no clear seasonal trend emerges in the median price series. This indicates that price levels are relatively stable over time and are less sensitive to short-term cyclical fluctuations. Instead, they appear to follow longer-term structural factors rather than temporal or seasonal variations in market activity.
The time series of sales by city seems to confirm the
cyclical seasonal pattern observed also for the volume,and
median_price, with peaks occurring around the mid-year
months.
Additionally, this visualization shows that Tyler consistently records the highest number of sales over time, followed by Bryan-College Station. This finding aligns with the earlier observation that, although Bryan-College Station exhibits higher median property prices, Tyler consistently achieves a higher total sales value, which can therefore be attributed to its greater number of transactions. This suggests that it is the market volume, rather than price levels, that drives Tyler’s overall sales performance.
To complement the analysis of sales and listing performance, we now
examine the time series of active listings, which
represents the number of properties available on the market at a given
time. This variable provides insight into the supply side of the real
estate market, helping to better understand liquidity in the market
activity, and potential imbalances between supply and demand across
cities and over time.
The plot indicates that Tyler consistently maintains the highest number of active listings, followed by Beaumont, Bryan-College Station, and Wichita Falls, highlighting the relative supply and market activity levels across these cities.
This chapter summarizes the main findings of the Texas real estate market analysis, highlighting insights relevant for investors, developers, and market professionals.
Trends in Average Property Prices
From 2010 to 2014, the average property price increased steadily from $150,189 to $163,559.
The coefficient of variation (CV) also slightly increased over the period, peaking at 19.41% in 2014, indicating modest growth in price variability.
Market Segmentation Across Cities
Bryan-College Station: Premium market, with the highest median and average prices.
Tyler: Slightly lower prices but highly dynamic, with a strong number of sales and total sales value.
Beaumont: Moderate pricing with balanced sales activity.
Wichita Falls: Most affordable market, with the lowest prices and small volumes.
Price ranges across cities rarely overlap, indicating clear geographic segmentation and market differentiation.
Seasonal Trends in Sales
Total sales value and number of sales exhibit a strong seasonal pattern, peaking in late spring and summer (May–August).
Median property prices, in contrast, do not show seasonal variation, indicating that the timing of sales does not influence pricing levels.
City-Level Sales Performance
Tyler: Consistently shows the highest total sales value and transaction volume, making it the most dynamic market. Its strong performance is driven primarily by volume rather than property price, as Bryan-College Station has higher median prices.
Bryan-College Station: Second in total sales value and transaction volume. Despite lower transaction numbers than Tyler, its high listings effectiveness (highest among the cities) makes it a dynamic and highly responsive market, attractive for premium listings and high-value investments.
Beaumont: Balanced in both price and sales activity, representing a stable mid-tier market.
Wichita Falls: Smallest market in terms of sales and volume, but with affordable pricing for entry-level buyers; limited market dynamism.
Listings Effectiveness
Overall, listings effectiveness remains below 20%, suggesting room for improvement via targeted campaigns.
Effectiveness has increased over time, from 9.97% in 2010 to a peak of 15.70% in 2014.
There is a seasonal effect, with higher effectiveness during the summer months, mirroring the total sales peak.
Bryan-College Station stands out with the highest effectiveness, reinforcing its status as a dynamic premium market.
Market Supply and Active Listings
The combination of high active listings and high transaction volumes confirms Tyler as the most liquid and dynamic market.
Bryan-College Station, while having fewer active listings than Tyler, achieves high sales efficiency through superior listings effectiveness, highlighting its market responsiveness.
Overall Market Insights for Investors
Bryan-College Station: Premium pricing, high listings effectiveness, dynamic market despite lower volume than Tyler—ideal for high-value investments.
Tyler: High total sales value and transaction volume, abundant active listings—ideal for investors seeking market activity and liquidity.
Beaumont: Balanced pricing and moderate dynamism—suitable for mid-tier investment strategies.
Wichita Falls: Low pricing, small volumes, and limited activity—suitable for entry-level buyers or small-scale investors, but with careful attention to the market’s limited dynamism.
This analysis was conducted using R version 4.5.1 and relied exclusively on the provided dataset.