Introduction

Power pylons at sunset. Photo by Matthew Henry
Power pylons at sunset. Photo by Matthew Henry

According to the U.S. Department of Energy, “energy efficiency is one of the easiest and most cost-effective ways to combat climate change, reduce energy costs for consumers, and improve the competitiveness of U.S. businesses.” (Energy efficiency: Buildings and industry) Therefore, Montgomery County’s Department of Environmental Protection is collecting information on how efficiently buildings are using energy. The county’s Building Energy Benchmarking Law, which was signed in 2022, requires the owners of buildings over 25,000 square feet to track energy use, report verified data to the county every year, and meet long-term performance standards. I chose to investigate this topic because I try to minimize my personal energy use and impact on the environment as much as possible, and I am curious about what companies and other large entities are doing about this.

This dataset was obtained from dataMontgomery. The data was collected by Montgomery County’s Department of Environmental Protection from building reports obtained between July 2022 and July 2023.

Variables

There are 14 variables in this dataset:

  • Property Name (categorical): Name of the building, if applicable.
  • Address (categorical): Street address of the building.
  • City (categorical)
  • Zip code (categorical)
  • Benchmark (categorical): Compliance status with the Building Energy Benchmarking Law.
  • Property ID (categorical): Property ID number assigned by the county.
  • Owner (categorical): Owner or business name of record.
  • Property Type (categorical): Description of general use of the building.
  • Floor Area (quantitative): Building square footage.
  • Site Energy Use Intensity (quantitative): The amount of energy consumed in kBtu per gross square foot of the property. This entry is self-reported.
  • Energystar Rating (quantitative): EPA Energy Star Rating, an external benchmark of energy efficiency. Scores range from 1-100; 0 indicates that the building was not rated.
  • Weather Normalized (quantitative): Site EUI values normalized for weather. Buildings within a certain region, for example, may be normalized differently from buildings in a different climate.
  • Year Built (quantitative): Year the building was finished construction.
  • the_geom (geolocation, text): Longitude and latitude of the building.

I would like to see if there are any relationships between city, property type, floor area, year built, and the different measures of energy use. Since there are three different measures of energy use I would like to compare them to see how they may differ. In addition, the geolocation data will allow me to create a map of the results.

Energy Star scores appear to be the most reliable source of information about a building’s energy efficiency since they are calculated by a third party, evaluate actual metered energy use, normalize for business activity (e.g., building hours, number of workers, climate, etc.), compare buildings to peer groups and the national population, and provide a standardized measure of energy performance. (Energy Star, 2021) Therefore, I will include these scores in my final visualizations.

Data Cleaning/Wrangling

Load packages

Set the working directory and load the .csv file

Make the column names lowercase and remove spaces. Then view the data.

## # A tibble: 6 Ă— 14
##   propertyname     address city  zipcode benchmark propertyid owner propertytype
##   <chr>            <chr>   <chr> <chr>   <chr>     <chr>      <chr> <chr>       
## 1 <NA>             22300 … Clar… 20871   No Repor… 00018631   <NA>  <NA>        
## 2 249 - CLARKSBUR… 22500 … CLAR… 20871   In Compl… 00017090   Mont… K-12 School 
## 3 <NA>             22300 … Clar… 20871   No Repor… 02841561   <NA>  <NA>        
## 4 GERMANTOWN LIBR… 19840 … Germ… 20874-… In Compl… 03271420   Mont… Library     
## 5 <NA>             20900 … Germ… 20874   No Repor… 03198807   <NA>  <NA>        
## 6 <NA>             12409 … Germ… 20874   Received… 03327726   <NA>  <NA>        
## # ℹ 6 more variables: floorarea <dbl>, siteenergyuseintensity <dbl>,
## #   energystarrating <dbl>, weathernormalized <dbl>, yearbuilt <dbl>,
## #   the_geom <chr>

Split the “the_geom” column into two columns for latitude and longitude.

Each value in the_geom follows the format: POINT (longitude latitude). Separate POINT, longitude, and latitude into 3 different columns, then remove the POINT column from the dataframe.

## # A tibble: 6 Ă— 15
##   propertyname     address city  zipcode benchmark propertyid owner propertytype
##   <chr>            <chr>   <chr> <chr>   <chr>     <chr>      <chr> <chr>       
## 1 <NA>             22300 … Clar… 20871   No Repor… 00018631   <NA>  <NA>        
## 2 249 - CLARKSBUR… 22500 … CLAR… 20871   In Compl… 00017090   Mont… K-12 School 
## 3 <NA>             22300 … Clar… 20871   No Repor… 02841561   <NA>  <NA>        
## 4 GERMANTOWN LIBR… 19840 … Germ… 20874-… In Compl… 03271420   Mont… Library     
## 5 <NA>             20900 … Germ… 20874   No Repor… 03198807   <NA>  <NA>        
## 6 <NA>             12409 … Germ… 20874   Received… 03327726   <NA>  <NA>        
## # ℹ 7 more variables: floorarea <dbl>, siteenergyuseintensity <dbl>,
## #   energystarrating <dbl>, weathernormalized <dbl>, yearbuilt <dbl>,
## #   long <chr>, lat <chr>

Remove observations with incomplete data or where a report was not received.

This includes all observations where the value for Site Energy Use Intensity = 0.

Ensure that all “city” entries are in regular case (not all caps)

Write the cleaned data to a new .csv file for Tableau use.

Data Analysis and Exploratory Visualizations

Create a boxplot to look at Energy Star Ratings statistics across different property types.

This plot shows that the mean and range of Energy Star ratings differs across various property types. It is difficult to see the differences across cities here since there are so many property types and cities included. It will be helpful to filter this dataset further to look at the most common property types.

Create a table of summary statistics

Means of Floor Area, EUI, Energy Star Rating, and Weather Normalized EUI
propertytype count meanarea meanEUI meanrating meanEUIweather
Office 40 349119.22 80.60 69.85 217.26
College/University 24 54726.96 113.53 0.00 242.21
K-12 School 22 307917.32 55.15 76.23 129.84
Hospital (General Medical & Surgical) 5 488536.00 265.44 29.20 430.88
Performing Arts 5 89936.40 117.64 0.00 234.20
Hotel 4 332139.25 85.75 44.50 220.50
Library 4 60804.50 91.65 0.00 199.93
Pre-school/Daycare 3 3781.00 61.17 0.00 137.00
Courthouse 2 245042.00 100.80 53.50 230.85
Enclosed Mall 2 1728000.00 50.20 0.00 146.50
Fitness Center/Health Club/Gym 2 62006.00 96.40 0.00 200.50
Medical Office 2 185899.00 74.85 66.50 220.15
Mixed Use Property 2 305177.00 76.95 37.50 182.95
Other - Technology/Science 2 296955.50 392.90 0.00 957.65
Data Center 1 356179.00 407.40 1.00 1279.20
Energy/Power Station 1 20406.00 7922.00 0.00 11101.80
Financial Office 1 258204.00 85.30 0.00 263.90
Other 1 243966.00 77.10 0.00 0.00
Other - Recreation 1 54022.00 54.90 0.00 168.00
Other - Services 1 4720.00 90.20 0.00 0.00
Other - Specialty Hospital 1 84051.00 98.90 0.00 273.70
Retail Store 1 325000.00 62.40 0.00 162.60
Supermarket/Grocery Store 1 58883.00 185.40 82.00 494.20
Wholesale Club/Supercenter 1 152545.00 169.60 14.00 386.60

This table is helpful in determining which property types to focus on. Several property types have individual and mean Energy Star Ratings of 0. This means that their ratings were not calculated by the EPA due to insufficient data for statistical analysis. However, almost all of these buildings do have Site EUIs and Weather Normalized EUIs which we can still compare to the other property types.

Two buildings (one in “Other” and the other in “Other - Services”) do not have Weather Normalized EUIs even though they have Site EUIs.

Boxplot with the top 5 property types

This boxplot is a little easier to read now that there are fewer property types included. Since there are so many cities represented, it is still difficult to interpret. I will filter for the top 5 cities and recreate the boxplot with just their information.

## # A tibble: 6 Ă— 2
##   city          count
##   <chr>         <int>
## 1 rockville        34
## 2 silver spring    20
## 3 bethesda         15
## 4 takoma park       9
## 5 germantown        6
## 6 chevy chase       4

I will focus my analysis on the top 6 cities represented in this dataset. This is because after Chevy Chase, the remaining cities either have only 1 or 2 observations.

Boxplot with observations from the top 6 cities

Narrowing the cities down did not help make this plot much easier to read, so I will look at other statistical analysis methods.

Look at density curves and correlation output for predictor variables across all cities

The only strong correlation seen here is between EUI and weather normalized EUI (0.726), which is to be expected since one is calculated from the other.

Do the correlations change when looking at just the top 6 cities in the dataset?

There are no significant changes in correlations here. Interestingly, the relationship between EUI and weather normalized EUI is not as strong.

Create scatterplot matrices for the top 6 cities

The relationships among each variable (city, property type, floor area, EUI, Energy Star Rating, Weather Normalized EUI, and year built) are illustrated here. Outliers are shown in blue. Significant correlations are indicated with asterisks. In addition to the expected strong correlation between EUI and Weather Normalized EUI, there are also relatively strong correlations between:

  • Property type and Energy Star Rating (0.59)
  • Floor area and Energy Star Rating (0.55)
  • Site EUI and Energy Star Rating (-0.52)

Although these correlations are close to +/- 0.5, they are much closer to +/- 1 than any of the other correlations shown here.

Scatter plot of floor area, site EUI, and property type

I will use weather normalized EUI instead of site EUI to refine this visualization even though the correlation coefficient is smaller with weather normalized EUI. Since weather normalized EUI is supposed to account for weather as a confounding factor in determining energy efficiency, using site EUI would be a misrepresentation of what is actually happening with energy use in buildings.

Final Viz 1: Interactive Scatterplot of Floor Area and Weather Normalized EUI by Property Type

This interactive visualization shows that the relationship between floor area and weather normalized EUI vary across individual properties as well as property types. For instance, K-12 schools are of similar size and have similar EUIs, whereas offices and hospitals vary a lot in size as well as EUI across small and large buildings. The tooltip allows us to see the specific square footage and EUI of a given building. The size of each circle corresponds to the Energy Star Rating. I would have liked to include the building name and Energy Star Rating in the tooltip, had I had more time.

Create a map of the top 5 property types in the top 6 cities in Montgomery County

Change lat and long column type to numeric

Create a map

Create a tooltip and add it to the map

Create a palette for Energy Star Ratings and add it to the map, along with a legend.

Final Visualization 2

This map makes it easy to see a building’s Energy Star rating in relation to its size at a glance. Although a few buildings in this dataset do not have Energy Star ratings, I chose to use this measure because it is a holistic measure of energy efficiency and has been verified by a third party, whereas the EUI is self-reported by each building and is not always verified outside of the Energy Star review. EUI information is still available in the tooltip for buildings that do not have Energy Star ratings.

This visualization illustrates that there is a lot of variation in energy efficiency across building types, building size, and other factors such as the year built. The Energy Star rating system is valuable because of this. Initially I assumed that certain types of buildings or new buildings are more energy efficient than others but that is not the case according to these results. I would like to learn more about why certain property types (e.g., hospitals) tend to have lower Energy Star scores than others (e.g., schools).

References

Building Energy Benchmarking Results, electronic dataset, viewed 6 December 2023, < https://data.montgomerycountymd.gov/Environment/Building-Energy-Benchmarking-Results/izzs-2bn4/about_data>.

Energy efficiency: Buildings and industry. Office of Energy Efficiency & Renewable Energy. (n.d.). https://www.energy.gov/eere/energy-efficiency-buildings-and-industry

Energy Star, Energy Star Portfolio Manager Technical Reference 2 (2021). Environmental Protection Agency. Retrieved from https://portfoliomanager.energystar.gov/pdf/reference/ENERGY%20STAR%20Score.pdf.