US Energy Consumption

As a human, I am quite interested in the energy consumption habits of the United States - specifically what kinds of energy we are consuming. Especially in modern times and the advancement of renewable energy, I am curious of the direction we are going as a nation in how we consume energy and what it has looked like over the years.

I would like to understand what types of energy is being consumed and to what extent that energy is being consumed. I also have a curiosity in how the states individually are consuming this energy and how this relates to their population and other various factors.

The Data

The data that was for my analysis was scraped from the Britannica ProCon website and their data was pulled from the US Energy Information Administration. I scraped energy data from two separate Britannica pages and combined the tables on each page to create two separate data frames. One to analyze energy consumption at the state level data as of 2018 and one to analyze energy consumption from 1960-2019. Below are the complete data sets that are used in the analysis. The data frames contain usage information of fossil fuels, nuclear energy and renewable energy.

State Level

Units are in trillions of BTU

Summary Of The State Level Data

Variables in Data Set Variable Type Description
State…1 character US State
Population number total population as of 2019
Total.Energy.Consumption number Total energy consumption
Total.Fossil.Fuel.Energy.Consumption number Total fossil fuel consumption
Total.Alternative.Energy.Consumption number Total Alternative energy consumption including nuclear energy
Coal number Coal consumption
Natural.Gas number Natural Gas Consumption
Petroleum number Petroleum Consumption
Nuclear number Nuclear consumption
Hydroelectric number Hydroelectric consumption
Biomass number Biomass consumption
Geothermal number Geothermal consumption
solar number Solar consumption
Wind number wind consumption
Total.Renewable.Energy number Total renewable energy consumption excluding nuclear energy

Year Level

The values of this data is a percentage

Summary Of The Year Level Data

Variables in Data Set Variable Type Description
Year …1 number Years from 1960 - 2019
Coal number Percentage of coal used
Natural.Gas number Percentage of natural gas used
Petroleum number Percentage of petroleum used
Total.Fossil.Fuels number Total percentage of fossil fuel used
Nuclear number Percentage of nuclear used
Hydroelectric number Percentage of hydroelectric used
Geothermal number Percentage of geothermal used
Solar number Percentage of solar used
Wind number Percentage of wind used
Biomass number Percentage of biomass used
Total.Renewable.Energy number Total percentage of renewable used

Descriptive Analysis - State Level

The state level data looks at energy consumption in 2018. I want to understand how a states population may impact the overall energy consumption and what types of energy is being consumed.

Energy Consumption By State Population

I find the first graph predictable. As the population increases so does the energy consumption. Each dot represents a different state. I found it interesting that there state with the highest energy consumption was not the most populous state. They have more than 10,000,000 fewer people than the most populous state while consuming more than 7500 more units of energy.

Total Energy Consumption By State

This second graph provides more context to the first graph. We can now see who the culprits are for the highest energy consumption. I am curious why Texas has the highest energy consumption by such a wide margin, but are less populous by approximately 10,000,000. Speculation has drawn me to more manufacturing, hotter climate throughout the entire state, or the population is more spread out (fewer apartment buildings).

Top 5 By Popultion Energy Consumption

I wanted to understand the how the states most energy consumption receive their energy. The graph is ordered in terms of most populous to least populous and each bar represents their total energy consumption. I did find it interesting that California and Texas have almost the same amount of renewable energy being consumed, considering the politics of both sates.

Descriptive Analysis - Year Level

I was very interested to understand the sources of our energy consumption over the years and how it has changed. The following graphs really put it into perspective for me in terms of what type of energy we are using and how “far” we have come as a nation. Perhaps I was being naive or maybe i was just watching the wrong channel on the television or I may just be to involved in the production of a quality ranch dressing (that’s what we make at my manufacturing plant).

Overall Energy Use

I found this graph to be very interesting. The vast amount of space between fossil fuels and nuclear energy and renewable energy was surprising to me. But you can see the trend of fossil fuels starting to decrease and renewbles/Nuclear begining to increase.

Fossil Fuel Energy Use

Taking a deeper look into renewable energy, I found the divergent of coal and natural gas in the mid 2000’s very interesting. That was about the time the fracking boom took off and it is quite interesting to see how it affected the consumption of coal to shortly after the boom hit.

Renewable Energy Use

Taking a deeper look into renewable energy, I found it interesting the decrease of hydroelectric overtime. I would have expected that particular source of energy to increase over time.

Secondary Data Source

I scraped 2000 tweets that included “renewable energy” and categorized those tweets by the same name. I also scraped 2000 tweets that had included “fossil fuel”, “coal” and “natural gas” and categorized those tweets by “fossil fuel”. I then preformed a sentiment analysis utilizing the NRC lexicon and the Bing lexicon.

NRC Sentiment Analysis

The NRC sentiment analysis showed “renewable energy” to have a more positive sentiment over “fossil fuel”. Notably, “renewable energy” had higher scores in the sentiments of trust, surprise, positive, joy,and anticipation. “Fossil Fuel” had higher scores in the sentiments of anger, disgust, negative and sadness.

Bing Positive/Negative

Bing Sentiment Graph

The bing sentiment analysis yielded some interesting results. It is likely not an accurate depiction of the true sentiment of the tweet, but it is interesting to see some of the most common words in the tweets. Specifically, the fossil fuel tweets have the words “safe” and “clean” as positive. Those words are likely being used in a negative context when mentioning fossil fuels.

Predictive Analysis

As a home owner, I am often solicited to install solar panels on the roof of my home. Since this is a growing form of energy, I want to understand more about it and how it is impacted by other sources of energy.

Exploratory Analysis

My first step will be running a correlation matrix of the energy resources in the year level data set. Solar has a high correlation with all of the other sources of energy.

Regression 1

The regression yields a result the shows a high amount of correlation between the variables and the model produced a high R2. This makes sense because when one type of energy decreases, another type of energy increase to fill the void. I don’t believe the data set I have pulled is a good predictor of energy use. To understand the full impact of solar, I would need to find a data set that includes other variable like GDP, weather, or population. This is something I intend to investigate in the future.

## 
## Solar Regression 1
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                                Solar           
## -----------------------------------------------
## Natural.Gas              -0.484*** (0.092)     
## Petroleum                -4.740*** (0.912)     
## Nuclear                  -0.508*** (0.091)     
## Hydroelectric            -0.435*** (0.087)     
## Geothermal                 0.392 (0.440)       
## Wind                     -0.265** (0.110)      
## Coal                     -0.503*** (0.083)     
## Biomass                  -0.627*** (0.091)     
## Constant                 0.487*** (0.089)      
## -----------------------------------------------
## Observations                    36             
## R2                             0.985           
## Adjusted R2                    0.980           
## Residual Std. Error      0.0004 (df = 27)      
## F Statistic           217.597*** (df = 8; 27)  
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01