As a human, I am quite interested in the energy consumption habits of the United States - specifically what kinds of energy we are consuming. Especially in modern times and the advancement of renewable energy, I am curious of the direction we are going as a nation in how we consume energy and what it has looked like over the years.
I would like to understand what types of energy is being consumed and to what extent that energy is being consumed. I also have a curiosity in how the states individually are consuming this energy and how this relates to their population and other various factors.
The data that was for my analysis was scraped from the Britannica ProCon website and their data was pulled from the US Energy Information Administration. I scraped energy data from two separate Britannica pages and combined the tables on each page to create two separate data frames. One to analyze energy consumption at the state level data as of 2018 and one to analyze energy consumption from 1960-2019. Below are the complete data sets that are used in the analysis. The data frames contain usage information of fossil fuels, nuclear energy and renewable energy.
Units are in trillions of BTU
| Variables in Data Set | Variable Type | Description |
|---|---|---|
| State…1 | character | US State |
| Population | number | total population as of 2019 |
| Total.Energy.Consumption | number | Total energy consumption |
| Total.Fossil.Fuel.Energy.Consumption | number | Total fossil fuel consumption |
| Total.Alternative.Energy.Consumption | number | Total Alternative energy consumption including nuclear energy |
| Coal | number | Coal consumption |
| Natural.Gas | number | Natural Gas Consumption |
| Petroleum | number | Petroleum Consumption |
| Nuclear | number | Nuclear consumption |
| Hydroelectric | number | Hydroelectric consumption |
| Biomass | number | Biomass consumption |
| Geothermal | number | Geothermal consumption |
| solar | number | Solar consumption |
| Wind | number | wind consumption |
| Total.Renewable.Energy | number | Total renewable energy consumption excluding nuclear energy |
The values of this data is a percentage
| Variables in Data Set | Variable Type | Description |
|---|---|---|
| Year …1 | number | Years from 1960 - 2019 |
| Coal | number | Percentage of coal used |
| Natural.Gas | number | Percentage of natural gas used |
| Petroleum | number | Percentage of petroleum used |
| Total.Fossil.Fuels | number | Total percentage of fossil fuel used |
| Nuclear | number | Percentage of nuclear used |
| Hydroelectric | number | Percentage of hydroelectric used |
| Geothermal | number | Percentage of geothermal used |
| Solar | number | Percentage of solar used |
| Wind | number | Percentage of wind used |
| Biomass | number | Percentage of biomass used |
| Total.Renewable.Energy | number | Total percentage of renewable used |
The state level data looks at energy consumption in 2018. I want to understand how a states population may impact the overall energy consumption and what types of energy is being consumed.
I find the first graph predictable. As the population increases so does the energy consumption. Each dot represents a different state. I found it interesting that there state with the highest energy consumption was not the most populous state. They have more than 10,000,000 fewer people than the most populous state while consuming more than 7500 more units of energy.
This second graph provides more context to the first graph. We can now see who the culprits are for the highest energy consumption. I am curious why Texas has the highest energy consumption by such a wide margin, but are less populous by approximately 10,000,000. Speculation has drawn me to more manufacturing, hotter climate throughout the entire state, or the population is more spread out (fewer apartment buildings).
I wanted to understand the how the states most energy consumption receive their energy. The graph is ordered in terms of most populous to least populous and each bar represents their total energy consumption. I did find it interesting that California and Texas have almost the same amount of renewable energy being consumed, considering the politics of both sates.
I was very interested to understand the sources of our energy consumption over the years and how it has changed. The following graphs really put it into perspective for me in terms of what type of energy we are using and how “far” we have come as a nation. Perhaps I was being naive or maybe i was just watching the wrong channel on the television or I may just be to involved in the production of a quality ranch dressing (that’s what we make at my manufacturing plant).
I found this graph to be very interesting. The vast amount of space between fossil fuels and nuclear energy and renewable energy was surprising to me. But you can see the trend of fossil fuels starting to decrease and renewbles/Nuclear begining to increase.
Taking a deeper look into renewable energy, I found the divergent of coal and natural gas in the mid 2000’s very interesting. That was about the time the fracking boom took off and it is quite interesting to see how it affected the consumption of coal to shortly after the boom hit.
Taking a deeper look into renewable energy, I found it interesting the decrease of hydroelectric overtime. I would have expected that particular source of energy to increase over time.
I scraped 2000 tweets that included “renewable energy” and categorized those tweets by the same name. I also scraped 2000 tweets that had included “fossil fuel”, “coal” and “natural gas” and categorized those tweets by “fossil fuel”. I then preformed a sentiment analysis utilizing the NRC lexicon and the Bing lexicon.
The NRC sentiment analysis showed “renewable energy” to have a more positive sentiment over “fossil fuel”. Notably, “renewable energy” had higher scores in the sentiments of trust, surprise, positive, joy,and anticipation. “Fossil Fuel” had higher scores in the sentiments of anger, disgust, negative and sadness.
The bing sentiment analysis yielded some interesting results. It is likely not an accurate depiction of the true sentiment of the tweet, but it is interesting to see some of the most common words in the tweets. Specifically, the fossil fuel tweets have the words “safe” and “clean” as positive. Those words are likely being used in a negative context when mentioning fossil fuels.
As a home owner, I am often solicited to install solar panels on the roof of my home. Since this is a growing form of energy, I want to understand more about it and how it is impacted by other sources of energy.
My first step will be running a correlation matrix of the energy resources in the year level data set. Solar has a high correlation with all of the other sources of energy.
The regression yields a result the shows a high amount of correlation between the variables and the model produced a high R2. This makes sense because when one type of energy decreases, another type of energy increase to fill the void. I don’t believe the data set I have pulled is a good predictor of energy use. To understand the full impact of solar, I would need to find a data set that includes other variable like GDP, weather, or population. This is something I intend to investigate in the future.
##
## Solar Regression 1
## ===============================================
## Dependent variable:
## ---------------------------
## Solar
## -----------------------------------------------
## Natural.Gas -0.484*** (0.092)
## Petroleum -4.740*** (0.912)
## Nuclear -0.508*** (0.091)
## Hydroelectric -0.435*** (0.087)
## Geothermal 0.392 (0.440)
## Wind -0.265** (0.110)
## Coal -0.503*** (0.083)
## Biomass -0.627*** (0.091)
## Constant 0.487*** (0.089)
## -----------------------------------------------
## Observations 36
## R2 0.985
## Adjusted R2 0.980
## Residual Std. Error 0.0004 (df = 27)
## F Statistic 217.597*** (df = 8; 27)
## ===============================================
## Note: *p<0.1; **p<0.05; ***p<0.01