A: Time interval: Quarterly because the series shows strong seasonal fluctuations and a long-term trend. Brick production increases steadily from the late 1950s through the 1970s, peaks in the late 1970s and early 1980s, and then declines with increased volatility in later years. These patterns are consistent with construction cycles and broader economic conditions.
Use plot_series() to produce a time plot of each series.
For the last plot, modify the axis labels and title.
A: Time interval: Annual because the lynx series exhibits strong cyclical behavior with repeated peaks and troughs over multi-year periods. The magnitude of these cycles varies through time, but there is no seasonal pattern since the data are annual.
Use plot_series() to produce a time plot of each series.
For the last plot, modify the axis labels and title.
import matplotlib.pyplot as pltimport matplotlib.dates as mdatespelt["ds"] = pd.to_datetime(pelt["ds"], errors="coerce")lynx = pelt[pelt["unique_id"] =="lynx"].dropna(subset=["ds", "y"]).copy()plt.figure(figsize=(10,4))plt.plot(lynx["ds"], lynx["y"])plt.title("Lynx Trappings")plt.xlabel("Year")plt.ylabel("Number of Lynx")ax = plt.gca()ax.xaxis.set_major_locator(mdates.YearLocator(10))ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y"))plt.tight_layout()plt.show()
GOOG_Close
Use info() to find out about the data in each series.
import pandas as pdgafa = pd.read_csv("gafa_stock.csv")gafa["ds"] = pd.to_datetime(gafa["ds"], errors="coerce")gafa.info()
A: Time Interval: Daily because the Google closing price shows a clear upward trend across the sample period, with noticeable periods of higher volatility. Price movements cluster in certain intervals, which is typical for financial series. There is no obvious seasonal pattern at the daily scale.
Use plot_series() to produce a time plot of each series.
For the last plot, modify the axis labels and title.
import matplotlib.pyplot as pltimport matplotlib.dates as mdatesgoog_close = gafa[gafa["unique_id"] =="GOOG_Close"].dropna(subset=["ds", "y"]).copy()plt.figure(figsize=(10,4))plt.plot(goog_close["ds"], goog_close["y"])plt.title("Google Daily Closing Price (GOOG_Close)")plt.xlabel("Date")plt.ylabel("Closing Price")ax = plt.gca()ax.xaxis.set_major_locator(mdates.YearLocator())ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y"))plt.tight_layout()plt.show()
Demand
Use info() to find out about the data in each series.
import pandas as pdvic = pd.read_csv("vic_elec.csv")vic["ds"] = pd.to_datetime(vic["ds"], errors="coerce")vic.info()
A: Time Interval: Half-Hourly because electricity demand exhibits strong and regular short-term patterns, reflecting daily and weekly seasonality. Demand levels fluctuate consistently within each day, with broader changes over time. Occasional spikes and dips suggest the influence of extreme weather, holidays, or other unusual events.
Use plot_series() to produce a time plot of each series.
For the last plot, modify the axis labels and title.
import matplotlib.pyplot as pltimport matplotlib.dates as mdatesdemand = vic[vic["unique_id"] =="Demand"].dropna(subset=["ds", "y"]).copy()plt.figure(figsize=(10,4))plt.plot(demand["ds"], demand["y"])plt.title("Half-hourly Electricity Demand in Victoria")plt.xlabel("Time")plt.ylabel("Electricity Demand (MWh)")ax = plt.gca()ax.xaxis.set_major_locator(mdates.YearLocator())ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y"))plt.tight_layout()plt.show()
Exercise 2.2
Use query() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.
The peak closing prices for AAPL, AMZN, FB, and GOOG occur at different points in time, reflecting stock-specific performance rather than a common market peak. While all four series show overall growth, the timing of their maximum values varies due to differences in business conditions and investor expectations. This highlights how individual stock behavior can diverge even within the same market sector.
Exercise 2.3
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
You can read the data into Python with the following script:
import pandas as pdtute1 = pd.read_csv("tute1.csv")tute1.head()
Quarter
Sales
AdBudget
GDP
0
1981-03-01
1020.2
659.2
251.8
1
1981-06-01
889.2
589.0
290.9
2
1981-09-01
795.0
512.5
290.8
3
1981-12-01
1003.9
614.1
292.4
4
1982-03-01
1057.7
647.2
279.1
Convert the data to time series
import pandas as pdtute1["ds"] = pd.PeriodIndex(tute1["Quarter"], freq="Q").to_timestamp()
Construct time series plots of each of the three series
import matplotlib.pyplot as pltplt.figure(figsize=(10,4))plt.plot(tute1["ds"], tute1["Sales"], label="Sales")plt.plot(tute1["ds"], tute1["AdBudget"], label="AdBudget")plt.plot(tute1["ds"], tute1["GDP"], label="GDP")plt.title("Quarterly Sales, Advertising Budget, and GDP")plt.xlabel("Time")plt.ylabel("Value (inflation-adjusted)")plt.legend()plt.tight_layout()plt.show()
The file tute1.csv contains quarterly data from 1981 to 2005, including sales for a small company, its advertising budget, and gross domestic product, all adjusted for inflation. After converting the time variable to a datetime format, time series plots were constructed for each variable to examine their overall behavior. These plots allow for visual comparison of long-term trends and variability across the three series.
Exercise 2.4
The us_total.csv contains data on the demand for natural gas in the US.
Download us_total.csv from the book website read in the csv file using pd.read_csv().
import pandas as pdimport matplotlib.pyplot as pltus_total = pd.read_csv("us_total.csv")us_total.head()
ds
unique_id
y
0
1997
Alabama
324158
1
1998
Alabama
329134
2
1999
Alabama
337270
3
2000
Alabama
353614
4
2001
Alabama
332693
Create a dataframe from us_total with year as the index.
Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).
import matplotlib.dates as mdatesus_total = pd.read_csv("us_total.csv")us_total.columns = us_total.columns.str.strip()new_england_states = ["Maine","Vermont","New Hampshire","Massachusetts","Connecticut","Rhode Island"]ne = us_total[us_total["unique_id"].isin(new_england_states)].copy()plt.figure(figsize=(10,4))for state in new_england_states: s = ne[ne["unique_id"] == state] plt.plot(s["ds"], s["y"], label=state)plt.title("Annual Natural Gas Consumption in New England")plt.xlabel("Year")plt.ylabel("Natural Gas Consumption")plt.legend()plt.tight_layout()plt.show()
The plot shows annual natural gas consumption for New England states over time. Massachusetts and Connecticut consistently account for the largest share of regional consumption, with Massachusetts remaining the dominant consumer throughout the period. Connecticut shows a clear upward trend, particularly after the mid-2000s.
In contrast, Maine, New Hampshire, and Rhode Island exhibit more moderate usage levels with relatively smaller fluctuations, while Vermont remains the lowest consumer across all years. Overall, the region shows steady growth in natural gas demand, driven primarily by increases in the larger states.
Exercise 2.5
Download tourism.xlsx from the book website and read in it using pd.read_excel().
import pandas as pdimport matplotlib.pyplot as plttourism = pd.read_excel("tourism.xlsx", engine="calamine")tourism.head()
The tourism data were grouped by Region and Purpose to compute the average number of overnight trips. The combination with the highest average trips corresponds to large population and travel-heavy regions. The data were then aggregated by State to obtain total trips, which provides a simplified view of overall tourism demand at the state level.
Exercise 2.8
Use the following graphics functions: plot_series(), seasonal_decompose(), lag_plot(), plot_acf() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.
Can you spot any seasonality, cyclicity and trend?
Yes. Total Private Employment shows a strong long-term upward trend with noticeable cyclical downturns during recession periods, but little seasonality. Bricks, H02 cost, and US gasoline barrels all display clear seasonality, with repeating within-year patterns. Hare does not exhibit seasonality, but instead shows pronounced multi-year cycles.
What do you learn about the series?
Each series exhibits different underlying structures. Employment is dominated by trend and business cycles, indicating strong persistence. Bricks and gasoline demand are strongly seasonal and autocorrelated, reflecting production and consumption patterns tied to the calendar. H02 cost shows both trend and seasonality, suggesting steadily rising costs with recurring annual behavior. The hare series is highly volatile and cyclical, with sharp rises and falls rather than steady growth.
What can you say about the seasonal patterns?
Seasonality is strongest and most regular in Bricks, H02 cost, and US gasoline barrels, as confirmed by the seasonal decomposition and ACF plots showing peaks at seasonal lags. These seasonal effects appear stable over time. Total Private Employment shows only weak seasonality relative to its trend, while Hare shows no seasonal pattern at all.
Can you identify any unusual years?
Yes. Total Private Employment shows sharp declines during major economic downturns, which stand out as unusual periods. US gasoline barrels displays abrupt drops during periods of reduced demand or disruption. Bricks shows irregular spikes and declines that deviate from its typical seasonal pattern. The Hare series contains extreme peak and trough years associated with population booms and crashes, making those years clearly unusual.