?fpp2plot(melsyd[,"Economy.Class"], main ="Economy class passengers: Melbourne-Sydney", xlab="Year", ylab="Thousands")
Passenger traffic shows several notable anomalies. In 1989, no passengers were carried for a period due to an industrial dispute. In 1992, load factors were reduced during a trial that converted some economy seats to business class. A sharp increase in passenger numbers occurred in the second half of 1991, while holiday effects produced large dips around the start of each year.
The series also exhibits long-term fluctuations: rising through 1987, falling in 1989, and then increasing again across 1990–1991. Some observations are missing, and in certain periods, values even drop to zero.
Antidiabetic drug sales
This time series plot of antidiabetic drug sales gives a clear visual of the data’s structure over time.
The data is from the fpp2 package, specifically the a10 dataset, which contains monthly sales of antidiabetic drugs in Australia from 1991 to 2008. The plot below illustrates the sales trend and seasonal patterns in the data.
plot(x =a10, ylab="$ million", xlab="Year", main="Antidiabetic drug sales")
Interpretation of the Time Series Plot
Clear upward trend: The sales are increasing steadily from the early 1990s to the late 2000s, reflecting exponential growth, not just a linear increase.
Strong seasonal pattern: There are regular spikes at the end of each year — likely December — followed by a drop, consistent with end-of-year stockpiling behavior due to government subsidies.
Increasing seasonal amplitude: The size of the seasonal fluctuations grows with the level of the series, suggesting a multiplicative seasonal effect.
Heteroskedasticity: The variance increases over time, which violates assumptions of constant variance — a log transformation may stabilize this.
Implications for Modeling
Use a multiplicative decomposition or fit a model on the log-transformed data.
Models like ETS(M,A,M) (multiplicative error, additive trend, multiplicative seasonality) would be appropriate.
Any forecasting model should account for both the nonlinear growth trend and the seasonality tied to policy-driven behavior.
Seasonal Plot
A seasonal plot shows how a variable behaves across the same cycle (e.g., months of the year), repeated for multiple years. It helps you:
Compare patterns across different years
Identify consistent seasonality (e.g., always low in Feb, high in Dec)
Detect unusual years that deviate from typical patterns
Spot upward or downward trends within seasons
DATA: Monthly anti-diabetic drug subsidy in Australia from 1991 to 2008 (a10)
Monthly government expenditure (millions of dollars) as part of the Pharmaceutical Benefit Scheme for products falling under ATC code A10 as recorded by the Australian Health Insurance Commission. July 1991 - June 2008.
ggseasonplot(a10)+labs(title ="Seasonal Plot: Antidiabetic Drug Sales", x ="Month", y ="Sales ($ million)")
Interpreting seasonal plot:
X-axis: Months (Jan–Dec)
Y-axis: Sales in millions of dollars
Lines: Each colored line is a year (1991 to 2008)
Insight: Sales are consistently lowest in February, with a clear upward trend over time — especially steep in 2006–2008.
When to Use a Seasonal Plot
Use it to visualize and analyze seasonal patterns in your time series data. Seasonal plots are particularly useful for identifying how different years compare in terms of seasonal behavior. They are especially helpful when you want to understand how a variable behaves during specific months across multiple years.
You have monthly or quarterly time series data
You suspect seasonal effects (e.g., sales, demand, prices)
You want to compare how each year behaves month by month
Month Plot
The monthplot() function in R is used to visualize seasonal effects in a time series, typically for monthly data. It helps you see how each month behaves on average across multiple years, highlighting patterns such as seasonality or outliers.
Purpose
Isolate and display seasonality — especially useful for monthly time series data.
Compare the average deviation of each month from the overall trend.
Visually assess which months are consistently higher or lower than the yearly average.
How It Works
monthplot() is most often applied to time series objects (e.g., ts, tslm, decompose, stl).
It usually works on the seasonal component obtained after decomposing a time series (e.g., via decompose()).
It creates a small line chart for each month across years, or plots the average seasonal effect for each month.
?monthplotmonthplot(a10, ylab ="$ million", xlab ="Month", xaxt ="n", main ="Seasonal deviation plot: antidiabetic drug sales")axis(1, at =1:12, labels =month.abb, cex =0.8)
Understand jitter() in a Plot
Sometimes in a scatterplot, multiple data points have the same or similar values on one axis — causing them to overlap and hide patterns. jitter() adds small noise to make these points visible.
Example: Students’ Scores in Two Subjects
Let’s simulate student scores, where many students got the same grade:
# Create example dataset.seed(123)students<-data.frame( Math =sample(x =c(70, 75, 80, 85, 90), size =100, replace =TRUE), English =sample(x =c(70, 75, 80, 85, 90), size =100, replace =TRUE))# Without jitter (many overlapping points)plot(students$Math, students$English, xlab ="Math Score", ylab ="English Score", main ="Without jitter: overlapping points", pch =19, col ="steelblue")
You’ll see clusters of points overplotted, especially where many students got the same pair of scores.
You want to see all the observations, even when values repeat.
#install.packages("fpp_0.5.tar.gz", repos = NULL, type = "source")library(fpp)# data archived here
Loading required package: lmtest
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
Loading required package: tseries
Attaching package: 'fpp'
The following objects are masked from 'package:fpp2':
ausair, ausbeer, austa, austourists, debitcards, departures,
elecequip, euretail, guinearice, oil, sunspotarea, usmelec
plot(x =fpp::fuel[,5], y =fpp::fuel[,8], xlab="City mpg", ylab="Carbon footprint")
plot(x =jitter(fpp::fuel[,5]), y =jitter(fpp::fuel[,8]), xlab="City mpg", ylab="Carbon footprint")
Scatterplot Matrix: Exploring Fuel Efficiency and Emissions
To explore relationships between fuel efficiency, engine size, and emissions, we use a scatterplot matrix based on the fuel dataset. This dataset contains information about various vehicle models, including their engine specifications and environmental impact.
We focus on four key quantitative variables:
Litres – Engine size in litres
City – City fuel efficiency in miles per gallon (MPG)
Highway – Highway fuel efficiency in MPG
Carbon – Estimated CO₂ emissions in metric tons per year
# pairs(fpp::fuel[,-c(1:2,4,7)], pch=19)# Create scatterplot matrix with selected numeric variablespairs(fuel[, c("Litres", "City", "Highway", "Carbon")], pch =19, main ="Scatterplot Matrix: Engine Size, MPG, and CO₂ Emissions")
?pairs
Key Observations
Engine Size vs MPG
Vehicles with larger engines (Litres) tend to have lower city and highway MPG. This inverse relationship is expected: bigger engines typically consume more fuel.
Fuel Efficiency vs Emissions
Both City and Highway fuel economy are negatively correlated with Carbon. That is, fuel-efficient vehicles emit less CO₂ per year.
Engine Size vs Carbon Emissions
There is a strong positive correlation between Litres and Carbon. Larger engines result in higher carbon output due to more fuel burned.
City vs Highway MPG
These two metrics are strongly positively correlated—cars that are efficient in the city tend to be efficient on highways too.
Visualize pairwise relationships across multiple numeric variables
Detect correlations, outliers, and clustering
Quickly screen for non-linear relationships or redundancy
This tool is especially helpful during exploratory data analysis (EDA), where understanding the structure and interaction between variables is key before modeling.
Enhanced Pairwise Plot: Fuel Dataset
The kdpairs() function from the car package provides an enhanced version of the classic scatterplot matrix. It adds:
2D density contour plots (lower triangle)
Correlation values and smoothed trend lines (upper triangle)
Histograms with density overlays (diagonal)
This allows us to explore pairwise relationships between variables more effectively than with a standard pairs() plot.
We use the fuel dataset (from the archived fpp package), which includes data on vehicle engine size, fuel efficiency, and carbon emissions.
Warning in par(usr): argument 1 does not name a graphical parameter
Warning in par(usr): argument 1 does not name a graphical parameter
Warning in par(usr): argument 1 does not name a graphical parameter
Warning in par(usr): argument 1 does not name a graphical parameter
Warning in par(usr): argument 1 does not name a graphical parameter
Warning in par(usr): argument 1 does not name a graphical parameter
Warning in par(usr): argument 1 does not name a graphical parameter
Warning in par(usr): argument 1 does not name a graphical parameter