Stuti_1nt23is221_program3

Author

Stuti Shamsundar Kulkarni

Implement an R function to generate a line graph depicting the trend of a time-series dataset, with separate lines for each group, utilizing ggplot2’s group aesthetic.

Introduction:

This document demonstrates how to create a time-series line graph using the built-in AirPassengers dataset in R.

The dataset contains monthly airline passenger counts from 1949 to 1960. We will use ggplot2 to visualize trends , with separate lines for each year.

Step 1: Load necessary libraries.

library(tidyr)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(ggplot2)

Step 2Load the Built-in AirPassengers Dataset

The AirPassengers dataset is a time series object in R.

We first convert it into a dataframe to use it with ggplot2.

#convert time-series data to a dataframe
AirPassengers
     Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1949 112 118 132 129 121 135 148 148 136 119 104 118
1950 115 126 141 135 125 149 170 170 158 133 114 140
1951 145 150 178 163 172 178 199 199 184 162 146 166
1952 171 180 193 181 183 218 230 242 209 191 172 194
1953 196 196 236 235 229 243 264 272 237 211 180 201
1954 204 188 235 227 234 264 302 293 259 229 203 229
1955 242 233 267 269 270 315 364 347 312 274 237 278
1956 284 277 317 313 318 374 413 405 355 306 271 306
1957 315 301 356 348 355 422 465 467 404 347 305 336
1958 340 318 362 348 363 435 491 505 404 359 310 337
1959 360 342 406 396 420 472 548 559 463 407 362 405
1960 417 391 419 461 472 535 622 606 508 461 390 432
as.numeric(AirPassengers)
  [1] 112 118 132 129 121 135 148 148 136 119 104 118 115 126 141 135 125 149
 [19] 170 170 158 133 114 140 145 150 178 163 172 178 199 199 184 162 146 166
 [37] 171 180 193 181 183 218 230 242 209 191 172 194 196 196 236 235 229 243
 [55] 264 272 237 211 180 201 204 188 235 227 234 264 302 293 259 229 203 229
 [73] 242 233 267 269 270 315 364 347 312 274 237 278 284 277 317 313 318 374
 [91] 413 405 355 306 271 306 315 301 356 348 355 422 465 467 404 347 305 336
[109] 340 318 362 348 363 435 491 505 404 359 310 337 360 342 406 396 420 472
[127] 548 559 463 407 362 405 417 391 419 461 472 535 622 606 508 461 390 432
as.numeric(time(AirPassengers))
  [1] 1949.000 1949.083 1949.167 1949.250 1949.333 1949.417 1949.500 1949.583
  [9] 1949.667 1949.750 1949.833 1949.917 1950.000 1950.083 1950.167 1950.250
 [17] 1950.333 1950.417 1950.500 1950.583 1950.667 1950.750 1950.833 1950.917
 [25] 1951.000 1951.083 1951.167 1951.250 1951.333 1951.417 1951.500 1951.583
 [33] 1951.667 1951.750 1951.833 1951.917 1952.000 1952.083 1952.167 1952.250
 [41] 1952.333 1952.417 1952.500 1952.583 1952.667 1952.750 1952.833 1952.917
 [49] 1953.000 1953.083 1953.167 1953.250 1953.333 1953.417 1953.500 1953.583
 [57] 1953.667 1953.750 1953.833 1953.917 1954.000 1954.083 1954.167 1954.250
 [65] 1954.333 1954.417 1954.500 1954.583 1954.667 1954.750 1954.833 1954.917
 [73] 1955.000 1955.083 1955.167 1955.250 1955.333 1955.417 1955.500 1955.583
 [81] 1955.667 1955.750 1955.833 1955.917 1956.000 1956.083 1956.167 1956.250
 [89] 1956.333 1956.417 1956.500 1956.583 1956.667 1956.750 1956.833 1956.917
 [97] 1957.000 1957.083 1957.167 1957.250 1957.333 1957.417 1957.500 1957.583
[105] 1957.667 1957.750 1957.833 1957.917 1958.000 1958.083 1958.167 1958.250
[113] 1958.333 1958.417 1958.500 1958.583 1958.667 1958.750 1958.833 1958.917
[121] 1959.000 1959.083 1959.167 1959.250 1959.333 1959.417 1959.500 1959.583
[129] 1959.667 1959.750 1959.833 1959.917 1960.000 1960.083 1960.167 1960.250
[137] 1960.333 1960.417 1960.500 1960.583 1960.667 1960.750 1960.833 1960.917
class(AirPassengers)
[1] "ts"
data<-data.frame(
  Date = seq(as.Date("1949-01-01"),
             by = "month",
             length.out = length(AirPassengers)
             ),
  Passengers = as.numeric(AirPassengers),
  year = as.factor(format(seq(as.Date("1949-01-01"),
                              by= "month",
                              length.out = length(AirPassengers)) ,"%Y"))
)

#display first few rows
head(data,n=20)
         Date Passengers year
1  1949-01-01        112 1949
2  1949-02-01        118 1949
3  1949-03-01        132 1949
4  1949-04-01        129 1949
5  1949-05-01        121 1949
6  1949-06-01        135 1949
7  1949-07-01        148 1949
8  1949-08-01        148 1949
9  1949-09-01        136 1949
10 1949-10-01        119 1949
11 1949-11-01        104 1949
12 1949-12-01        118 1949
13 1950-01-01        115 1950
14 1950-02-01        126 1950
15 1950-03-01        141 1950
16 1950-04-01        135 1950
17 1950-05-01        125 1950
18 1950-06-01        149 1950
19 1950-07-01        170 1950
20 1950-08-01        170 1950

Step-3: Define a function

plot_time_series<-function(data, x_col, y_col, group_col, title="Air Passenger Trends" ){
  ggplot(data, aes_string(x=x_col,y= y_col , color = group_col,group = group_col))+
    geom_line(size = 1.2)+
    geom_point(size = 2)+
    labs(title = title,
         x="year",
         y="number of passengers",
         color = "year") +
    theme_minimal()+
    theme(legend.position = "top")
}

#call the function
plot_time_series(data,"Date","Passengers" , "year", "Trend of airline passengers over time")
Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
ℹ Please use tidy evaluation idioms with `aes()`.
ℹ See also `vignette("ggplot2-in-packages")` for more information.
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.