Visit my website for more like this! I would love to hear your feedback (seriously).

```
require(astsa, quietly=TRUE, warn.conflicts=FALSE)
require(knitr)
```

`## Loading required package: knitr`

`library(ggplot2)`

This lesson will describe some of the important features that need to be considered when dealing with time series analysis. Here we focus on a single time series, future lessons will incorporate more series.

- A
**univariate**time series is a sequence of measurements of the same variable collected over time. Most often, they occur at regular time intervals.

However, the major different from standard linear models is that the date/time data are not necessarily identically distributed, that is to say, the ordering matters, thus there is often dependency in the data.

The core objective of tsa is to generate a model that describes the true underlying trend of the time series. With a properly specified mode we can:

- Describe important feature of the ts pattern.
- Explain how the past affects the future.
- Explain how two time series interact.
- Forecast future values of the time series.
- Other real-life applications like serving as a control standard for a variable that measures some manufacturing operation.

There are two basic types of “time domain” models:

**ARIMA models**(Auto regressive Integrated Moving Average), these are models that relate the present value of a series to past values and past prediction errors.**Ordinary regression models**that use time series as`x`

variables. Like in classical statistics, these are often helpful for a first look at the data, and serve as a starting point for some forecasting methods.

When first looking at a time series it is important too…

- Is there an underlying
**trend**? - Is there
**seasonality**, or a regular repeating pattern of highs and lows? - Is there a
**long-run cycle**or period unrelated to the seasonality? - Are their
**outliers** - Is there
**constant variance**over time, or is it non-constant? - Are there any
**abrupt changes**to any aspects of the series?

Let’s look at some examples:

Linear models are applied frequently in time series analysis, and general statistics. Time series can be described by simple linear models, or in more complex cases, local regression models, polynomial models, and splines.

Let’s create a simple linear model for global temperature data. If you need a refresher on regression, I have compiled a modest tutorial in an IPython notebook.

`summary(fit<- lm(gtemp~time(gtemp)))`

```
##
## Call:
## lm(formula = gtemp ~ time(gtemp))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.3195 -0.0972 0.0008 0.0825 0.2938
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.12e+01 5.69e-01 -19.7 <2e-16 ***
## time(gtemp) 5.75e-03 2.92e-04 19.6 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.125 on 128 degrees of freedom
## Multiple R-squared: 0.751, Adjusted R-squared: 0.749
## F-statistic: 386 on 1 and 128 DF, p-value: <2e-16
```

```
plot(gtemp, type='o', ylab='Global Temperature')
abline(fit) # Add regression line
```

- There is an apparent
**constant trend**upwards, and the time series seems to slowly wander up and down along this upward mean. We can see that the series is fairly centered around a mean value (red line). - There is no obvious
**seasonality**. - There are no potential
**outliers**. - Hard to visual judge if the variance remains constant, but it appears fairly regular.

Now consider a more complex multiple regression model that predicts cardiovascular death using temperature and particulate matter pollution.

```
par(mfrow=c(3, 1))
plot(cmort, main='Cardiovascular Mortality')
plot(tempr, main='Temperature')
plot(part, main='Particulates')
```

`pairs(cbind(Mortality=cmort, Temperature=tempr, Particulates=part))`