Introduction to Non-Linear Models
2025-04-03
Review of Log and Natural Log
Non-Linear Models
Example 1: Old Faithful Eruption Intervals
Example 2: Ferrari Acceleration Time
Example 3: BMW Fuel Efficiency
In-class Polling (Session ID: bua345s25)
Session ID: bua345s25
Review Question from Lecture 19:
If the estimated log odds(Y’) of you making a late payment on your credit card in -0.257, what is the PERCENT CHANCE you will submit a late payment. Round percentage to one decimal place.
Recall:
Why do these matter?
Up until now:
OR
We used a linear transformation to transform a curvilinear relationship into straight line relationship (MAS 261)
Transformations like LN(Y), are common and effectively used in Finance, Accounting, etc.
An alternative is to model the data as non-linear or curvilinear.
Pros:
No transformation and back-transforming of estimates
Model fits the data as shown
Common Simple Linear Regression models can be done in Excel (or R)
R functions can expedite process (next week)
Cons
Requires trial and error (like transformation) to determine model
Interpretation must account for non-linear relationship
For Multiple Linear Regression would have to be done in R
Suppose you are examining thermal energy for a new start-up company.
As part of your research, you take a trip to the most famous geyser in the US – Old Faithful
The park ranger explains that it has a highly predictable geothermal output
You decide to fit a model to this relationship based on one month of data.
Adding a trendline in Excel is very quick.
The provided worksheets will allow you to compare linear and non-linear options.
Based on our data, for an Old Faithful eruption of 2 minutes (X = 2), the average duration until the next eruption is 57 minutes, Y = 57
.
Use this average to find the residual for the linear model:
Linear Model: \(Y = 34.347 + 10.537\times X\)
In Excel: = 57 – (34.347 + 10.537*X)
In R:
X <- 2
57 - (34.347 + 10.537*X)
Answer will be decimal minutes, not minutes and seconds.
Based on our data, for an Old Faithful eruption of 2 minutes (X = 2), the average duration until the next eruption is 57 minutes, Y = 57
.
Use this average to find the residual for the power model:
Power Model: \(Y = 39.144 \times X^{0.487}\)
In Excel: = 57 – (39.144*X^(0.487))
In R:
X <- 2
57 - (39.144*X^(0.487))
Answer will be decimal minutes, not minutes and seconds.
From a tourism point of view for Old Faithful, a prediction error of less than 5 minutes does not matter.
Based on the Adjusted \(R^2\) for the Linear and Power models (Slide 15) and the residuals found in the two previous questions, which model would you choose?
A. Power model because it more accurate.
B. Linear model because difference in accuracy is negligible and linear model is simpler
Use R or Excel to calculate the estimated time in seconds it takes for the Ferrari to go from 0 to 100 mph (X = 100) for both the Exponential model and the Polynomial Model (shown below).
Exponential Model:
\(\hat{Y} = 0.9936 \times e^{0.0154X}\)
Polynomial Model:
\(1.8123 - 0.0165X + 0.0005X^2\)
Select which statement(s) is/are true:
A. The polynomial model estimates a longer time for 0 to 100 mph acceleration than the exponential model.
B. The exponential model estimates a longer time for 0 to 100 mph acceleration than the polynomial model.
C. The two model estimates are within half a second of each other.
D. The two model estimates are within 1 second of each other
Given that the difference in Adjusted \(R^2\) between the Polynomial and Exponential models for the Ferrari data is negligible, you opt for the model that is easier to explain to someone without any quantitative analytical training.
Which model do you choose and why?
Note this is subjective and the answer depends on discipline.
As part of new sales campaign for BMW, you want to model the fuel economy (MPG) of the BMW 430i based on speed.
Although BMW sells electric cars, you also have customers that want gas vehicles.
You have a small data set examining average fuel economy at 8 different speeds.
You notice the data definitely aren’t linear but want to fit the model as accurately as possible.
Which model provides the best fit?
It is clear from the provided plot, that a linear mode would be inappropriate for these data.
Other than the linear model, which model choice is ALWAYS inappropriate for concave down relationships like the BMW data?
Hint: If you are unsure, examine the trendlines in Excel or the R html file.
A. Exponential
B. Logarithmic
C. Polynomial
D. Power
Based on the plots and Adjusted \(R^2\) values (see below), which model fits this relationship the best for the BMW fuel economy data?
Non-linear model are a useful and flexible alternative to linear transformations.
IN R models are specified with the transformations.
Excel is great for comparing multiple trendline options quickly
For better information on model fit and residuals, software such as R is required.
Important to understand for BUA 345:
Lecture 23 will look at unconstrained optimization using data like the BMW data set.
Including today, there are six lectures and engagement questions remaining.
To submit an Engagement Question or Comment about material from Lecture 22: Submit it by midnight today (day of lecture).