This assignment steps through programming syntax with
ggplot2 and introduces the formula interface with the
lm() function. The grammar of graphics can help create
meaningful visualizations which communicate Exploratory Data Analysis
(EDA) findings. Visualizations will provide context to the model
results, which might be opaque and difficult to interpret. In this
assignment you will demonstrate basic syntax, go through a detailed EDA
of a common data set, and then begin to work through the syntax of
fitting a model.
If you need help with understanding the R syntax please see the R4DS book and/or the R tutorial videos and demos available on the Canvas site.
Certain code chunks are created for you. Each code chunk has
eval=FALSE set in the chunk options. You
MUST change it to be eval=TRUE in order
for the code chunks to be evaluated when rendering the document. Setting
unfinished blocks to eval=FALSE is helpful for rendering a
work-in-progress HTML document when some code is causing errors.
You are free to add more code chunks if you would like.
The tidyverse is loaded in for you in the code chunk
below. The visualization package, ggplot2, and the data
manipulation package, dplyr, are part of the “larger”
tidyverse. This homework also uses a dataset from the
Applied Predictive Modeling (Kuhn and Johnson) textbook, which will
require you to install the AppliedPredictiveModeling
package.
library(tidyverse)
## Warning: package 'tidyr' was built under R version 4.3.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidyr)
library(AppliedPredictiveModeling)
This problem introduces key syntax of the grammar of graphics in
ggplot2 with the abalone data set. The
simple_r_intro.html available on Canvas demonstrates
visualizations in ggplot2 with the iris. Going
through that document will help with this problem.
You do not need to modify the code below. It loads the
abalone dataset and selects the first 200 samples as a
training data. The response variable, “Rings,” tells the number of
visible rings in an abalone sample (a type of sea snail), which
corresponds to its age. The other columns in the dataframe are
measurements of the mollusks that can be used to estimate the age of a
specimen without counting the rings directly.
data("abalone") # from AppliedPredictiveModeling package
# Convert to a tibble object and keep only the first 100 rows
abalone <- abalone |>
as_tibble() |>
slice(1:200) |>
mutate(Rings = Rings + rnorm(n(),0,0.1)) # add a tiny bit of jitter to the integer age
Create a histogram for Diameter from
abalone with 15 bins.
ggplot(abalone, aes(x = Diameter)) +
geom_histogram(bins = 15) +
labs(title = "Abalone Diameters")
By default, a ggplot2 histogram does not show the lines
associated with each bin (as in a bar graph). The histogram effectively
looks like a discretized distribution. To adjust this, we need to
override the default fill and color arguments
to the geom_histogram() function. Note that within
ggplot2, color is applied to line-like objects and points
while fill is applied to whole areas of the graph (think “fill in an
area”). Thus, you can have substantial control over how color is used to
visually present information within a graphic.
Even though an aesthetic can be linked to a variable, some aesthetics
can be modified “manually” and not associated with any variables within
the dataset. We use the same type of argument, but we set that argument
outside of the aes() function.
To see how this works, type color = "red" within
the geom_histogram() call. Be careful about your
commas!
ggplot(abalone, aes(x = Diameter)) +
geom_histogram(bins = 15, color="red") +
labs(title = "Abalone Diameters")
ggplot2 has many “named” colors available for use, which
you can see by calling colors(). If you really want to fine
tune your colors you are free to use the hex color codes! In this
course, we will typically stick with common colors when we manually pick
a color and/or fill.
colors() |> head(10)
## [1] "white" "aliceblue" "antiquewhite" "antiquewhite1"
## [5] "antiquewhite2" "antiquewhite3" "antiquewhite4" "aquamarine"
## [9] "aquamarine1" "aquamarine2"
To make the difference between color and fill explicit within
the histogram, change the color to color = "steelblue" and
modify the histogram’s fill by setting
fill = "springgreen1".
ggplot(abalone, aes(x = Diameter)) +
geom_histogram(bins = 15, color="steelblue", fill = "springgreen1") +
labs(title = "Abalone Diameters")
We can alter the size or thickness of the lines around each bin with
the size argument.
Set size = 1.55 within the
geom_histogram() call (using the same color scheme from
Problem 1c)).
ggplot(abalone, aes(x = Diameter)) +
geom_histogram(bins = 15, size = 1.55, color="steelblue", fill = "springgreen1") +
labs(title = "Abalone Diameters")
Lastly, the transparency of geometric objects can be altered with the
alpha argument.
Set the transparency to alpha = 0.5 within the
geom_histogram() call.
ggplot(abalone, aes(x = Diameter)) +
geom_histogram(bins = 15, alpha = 0.5, size = 1.55, color="steelblue", fill = "springgreen1") +
labs(title = "Abalone Diameters")
Finally, modify the plot from 1e) to show the histogram of
WholeWeight instead of Diameter. You should
choose a different color scheme, setting the color and
fill arguments to be unique from the previous figure. Also,
add a title by adding a labs() component to your
ggplot.
Create a histogram for the WholeWeight variable
and include a title above the figure
ggplot(abalone, aes(x = WholeWeight)) +
geom_histogram(bins = 15, alpha = 0.5, size = 1.55, color="steelblue", fill = "springgreen1") +
labs(title = "Abalone Diameters")
In addition to setting things like color and fill manually, you can
also map variables in your dataset to those aesthetics to control
elements of the figure based on the samples themselves. In this problem,
you will create three different histograms in the same figure by setting
the fill aesthetic to be the Type column in
the data. This will display the distribution of
WholeWeight, but separately according to the three levels
of Type.
Pipe abalone into ggplot and set the mapping
argument such that the x aesthetic is mapped to
WholeWeight, and fill to Type.
Then add a histogram to the plot
ggplot(abalone, aes(x = WholeWeight, fill = Type)) +
geom_histogram(bins = 15, alpha = 0.5, size = 1.55, color="steelblue", fill = "springgreen1") +
labs(title = "Distribution of Whole Weight by Abalone Type")
Sometimes it is useful to plot multiple distributions in the same
figure, as in 1g), to see whether any obvious differences can be
observed. However, since the plots are added on top of each other it can
be difficult to make out the individual distributions. As an
alternative, we can use ggplot’s facet_wrap function to
create separate subplots for each level variable Type. The
first argument to facet_wrap is a formula, which includes a
tilde ~ character and, to its right, the name of the
categorical variable you want to facet based on,
e.g. facet_wrap(~cat_input). See Section 1.5.4 in R4DS for
more information on faceting.
To the code from 1g), add a call to facet_wrap
that uses the Type column to create subplots. Additionally,
in your call to geom_histogram, include the argument
show.legend = FALSE, since the subplots will already
contain the names of the factor levels.
ggplot(abalone, aes(x = WholeWeight, fill = Type)) +
geom_histogram(show.legend = FALSE, bins = 15, alpha = 0.5, size = 1.55, color="steelblue", fill = "springgreen1") +
facet_wrap(~Type)
In this problem, we will introduce another very important geom, the
boxplot, which provides a quick visual display of useful summary
statistics for continuous variables ("numeric"s). Compared
with the histogram which focuses on displaying the shape of the
distribution, the boxplot allows us to visually relate the median with
the 25th and 75th quantiles, as well as outliers. We get an idea about
the central tendency of the variable, as well as a rough guide for the
“meaningful” range.
To demonstrate the usefulness of the boxplot, we will use the
diamonds dataset from ggplot2.
Some packages in R include datasets as well as code, and these are
added to the Environment whenever you call library(...).
The command data() will display all datasets that currently
exist in the environment. The code example below demonstrates how you
can access any of these datasets by name and pipe them into other
functions. Datasets can be accessed without assigning them to a
variable, but doing so will make it appear in your Environment.
Pipe diamonds into the glimpse()
function to display the dimensions and datatypes associated with the
variables within the dataset.
diamonds |> glimpse()
The variable price gives the cost of a diamond in US
dollars, while the other variables provide attributes associated with a
diamond. A natural question to ask then is: What variables influence
the price? If you have purchased a diamond before, you may have
heard about the “4 C’s”: cut, color, clarity, and carat. We will explore
the behavior of price with respect to these 4
variables.
The first three of these are all factors (the "ord" data
type is a special ordered factor which as the name states has a natural
ordering of the categorical levels) within the diamonds
dataset, while carat is a "numeric" variable.
With a boxplot, we group a continuous variable based on a categorical
variable, and display summary statistics associated with the continuous
variable for each categorical level. Therefore, the boxplot geom is
another geometric object which performs multiple operations behind the
scenes. If you have not worked with boxplots before, I recommend Chapter
7 of the R for Data Science
book.
Let’s start by visualizing the relationship between
price and color.
Pipe diamonds into ggplot() and set
the x and y aesthetics to color
and price, respectively. Then, call
geom_boxplot(). What conclusion would you draw based on the
resulting figure?
diamonds |>
ggplot(aes(x = color, y = price)) +
geom_boxplot() +
labs(title = "Diamond Price by Color",
x = "Diamond Color",
y = "Price (USD)")
Next, we will include the influence of the cut variable
breaking up the graphic into separate subplots based on the levels of
cut.
Add the facet_wrap() call to the code from
Problem 2b), and set the facetting variable to be
cut.
diamonds |>
ggplot(aes(x = color, y = price)) +
geom_boxplot() +
labs(title = "Diamond Price by Color",
x = "Diamond Color",
y = "Price (USD)") +
facet_wrap(~cut)
In addition to the facet_wrap() function, we can create
subplots with the facet_grid() function. As the name
suggests, facet_grid() creates a 2D grid layout where each
subplot corresponds to a combination of two facetting variables. As with
facet_wrap(), the syntax uses the formula interface:
facet_grid(<vertical variable> ~ <horizontal variable>).
The variable provided to the left of the ~ varies
top-to-bottom (vertically), while the variable to the right of the
~ changes left-to-right (horizontally).
To see how this works, use facet_grid() instead
of facet_wrap() and set the facetting variables to be
clarity and cut for the vertical and
horizontal directions, respectively.
diamonds |>
ggplot(aes(x = color, y = price)) +
geom_boxplot() +
labs(title = "Diamond Price by Color",
x = "Diamond Color",
y = "Price (USD)") +
facet_grid(clarity ~ cut)
The resulting figure in Problem 2d) includes 3 out of the 4 C’s. The
remaining variable, carat, is not categorical. To include
carat in our figure, let’s discretize it and compare two
boxplots side-by-side at each color level within each
clarity and cut subplot combination. For now,
we will keep things simple and break up carat based on if
an observation has a value greater than the median carat
value.
Within the geom_boxplot() function, set the
fill aesthetic to be a conditional test:
carat > median(carat). As shown in the supplemental
reading material, use the theme() function to move the
legend position to the top of the graphic.
Note: It might be difficult to see everything within the
graphic window displayed in the result after the code chunk, when
working within the .Rmd file in the RStudio IDE. You can zoom in by
clicking on the “Show in New Window” icon which is displayed as the
small “arrow over paper” icon to the right hand size of the output
portion. Alternatively, the figure dimensions can be modified by the
code chunk parameters fig.width and
fig.height. For this assignment, it is ok to use
the default figure dimensions.
diamonds |>
ggplot(aes(x = color, y = price)) +
geom_boxplot(aes(fill = carat > median(carat))) +
facet_grid(clarity ~ cut) +
labs(title = "Diamond Price by Color, Faceted by Clarity and Cut",
x = "Diamond Color",
y = "Price (USD)",
fill = "Carat > Median") +
theme(legend.position = "top")
Due to the large number of subplots, the individual facets are quite
small with the default figure size. Let’s focus on the case with
cut == "Ideal" and clarity == "IF" by calling
filter() before piping the dataset into the
ggplot().
Pipe diamonds into filter() and
perform the necessary operation. Pipe the resulting dataset into the
same ggplot2 function calls used in Problem 2e), except for
one important change. The filter() call will reduce the
dataset, and thus our conditional test will be comparing
carat to the median value associated with the smaller
dataset. To force the conditional test to still be applied to the median
based on the complete dataset use median(diamonds$carat)
within the conditional test instead of
median(carat).
diamonds |>
filter(cut == "Ideal", clarity == "IF") |>
ggplot(aes(x = color, y = price)) +
geom_boxplot(aes(fill = carat > median(diamonds$carat))) +
labs(title = "Diamond Price by Color (Cut = Ideal, Clarity = IF)",
x = "Diamond Color",
y = "Price (USD)",
fill = "Carat > Median") +
theme(legend.position = "top")
Discuss the differences between the trends shown in the resulting figure in Problem 2f) with the trends shown in the figure in Problem 2b).
In Problem 2b, the boxplots showed that diamond price seemed to increase with lower color quality, but the relationship was noisy and confounded by other variables. In Problem 2f, after filtering to only Ideal cut and IF clarity diamonds, and splitting by carat relative to the overall median, it becomes clear that carat size is the dominant driver of price. Large-carat stones are consistently much more expensive, while color plays only a secondary role within this high-quality subset.
Let’s now start to introduce model fitting. Within the R
ecosystem the workhorse of any modeling exercise is the
lm() function. We will learn what goes on behind the scenes
of lm() later in the semester. For now, let’s just get some
practice using lm(). It’s always helpful to visualize model
behavior and so we will first introduce lm() through the
ggplot2 geom_smooth() geom.
geom_smooth() is a way to add a “smoothing” trend to a
visualization. Chapter 3 of the R4DS
book provides an excellent overview of geom_smooth().
Reading that chapter will help you with this problem.
However, Ch. 3 in R4DS focused on the default approach of
geom_smooth() which applies a non-linear smoothing
function. We will instead force geom_smooth() to display a
linear trend by setting the method argument in
geom_smooth() to be lm. You do not have to put
quotes around lm. You are telling
geom_smooth() to use the lm() function. It is
very important to note, you should NOT
type lm() with parentheses when you set
method = lm.
You will return to using the abalone dataset in this
problem.
As already stated, geom_smooth() allows a trend to be
added as a layer to the graphic. In this problem, you will create a
scatter plot of Rings vs Height, then you will
then layer a smooth trend line on top of that that scatter plot. For
now, you will just use the default parameters for
geom_smooth(), don’t worry about the shape of the
curve.
Pipe abalone into ggplot and map the
x and y aesthetics to Height and
Rings respectively. Add geom_point AND
geom_smooth components to the plot. You do not need to
specify any arguments to geom_smooth() in this
problem.
NOTE: You may seem some warning messages appear. That’s ok for now.
abalone |> ggplot(aes(x = Height, y = Rings)) +
geom_point() +
geom_smooth()
Let’s now visualize a linear trend instead of the default non-linear smoother.
Specify the method argument within
geom_smooth() to be equal to lm.
abalone |> ggplot(aes(x = Height, y = Rings)) +
geom_point() +
geom_smooth(method = 'lm')
By default, geom_smooth() assumes that the input is the
variable mapped to the x aesthetic and the response is the
variable mapped to the y aesthetic. You can specify an
alternative formula to be used by the smoother through the
formula argument to geom_smooth(). For now,
just type the linear relationship formula, y ~ x, for the
formula argument.
In R’s formula interface, the variable to the left of
the ~ is the response and all variables to the right of the
~ are the inputs/predictors/features. So by setting the
formula to be y ~ x you are telling R that “y
is a function of x”.
It’s important to note that when specifying the formula
argument to geom_smooth() you can use x and
y because they are “local” to geom_smooth().
You do not have to specify the original variable names because of the
aesthetic mappings.
Explicitly set the formula argument to be “y is
a function of x”. Keep the method argument set to
lm.
abalone |> ggplot(aes(x = Height, y = Rings)) +
geom_point() +
geom_smooth(method = 'lm', formula = 'y ~ x')
geom_smooth() has many of the same aesthetics as other
geoms like geom_point(). You can tell
geom_smooth() to fit separate trend lines to different
groups several ways. First, you can map a discrete variable to the
color aesthetic to produce separate trend lines with
different colors. For example, you can fit separate trend lines to male,
female, and juvenile abalone for Height vs
Rings.
Use the same code setup that you used to answer Problem 3c).
This time set the color aesthetic within
geom_smooth() to be equal to
Type.
abalone |>
ggplot(aes(x = Height, y = Rings)) +
geom_point() +
geom_smooth(method = lm, formula = y ~ x, aes(color = Type))
When you map a discrete variable in geom_smooth(),
ggplot2 first groups the data associated with the unique
values (or levels in R terminology) together. Then separate
trend lines are fit and displayed for the separate groups. We can force
the grouping operation to occur without assigning specific colors to the
groups with the group aesthetic.
Use the same code setup that you used to answer Problem 3d).
This time, set the group aesthetic within
geom_smooth() to be equal to Type. Do not map
any variable to the color aesthetic.
abalone |>
ggplot(aes(x = Height, y = Rings)) +
geom_point() +
geom_smooth(method = lm, formula = y ~ x, aes(group = Type))
As useful as it is to include trend lines on figures, we do not have
access to such models outside of ggplot2. In other words,
we can’t study their behavior, or make predictions with the models on
new data. To do so, we need to fit a model ourselves outside
ggplot2. In this problem you will use lm()
directly to fit simple models to the mpg data.
Fit a linear relationship between Rings and
Height using R’s formula interface. Assign the
model to the variable mod1 below.
mod1 <- lm(Rings ~ Height, data = abalone)
We could use the summary() function to inspect the
results, but instead I want you to visualize the coefficients estimates
and confidence intervals. To do so, you will use the
coefplot() function from the coefplot package.
If you have not downloaded and installed coefplot please do
so now.
Use coefplot::coefplot() to visualize the
coefficients for mod1. Based on your visualization, is the
Height variable “significant”?
coefplot::coefplot(mod1)
Let’s now fit a slightly more complex model which accounts for the
influence of the Type variable. You will use an additive
relationship and so in your formula you only need to separate the two
input variables with the + operator.
Fit a linear relationship between the Rings and
the inputs WholeWeight and Height. Treat the
inputs as additive. Assign the result to the variable
mod2.
mod2 <- lm(Rings ~ WholeWeight + Height, data = abalone)
Use the coefplot::coefplot() function again to
visualize the coefficients associated with mod2. Are the
variables all “significant”? How many coefficients are
displayed?
coefplot::coefplot(mod2)
This question gives you practice working with LaTeX to write math expressions and equations. The sub-parts have mathematical expressions described in words or written in text. You will need to “code up” those expressions in LaTeX within the provided equation blocks.
We will denote vectors as bold face font lower case letters and
matrix with bold face font upper case letters. Thus, the vector \(\mathbf{z}\) is written as
$\mathbf{z}$ to create an in-line LaTeX expression.
The equation block is started for you below. Write a system of linear equations in matrix form such that the matrix A multiplied by the vector x equals the vector b. The equation should read as Ax=b, but must be written in bold face font below.
\[ \mathbf{A}\mathbf{x} = \mathbf{b} \]
Parentheses can be created several ways in LaTeX. Using the “basic”
( ) characters will create parentheses around an
expression. However, the parentheses are fixed size and do not “grow” as
the size of the expression grows. Dynamic parentheses are created by
using \left( \right) instead of ( ). You can
also create dynamic square and curly braces with
\left[ \right] and \left{ \right},
respectively.
Throughout the semester we will frequently need to use subscripts and
superscripts. Subscripts are “attached” with the underscore
_ and superscripts are “attached” with the ^.
For example, $7_{3}$ would set 3 as the subscript to 7.
Using the curly braces { } next to the underscore is a
formal way of denoting that everything contained within the curly braces
will be used as the subscript.
The equation block is created for you below. Place within dynamically sized parentheses the variable capital X with a subscript of 1 and a superscript of capital Z. The Z superscript itself must have a superscript of capital Y.
\[ \left( X_{1}^{Z^{Y}} \right) \]
We will often need to write complicated mathematical expressions
which are easiest to express in multiple lines. The easiest way to
create multi-line equation is by using the align LaTeX
math block environments.
The & character is the alignment marker and it tells
LaTeX which characters should be matched with each other between lines.
In the following example, you can see that all three = are
preceded by a &, which makes them all aligned. The
double slash \\ marks the end of a line. It is not
necessary on the last line of an equation block, and can cause some
problems in LaTeX compilers.
\[ \begin{align*} \exp(5a^3) - \frac{3}{4} &= \omega\\ 4 &= 2 + 2\\ &= \sum_{n=1}^Nn^2 \end{align*} \]
Four algebraic expressions are given below. Create an
align* block with one line for each expression, all of
which should be aligned according to the equals sign. Hint: in LaTeX
math, Greek characters can be simply referred to with a backslash and
their name, e.g. \epsilon
\[ \begin{align*} x + y &= 2z^2 \\ x^2 + 3y &= 20 \\ 27\zeta &= x^2 y^3 \\ x^3 - 5z^2 &= 12 \end{align*} \]
This question is designed to give graduate students further experience with Latex, as well as practice basic calculus skills that will provide deeper insight into techniques we use in the coming weeks.
Consider two functions \(f(x)\) and \(g(x)\). Let \(f(x) = 3x^2 + 2x - 8\) and \(g(x) = \log_e{f(x)}=\log_e[3x^2 + 2x - 8]\) - that is, the natural logarithm of \(f(x)\). In this class \(\log\) will always connote the natural (base \(e\)) logarithm, unless otherwise specified.
A major theme in machine learning is optimization, or choosing variable values that maximize/minimize some function. When that function is smooth and differentiable, we can find the local optima of the function by calculating the zero-points of its first derivative. For a polynomial of degree \(d\), there are \(d-1\) complex roots of the first derivative. Quadratic (degree 2) functions, like \(f(x)\) in this Problem, have the convenient property of only having one real optimum.
The code chunk below creates a tibble with values of \(f(x)\) calculated for \(x\) between -3 and +3. You don’t need to make any changes to this chunk, this question asks you to plot the function over this domain of \(x\).
f_function_data <- tibble(x = seq(-3,3,by=.1)) |>
mutate(fx = 3*x^2 + 2*x -8)
Pipe f_function_data into ggplot()
then add a geom_line object with the x aesthetic set to
x and the y aesthetic to fx. You do not need
to plot the individual points. Additionally please answer the following
question above your code chunk: is the optimum of this function a
minimum or maximum and what is its approximate value? You can give a
rough estimate based on the visual alone.
YOUR ANSWER HERE.
f_function_data |>
ggplot(aes(x = x, y = fx)) +
geom_line() +
labs(title = "Quadratic Function f(x) = 3x^2 + 2x - 8",
x = "x", y = "f(x)")
Within the {align} environment started for you
below, calculate the first derivative of \(f(x)\). You may add more lines if you need
to.
\[ \begin{align} f(x) &= 3x^2 + 2x - 8 \\ \frac{df}{dx} = f^\prime(x) &= 6x + 2 \end{align} \]
## # A tibble: 1 × 2
## min_index_fx min_index_logf
## <int> <int>
## 1 28 28
There is no programming required for this question, only
math. Complete the {align} block below with the first
derivative of \(g(x)\)
\[ \begin{align} g(x) &= \log\left(3x^2 + 2x - 8\right) \\ g^\prime(x) &= \frac{6x + 2}{3x^2 + 2x - 8} \end{align} \]
Now that you have calculated the derivatives of both functions, you
will verify what the figure in 6c) suggests: both \(f(x)\) and \(g(x)\) reach their minimum values for the
same value of x. To do so, you should set each derivative
equal to zero and solve for x, then show that both
functions have the same solution. An important caveat - there is a
trivial solution when \(x=\infty\)
which only applies to one function.
In this question you are proving a specific case of a general theorem: if a value \(\hat{x}\) is a local optimum of a function, then it is also a local optimum of its natural logarithm. [Notice the implication only goes in one direction]. This is actually quite easy to prove in general for yourself, and the consequences turn out to be quite powerful owing to certain convenient properties of logarithms.
If you’re wondering how we know if a local optimum is a maximum or minimum value, here is a link to a good Youtube video that visualizes the second derivative test for concavity. We’ll cover this in class later.
Show that \(f(x)\) and \(g(x)\) have the same non-trivial (i.e. not infinity) minima by setting each derivative equal to zero and solving for \(x\) hint: the algebra you need is very basic if you’ve calculated the derivatives correctly.
\[ \begin{align} f^\prime(x) = 0 &\implies 6x + 2 = 0 \implies x = -\tfrac{1}{3} \\ g^\prime(x) = 0 &\implies \frac{6x + 2}{3x^2 + 2x - 8} = 0 \implies x = -\tfrac{1}{3} \end{align} \]