Exa math v6

E1.

Describe the values \(x\) plotted in this graph using mathematical notation with inequalities and absolute value.


E2.

For the set of functions below find \((f \circ g \circ h)(x) = f(g(h(x))\) and evaluate the result at \(x = 1\). \[f(x) = e^x + x^2, \quad g(x) = \sqrt{x + 4}, \quad h(x) = x^3 - 5\]


E3.

Find functions f(x), g(x) and h(x) so that the function \(y(x)\) can be expressed as \(y(x) = (f \circ g \circ h)(x)= f(g(h(x))\) (there is possibly more than one solution) \[y(x) = \frac{1}{\log(x^2+1)}\]


E4.

Write the slope-intercept \(y = mx+b\), point-slope \((y - y_1) = m(x-x_1)\) and standard form \(ax + by = c\) of the line in the graph.

E5.

Solve the equations below for x. Do not evaluate the logarithms and exponentials exactly in the final solutions, e.g. leave the solutions as \(x = log(4)\) instead of calculating the value of \(\log(4)=1.386\). \[\log(4) + 3\log(x) = \log(32)\\ \frac{e^{3x + 2}}{e^2} - \frac{8^3}{2^3} = 0\]


E6.

For a continous random variable \(X\) with Laplace distribution \(Laplace(\mu, b)\) the pdf is \[f(x) = \frac{1}{2b} e^{-\frac{|x-\mu|}{b}} \enspace .\] For a random sample \(\mathbf{x}=(x_1, x_2, \ldots, x_n)\) of size \(n\) from this distribution the likelihood function is \[l(\mathbf{x}) = \prod_{i=1}^n \frac{1}{2b} e^{-\frac{|x_i-\mu|}{b}} \enspace .\] Find the log likelihood function \(\log l(\mathbf{x})\) and simplify it to a sum of terms without any exponents.


E7.

For the function \(f\) find \(\text{argmax } f(x)\) and \(\max f(x)\)

\[f(x) = (2x+3)(-0.5x-1)\]


E8.

Find the derivative of the function \[f(x) = \sigma( 6x^2 - 5x + 3 ) \enspace ,\] where \(\sigma(x)\) is the logisitic sigmoid \(\sigma(x) = \frac{1}{1 + e^{-x}}\) with derivative \(\sigma'(x) = \sigma(x) (1 - \sigma(x))\).

Evaluate the derivative at \(x = 1\).

Use the finite difference approximation of the derivative to verify your above result.


E9.

In the following exercise all vectors \(\mathbf{x_i} \in \mathbb{R}^2\) and scalars \(y_i \in \mathbb{R}\) are observations of data (konwn and fixed constants) and \(\mathbf{w} = (w_1, w_2)\) is a vector of parameters of the function \(f(\mathbf{w})\).

  1. Find the partial derivatives of the function \(f\) with respect to the elements of the vector \(\mathbf{w} = (w_1, w_2)\).

\[f(\mathbf{w}) = \sum_{i=1}^3 \big( \mathbf{w}^T \mathbf{x}_i - y_i \big)^2 + || \mathbf{w} ||_2^2\] (Note: \(\mathbf{w}^T \mathbf{x}_i\) is the inner product of the vectors \(\mathbf{w}\) and \(\mathbf{x_i}\) and \(||\mathbf{w}||_2\) is the \(\ell_2\) norm of the vector. Check your notes on linear algebra from the sister course!)

Find the partial derivatives for these specific values of observations \[\mathbf{x_1} = (1, 2), \quad y_1 = 2\\ \mathbf{x_2} = (0, 3), \quad y_2 = -1\\ \mathbf{x_3} = (-1, 0), \quad y_3 = -2\\\]

  1. Find the gradient \(\nabla f(\mathbf{w})\) of the function at the point \(\mathbf{w} = (0.5,0.2)\).

  2. Evaluate the function at the point \(\mathbf{w} = (0.5, 0.2)\) and the point \(\mathbf{w} - 0.1 \nabla f(\mathbf{w})\). Which of these is bigger?