Exa math v3

E1.

Describe all function values \(f(x)\) whos distance from the number \(2\) is at most \(4\). Use mathematical notation with inequalities and absolute value.


E2.

For the set of functions below find \((f \circ g \circ h)(x) = f(g(h(x))\) and evaluate the result at \(x = -1\). \[f(x) = \frac{e^{x}}{x}, \quad g(x) = x+3, \quad h(x) = x^2 + 1\]


E3.

Find functions f(x), g(x) and h(x) so that the function \(y(x)\) can be expressed as \(y(x) = (f \circ g \circ h)(x)= f(g(h(x))\) (there is possibly more than one solution) \[y(x) = e^{\frac{1}{(x-2)^2}}\]


E4.

Write the slope-intercept \(y = mx+b\), point-slope \((y - y_1) = m(x-x_1)\) and standard form \(ax + by = c\) of the line in the graph.

E5.

Solve the equations below for x. Do not evaluate the logarithms and exponentials exactly in the final solutions, e.g. leave the solutions as \(x = log(4)\) instead of calculating the value of \(\log(4)=1.386\). \[\log(5) = \log(x^3 - 4x^2) - 2\log(x)\\ e^{x-1}e^2 = 9\]


E6.

For a binary random variable \(X\) with Bernoulli distribution \(Ber(\mu)\) the pmf is \[p(x) = \mu^x \, (1 - \mu)^{1-x} \enspace .\] For a random sample \(\mathbf{x}=(x_1, x_2, \ldots, x_n)\) of size \(n\) from this distribution the likelihood function is \[l(\mathbf{x}) = \prod_{i=1}^n \mu^{x_i} \, (1 - \mu)^{1-x_i}\] Find the log likelihood function \(\log l(\mathbf{x})\) and simplify it to a sum of terms without any exponents.

E7.

For the function \(f\) find \(\text{argmin } f(x)\) and \(\min f(x)\)

\[f(x) = (2x - 4)(x+2)\]


E8.

Find the derivative of the function \[f(x) = \sigma(2\sqrt{x} - 5x) \enspace ,\] where \(\sigma(x)\) is the softplus funciton \(\sigma(x) = \log(1 + e^x)\) with derivative \(\sigma'(x) = \frac{1}{1 + e^{-x}}\).

Evaluate the derivative at \(x = 1\).

Use the finite difference approximation of the derivative to verify your above result.


E9.

In the following exercise all vectors \(\mathbf{x_i} \in \mathbb{R}^2\) and scalars \(y_i \in \mathbb{R}\) are observations of data (konwn and fixed constants) and \(\mathbf{w} = (w_1, w_2)\) is a vector of parameters of the function \(f(\mathbf{w})\).

  1. Find the partial derivatives of the function \(f\) with respect to the elements of the vector \(\mathbf{w} = (w_1, w_2)\).

\[f(\mathbf{w}) = \sum_{i=1}^3 \big( \mathbf{w}^T \mathbf{x}_i - y_i \big)^2 + || \mathbf{w} ||_2^2\] (Note: \(\mathbf{w}^T \mathbf{x}_i\) is the inner product of the vectors \(\mathbf{w}\) and \(\mathbf{x_i}\) and \(||\mathbf{w}||_2\) is the \(\ell_2\) norm of the vector. Check your notes on linear algebra from the sister course!)

Find the partial derivatives for these specific values of observations \[\mathbf{x_1} = (1, 2), \quad y_1 = 2\\ \mathbf{x_2} = (0, 3), \quad y_2 = -1\\ \mathbf{x_3} = (-1, 0), \quad y_3 = -2\\\]

  1. Find the gradient \(\nabla f(\mathbf{w})\) of the function at the point \(\mathbf{w} = (0.5,0.2)\).

  2. Evaluate the function at the point \(\mathbf{w} = (0.5, 0.2)\) and the point \(\mathbf{w} - 0.1 \nabla f(\mathbf{w})\). Which of these is bigger?