Some Linear Algebra Concepts

Functions

The domain of a function is the set of all possible inputs. The co-domain is the set of all possible outputs. The image of a function is the set of all actual outputs.

If the image of a function is the same as the co-domain, the function is onto. This means that for every element in the co-domain, there is at least one element in the image.

If each element in the image is mapped by at most one element in the domain, the function is one-to-one.

A function is linear if it is a polynomial of degree 1 or 0. A linear function is equivalent to multiplying some matrix by a vector in the domain. The kernel of the linear function is the same as the nullspace of the matrix.

Linear independence

A collection of vectors is either linearly independent or linearly dependent. The vectors $\{\vec{v}_1, \vec{v}_2,\dots, \vec{v}_n\}$ are linearly independent if the equation involving linear combinations $a_1\vec{v}_1 + a_2\vec{v}_2+\dots +a_n\vec{v}_n = \vec{0}$ is true only when the scalars $a_i$ are all equal to zero. (This is known as the trivial solution.) The vectors are linearly dependent if the equation has a solution when at least one of the scalars is not zero.

If a set of vectors is linearly dependent, then there are some redundant vectors. These can fall into two categories. The first is that two of the vectors are either identical, or are scalar multiples of each other. The second is that some linear combination of two or more of the vectors can produce another of the vectors.

Consider the following vectors, based on:

https://www.youtube.com/watch?v=yLi8RxqfowA

\[ \vec{v}_1 = \begin{bmatrix}1\\-2\\0\\\end{bmatrix}, \vec{v}_2= \begin{bmatrix}4\\0\\8\\\end{bmatrix}, \vec{v}_3 = \begin{bmatrix}3\\-1\\5\\\end{bmatrix} \]

Note that the vectors map to the unknowns, not the equations on which the rays lie. So the equations would be:

\[1x_1 + 4x_2 + 3x_3 = 0\] \[-2x_1 + 0x_2 - 1x_3 = 0\] \[0x_1 + 8x_2 + 5x_3 = 0\]

The vectors are the unknowns because we have a coefficient for each unknown, in this case, $a_1, a_2, a_3$ and we want to multiply $a_1$ with the first unknown, $a_2$ with the second, and so on (as shown below). But from a visual perspective, we want to know if we can take any combination of expanded or contracted rays and connect them end-to-end in order to reach the origin. If we can, the vectors corresponding to the rays are linearly dependent, otherwise independent. Here is an example:

plot of chunk unnamed-chunk-2

No combination of the two rays above can be used to reach the origin, except for the trivial solution, i.e., multiplying both vectors by zero. But if we add a third ray:

plot of chunk unnamed-chunk-3

Now we can use combinations of these rays to reach the origin:

plot of chunk unnamed-chunk-4 And thus this set of vectors is linearly dependent. This is why, to my mind, we look to see if the vectors based on the rays (i.e., the rows in the matrix) are linearly independent, rather than the vectors based on the unknowns (the columns).

The key is that the rays or hyperplanes correspond to the rows, and the unknowns correspond to the columns.

We want to look for linear combinations that result in a zero vector:

\[ a_1 \begin{bmatrix}1\\-2\\0\\\end{bmatrix} + a_2 \begin{bmatrix}4\\0\\8\end{bmatrix} + a_3 \begin{bmatrix}3\\-1\\5\end{bmatrix} = \begin{bmatrix}0\\0\\0\end{bmatrix} \]

This demonstrates the previous point. The first coefficient, $a_1$, is multiplied against the first element (or unknown) of each ray’s vector, the second coefficient, $a_2$, is multiplied against the second element/unknown, and so on.

We can write these as an augmented matrix:

\[ \begin{bmatrix} 1 & 4 & 3 & | & 0\\ -2 & 0 & -1 & | & 0\\ 0 & 8 & 5 & | & 0\\ \end{bmatrix} \]

Now we can move to reduce row echelon form:

\[ \begin{eqnarray} R_1\\ 2R_1 + R_2 \rightarrow R_2\\ R_3\\ \end{eqnarray} \begin{bmatrix} 1 & 4 & 3 & | & 0\\ 0 & 8 & 5 & | & 0\\ 0 & 8 & 5 & | & 0\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1\\ R_2\\ -1R_2 + R_3 \rightarrow R_3\\ \end{eqnarray} \begin{bmatrix} 1 & 4 & 3 & | & 0\\ 0 & 8 & 5 & | & 0\\ 0 & 0 & 0 & | & 0\\ \end{bmatrix} \]

Note that, as we will see in the second example, a row of all zeroes does not automatically mean we have non-trivial solutions. But for now, we continue:

\[ \begin{eqnarray} R_1\\ 1/8R_2 \rightarrow R_2\\ R_3\\ \end{eqnarray} \begin{bmatrix} 1 & 4 & 3 & | & 0\\ 0 & 1 & 5/8 & | & 0\\ 0 & 0 & 0 & | & 0\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1 - 4R_2 \rightarrow R_1\\ R_2\\ R_3\\ \end{eqnarray} \begin{bmatrix} 1 & 0 & 1/2 & | & 0\\ 0 & 1 & 5/8 & | & 0\\ 0 & 0 & 0 & | & 0\\ \end{bmatrix} \]

Now we have: \[1a_1 + 1/2a_3 = 0\] \[1a_2 + 5/8a_3 = 0\]

The last variable, $a_3$, is the free variable, it can be any real number $k$:

\[a_3 = k\]

Therefore:

\[1a_1 + 1/2k = 0\] \[1a_2 + 5/8k = 0\]

And:

\[a_1 = -1/2k\] \[a_2 = -5/8k\] \[a_3 = k\]

So there are infinite non-trivial solutions. Any real number can be substituted for $k$, and there will be a solution based on that value, e.g., if $k=1$, then $a_1=-1/2, a_2=-5/8, a_3=1$. And thus the set of vectors is linearly dependent.

To reiterate a point made above, the vectors shown map to the unknowns. So first value from each vector is 1, 4 and 3. $-1/2(1) + -5/8(4) + 3(8) = -1/2 - 20/8 + 3 = 0. I think the unknowns are shown grouped into column vectors because it is assumed by default that a matrix is composed of column vectors. So unknowns are grouped by column, the actual vectors (that would map to equations) are grouped by rows. In this form, the matrix can be multiplied with a column vector:

\[ \begin{bmatrix} 1 & 4 & 3\\ -2 & 0 & -1\\ 0 & 8 & 5\\ \end{bmatrix} \times \begin{bmatrix} -1/2\\-5/8\\1\\ \end{bmatrix} = \begin{bmatrix} -1/2(1) - 5/8(4) + 1(3)\\ -1/2(-2) - 5/8(0) + 1(-1)\\ -1/2(0) - 5/8(8) + 1(5)\\ \end{bmatrix} = \begin{bmatrix} 0\\0\\0\\ \end{bmatrix} \]

Now consider a second example with these vectors, from:

https://www.youtube.com/watch?v=SOzO9EcQdQc

Again, the column vectors map to the unknowns.

\[ \vec{v}_1 = \begin{bmatrix}1\\2\\1\\4\\\end{bmatrix}, \vec{v}_2= \begin{bmatrix}0\\1\\1\\1\\\end{bmatrix}, \vec{v}_3 = \begin{bmatrix}2\\0\\1\\7\\\end{bmatrix} \]

We want to look for linear combinations that result in a zero vector:

\[ a_1 \begin{bmatrix}1\\2\\1\\4\\\end{bmatrix} + a_2 \begin{bmatrix}0\\1\\1\\1\\\end{bmatrix} + a_3 \begin{bmatrix}2\\0\\1\\7\\\end{bmatrix} = \begin{bmatrix}0\\0\\0\\0\\\end{bmatrix} \]

We can write these as an augmented matrix:

\[ \begin{bmatrix} 1 & 0 & 2 & | & 0\\ 2 & 1 & 0 & | & 0\\ 1 & 1 & 1 & | & 0\\ 4 & 1 & 7 & | & 0\\ \end{bmatrix} \]

Finding solutions for linearly dependent systems

The first example above showed the way to find any solution for a linearly-dependent system of vectors when there are three unknowns and three vectors. That involves a single free variable. It is possible in $R^n$ space to have $n$ vectors in which two or more of the vectors can combine to form a third vector, but only when $n>2$. If $n=2$, then clearly we only have two vectors to work with, and so we can’t combine two to form a third. (We can have two vectors that point in the same direction, but then they’re linearly dependent exactly because of that reason.)

This brings up a key point: when $n>2$, then having $n$ vectors (that are not colinear or the zero vector) in $R^n$ does not guarantee that the system is linearly independent, even if no two vectors are multiples of each other. Two or more of the vectors might be able to combine to form a third.

But what if the number of vectors is greater than $n$? (And by “vectors” here, I mean the number of rays, not the number of unknowns.) For example:

\[x_1 + 3x_2 = 0\] \[5x_1 + 3x_2 = 0\] \[2x_1 + x_2 = 0\]

Clearly, none of these can be scaled to equal any other. We know, because there are more vectors than unknowns, that the system must be linearly dependent. But how to find a solution? You cannot do it with the method shown for the first example above, and here is why. First, create an augmented matrix:

\[ \begin{bmatrix} 1 & 3 & | & 0\\ 5 & 1 & | & 0\\ 2 & 1 & | & 0\\ \end{bmatrix} \]

Now reduce to reduced row echelon form.

\[ \begin{eqnarray} R_1\\ R_2 - 5R_1 \rightarrow R_2\\ R_3 - 2R_1 \rightarrow R_3\\ \end{eqnarray} \begin{bmatrix} 1 & 3 & | & 0\\ 0 & -14 & | & 0\\ 0 & -5 & | & 0\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1\\ -(1/14)*R_2 \rightarrow R_2\\ R_3\\ \end{eqnarray} \begin{bmatrix} 1 & 3 & | & 0\\ 0 & 1 & | & 0\\ 0 & -5 & | & 0\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1 - 3R_2 \rightarrow R_1\\ R_2\\ R3 + 5R_2 \rightarrow R_3\\ \end{eqnarray} \begin{bmatrix} 1 & 0 & | & 0\\ 0 & 1 & | & 0\\ 0 & 0 & | & 0\\ \end{bmatrix} \]

Because we have a row of all zeroes, we know that the system is linearly dependent. But we don’t have a free variable this time. We can’t say that $x_2$ is $k$, because it isn’t…in the second vector, it’s 1. So how do we find a solution? This is the best I’ve come up with so far. We can find a solution for the first two vectors when the second is:

\[\begin{bmatrix}2\\1\\\end{bmatrix}\]

So:

\[ \begin{bmatrix} 1 & 3 & | & 2\\ 5 & 1 & | & 1\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1\\ R_2 - 5R_1 \rightarrow R_2\\ \end{eqnarray} \begin{bmatrix} 1 & 3 & | & 2\\ 0 & -14 & | & -9\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1\\ -(1/14)R_2 \rightarrow R_2\\ \end{eqnarray} \begin{bmatrix} 1 & 3 & | & 2\\ 0 & 1 & | & 9/14\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1 - 3R_2 \rightarrow R_1\\ R_2\\ \end{eqnarray} \begin{bmatrix} 1 & 0 & | & 1/14\\ 0 & 1 & | & 9/14\\ \end{bmatrix} \]

So in other words:

\[1/14(1) + 9/14(3) = 2\] \[1/14(5) + 9/14(1) = 1\]

Or:

\[ \begin{bmatrix} 1 & 3\\ 5 & 1\\ \end{bmatrix} \times \begin{bmatrix} 1/14\\9/14\\ \end{bmatrix} = \begin{bmatrix} 2\\1\\ \end{bmatrix} \]

But first, this gets us to the point (2,1), not the origin. Then we have to add the vector [2,1] times -1 to get to the origin. Second, there is no easy way to calculate for other multiples of [2,1]. Second, there is no handy formula for this. If we want to reach a scalar multiple of [2,1], for example, [4,2], then we would just double the coefficients:

\[ \begin{bmatrix} 1 & 3\\ 5 & 1\\ \end{bmatrix} \times \begin{bmatrix} 1/7\\9/7\\ \end{bmatrix} = \begin{bmatrix} 4\\2\\ \end{bmatrix} \]

I suppose that means, therefore, that if $k=1$, then $a_1=1/14$ and $a_2=9/14$. But that doesn’t really fit the earlier example, because $k$ was equal to a free variable, which we don’t have here. There must be a linear combination that can be applied to the three vectors (or rather, to the two vectors of unknowns) but I don’t know how to find it.

I have found that I can solve using R, if my target vector is [2,1]:

options(digits=7)
data = matrix(c(1,3,5,1),nrow=2,byrow=T)
solve(a=data,b=c(2,1))

## [1] 0.07142857 0.64285714

These values are equal to 1/14 and 9/14. But you can’t solve for all three equations; the matrix isn’t square in that case.

My next question is: What if I have four equations in $R_2$, can I find a combination of three that gives me the fourth? I know that it’s possible. I can visualize making expanded or contracted copies of the three, placing them end-to-end, and finding the fourth. But how do to this? The matrix would have three rows, but in $R_2$, the solution has only two values. So how do calculate this?

plot(0,0,xlim=c(0,5),ylim=c(0,5))
arrows(0,0,3,2,col="red")
arrows(3,2,4,3,col="blue")
arrows(4,3,3,4,col="green")
arrows(3,4,0,0,col="black")

plot of chunk unnamed-chunk-6

The equations are:

\[3x_1 + 2x_2 = 0\] \[1x_1 + 1x_2 = 0\] \[-1x_1 + 1x_2 = 0\] \[3x_1 + 4x_2 = 0\]

If I have three equations in $R^2$ I can solve. So I can add two of the vectors together (the first two), and then use that, and one other vector, to solve for the last one:

\[ \begin{bmatrix} 4 & 2 & | & 3\\ -1 & 1 & | & 4\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1 + 4R_2 \rightarrow R_1\\ R_2 \end{eqnarray} \begin{bmatrix} 1 & 6 & | & 15\\ -1 & 1 & | & 4\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1\\ R_1 + R_2 \rightarrow R_2\\ \end{eqnarray} \begin{bmatrix} 1 & 6 & | & 15\\ 0 & 7 & | & 19\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1\\ (1/7)R_2 \rightarrow R_2\\ \end{eqnarray} \begin{bmatrix} 1 & 6 & | & 15\\ 0 & 1 & | & 19/7\\ \end{bmatrix} \]

\[ \begin{eqnarray} R_1 - 6R_2 rightarrow R_2\\ R_2\\ \end{eqnarray} \begin{bmatrix} 1 & 0 & | & -9/7\\ 0 & 1 & | & 19/7\\ \end{bmatrix} \]

So now we have our coefficients:

\[ \begin{bmatrix} 4 & 3\\ -1 & 1\\ \end{bmatrix} \times \begin{bmatrix} -9/7\\ 19/7\\ \end{bmatrix} = \begin{bmatrix} 3\\ 4\\ \end{bmatrix} \]

This will work so long as the two vectors are not colinear. So if we have a set of four vectors in $R^2$, and two of them are colinear, the first thing to do is add them together, and proceed as before with three vectors. If a set of four vectors in $R^2$ has three colinear vectors, then we can simply add all three together, and now we have a linearly independent set of two vectors with only the trivial solution.

Vector Spaces

10 Axioms That Define Vector Spaces

Let $V$ be a non-empty set of elements called vectors. We define operations on the set $V$–vector addition and scalar multiplication. Salars are real numbers.

Let $u, v$ and $w$ be vectors in the set $V$. The set $V$ is called a vector space if it satisfies the following 10 axioms.

The vector addition $\vec{u} + \vec{v}$ is also in the vector space $V$. (Closure under vector addition)
Commutative law: $\vec{u} + \vec{v} = \vec{v} + \vec{u}$
Associative law: $(\vec{u} + \vec{v}) + \vec{w} = \vec{u} + (\vec{v} + \vec{w})$
Netural element. There is a vector called the zero vector in $V$ denoted by $\vec{0}$ which satisfies $\vec{u} + 0 = \vec{u}$ for every vector $\vec{u}$ in $V$
Additive inverse. For every vector $\vec{u}$ there is a vector $-\vec{u}$ which satisfies: $\vec{u} + (-\vec{u}) = \vec{0}$
Let $k$ be a real scalar then $k\vec{u}$ is also in $V$. (Closure under scalar multiplication)
Associative law for scalar multiplication: $k(c\vec{u}) = (kc)\vec{u}$
Distributive law for vectors: $k(\vec{u} + \vec{v}) = k\vec{u} + k\vec{v}$
Distributive law for scalars: $(k + c)\vec{u} = k\vec{u} + c\vec{u}$
Identity element: $1\vec{u} = \vec{u}$

Null Spaces

A null space of a matrix is the vector space which can be used to create linear combinations with the matrix that produce the zero vector. So if the matrix is $A$, then the null space $u$ is the set of vectors such that:

\[A * u = 0\]

If the only member of $u$ is the trivial solution (a zero vector), then the row vectors of the matrix are linearly independent. This means (I think!) that the matrix is invertible.

If there is more than one member in $u$ then the matrix is singular/degenerate.

Note that a null space is a vector space.

It contains the zero vector.
Any two members can be added, and the result is still in the null space.
Any member can be multipled with a scalar, and the result is still in the null space.

Point 2 is equivalent to saying that there are two vectors, $u_1$ and $u_2$ in $u$. We know, because they are in the null space, that $Au_1=0$ and $Au_2=0$. Therefore, $A(u_1 + u_2)=0$.

Point 3 is equivalent to saying that if $Au=0$, then $A*(\alpha u)=0$.