1.1.2 System of linear equations

A finite number of linear equations with a finite number of unknowns is called a system of linear equations or a linear system.

1.1.3 Types of solutions

A linear system that has no solution is called inconsistent. If a system has at least one solution, it is consistent.

Any linear system can have zero, one, or infinite solutions.

Theorem 1.12. Properties of matrix addition

Let A, B, and C be matrices of the same size \(m \times n\). Then.

  1. A + B = B + A
  2. (A + B) + C = A + (B + C)
  3. A + A + A + … A (\(k\) times) = \(k\)A
  4. There is an \(m \times n\) matrix called the zero matrix 0 which has the property A + 0 = A
  5. There is a matrix denoted -A such that A+(-A) = A - A = 0

Theorem 1.13. Properties of scalar multiplication

Let A and B both be \(m \times n\) matrices.

  1. \((ck)\)A = \(c(k\)A\()\)
  2. \(k(\)A + B\()\) = \(k\)A + \(k\)B
  3. \((c+k)\)A = \(c\)A + \(k\)A

Proposition 1.16. Properties of matrix multiplication

(Why a proposition and not a theorem?)

Let A, B and C be matrices of an appropriate size so that the following arithmetric operations can be carried out.

  1. (AB)C = A(BC)
  2. A(B+C) = AB + AC
  3. (B+C)A = BA + CA
  4. A\(\times\)0 = 0\(\times\)A = 0

Theorem 1.19. Properties of matrix transpose

Let A and B be matrices of appropriate size so that the operations below can be carried out.

  1. (A\(^T\))\(^T\) = A
  2. (\(k\)A)\(^T\) = \(k\)A\(^T\)
  3. (A + B)\(^T\) = A\(^T\) + B\(^T\)
  4. (AB)\(^T\) = B\(^T\)A\(^T\)

Definition 1.20. Identity Matrix

AI = A for any matrix A

Definition 1.21

An identity matrix is a square matrix denoted by I and defined by all zeroes except for the values in the diagonal from top left to bottom right, which are all ones.

Proposition 1.22

Let I be the identity matrix. Then I\(^T\) = I.

Definition 1.23

A square matrix A is said to be invertible or non-singular if there is a matrix B of the same size such that AB = BA = I

Matrix B is known as the multiplicative inverse of A and is denoted by A\(^{-1}\).

Definition 1.24

A square matrix A is said to be non-invertible or singular if there is no matrix B such that AB = I.

Proposition 1.25 Matrix inverse is unique

The inverse of an invertible matrix is unique.

Proposition 1.26 Inverse of matrix inverse is original matrix

If A is an invertible matrix then A\(^{-1}\) is invertible and (A\(^{-1}\))\(^{-1}\) = A.

Proposition 1.27. Inverse of product of invertible matrices

Let A and B be invertible matrices. Then AB is invertible and (AB)\(^{-1}\) = B\(^{-1}\)A\(^{-1}\)

Proposition 1.28. Scalar and product of matrix inverse

Let A be an invertible matrix and \(k\) be a non-zero scalar. Then (\(k\)A)\(^{-1}\) = \(k\)A\(^{-1}\).

Proposition 1.29.

Let A be an invertible matrix. Then the transpose of the matrix, A\(^T\) is also invertible and (A\(^T\))\(^{-1}\) = (A\(^{-1}\))\(^T\).

Proposition 1.30.

If a linear system is described by the augmented matrix (A|b) and it is row equivalent to (R|b’) then both linear systems have the same solution set. They are equivalent systems.

1.7.3

A homogeneous linear system is defined as Ax=0 where A is an \(m \times n\) matrix and x is a vector.

Proposition 1.31

Let a homogeneous linear system Ax = 0, whose augmented matrix is (A|0), be row equivalent to (R|0), where R is an equivalent matrix in reduced row echelon form. Let there be n unknowns and r non-zero rows in R. If r<n (the number of non-zero equations or rows is less than the number of unknowns) then the linear system (A|0) has an infinite number of solutions.

Short version: If a matrix has fewer non-zero rows than columns, there are infinite solutions.

1.7.4

A non-homogeneous system is defined as Ax=b where b\(\ne\)0.

Proposition 1.32

Let a consistent, non-homogeneous linear system, Ax=b, where b\(\ne\)0, be row equivalent to the augment matrix (R|b’) where R is in reduced row echelon form and there are n unknowns and r non-zero rows in R.

1.8.1 Elementary matrices

An elementary matrix is one that can be obtained by a single row operation on the identity matrix I. This means therefore that only one row can be changed.

Definition 1.33

A matrix B is row equivalent to matrix A if and only if there are a finite number of elementary matrices E\(_1\), E\(_2\)E\(_n\) such that B=E\(_n\)E\(_{n-1}\)E\(_1\)A.

If matrix A is row equiavlent to matrix B, then matrix B is row equivalent to matrix A.

Proposition 1.34

An elemetnary matrix is invertible (non-singular) and its inverse is also an elementary matrix.

Theorem 1.35

Let A be an n by n matrix. The following four statements are equivalent:

  1. The matrix A is invertible
  2. The linear system Ax=0 has only the trivial solution 0.
  3. The reduced row echelon form of A is the identity matrix.
  4. A is a product of elementary matrices

(This gets expanded later on with “columns are linearly independent” and “Ax=b has a single solution”.)

1.8.3 Determining the inverse matrix

(A|I) \(\times\) A\(^{-1}\) = (I|A\(^{-1}\))

This means that we can start with the augmented matrix (A|I) and use row operations, if possible, until we arrive at (I|A\(^{-1}\)).

(Or just use R’s solve function on A!)

1.8.4 Solving linear systems

If we have a general linear system Ax=b, then x=A\(^{-1}\)b.

Proposition 1.37

The linear system Ax=b has a unique solution \(\Leftrightarrow\) A is invertible.

Theorem 1.38

Let A be an n by n matrix, then the following five statements are equivalent:

  1. The matrix A is invertible
  2. The linear system Ax=0 has only the trivial solution 0.
  3. The reduced row echelon form of A is the identity matrix.
  4. A is a product of elementary matrices
  5. Ax=b has a unique solution.

Proposition 1.39

Let A be a square matrix and R be the reduced row echelon form of A. Then R has at least one zero row \(\Leftrightarrow\) A is non-invertible.

Proposition 2.1

Let u, v and w be vectors in \(\mathbb{R}^n\) and k, c be real numbers (scalars).

  1. u+v = v + u Commutative law for vector addition
  2. (u + v) + w = u + (v + w) Associative law for vector addition
  3. There exists a zero vector 0 such that u + 0 = u Neutral element
  4. For every vector u there is a vector -u such that u + (-u) = 0 Additive inverse
  5. k(u+v) = ku + kv Distributive law for scalar vector multiplication
  6. (k+c)u = ku + cu Distributive law for scalar vector multiplication
  7. (kc)u = k(cu) Associative law for scalar multiplication
  8. For every vector u we have u = 1u Identity (book says neutral element?)

Proposition 2.2

Let u be a vector in \(\mathbb{R}^n\). Then the vector -u which satisifies property (iv) in Proposition 2.1 is unique: u + (-u) = 0

I find it difficult to understand why something like the above needs a proposition, and a proof. It’s like saying, for any real number, there is only one number that is its negative.

Proposition 2.3

Let u be a vector in \(\mathbb{R}^n\) then (-1)u = -u

2.1.2 Orthogonal vectors

This section discusses orthoganlity. Want to capture that.

These two statements are equivalent:

  1. The dot product of two vectors u and v is 0.
  2. Two vectors u and v are orthogonal.

Proposition 2.6

Let u, v and w be vectors in \(\mathbb{R}^n\) and k be a real scalar.

  1. (u + v) + w = u + (v + w) Distributive law of vector addition
  2. u \(\cdot\) v = v \(\cdot\) u Commutative law of vector multiplication
  3. (ku) \(\cdot\) v = k (u \(\cdot\) v) = u \(\cdot\) kv
  4. u \(\cdot\) u \(\ge 0\), u \(\cdot\) u=0 \(\Leftrightarrow\) u = 0

2.1.4 Norm or length of a vector

Let u be a vector in \(\mathbb{R}^n\). The length or norm of u is denoted by ||u||. It is:

\(||\overrightarrow{u}|| = \sqrt{\overrightarrow{u} \cdot \overrightarrow{u}}\)

Also known as the Euclidean norm.

Definition 2.9

Distance function \(d\) between two vectors:

\(d(\overrightarrow{u}, \overrightarrow{v}) = ||\overrightarrow{u} - \overrightarrow{v}||\)

Proposition 2.10

Let u be a vector in \(\mathbb{R}^n\) and k be a real scalar. Then:

  1. ||u|| \(\ge 0\) and ||u|| = 0 \(\Leftrightarrow\) u = 0
  2. ||ku|| = |k| ||u||

Definition 2.11

u \(\cdot\) v = ||u|| ||v|| \(\cos{\theta}\)

Definition 2.12

\[\cos{\theta} = \frac{\overrightarrow{u} \cdot \overrightarrow{v}}{||\overrightarrow{u}||\text{ }||\overrightarrow{v}||}\]

Definition 2.13

Definition 2.12 can be extended to any two non-zero vectors in \(\mathbb{R}^n\), where \(0 \le \theta \le 2\pi\) radians.

Definition 2.14

Cauchy-Schwartz inequality:

\[|\overrightarrow{u} \cdot \overrightarrow{v}| \le ||\overrightarrow{u}|| \text{ } ||\overrightarrow{v}||\] We know that \(|\overrightarrow{u}\cdot\overrightarrow{v}|=0\) whenever u and v are orthogonal, or when one or both are zero vectors. But \(||\overrightarrow{u}||\text{ }||\overrightarrow{v}||\) can be zero only if u or v is a zero vector.

In simple terms, Cauchy-Schwartz inequality means the dot product of the vectors is less than or equal to the product of the norms/lengths.

Definition 2.15

Minkowski (triangular) inequality

Let u and v be vectors in \(\mathbb{R}^n\).

\[||\overrightarrow{u} + \overrightarrow{v}|| \le ||\overrightarrow{u}|| + ||\overrightarrow{v}||\]

What it means: The norm of the sum of two vectors is equal to the sum of the norms only when the two vectors are collinear. Otherwise, the norm of the sum must be less than the sum of the norms.

2.2.4 Unit vectors

A vector of length 1 is a unit vector. It can be derived from any vector u by:

\[1 = || \frac{1}{||\overrightarrow{u}||} \overrightarrow{u} ||\]

If \(\overrightarrow{u}\) is (3, 4) then \(||\overrightarrow{u}||=5\), so the unit vector \(\hat{u}\) for \(\overrightarrow{u}\) is:

\[\hat{u} = \frac{1}{||\overrightarrow{u}||} \overrightarrow{u} = \frac{1}{5}\overrightarrow{u} = (\frac{3}{5}, \frac{4}{5})\]

Proposition 2.17

Let \(\overrightarrow{u} = (x_1, x_2...x_n)^T\) be any vector in \(\mathbb{R}^n\), then

\[\overrightarrow{u} = x_1 \overrightarrow{e}_1 + x_2 \overrightarrow{e}_2 + ... + x_n \overrightarrow{e}_n\]

Where \(x_n\) is a scalar and and \(\overrightarrow{e}_n\) is a unit vector.

Proposition 2.18

Let \(\overrightarrow{u}\) be any vector in \(\mathbb{R}^n\), then the linear combination

\[\overrightarrow{u} = x_1\overrightarrow{e}_1 + x_2\overrightarrow{e}_2 +...+ x_n\overrightarrow{e}_n\]

is unique.

Definition 2.19

We say vectors \(\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\) are linearly independent \(\Leftrightarrow\) the only real scalars \(k_1, k_2,...k_n\) which satisfy

\[k_1\overrightarrow{v}_1+k_2\overrightarrow{v}_2+...+k_n\overrightarrow{v}_n=\overrightarrow{0}\]

are

\[k_1=k_2=...k_n=0\]

Definiton 2.20

Conversely we have the vectors \(\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\) in linearly dependent \(\Leftrightarrow\) the scalars \(k_1, k_2,...k_n\) are not all zero and satisfy

\[k_1\overrightarrow{v}_1+k_2\overrightarrow{v}_2+...+k_n\overrightarrow{v}_n=\overrightarrow{0}\]

Proposition 2.21

Let \(\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\) be any vectors in \(\mathbb{R}^n\). If at least one of these vectors is the zero vector, then the set of vectors is linearly dependent.

Proposition 2.22

Let \(\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\) be distinct vectors in \(\mathbb{R}^n\). If \(n < m\), that is, the number of unknowns is less than the number of vectors, then the set of vectors is linearly dependent.

(Note that this is true whether any of the vectors is a zero vector or not.)

If the set is linearly dependent, there is an infinite number of solutions for the linear system represented by the set.

Proposition 2.23

Let \(S = \{\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\}\) be \(n\) vectors in \(\mathbb{R}^n\). Let \(A\) be the \(n\) by \(n\) matrix whose columns are given by the vectors in \(S\):

\[\mathbf{A} = (\overrightarrow{v}_1\text{ }\overrightarrow{v}_2\text{ }...\text{ }\overrightarrow{v}_n)\]

Then the vectors in \(S\) are linearly independent \(\Leftrightarrow\) matrix A is invertible.

Theorem 2.24

Let A be any \(n\) by \(n\) matrix. The following six statements are equivalent (if any is true, all are true):

  1. A is an invertible matrix
  2. The rref form of A is the identity matrix
  3. The columns of A are linearly independent
  4. A is the product of elementary matrices
  5. Ax=0 has only the trivial solution
  6. Ax=b has a unique solution

Definiton 2.25

Consider the \(n\) vectors in the set \(S = \{\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\}\) in the n-space \(\mathbb{R}^n\). If every vector can be reproduced in \(\mathbb{R}^n\) by a linear combination of these vectors, then we say that these vectors span or generate the n-space \(\mathbb{R}^n\).

Definition 2.26

Consider the n vectors \(\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\) in the n space \(\mathbb{R}^n\). These vectors form a basis for \(\mathbb{R}^n \Leftrightarrow\)

  1. \(\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\) span \(\mathbb{R}^n\)
  2. \(\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\) are linearly independent

The standard unit vectors in \(\mathbb{R}^n\) form the natural or standard basis for \(\mathbb{R}^n\).

Proposition 2.27

Any linearly independent set of vectors in \(\mathbb{R}^n\) forms a basis for \(\mathbb{R}^n\).

Proposition 2.28

Any \(n\) vectors which span \(\mathbb{R}^n\) form a basis for \(\mathbb{R}^n\).

Proposition 2.29

Let the vectors \(\{\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\}\) be a basis for \(\mathbb{R}^n\). Every vector in \(\mathbb{R}^n\) can be written as a linear combination of the vectors in this basis.

Lemma 2.30

Let \(S=\{\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\}\) be a set of \(m\) vectors that are linearly independent in \(\mathbb{R}^n\), then \(m \le n\).

What this means: There can be at most \(n\) independent vectors in \(\mathbb{R}^n\). If there are more, the set is not a basis (minimum spanning set), and therefore the set is linearly dependent.

Further, if \(m < n\), the vectors may (or may not) be linearly independent, but they do not span \(\mathbb{R}^n\); they are not a basis.

Proposition 2.31

Every basis of \(\mathbb{R}^n\) contains exactly \(n\) vectors.

Proposition 2.32

Any \(n\) non-zero orthogonal vectors in \(\mathbb{R}^n\) form a basis for \(\mathbb{R}^n\).

3.1.1 Vector space

Let \(V\) be a non-empty set of elements called vectors. We define two operations on the set: vector addition and scalar-vector multiplication. Scalars are real numbers.

Let \(\overrightarrow{u}, \overrightarrow{v} and \overrightarrow{v}\) be vectors in \(V\). The set \(V\) is called a vector space if it satisfies the following ten axioms.

  1. Closure under vector addition: \(\overrightarrow{u} + \overrightarrow{v}\) is in \(V\).
  2. Commutative law for vector addition: \(\overrightarrow{u} + \overrightarrow{v} = \overrightarrow{v} + \overrightarrow{u}\)
  3. Associative law for vector addition: \((\overrightarrow{u} + \overrightarrow{v}) + \overrightarrow{w} = \overrightarrow{u} + (\overrightarrow{v} + \overrightarrow{w})\)
  4. Neutral element: \(\overrightarrow{u} + \overrightarrow{0} = \overrightarrow{u}\) for every vector in \(V\)
  5. Additive inverse: \(\overrightarrow{u} + -(\overrightarrow{u}) = \overrightarrow{0}\)
  6. Closure under scalar-vector multiplication: \(k\overrightarrow{u}\) is in \(V\)
  7. Associative law for scalar multiplication: \(\k(c\overrightarrow{u}) = (kc)\overrightarrow{u}\)
  8. Distributive law for vectors: \(k(\overrightarrow{u} + \overrightarrow{v}) = k\overrightarrow{i} + k\overrightarrow{v}\)
  9. Distributive law for scalars: \((k+c)\overrightarrow{u} = k\overrightarrow{u} + c\overrightarrow{u}\)
  10. Identity element: \(\overrightarrow{u} = 1\overrightarrow{u}\)

Proposition 3.1

Let \(V\) be a vector space and \(k\) be a real scalar. Then we have:

  1. \(k\overrightarrow{0} = \overrightarrow{0}\)
  2. \(0\overrightarrow{u} = \overrightarrow{u}\)
  3. \((-1)\overrightarrow{u} = -\overrightarrow{u}\)
  4. If \(k\overrightarrow{u}=\overrightarrow{0}\), then \(k=0\), \(\overrightarrow{u}=\overrightarrow{0}\), or both

The above seem so obvious and tautological to me that I don’t see any point in stating them.

Proposition 3.2

Let \(V\) be a vector space. The zero vector \(\overrightarrow{0}\) of \(V\) is unique.

Again…I don’t see the point.

Proposition 3.3

Let \(V\) be a vector space and k be a real scalar. The vector \(-\overrightarrow{u}\) which satisfies axiom 5 is unique:

\[\overrightarrow{u}+(-\overrightarrow{u}) = \overrightarrow{0}\]

Definition 3.4

A non-empty subset \(S\) of a vector space \(V\) is called a subspace of \(V\) if it is also a vector space with respect to the same vector addition and scalar-vector multiplication as \(V\).

One singleton set that is an example of a subspace is the zero vector.

Proposition 3.5

Let \(S\) be a non-empty subset of vector space \(V\). Then \(S\) is a subspace of \(V \Leftrightarrow\):

  1. if \(\overrightarrow{u}\) and \(\overrightarrow{v}\) are vectors in the set \(S\) then \(\overrightarrow{u} + vec{v}\) is also in \(S\)
  2. if \(\overrightarrow{u}\) is a vector in \(S\) then for every real scalar, \(k\overrightarrow{u}\) is also in \(S\).

Definition 3.6

Let \(\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\) be vectors in a vector space. If a vector \(\overrightarrow{x}\) can be expressed as

\[\overrightarrow{x} = k_1\overrightarrow{v}_1 + k_2\overrightarrow{v}_2,...k_n\overrightarrow{v}_n\] where \(k\)’s are scalars,

then we say x is a linear combination of the vectors \(\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\)

Proposition 3.7

A non-empty subset \(S\) containing vectors \(\overrightarrow{u}\) and \(\overrightarrow{v}\) is a subspace of a vector space \(V \Leftrightarrow\) any linear combination \(k\overrightarrow{u} + c\overrightarrow{v}\) is also in \(S\).

Definition 3.8

If every vector in \(V\) can be produced by a linear combination of vectors \(\overrightarrow{v}_1, \overrightarrow{v}_2,...\overrightarrow{v}_n\), then these vectors span or generate the vector space \(V\).

Proposition 3.9

Let \(S\) be a non-empty subset of a vector space \(V\). The set \(\text{span}\{S\}\) is a subspace of the vector space \(V\).