113

From Mathsoc wiki

This page is far from complete! Can you help by adding anything to it (such as notes or links to useful sites)?
Course Name: 1111 Linear Algebra
Lecturer: Dr. Vladimir Dotsenko
Lecturer's Page: http://www.maths.tcd.ie/~vdots/
Course Page: http://www.maths.tcd.ie/~vdots/indexLinearAlgebra.html
Course Description: http://www.maths.tcd.ie/pub/official/Courses08-09/113.html
  • Notes on course 111 (abstract + linear algebra), 2006-2007 are here
  • A page with some worked examples of general Linear Algebra questions can be found here
  • Strang's Linear Algebra is a great book for revision or just looking at things from a different (slightly more applied) perspective.

Contents

Matrices

Inroduction

  • A finite set of linear equations in the variables x_{1}, x_{2}, \dots, x_{n} is defined as a System of Linear Equations. The general form for a System of Linear Equations is given by:
a_{11}x_{1}+a_{12}x_{2}+\dots+a_{1n}x_{n}=b_{1}
a_{21}x_{1}+a_{22}x_{2}+\dots+a_{2n}x_{n}=b_{2}
\vdots
a_{m1}x_{1}+a_{m2}x_{2}+\dots+a_{mn}x_{n}=b_{m}
  • A Solution Set of a System of Linear Equations is given by \left\lbrace s_{1}, s_{2}, s_{3}, \dots, s_{n} \right\rbrace where x_{1}=s_{1}, x_{2}=s_{2}, x_{3}=s_{3}, \dots, x_{n}=s_{n}.
  • Lemma: Given any System of Linear Equations, exactly one of the following is true:
  1. The system has no solutions,
  2. The system has exactly one solution set, or
  3. The system has infinitely many solution sets.
  • Given a System of Linear Equations, the corresponding Coefficient Matrix is given by:

A=  \begin{pmatrix} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \dots & a_{mn} \end{pmatrix}

  • Given a System of Linear Equations, the corresponding Augmented or Extended Matrix is given by:

A^{+} =  \begin{pmatrix} a_{11} & a_{12} & \dots & a_{1n} & b_{1} \\ a_{21} & a_{22} & \dots & a_{2n} & b_{2} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ a_{m1} & a_{m2} & \dots & a_{mn} & b_{m} \end{pmatrix}

  • Given any Matrix, there are three operations that we may perform on the system or on the augmented matrix that will not alter the solution set of the matrix. They are called Elementary Row Operations and are:
  1. Multiplying a row by a non zero constant,
  2. Interchanging two rows, and
  3. Adding a multiple of one row to another.
  • In any non-zero row of a matrix, if we multiply the row by the inverse of the first non-zero term, we call the first 1 in the row the leading 1.
  • For any matrix to be in Row echelon form the following properties must hold:
  1. All non-zero rows must have a Leading 1,
  2. For any row, call it a, the leading 1 in the row beneath a must be further to the right than the Leading 1 in a, and
  3. All rows consisting entirely of zero elements are grouped at the bottom, and essentially can be ignored.
  • As well as the above properties, for any matrix to be in Reduced Row Echelon Form, for each column containing a Leading 1, the Leading 1 must be the only element in that column.
  • By bringing the augmented matrix of a system of linear equations to reduced row echelon form and by comparing it to the original system, we can see that the variables corresponding to the Leading 1's can be solved. This is called Gauss-Jordan Elimination.
  • By bringing the augmented matrix of a system of linear equations to row echelon form, we solve the system by solving the lowermost equations, and working upwards, substituting where applicable. This is called Gaussian Elimination with Back Substitution. As with Gauss-Jordan Elimination, the solution set may be in parametric form.
  • A Homogeneous System of Linear Equations is given by:
a_{11}x_{1}+a_{12}x_{2}+\dots+a_{1n}x_{n}=0
a_{21}x_{1}+a_{22}x_{2}+\dots+a_{2n}x_{n}=0
\vdots
a_{m1}x_{1}+a_{m2}x_{2}+\dots+a_{mn}x_{n}=0
  • The Trivial Solution is where every element of the solutions set is equal to 0, i.e.

s_{1}=0, s_{2}=0, s_{3}=0, \dots, s_{n}=0

  • Theorem: A Homogeneous System of Linear Equations with more variables than equations has infinitely many solutions.


Matrix Operations

  • Given any matrix [A]_{nn} we define the main diagonal to be the line containing the elements

a_{11}, a_{22}, \dots a_{nn}.

  • Given any two matrices [A]_{nm} and [B]_{nm}, (noting the same sizes), we define the sum of the matrices by adding the corresponding entries of the two matrices, i.e.:

\begin{pmatrix} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \dots & a_{mn} \end{pmatrix} + \begin{pmatrix} b_{11} & b_{12} & \dots & b_{1n} \\ b_{21} & b_{22} & \dots & b_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ b_{m1} & b_{m2} & \dots & b_{mn} \end{pmatrix} = \begin{pmatrix} a_{11}+b_{11} & a_{12}+b_{12} & \dots & a_{1n}+b_{1n} \\ a_{21}+b_{21} & a_{22}+b_{22} & \dots & a_{2n}+b_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1}+b_{m1} & a_{m2}+b_{m2} & \dots & a_{mn}+b_{mn} \end{pmatrix}

  • Given any matrix [A]_{nm} a constant number c, the product cA is given by

cA=Ac= c\begin{pmatrix} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \dots & a_{mn} \end{pmatrix} = \begin{pmatrix} ca_{11} & ca_{12} & \dots & ca_{1n} \\ ca_{21} & ca_{22} & \dots & ca_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ ca_{m1} & ca_{m2} & \dots & ca_{mn} \end{pmatrix}

  • Given any two matrices [A]_{mn} and [B]_{nr}, (noting the common dimension of n), we define the ij entry of the product AB as follows

[AB]_{ij}=a_{i1}b_{1j}+a_{i2}b_{2j}+\dots+a_{in}b_{nj}= \sum^{n}_{k=1}a_{ik}b_{kj}

  • The transpose of a matrix [A]_{nm} is denoted by [A^{T}]_{nm} and has the following property (note the reversal of i and j): ([A]_{ij})^T =[A]_{ji}
  • Lemma: For any invertible matrices A and B with n\in\mathbb{Z} where n\neq 0:
  1. (A^{T})^{T}=A
  2. (A+B)^{T}=(A^{T}+B^{T})
  3. (nA)^{T}=(nA^{T})
  • The trace of a matrix [A]_{nm} is defined as tr(A)=a_{11}+a_{22}+\dots+a_{nn}
  • Theorem: For all matrices and constants where defined, the following laws hold:
  1. Commutativity for addition,
  2. Associativity for addition and multiplication, and
  3. Distributive Laws, on both the left and the right sides.


Types of Matrices

  • Any matrix that consists entirely of zeroes is called the Zero Matrix and is denoted by 0_{nm}. It acts as an additive identity.
  • Any square matrix consisting of all elements on the main diagonal being 1 and zeroes elsewhere is called the Identity Matrix and is denoted by I_{n}. It acts as a multiplicative identity.
  • For any square matrix, denoted A, if there exists a matrix B such that AB=BA=I, where I is the Identity Matrix, then A is said to be invertible, and B is the inverse of A, more commonly denoted by A^{-1}.
  • Lemma: Inverses are unique.
  • Theorem: For any number of invertible matrices, the inverse of the product is equal to the product of the inverses in reverse order.
  • Lemma: For any square matrix A, the identity I and n, m\in\mathbb{Z}:
  1. A^{0}=I
  2. A^{n}=AA\dots A n times
  3. A^{-1}=(A^{-1})^{n} if A is invertible
  4. A^{n}A^{m}=A^{n+m}
  5. (A^{n})^{m}=A^{nm}
  • Lemma: For any invertible matrix A and n\in\mathbb{Z} where n\neq 0:
  1. (A^{-1})^{-1}=A
  2. (A^{n})^{-1}=(A^{-1})^{n}
  3. (nA)^{-1}=\frac{1}{n}(A^{-1})
  4. (A^{T})^{-1}=(A^{-1})^{T}
  • Lemma: For any number of matrices, the transpose of the product is equal to the product of the transposes in reverse order.
  • An elementary matrix is any square matrix from which the identity matrix may be derived from exactly one elementary row operation. An elementary matrix is usually denoted by E.

If E is an elementary matrix, and A is any square matrix, then EA is the matrix A under the elementary operation that is performed on E to make it elementary.

  • Theorem: The following statements are equivalent:
  1. A is an invertible matrix.
  2. Ax=0 only has the trivial solution for any column matrix x.
  3. The reduced row-echelon form of A is the identity matrix.
  4. A can be expressed as a product of elementary Matrices.
  5. Ax=b is consistent for every column matrix [b]^{n1}.
  6. Ax=b has exactly one solution for every column matrix [b]^{n1}.


Vectors

Introduction

  • Let u=(u_{1},u_{2},\dots,u_{n}), v=(v_{1},v_{2},\dots,v_{n}) \in\mathbb{R}^{n}. These vectors have starting points in the origin. The vector that connects the endpoint of u to the endpoint of v is given by \overrightarrow{uv}=( v_{1}-u_{1},v_{2}-u_{2},\dots,v_{n}-u_{n}). The vectors u and v are said to be equal iff u_{i}=v_{i} for all 1\leq i \leq n.
  • Addition and scalar multiplication are defined by u+v=( u_{1}+v_{1}, u_{2}+v_{2}, \dots,u_{n}+v_{n}) and cu=( cu_{1}, cu_{2}, \dots,cu_{n}) where c\in\mathbb{R}.

Also, note that for vectors and multiplicative scalars, the following properties hold:

  1. Commutativity for addition,
  2. Associativity for addition and multiplication,
  3. Additive and multiplicative identities exist, and
  4. Distributive Laws, on both the left and the right sides.
  • Given v=( v_{1},v_{2},\dots,v_{n}) \in\mathbb{R}^{n}, the Norm of the vector v is given by ||v|| = \sqrt{\sum^{n}_{i=1}(v_{i})^{2}}
  • Let u=(u_{1},u_{2},\dots,u_{n}), v=(v_{1},v_{2},\dots,v_{n}) \in\mathbb{R}^{n}. The distance between these vectors is given by ||\overrightarrow{uv}|| = ||\overrightarrow{vu}|| = \sqrt{\sum^{n}_{i=1}(u_{i}-v_{i})^{2}}
  • Lemma: Given v=( v_{1},v_{2},\dots,v_{n}) \in\mathbb{R}^n and k \in \mathbb{R}, ||ku||= |k|\cdot ||u||.
  • Given any two vectors in \mathbb{R}^2 or \mathbb{R}^3, denoted as u and v the dot product of the two vectors is a scalar defined as:
u \cdot v = ||u|| ||v|| \cos\theta if a\neq 0 and b\neq 0
0 if a=0 or b=0

Where \theta is the angle formed between the two vectors.

  • Theorem: Given any u=(u_{1},u_{2},\dots,u_{n}), v=(v_{1},v_{2},\dots,v_{n}) \in\mathbb{R}^n, u\cdot v=\sum^{n}_{i=1}u_{i}v_{i}
  • Theorem: Given any u=( u_{1},u_{2},\dots,u_{n}),v=( v_{1},v_{2},\dots,v_{n}), w=( w_{1},w_{2},\dots,w_{n}) \in\mathbb{R}^n, and k\in \mathbb{R}The following hold:
  1. u\cdot v = v\cdot u
  2. u\cdot (v+w)=u\cdot v + u\cdot w
  3. k(u\cdot v)=(ku)\cdot v=u\cdot (kv)
  4. v \neq 0 \Rightarrow v\cdot v > 0, and v=0 \Rightarrow v\cdot v = 0
  • If u and v are vectors in \mathbb{R}^n and v \neq 0, then the vector component of u parallel to v is given by \mathrm{proj}_{v}u=\frac{u\cdot v}{||v||^2} and the vector component of u orthogonal or perpendicular to v is given by u-\mathrm{proj}_{v}u= u -\frac{u\cdot v}{||v||^2}
  • Lemma: If u and v are vectors in \mathbb{R}^n and v \neq 0, then ||\mathrm{proj}_{v}u||=||u|| \cdot |\cos\theta|
  • Lemma: Given a line ax+by+c and a point (x_1,y_1), the straight line distance from the line to the point is given by d=\frac{|ax_1+by_1+c|}{\sqrt{a^2+b^2}}.


Vector Spaces

  • We define \mathbb{R}^{n} to be the set of all the column vectors (matrices) that are of the form:

v= \begin{pmatrix} v_{1} \\ v_{2} \\ \vdots \\ v_{n} \end{pmatrix} where v_{i}\in\mathbb{R}

  • Lemma: Given a vector space V, the additive identity vector, namely the zero vector, is unique.
  • For any vectors u,v\in\mathbb{R}^{n} and constant c\in\mathbb{R}, we have u+v\in\mathbb{R}^{n}, cv\in\mathbb{R}^{n}. Combining the above two makes what's called a linear combination:

if v_{1},v_{2},\dots,v_{k},\in\mathbb{R}^{n}, and c_{1},c_{2},\dots,c_{k}\in\mathbb{R}, then c_{1}v_{1}+c_{2}v_{2}+\dots+c_{k}v_{k}\in\mathbb{R}^{n}.

  • For any set of vectors, v_{1},v_{2},\dots,v_{k},\in\mathbb{R}^{n}, if there exists a set of constants, c_{1},c_{2},\dots,c_{k}\in\mathbb{R}, that are not entirely equal to 0, such that c_{1}v_{1}+c_{2}v_{2}+\dots+c_{k}v_{k} = 0, we say that the set of vectors are linearly dependent. If such a set of constants do not exist, then we say that the set of vectors is linearly independent.
  • For any set of vectors, M=\left\lbrace v_{1},v_{2},\dots,v_{k}\right\rbrace \in\mathbb{R}^{n}, we define the rank of the set to be the maximum number of linearly independent vectors in M, and we denote it by \mathrm{rk}(M).
  • Theorem: Elementary row operations on a matrix do not change the rank of that matrix.
  • A set of vectors M=\left\lbrace v_{1},v_{2},\dots,v_{k}\right\rbrace \subseteq\mathbb{R}^{n} is said to be complete if any vector u\in\mathbb{R}^{n} can be represented as a linear combination of that set, i.e. u=c_{1}v_{1}+c_{2}v_{2}+\dots+c_{k}v_{k} for a set of constants, c_{1},c_{2},\dots,c_{k}\in\mathbb{R}.
  • A set of vectors M=\left\lbrace v_{1},v_{2},\dots,v_{k}\right\rbrace \subseteq\mathbb{R}^{n} is called a basis if it is both linearly independent and complete.
  • Lemma: Given any basis in \mathbb{R}^{n} denoted by e=\left\lbrace e_{1},e_{2},\dots,e_{k} \right\rbrace and any vector u\in\mathbb{R}^{n}, there exists exactly one expression of v as a linear combination of the basis.
  • Lemma: Any set of more than n vectors in \mathbb{R}^{n} is linearly dependent.
  • Theorem: For a set of vectors M=\left\lbrace v_{1},v_{2},\dots,v_{k}\right\rbrace if there exists an l such that 0 \leq l \leq k and if \left\lbrace v_{l+1},v_{l+2},\dots,v_{k}\right\rbrace can be expressed as a linear combination of the linearly independent set M=\left\lbrace v_{1},v_{2},\dots,v_{l}\right\rbrace, then \mathrm{rk}(M)=l.
  • The Kronecker Capelli Theorem: Given a system of linear equations described by the coefficient matrix A and the augmented matrix A^+, the system of equations is consistent iff \mathrm{rk}(A) = \mathrm{rk}(A^+).
  • For a vector space V, a subset U of V is said to be a Vector Subspace of V if it is closed under addition and scalar multiplication.
  • The span of a set of vectors v_1, v_2, \dots ,v_n in \mathbb{R}^n is denoted \mathrm{span}(v_1, v_2, \dots ,v_n) and is defined as the smallest subspace of \mathbb{R}^n that contains all these vectors.
  • Given a vector space V, the dimension of V is denoted \dim (V) and is defined as the number of vectors that form a basis for V. If \dim (V) is finite, V is said to be finite dimensional. V is called infinite dimensional otherwise.
  • Lemma: Given a vector space V, \dim (v) is well defined, i.e., any bases of V have the same number of elements.
  • A set of vectors F is called a field over \mathbb{R} if addition and multiplication of vectors is defined on this set, and if it has the following properties:
  1. Addition and multiplication are both associative,
  2. Addition is also commutative,
  3. The left and right distributive laws hold over addition/multiplication,
  4. Additive and multiplicative identities exist, and
  5. Additive inverses exist.

Explicitly, the following properties must hold for all u,v,w\in F and c_1, c_2\in\mathbb{R} :

  1. u+v = v+u,
  2. u+(v+w) = (u+v)+w,
  3. There exists a vector 0 such that v+0 = 0,
  4. For all v\in V there exists a -v\in V such that v+(-v)=0,
  5. c_1(u+v)=c_1u+c_1v
  6. (c_1+c_2)v = c_1v+c_2v
  7. (c_1c_2)v = c_1(c_2v)
  8. There exists a vector 1 such that 1v = v


Lines and Points

  • Given any plane in \mathbb{R}^3 the normal is defined as a non-zero vector that is perpendicular to the plane.
  • Lemma: Given any plane in \mathbb{R}^3 if the normal is given by n= \left( a,b,c \right) and the normal intersects the plane at P_0 = \left( x_0, y_0, z_0 \right) then the general equation of the plane, called the point-normal form of the equation of the plane, is given by a(x-x_0)+b(y-y_0)+c(z-z_0)=0 which may also be written as ax+by+cz+d=0 where d is a constant.
  • Lemma: Given any plane in \mathbb{R}^3 if the normal is given by n= \left( a,b,c \right) and the normal intersects the plane at P_0 = \left( x_0, y_0, z_0 \right), the equation of the plane may be written in terms of the vectors from the origin to P_0, denoted by r_0 and to any point in that plane, denoted r. The vector form of the plane is thus given by n\cdot (r-r_0).
  • In \mathbb{R}^3, given the point P_0 = \left( x_0, y_0, z_0 \right) and the vector v= \left( a,b,c \right), the parametric equations of all lines passing through the point P_0 that are parallel to the vector v are given by:\\
x = x_0+ta
y = y_0+tb
z = z_0+tc

where t\in\mathbb{R}

  • The parametric vector equation of the line passing through P_0 = \left( x_0, y_0, z_0 \right) and parallel to v= \left( a,b,c \right), where r_0 is the vector given by r_0= \left( x_0,y_0,z_0 \right) and r is the vector to any point on the line, is given by r = r_0 + tv where t\in (-\infty, +\infty).
  • Lemma: Given a line ax+by+cz+d and a point (x_1,y_1, z_1), the straight line distance from the line to the point is given by: d=\frac{|ax_1+by_1+cz_1+d|}{\sqrt{a^2+b^2+c^2}}.


The Cross Product

  • Given any two vectors in \mathbb{R}^3 denoted a=(a_1, a_2, a_3) and b=(b_1, b_2, b_3) the cross product of a and b is a vector defined as

a\times b = \left(  a_2b_3 - a_3b_2, a_3b_1 - a_1b_3 , a_1b_2 - a_2b_1  \right)

  • Theorem: Given any three vectors in \mathbb{R}^3 denoted u, v and w, and scalar k \in \mathbb{R}:
  1. u\cdot (u\times v) = 0 if u\times v is orthogonal to u
  2. v\cdot (u\times v) = 0 if u\times v is orthogonal to v
  3. ||u\times v|| = ||u||^2 ||v||^2-(u\cdot v)^2, Lagrange's Identity
  4. u\times (v\times w) = (u\cdot w)v - (u\cdot v)w
  5. u\times (v\times w) = (u\cdot w)v - (v\cdot w)u
  6. u\times v = -v\times u
  7. u\times (v+w) = (u\times v)+(u\times w)
  8. (u+v)\times w = (u\times w)+(v\times v)
  9. k(u\times v) = (ku)\times v = u\times (kv)
  10. v\times v = v\times 0 = 0\times v = 0
  • The basis for \mathbb{R}^3 is given by the three vectors \textbf{i} = (1,0,0), \textbf{j} = (0,1,0) and \textbf{k} = (0, 0, 1) are called the unit vectors for \mathbb{R}^3. Note that the Cross Product may be written as:

a\times b =  \det \begin{pmatrix} \textbf{i} & \textbf{j} & \textbf{k}\\ a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ \end{pmatrix}

  • Lemma: Given two vectors in \mathbb{R}^3 denoted u and v, ||u\times v||=||u||\cdot||v||\sin\theta, where \theta is the angle between the vectors.
  • Theorem: Given three vectors u, v, w\in\mathbb{R}^3, the scalar triple product is defined as

u\cdot (v\times w) and is given by u\cdot(v\times w)= \det \begin{pmatrix} u_1 & u_2 & u_3\\ v_1 & v_2 & v_3 \\ w_1 & w_2 & w_3  \end{pmatrix}

  • Lemma: Given two vectors u, v\in\mathbb{R}^2, then the area of the parallelogram determined by the two vectors is given by: \mathrm{Area} =  \left| \det \begin{pmatrix} u_1 & u_2 \\ v_1 & v_2  \end{pmatrix}  \right|
  • Lemma: Given three vectors u, v, w\in\mathbb{R}^3, the volume of the parallelepiped determined by there three vectors is given by:

\mathrm{Volume} = \left|  \begin{pmatrix} u_1 & u_2 & u_3\\ v_1 & v_2 & v_3 \\ w_1 & w_2 & w_3  \end{pmatrix}  \right| = |u\cdot (v\times w) |


Linear Transformations

  • A Linear Transformation is a function that maps a set of linear equations from \mathbb{R}^n to \mathbb{R}^m. Often, this function can be expressed by multiplication of a column vector and a matrix. For example, let the function map f:\mathbb{R}^n\rightarrow\mathbb{R}^m, which can be written as y = Ax, where x\in\mathbb{R}^n and y\in\mathbb{R}^m. The matrix A is called the transition matrix between the domain ( \mathbb{R}^n ) and the codomain ( \mathbb{R}^m ). In this case, we say that the mapping f is multiplication by A.
  • A Linear Transformation T:\mathbb{R}^n\rightarrow\mathbb{R}^m is said to be one-to-one if T maps distinct points in \mathbb{R}^n to distinct points in \mathbb{R}^m .
  • If A_{n\times n} is a matrix and T:\mathbb{R}^n\rightarrow\mathbb{R}^n is multiplication by A, then the following statements are equivalent:
  1. A is an invertible matrix.
  2. Ax=0 only has the trivial solution for any column matrix x.
  3. The reduced row-echelon form of A is the identity matrix.
  4. A can be expressed as a product of elementary Matrices.
  5. Ax=b is consistent for every column matrix [b]^{n1}.
  6. Ax=b has exactly one solution for every column matrix [b]^{n1}.
  7. \det (A) \neq 0.
  8. The smallest subset of \mathbb{R}^m that contains all possible values of T is \mathbb{R}^n.
  9. T is one-to-one.
  • A transformation T:\mathbb{R}^n\rightarrow\mathbb{R}^m is called linear if the following properties hold for all u,v\in\mathbb{R}^n and scalar c\in\mathbb{R} T(u+v) = T(u)+T(v) and T(c\cdot v) = c\cdot T(v)
  • If T:\mathbb{R}^n\rightarrow\mathbb{R}^m is a linear transformation and e = \left \lbrace e_1,e_2,\dots ,e_n \right \rbrace are the standard basis vectors for \mathbb{R}^n, then the standard matrix form for T is T = \left[ T(e_1)|T(e_2)|\dots|T(e_n) \right]
  • Given a linear transformation A:V\rightarrow W, where V and W are vector spaces, the kernel of A is denoted \ker(A) and is defined as \ker(A) = \left\lbrace v\in V | Av = 0 \right\rbrace. The image of A is denoted \mathrm{im}(A) and is defined as \mathrm{im}(A) = \left\lbrace w\in W | w = Av\right\rbrace. The nullity of A is denoted \mathrm{nullity}(A), and is defined as the dimension of the kernel of V.
  • Lemma: Note, by their definitions, if A is one-to-one, then \ker (A) has only one element, and if A is onto, then \mathrm{im}(A) = W.
  • Lemma: Given a linear transformation A:V\rightarrow W , \ker(A) and \mathrm{im}(A) are subspaces of V and W respectively.
  • Lemma: For any linear transformation A:V\rightarrow W, \mathrm{rk}(A) = \dim(\mathrm{im}(A)).
  • Lemma: For any linear transformation A:V\rightarrow W, \mathrm{rk}(A) + \mathrm{nullity}(A) = \dim(W) .


Other Spaces

Introduction to Eigenspace

  • Let A be a square matrix. If x is non-zero a column vector such that Ax=\lambda x, then we say that x is an eigenvector of A, and \lambda is its corresponding eigenvalue.
  • Given equation Ax=\lambda x, we can rewrite this as (A-\lambda I)x = 0, where I is the identity matrix. The Characteristic Polynomial of A is denoted \chi_A(\lambda) and is defined as \chi_A(\lambda) = \det(A -\lambda I). As x is defined to be non-zero, there are non-trivial solutions, and thus \det(A -\lambda I) = 0.
  • Theorem: The roots of the characteristic polynomial are the eigenvalues of A.
  • A matrix A is said to be similar to a matrix B if there exists an invertible matrix C such that A=C^{-1}BC.
  • Lemma: If two matrices A and B are similar, then \chi_A(\lambda) = \chi_B(\lambda).
  • Theorem: A_{n\times n} is similar to a diagonal matrix iff it has n linearly independent eigenvectors.
  • Lemma: Eigenvectors of a matrix that correspond to different eigenvalues are linearly independent.
  • Corollary: If all eigenvalues of a matrix A are pairwise distinct, then A is similar to a diagonal matrix.
  • The Cayley-Hamilton Theorem: For any matrix A, \chi_A(A) = 0 .


Jordan Normal Form

  • A Jordan Block is a square matrix which is composed of an eigenvalue along the main diagonal, and 1's on the superdiagonal.

The Jordan Normal Form of a matrix A is a matrix of the form J =  \begin{pmatrix} J_1 & 0 & \dots & 0 \\ 0 & J_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & J_k \end{pmatrix}

where J_i are jordan blocks. If all the eigenvalues of A are not distinct, the repeated eigenvalues are grouped together in one Jordan Block. For example, if the eigenvalue \lambda_1 has t corresponding eigenvalues, then the Jordan Block for \lambda_1 will be of size t.

  • Proposition: Every matrix is similar to its jordan normal form. The transition matrix P such that A = P^{-1}JP is the matrix whose columns are the eigenvectors of A. One method of calculating these columns is to note that if the columns are p_1, p_2, \dots, p_s, then (A-\lambda I)p_1 = 0, and (A-\lambda I)p_i = p_{i-1}.
  • Lemma: Let A be a matrix, with distinct eigenvalues \lambda_1, \lambda_2, \dots, \lambda_n. Assume that there exists a polynomial f such that f(A) = 0 and f(t) = (\lambda_1-t)(\lambda_2-t)\cdots(\lambda_n-t). A is similar to a diagonal matrix with entries on the main diagonal equal to the roots of f.
  • A matrix U is said to be strictly upper triangular if it has 1's on the superdiagonal (the diagonal above the main diagonal) and 0's elsewhere. L, similarly, is strictly lower triangular if it has 1's on the subdiagonal (the diagonal below the main diagonal) and 0's elsewhere.
  • Lemma: Strictly triangular matrices shift when raised to integral powers. This is illustrated by the above example of U:

U^1 =  \begin{pmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix}

U^2 = \begin{pmatrix} 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix}

U^3 =  \begin{pmatrix} 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix}

U^4 =  \begin{pmatrix} 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix}

U^5 =  \begin{pmatrix} 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix}

In other words, for the matrix U^k, the diagonal of 1's is k diagonals from the main diagonal. Consequently, for k \geq 5 in this example, U^k = 0.

  • Computing the powers of a jordan matrix is relatively straightforward. Considering the general Jordan Normal Form,

J =  \begin{pmatrix} J_1 & 0 & \dots & 0 \\ 0 & J_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & J_k \end{pmatrix}

As this matrix is diagonal, we can say that

J^t =  \begin{pmatrix} (J_1)^t & 0 & \dots & 0 \\ 0 & (J_2)^t & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & (J_k)^t \end{pmatrix}.

  • The above reduces the problem to evaluating the powers of a jordan block. A Jordan block J can be expressed as J = D+N where D is the diagonal matrix containing the eigenvalues of J and N is a strictly upper triangular matrix. By the binomial theorem,

J^t = (D+N)^t = D^k+{t\choose 1}D^{t-1}N + {t\choose 2}D^{t-2}N^2 + \dots + N^t = \sum^t_{i=0} {t\choose i}D^{t-i}N^i.

As D is a diagonal matrix, its powers are easily computed. The powers of N, as shown above, shift the diagonal of 1's. As D = \lambda I, the term D^{t-i}N^i is a matrix with {t\choose i}\lambda^{t-i} in the diagonal i away from the main diagonal. This can be shown in the following (god-awful) matrix:

J =  \begin{pmatrix} \lambda^t & {t\choose 1}\lambda^{t-1} & {t\choose 2}\lambda^{t-2} & {t\choose 3}\lambda^{t-3} & {t\choose 4}\lambda^{t-4} & \dots & 1 \\  0 & \lambda^t & {t\choose 1}\lambda^{t-1} & {t\choose 2}\lambda^{t-2} & {t\choose 3}\lambda^{t-3} & \dots & {t\choose t-1}\lambda \\  0 & 0 & \lambda^t & {t\choose 1}\lambda^{t-1} & {t\choose 2}\lambda^{t-2} & \dots & {t\choose t-2}\lambda^{2}\\  0 & 0 & 0 & \lambda^t & {t\choose 1}\lambda^{t-1} & \dots & {t\choose t-3}\lambda^{3} \\  0 & 0 & 0 & 0 & \lambda^t & \dots & {t\choose t-4}\lambda^{4} \\  \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\  0 & 0 & 0 & 0 & 0 & 0 & \lambda^t  \end{pmatrix}

  • Given the power series of a function f, e.g. f(x) = e^x = \sum^\infty_{k=0} \frac{x^k}{k!}, we can determine the value of f(A) where A is any square matrix by using the above theorem about Jordan Normal Forms. For example, in the above equation, instead of having to work out e^A = \sum^\infty_{k=0} \frac{A^k}{k!}, we can replace A by P^{-1}JP, as A is similar to J, which reduced the problem to e^A = \sum^\infty_{k=0} \frac{(P^{-1}JP)^k}{k!} = P^{-1}\left\lbrace\sum^\infty_{k=0} \frac{J^k}{k!} \right \rbrace P, noting that (P^{-1}JP)^n = P^{-1}J^nP and that P and P^{-1} are independent of the sum.
  • For a vector space V and an operator A:V\rightarrow V, a subspace U of V is called invariant if A(U)\subset U.
  • Let V_1, V_2\subset V and V_1\cap V_2 = \emptyset. The Direct Sum of V_1 and V_2 is denoted V_1 \bigoplus V_2 and is defined as the set V_1 \bigoplus V_2 = \left \lbrace v_1+v_2 \quad: \forall\;v_1\in V_1, v_2\in V_2 \right \rbrace.
  • Lemma: Given the direct sum of two subspaces V_1 \bigoplus V_2, every v\in V_1 \bigoplus V_2 can be uniquely represented as v_1+v_2, for v_1\in V_1 and v_2\in V_2 .
  • Given a set of unknown functions x_1(t),x_2(t),\dots,x_n(t) that have the property
x_1^\prime(t) = a_{11}x_1(t)+a_{12}x_2(t)+\dots+a_{1n}x_n(t)
x_2^\prime(t) = a_{21}x_1(t)+a_{22}x_2(t)+\dots+a_{2n}x_n(t)
\vdots
x_n^\prime(t) = a_{n1}x_1(t)+a_{n2}x_2(t)+\dots+a_{nn}x_n(t)

Or more compactly, x^\prime(t) = A\cdot x(t), where A is the matrix of the coefficients a_{ij}. This system of equations is known as a system of differential equations.

  • Lemma: Given initial conditions, namely x(0), there exists a unique solution to this system, namely x(t) = e^{At}x(0).


Euclidean Space

  • A Euclidean Vector space is a vector space on which an inner product (as defined below) is defined.
  • Given a real vector space V, a mapping f:V\times V\rightarrow\mathbb{R} is called an inner product on V if the following properties hold for all u, v, w\in V and c\in\mathbb{R} (note that for u,v\in V, the inner product f(u,v) is simply denoted (u,v) or sometimes (u|v) or \left\langle u,v \right\rangle)
  1. Symmetry: (v, u) = (u, v),
  2. Distribution: (v + w, u) = (v, u) + (w, u),
  3. Homogeneity: (cv, u) = c(v, u), and
  4. Positivity: (v, v)\geq 0 and (v, v) = 0 iff v = 0
  • If an equation has, at most, a degree of 2, then it is called a quadratic form. In a vector space V, this can be thought of as a mapping from a vector to a real number. If this quadratic form is always non-negative, it is called a positive semi-definite quadratic form. If the quadratic form is equal to zero only when the vector is the zero vector, then it is called a positive definite quadratic form.
  • Let V be a Euclidean vector space. For any v\in V, the length of v is denoted |v| and is defined as |v| = \sqrt{(v, v)}, where (v, v) is an inner product on V. Sometimes, this value is denoted ||v||. Also, given a vector v, the vector \frac{v}{||v||} is the normalized form of v, which has a length of 1.
  • Lemma: Given two vectors u, v\in V where V is a Euclidean vector space, the angle formed between these two vectors is given by \theta = \arccos \left( \frac{(u, v)}{|u|\cdot |v|} \right)
  • Let V be an n-dimensional Euclidean vector space. A sequence of vectors e = \left \lbrace e_1, e_2, \dots, e_n \right \rbrace \subseteq V is an orthogonal basis of V if:
  1. the sequence e does not comprise entirely of the zero vector, and
  2. (e_i, e_j) = 0 for all i \neq j

If (e_i, e_i) = 1 for all 0\leq i\leq n in addition to the above properties, then e is an orthonormal basis of V

  • Lemma: Any orthogonal basis in a Euclidean vector space V is a basis for V.
  • Theorem: Any n-dimensional Euclidean vector space contains orthonormal bases.
  • The process of finding an orthogonal basis is called the Gram-Schmidt Orthogonalisation process. The final step is to normalise the orthogonal basis to form an orthonormal basis. Given a basis u = {u_1, u_2, \dots u_n} of an n-dimensional Euclidean vector space V, we form an orthogonal basis v = {v_1, v_2, \dots, v_n} by the following method.
  1. Let v_1 = u_1,
  2. Let v_r = u_r - \sum^{r-1}_{i=0} \frac{(u_{r}, v_i)}{(v_i, v_i)} u_i for all 2\leq r \leq n

This gives an orthogonal basis for V. To find an orthonormal basis, we normalise the orthogonal basis. Thus \left \lbrace \frac{v_1}{|v_1|}, \frac{v_2}{|v_2|}, \dots, \frac{v_n}{|v_n|} \right \rbrace is an orthonormal basis.


Linear and Bilinear Forms

  • A function f: V\rightarrow\mathbb{R}, where V is a vector space, is a linear form if:
  1. f(x+y) = f(x) + f(y) for all x,y\in V, and
  2. f(c\cdot x) = c\cdot f(x) for all x\in V
  • A function f: V\times V\rightarrow\mathbb{R}, where V is a vector space, is a bilinear form if:
  1. f(x_1+x_2, y) = f(x_1,y) + f(x_2,y) for all x_1, x_2,y \in V,
  2. f(x, y_1+y_2) = f(x,y_1) + f(x,y_2) for all x, y_1,y_2 \in V, and
  3. f(c\cdot x, k\cdot y) = ck\cdot f(x,y) for all x,y\in V, and c,k\in\mathbb{R}

In other words, the function A must be a linear function in both x and y.

  • Let x = ( x_1, x_2, \dots, x_n ) and y = ( y_1, y_2, \dots, y_n ) be two vectors in a vector space V. Given a basis e = \left \lbrace e_1, e_2, \dots, e_n \right \rbrace of V, and a bilinear mapping A:V\times V\rightarrow\mathbb{R}, we note the following:
A(x_1e_1+x_2e_2+\dots+x_ne_n, y_1e_1+y_2e_2+\dots+y_ne_n)=

A(x_1e_1, y_1e_1+y_2e_2+\dots+y_ne_n)+ A(x_2e_2,y_1e_1+y_2e_2+\dots+y_ne_n) + \dots + A(x_ne_2, y_1e_1+y_2e_2+\dots+y_ne_n)=

\begin{pmatrix} A(x_1e_1, y_1e_1)+ & A(x_1e_1, y_2e_2)+ & \dots+ & A(x_1e_1, y_ne_n)+ \\ A(x_2e_2, y_1e_1)+ & A(x_2e_2, y_2e_2)+ & \dots+ & A(x_2e_2, y_ne_n)+ \\ \vdots \\ A(x_ne_n, y_1e_1)+ & A(x_ne_n, y_2e_2)+ & \dots+ & A(x_ne_n, y_ne_n) \end{pmatrix}

\begin{pmatrix} x_1y_1A(e_1, e_1)+ & x_1y_2A(e_1, e_2)+ & \dots+ & x_1y_nA(e_1, e_n)+ \\ x_2y_1A(e_2, e_1)+ & x_2y_2A(e_2, e_2)+ & \dots+ & x_2y_nA(e_2, e_n)+ \\ \vdots \\ x_ny_1A(e_n, e_1)+ & x_ny_2A(e_n, e_2)+ & \dots+ & x_ny_nA(e_n, e_n) \end{pmatrix}

Thus, if we consider the matrix A, where a_{ij} = A(e_i, e_j), then we say that the matrix A is the matrix of the bilinear form A relative to the basis e. Thus, the bilinear form of is given by A(x,y) = x^TAy.

  • Lemma: For any symmetric bilinear form, there exists a basis of this form where the matrix is diagonal.
  • Given a bilinear form A on a vector space V, we denote the quadratic form of A by q_A(x) and define q_A: V\rightarrow \mathbb{R} as q_A(x) = A(x,x).
  • Theorem: Any symmetric bilinear form can be uniquely determined from its quadratic form, namely that A(x,y) = \frac{1}{2}[q_A(x+y)-q_A(x)-q_A(y)].
  • Theorem: For any quadratic form q_A(x) = \sum^n_{i=0} \sum^n_{j=0} a_{ij} x_i x_j, there exists a change of coordinates \left\lbrace y_1,y_2\dots, y_n \right\rbrace where
y_1 = c_{11}x_1 + c_{12}x_2 + \dots c_{1n}x_n
y_2 = c_{21}x_1 + c_{22}x_2 + \dots c_{2n}x_n
\vdots
y_n = c_{n1}x_1 + c_{n2}x_2 + \dots c_{nn}x_n

Such that q_A(y) = b_1y^2_1+b_2y^2_2+\dots+b_ny^2_n. This form is known as the signed sum of squares form.

  • Bessel's Inequality: Let \left\lbrace e_1,e_2,\dots,e_n \right\rbrace be an orthonormal set of vectors in a Euclidian vector space V. For all v\in V, (v,v)\geq (v,e_1)+(v,e_2)+\dots +(v,e_n).


  • Lemma: The set of continuous functions \left\lbrace \cos 0, \cos t, \sin t, \cos 2t, \sin 2t, \dots, \cos nt, \sin nt \right\rbrace is pairwise orthogonal with respect to the inner product (f(t), g(t)) = \int^\pi_{-\pi} f(t)g(t) dt.
  • Given a quadratic form in the signed sum of squares form, the signature of this form is given by (p,q) where p is the number of positive coefficients, and q the number of negative coefficients in the form. It is also referred to as inertia.
  • Sylvester's Law of Inertia: The signature of a quadratic form does not depend on the method employed to bring this form to the signed sum of squares form. Namely, the signature of a quadratic orm does not depend on the basis chosen.
  • Lemma: For any square symmetric matrix, these exists an orthonormal basis of eigenvectors.
  • Given a Euclidian vector space V, the orthogonal complement of a vector v\in V is given by v^\perp = \left\lbrace w\;:\;w\in V, (u,w)=0 \right\rbrace.
  • Theorem: Let V be a vector space and let A,B be two symmetric bilinear forms on V. If A is positive definite, then there exists a basis of V such that both forms have diagonal matrices.


Complex Spaces

  • A basic knowledge of complex numbers is required. For example, a complex number z = a+bi, the conjugate of z is given by \bar{z} = a-bi, where a,b\in\mathbb{R} and i^2=-1.
  • Let V be a complex vector space and let A:V\times V\rightarrow \mathbb{C}. A(x,y) is a sesquilinear form if, for all x,y\in V, and c\in\mathbb{C}.
  1. A(x_1+x_2, y) = A(x_1,y)+A(x_2,y),
  2. A(x, y_1+y_2) = A(x,y_1)+A(x,y_2),
  3. A(c\cdot x, y) = c\cdot A(x,y), and
  4. A(x, c\cdot y) = \bar{c}A(x,y),
  • A sesquilinear form A is Hermitan if, A(x,y) = \overline{A(y,x)}. In parallel with the definition for a bilinear form, we can express a Hermitan form given a basis \left\lbrace e_1, e_2, \dots, e_n \right\rbrace. In particular, if z and w are vectors in a complex Vector space, then A(z,w) = z^TAw where A is the matrix where the a_{ij} entry is A(e_i, e_j).
  • A complex vector space V is called a Hilbert space if it is equipped with a positive definite Hermitan form.
  • The transpose of the conjugate matrix of a matrix A is called the adjoint matrix of A. Not to be confused with the matrix of cofactors. The adjoint of a matrix is written A^*. If A = A^*, then A is a self adjoint matrix.
  • Lemma: For all matrices A and B with complex entries, (AB)^* = B^*A^*
  • If a matrix U has the property U^{-1} = U^*, then we say that U is unitary.
  • Given a matrix A, if A^*A=AA^*, then A is called a normal matrix.
  • Lemma: Any two commutative operators A and B on a complex vector space V have a common eigenvector.
  • Theorem: Any normal operator on a complex Hilbert space has an orthonormal basis of eigenvectors.
  • Theorem: Given a vector space V, on which a Hermitan operator A is defined, there exists an orthonormal basis of V consisting of eigenvectors, where the corrsponding eigenvalues are real.
  • Lemma: If Av=\lambda v and A^*v = \mu v, then \lambda = \overline{\mu}.
  • Theorem: If a matrix A with real coefficients satisfies the property that AA^T = A^TA, then A is diagonalisable over \mathbb{C}.