111

From Mathsoc wiki

Ma111 Algebra

Lecturer: Dr. Sinead Ryan

Website: Link


Contents

Set Theory

This was just a grounding with terminology in order to prepare for Group theory. Naive Set Theory was thus chosen over Axiomatic Set Theory.

Basic set theory

  • Set: A set is a collection of objects.
  • Subset: a set A is a subset of a set S if every element of A is an element of S
  • Notation: Union, intersection, difference, Cartesian product.


Relations, ordering

  • Relations: Have a non-empty set A, and non-empty subset R of A × A. If (a,b) is in R, then a is related to b, written a~b.

There are 4 properties of relations.

  • 1)If a in A, then a~a (reflexive)
  • 2)If a~b, then b~a (symmetric)
  • 2') If a~b, and b~a, then a=b (anti-symmetric)
  • 3)If a~b, and b~c, then a~c (transitive)
  • Partial Ordering: A relation that is reflexive, anti-symmetric and transitive is a partial ordering, and is written a~b as "a is less than or equal to b".
  • Linear Ordering: A linear ordering is a partial ordering such that if a, b are in A, then either a is less than or equal to b, or b is less than or equal to b.
  • Hausdorff Maximal Principle: In any partially ordered set, every linearly ordered subset is contained in a maximal linearly ordered subset (i.e. if the subset was made any bigger, it would cease to be linearly ordered). This is equivalent to Zorn's Lemma, and so to the Axiom of Choice.
  • Equivalence Relation: A relation that is reflexive, symmetric and transitive is an equivalence relation.
  • Equivalence Class: The equivalence class of an element a is defined by cl(a) = \{x \in A \mid a \sim x \}

Mappings

  • Mapping: A mapping is a relationship or rule that assigns to each element of one set, a uniquely determined element of another set.

Types of Mapping:

  • Injective (one-to-one): A mapping is one to one when distinct elements of the first set are mapped to distinct elements of the second.
  • Surjective (onto): A mapping is onto when for every element of the second set, there is at least one element of the first set that maps to it.
  • Bijective: A mapping is a bijection when it is both one-to-one and onto; every element of the first set is mapped to every element of the second set, and these mappings are distinct.
  • Composition of mappings: Mappings can be composed. Various relationships were stated or proved in class.


Integers, LIP, Euclidean Algorithm, gcd, prime stuff, congruence

  • ... -2,-1,0,1,2....
  • LIP (Least Integer Principle): A non-empty set of positive integers contains a least integer, which is unique.
  • gcd (Greatest Common Divisor): A positive integer c is the greatest common divisor of a and b if: 1) c|a and c|b, and 2) any divisor of a and b is a divisor of c. Note here the notation c|a signifying "c divides a". Also, if gcd(a,b) exists it can be written in the form c = ma+ nb, for some integers m,n.
  • Euclidean Division Algorithm
  • Relatively prime: a,b relatively prime if gcd(a, b) = 1. In other words, 1 is the only number that divides both a and b.
  • Prime number: One that is divided only by +/-1 and +/- itself.
  • Prime stuff: Sirloin.
  • Unique Factorisation Theorem: Every integer greater than two can be factored as a product of primes in a unique way.
  • Congruence Modulo: Two integers are said to be "congruent modulo n" ("congruent mod n") if they both have the same remainder when divided by n. For example, 5 is congruent to 9 mod 2, as they both have remainder 1 when divided by 2.

Group Theory

Basic group properties

Definition: A non-empty set of elements G forms a group if on it there is defined a binary operation * such that the following four properties are satisfied:

  1. If a,b are in G, then a*b is in G (closure)
  2. If a,b,c are in G, then a*(b*c)=(a*b)*c (associativity)
  3. There is an element e in G such that a*e = e*a = a (identity)
  4. For every a in G, there is an a^{-1} such that a*a^{-1} = a^{-1}*a = e (inverse)

More group properties

  • Abelian: A group is Abelian if its binary operation is commutative. Another way,a*b=b*a \forall a,b  \epsilon G
  • Finite Group: A group with a finite number of elements is called a finite group. A finite group may be expressed using a [Cayley Table]. A Cayley Table is a table of all possible combinations of elements in a group under the defined operation.
  • Order: The order of a group or subgroup is the number of elements it contains, denoted o(G).
  • Some Notation: For a in G, write a^0=e, a^1=a, a^2=a*a and so on. Note that that the use of power notation does not mean the group operation is multiplication; in a group under addition a^2=a+a
  • Cyclic: A cyclic group,G, is one such that if it has order n, G = \lbrace e,a,..,a^{n-1} \rbrace for some a an element of G. In other words G is generated by some element of G
  • Order of an element: The order of an element of a group, a, is the order of the cyclic subgroup it generates, denoted o(a).
Examples Of Groups
  • Z_n: The integers mod n form a group under addition mod n. E.g. Z_3 = \lbrace [0],[1],[2] \rbrace
  • S_n: The symmetry group S_n of permutations of a set of n objects forms a group under composition of permutations.

E.g. S_3 = \lbrace (1) (1,2) (1,3) (2,3) (1,2,3) (1,3,2) \rbrace

Subgroups, Cosets, Lagrange's Theorem

  • Subgroup: A non-empty subset H of a group G is a subgroup if, under the binary operation in G, H is itself a group (i.e., it is closed, contains an identity element, and all elements in it have inverses. Associativity is generally assumed.)
  • The centre of a group is an abelian subgroup in that the centre of a group G is the set Z(G) of all elements in G which commute with all the elements of G.
  • Normal Subgroup: A subgroup N of a group G is a normal subgroup if for every g \in G and n \in N, gng^{-1} \in N.
  • Coset: If G is a group, H a subgroup of G, and g \in G, then a left coset of H in G is defined by gH = \lbrace gh : h \in H \rbrace. Intuitively, a right coset is of the form Hg.
  • Quotient Group: If N a normal subgroup of G, then its cosets form a group under multiplication of cosets. This is called a quotient group, denoted G|N.
  • Lagrange's Theorem: G a finite group, H a subgroup, then the order of H divides the order of G.

Permutations, transpositions, cycles

  • Permutation: A permutation is a bijective mapping from a set to itself (Laytalk: Rearrangement). These can be written as transpositions, n-cycles, and can be either odd or even.
  • Determining whether it is odd or even appears to be confusing as different people write permutations from left to right, or from right to left, (Sinead Ryan reads permutations from left to right, whereas Durbin reads from right to left). The nicest method I've found is in Anton's Elementary Linear Algebra,which involves inversions.
  • An inversion is when a larger integer precedes a smaller one in a permutation. Reading from left to right, you count the number of inversions for each integer and add these up - if the sum is even, the permuation is even. If the sum is odd, the permuation is odd.
  • e.g. (4132) would be even as it has 3 + 0 + 1 = 4 inversions required to put it into the standard form of (1234).

Isomorphisms and Homomorphisms

  • Isomorphism: An isomorphism from a group G to a group \bar{G} is a map f:G \to  \bar{G} which has the following properties:
  1. f is a bijection, and
  2. f\lbrace x*y\rbrace = f\lbrace x \rbrace*f\lbrace y\rbrace, where the first * denotes the binary operation in G, and the second * denotes the binary operation in \bar{G}.

Thus, the mapping preserves the structure of the groups, and they are said to be isomorphic. Essentially, for algebraic purposes, there is no real difference between them.

  • Homomorphism: A mapping f from a group G to a group \bar{G} is a homomorphism if for all x,y \inG,

f\lbrace x*y\rbrace = f\lbrace x \rbrace*f\lbrace y\rbrace,where the first * denotes the binary operation in G, and the second * denotes the binary operation in \bar{G}.

  • Kernel: If f a homomorphism from G to \bar{G}, then the kernel of f is K_f = \lbrace x \in G : f\lbrace x\rbrace = \bar{e} \rbrace. In other words, the kernel contains all elements of the first group that are mapped to the identity element of the second group.
  • The Fundamental Homomorphism Theorem

The Fundamental Homomorphism Theorem features a diagram in the shape of a triangle.

FundHomDiag.png

Attempts in class to explain how one might possibly actually use this were futile and went on for ages, with lots of pointing at a similar diagram. In the diagram shown, K is the Kernel of f. This means the picture actually depicts the First Isomorphism Theorem. The Fundamental Homomorphism Theorem is technically stated for G|N, N a normal subgroup with N a subset of the Kernel.

Rings, fields

  • Rings: Are circular. Newton has some. Sauron lost his. Johnny Cash had a burning one.


A ring is a set R equipped with two binary operations called addition and multiplication, such that:

  • (R, +) is an abelian group with the identity element 0:
    • (a + b) + c = a + (b + c)
    • 0 + a = a + 0 = a
    • a + b = b + a
    • For every a in R, there exists an element denoted −a, such that a + −a = −a + a = 0
  • (R, ·) is a monoid with identity element 1:
    • (a·bc = a·(b·c)
    • a = a·1 = a
  • Multiplication distributive law|distributes over addition:
    • a·(b + c) = (a·b) + (a·c)
    • (a + bc = (a·c) + (b·c)


  • Fields: Contain cows.

A field is a commutative division ring.

Vector Spaces

Take a vector space... define another basis... Take a vector space.... define another basis...

I'm gonna take you to another dimension!

Properties, subspaces

  • Vector Space: A vector space V is a set of objects on which 2 operations are defined: 1) vector addition and 2) scalar multiplication, and which satisfy the following axioms for u,v,w  \in V and k,m scalars from the field F over which the vector space is defined:


1.u,v \in V \Rightarrow (u+v) \in V,

2.u+v=v+u,

3.(u+v)+w=u+(v+w),

4.0 \in V,  0+v=v+0=v,

5.\forall  u \in V,  \exists  -u  \in  V : u+(-u)=(-u)+u=0

6.k \cdot u  \in V,

7.k \cdot ( u+ v )= k \cdot u + k \cdot v,

8.(k+m) \cdot u=k \cdot u+m \cdot u,

9.k(m)u=(k \cdot m)(u),

10.1 \cdot u=u

  • Subspace: A subset W of a vector space V is called a subspace of V if W is itself a vector space under the addition and scalar multiplication defined on V. In practice, to check if a subset is a subspace, you just need to check it is closed under addition of vectors and multiplication by scalars.

Linear combinations, span, linear independence, basis, dimension

  • Linear Combination: A vector w \in V is a linear combination of vectors \lbrace v_1, v_2 ... v_n \rbrace \in V if there exist scalars c_1, c_2... c_n such that

w = c_1v_1 + c_2v_2 + ... c_nv_n

  • Span: If S is a set of vectors \lbrace v_1, v_2 ... v_n \rbrace \in V, then the span of S is the set of all linear combinations of the vectors \lbrace v_1, v_2 ... v_n \rbrace
  • Linear Independence: Consider S=\lbrace v_1, v_2 ... v_n \rbraceand the equation w = c_1v_1 + c_2v_2 + ... c_nv_n = 0.

If the only solution to this equation is:

c_1 = c_2 = ... = c_n =0, then the vectors v_i are said to be linearly independent.

  • Basis: A set of vectors from a vector space form a basis for that vector space if 1) they span the vector space, and 2) they are linearly independent.
  • Dimension: V a vector space, S a basis. If S contains a finite number of elements, then V is finite dimensional and its dimension is the the number of vectors in S. Otherwise, V is infinite dimensional.
  • Lots of Interesting Theorems: See Chapter 5, General Vector Spaces, of Elementary Linear Algebra, by Anton. I might type some of them up eventually.

Changing bases, transition matrices

  • Coordinate Vector: S= \lbrace v_1,v_2 ... v_n \rbrace a basis for vector space V, and a \in V. Then a=c_1v_1 + c_2v_2 + ... + c_nv_n and the scalars c_i are the coordinates of a relative to S. We write the coordinate vector ( a )_S  = ( c_1, c_2,...c_n ).
  • Transition Matrix: V a finite dimensional vector space and B = \lbrace v_1,v_2 ... v_n \rbrace, C = \lbrace w_1,w_2 ... w_n \rbrace both bases for V.

Then the transition matrix from C to B is:

P = [(w_1)_B,(w_2)_B,... (w_n)_B]

where the i^{th} column is the coordinates of w_i relative to B. Also, P is invertible and P^{-1} is the transition matrix from B to C.


Inner Product Spaces

  • Inner Product Space: A vector space V over a field F is an inner product space if for u,w,y \epsilon V there is defined an operation known as an "inner product" denoted \langle u,w \rangle or (u,w), an element of F, such that: 1)\langle u,w \rangle = \langle \bar{w},u \rangle, 2) \langle u,u \rangle \ge 0 with equality iff u=0, 3)\langle au+bw,y \rangle =a \langle u,y \rangle + b \langle w,y \rangle.

As an example, the Euclidean Inner Product or dot product.

  • Norm: The norm of u \epsilon V, an inner product space, is denoted ||u|| and equals \sqrt{ \langle u,u \rangle }. It is analogous to the length of a vector.
  • Properties of Norm: ||u|| \ge 0; ||u||=0 iff u=0; ||au||=a||u||, a \in F; ||u+v|| \le ||u|| + ||v||
  • Distance: The distance between u,w \in V is given by d(u,w)=||u-w|| = \sqrt \langle (u-w),(u-w) \rangle
  • Cauchy-Schwarz Inequality: If u,v \in V, then |\langle u,v \rangle | \le ||u||||v||
  • Orthogonality: u,v \in V, then u orthogonal to v if \langle u,v \rangle = \langle \bar{w},u \rangle =0.
  • Orthogonal complement: W a subspace of V, the orthogonal complement of W is W^\perp = \lbrace x \in V : \langle x,w \rangle =0 \forall w \in W \rbrace. In other words, the orthogonal complement of a subspace consists of all vectors orthogonal to every element in the subspace.
  • Orthonormal set: A set of vectors is an orthonormal set if each has norm=1, and if each is orthogonal to all the others. In addition, this means the vectors are linearly independent. Also, any finite dimensional inner product space has an orthonormal set as a basis.

Cross product

As demonstrated by MP Hammer.

Linear transformations, induced matrix

  • Linear Transformation: A transformation (mapping) from one vector space X to another Y is linear if for all x,y \in X, and \alpha a scalar, 1)f(x+y) = f(x) + f(y), and 2) f(\alpha x) = \alpha f(x)... in other words, a map is linear iff it preserves linear combinations.
  • Induced matrix: Every linear map is uniquely specified by its effect on any basis. Consider \lbrace x_1, x_2...x_n \rbrace, basis for vector space X, and \lbrace y_1, y_2...y_n \rbrace, basis for vector space Y. Then for any linear map f:X \to Y, there exist unique scalars a_{ij} 1 \le i \le m, 1 \le j \le n such that f(x_j) = \sum a_{ij}y_{i}. The a_{ij} form the induced matrix of the linear map f with respect to the two bases.

Linear Algebra

Matrix multiplication

See Leaving Certificate.

Systems of linear eqns, augmented matrix, EROs, Gauss and Gauss-Jordan Elimination

  • Linear Equation: A linear equations is an equation in which the unknowns appear to the first power only, and only as linear combinations. A system of linear equations is a collection of 2 or more linear equations. Essentially, these are just simultaneous equations, except now we will solve them using matrices.
  • Augmented Matrix: Consider n linear equations, m unknowns:

a_{11} + a_{12} + ...+a_{1m} = b_1

a_{21} + a_{22} + ... +a_{2m} = b_2

...

a_{n1} + a_{n2} + ... +a_{nm} = b_n

Then one can form the augmented matrix A_g: \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1m} & b_1   \\ a_{21} & a_{22} & \cdots & a_{2m} & b_2   \\ \vdots  & & & & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nm} & b_n    \end{pmatrix}


  • Solving Linear Systems: Having formed the augmented matrix, one can then solve the system of linear equations by performing elementary row operations (EROs) on the matrix. These are: 1) scaling a row by a non-zero constant, 2) Swapping one row with another, and 3)Adding c times one row to another row. These EROs do not change the solution and are invertible.

The purpose of the EROs is to transform the matrix into either row-echelon form (Gaussian elimination) or reduced row-echelon form (Gauss-Jordan Elimination).

  • In row-echelon form 1) any row of all zeros is at the bottom, 2)if a row is non-zero the first entry is a 1 (known as a leading 1) and 3), in any 2 non-zero rows, the leading 1 of the lower row is to the right of the leading 1 of the upper row.
  • In reduced row-echelon form the above 3 properties hold but with the additional stipulation that if a column contains a leading 1, all other entries in the column are zero.
  • Gaussian Elimination: In Gaussian Elimination, you reduce the matrix to row-echelon form, then use the coefficients in the matrix to solve for the x_i by back-substitution.
  • Gauss-Jordan Elimination: In Gauss-Jordan Elimination you reduce the matrix to reduced row-echelon form, then use the leading ones to read off the values of the x_i.

Diagonal, triangular, symmetric matrices

  • Diagonal matrix: A diagonal matrix is one with non-zero entries only on the main diagonal.
    • e.g. of Diagonal Mx:\begin{pmatrix} 1 & 0 & 0\\ 0 & 7 & 0\\ 0 & 0 & -5\end{pmatrix}
  • Triangular matrix: An upper triangular matrix is one with non-zero entries only on the main diagonal and above it. A lower triangular matrix is one with non-zero entries only on the main diagonal and below.
    • e.g of Lower\triangle Mx.:\begin{pmatrix} 1 & 0 & 0\\ 2 & 7 & 0\\ 5 & 0 & -5\end{pmatrix}
  • Symmetric Matrix: A Symmetric matrix is a square matrix, A, that is equal to its transpose.
  • Skew-symmetric matrix: A Skew-Symmetric matrix is a square matrix A, that is equal to the negative of its inverse.

Finding inverse by EROs

Consider an invertible nxn matrix A. Then it is row equivalent to the identity matrix, and so can be transformed into the identity matrix by a series of elementary row operations. This is the same thing as multiplying the matrix by a series of elementary matrices. Hence, Ek....E2E1A = 1. Now we can multiply across by A-1 to get Ek....E2E11 = A-1. Hence, performing the same elementary row operations on the identity matrix transforms it into A-1. In practice form a matrix [ A | 1 ] and perform EROs on it until A has been transformed into the identity, then you will have performed the following transformation [ 1 | A-1 ].


Diagonal, triangular, symmetric matrices

  • Diagonal matrix: A diagonal matrix is one with non-zero entries only on the main diagonal.
  • Triangular matrix: An upper triangular matrix is one with non-zero entries only on the main diagonal and above it. A lower triangular matrix is one with non-zero entries only on the main diagonal and below.
  • Symmetric Matrix: A Symmetric matrix is a square matrix, A, that is equal to its transpose.
  • Skew-symmetric matrix: A Skew-Symmetric matrix is a square matrix A, that is equal to the negative of its inverse.


LU Decomposition

The idea is to decompose a given matrix A into a product of two matrices, one lower triangular and one upper triangular, so that A = LU. The relevant background theory is that if a matrix can be reduced to row echelon form U by elementary row operations (excluding row interchanges - this is important), then you can write U as a product of elementary matrices: Ek....E2E1A = U. Now these Ei are invertible, and all lower triangular. So we can multiply across by the inverses to get: A = E1-1E2-1....Ek-1U

As the inverse of a lower triangular matrix is a lower triangular matrix, and the product of lower triangular matrices is a lower triangular matrix, we get: A = LU.

In practice, you reduce A to row echelon form U by elementary row operations (but no row interchanges), and then can form the matrix L by inspection. The diagonal entries of L are one divided by the multiplying factor that introduced the corresponding one on the diagonal of U, and the non-diagonal entries are the negative of the multiplying factor used to introduce the corresponding zero in U.

For Ax = b, we have LUx = b. Let Ux = y, solve Ly = b for y, and then solve Ux = y for x.


Row space, column space, null space

  • Row/Column Vectors: For an mxn matrix A, the vectors formed from its rows are the row vectors of A. The vectors formed from the its columns are the column vectors of A.
  • Row space: The subspace of Rn spanned by the row vectors of A is the row space of A, R(A).
  • Column space: The subspace of Rm spanned by the column vectors of A is the column space of A, C(A).
  • Nullspace: The subspace of Rn spanned by the solution vectors x of Ax=0 is the nullspace of A.

Useful Facts About These Subspaces

  • Elementary row operations do not change the nullspace or rowspace of a matrix. They can, however, changed the column space.
  • If A and B are row equivalent matrices, then a set of column vectors that are linearly independent in A iff the same set of column vectors in B are linearly independent. Similarly, a set column vectors of A form a basis for the column space iff the same vectors in B do.
  • If a matrix R is in row-echelon form, the row vectors with the leading ones form a basis for the row space of R, and the column vectors that contain these leading ones form a basis for the column space of R.
  • The row space and column space of a matrix A have the same dimension. This is known as the rank of A, rank(A).
  • The dimension of the nullspace is called the nullity of A, denoted nullity(A).
  • If a matrix A has n columns, then rank(A) + nullity(A) = n.
  • Fundamental System of Linear Systems: Consider m equations, n unknowns, written Ax=b, with augmented matrix Ag. Then Ax=b has a solution iff rank(A)=rank(Ag). It has a unique solution iff rank(A)=rank(Ag)=n. It has infinite solutions iff rank(A)=rank(Ag)<n.
  • The nullspace of A and the rowspace of A are orthogonal complements in Rn with respect to the Euclidean Inner Product. This also implies the nullspace of AT and the column space of A are orthogonal complements in Rm.


Determinants, Cramer’s Rule

  • Determinant: A determinant is a function that maps every n x n matrix to a real number. It is defined as the sum of the signed elementary products of the matrix. (Not a very helpful definition really).
  • Cramer's Rule: Cramer's rule states that for a linear system Ax=b,

x_i = \frac{det(A_i)}{det(A)}, where A_i is the matrix resulting from replacing the i^{th} column of A with the column vector b

Eigenvalues, eigenvectors

  • Eigenvector: An eigenvector is a vector whose direction is unchanged when multiplied by a matrix. Thus for a vector x and matrix A, Ax = \lambda x with \lambda a scalar.

Now, rewriting this gives (A - \lambda I)x = 0, where I is the identity matrix. We want this equation to have non-trivial solution, i.e. for x to not be zero. In other words, we want the matrix (A - \lambda I) to not be invertible, or that det(A - \lambda I) = 0 - this is known as the characteristic equation of A. Solving it gives the values of \lambda for which Ax = \lambda x, these are known as the eigenvalues of A. Subbing the eigenvalues back into (A - \lambda I)x = 0 allows you to determine the corresponding eigenvector for each eigenvalue.

nxn matrices

  • For an nxn matrix, the following are equivalent:
  • A is invertible
  • Ax=0 has only the trivial solution, x=0
  • The reduced row-echelon form of A is the identity
  • A is expressible as a product of elementary matrices
  • Ax=b is consistent for all nx1 matrices b, and has exactly one solution for each b
  • det(A) is non-zero
  • The column vectors of A are linearly independent and span Rn; so they form a basis for Rn
  • The same is true for the row vectors of A
  • The rank of A is n, and the nullity is zero
  • The orthogonal complement of the nullspace is Rn, that of the rowspace is {0}
  • ATA is invertible
  • 0 is not an eigenvalue of A

Note: the phrase "are equivalent" here means that if one of these statements holds for a particular nxn matrix A, then so do all the others... it does not mean they hold for all nxn matrices...

Diagonalising matrices, orthogonal diagonalisation, Gram-Schmidt Process

  • Diagonalisation: An nxn matrix A that has n linearly independent eigenvectors is diagonalisable; that is, there exists an nxn matrix P such that P^{-1}AP=D, D a diagonal matrix. We say that P diagonalises A.

In practice, one forms the matrix P (called the modal matrix) consisting of the eigenvectors of A, and then D is a diagonal matrix with the eigenvectors of A on the diagonal.

  • Orthogonal Matrix: An orthogonal matrix is one such that A^T=A^{-1}. Orthogonal matrices preserve norms and dot products; ||Ax||=||x||, Au \cdot Aw = u \cdot w. They are also used to diagonalise symmetric matrices.
  • Orthogonal Diagonalisation: If A is a symmetric (A=A^T) matrix, then A can be orthogonally diagonalised. That is, there exists a matrix P such that P^TAP=D.

In practice, one determines the eigenvalues and eigenvectors of A, then uses Gram-Schmidt Orthogonalisation to orthogonalise the eigenvectors (note that distinct eigenvectors corresponding to distinct eigenvalues of a symmetric matrix are already orthogonal). The next set is to form an orthonormal set by dividing each newly-orthogonal vector by its norm. Then one forms the matrix P with these orthonormal vectors as columns. This P is such that P^TAP=D, and D has the eigenvalues of A on the diagonal.

  • Gram-Schmidt Orthogonalisation: This is a process that orthogonalises vectors by subtracting off the projection of all previous vectors onto each vector in question. Given a set of vectors \lbrace v_1,v_2...v_n \rbrace one can construct an orthogonal set \lbrace w_1,w_2... w_n \rbrace using the following procedure:


w_1=v_1

w_2=v_2 - \frac{\langle w_1,v_2 \rangle}{||w_1||^2} w_1

w_3=v_3 - \frac{\langle w_2,v_3 \rangle}{||w_2||^2} w_2 - \frac{\langle w_1,v_3 \rangle}{||w_1||^2} w_1

and so on.

QR Decomposition

  • QR Decomposition: If A is an nxm matrix with linearly independent column vectors, then A can be factored as A = QR, with Q an mxn matrix with orthonormal column vectors, and R an nxn invertible upper triangular matrix. This can be shown by construction; Sinead Ryan made some mistakes doing it in class so learn it out of a book so that you get the correct version.

The basic idea is to use the Gram-Schmidt process to orthogonalise the column vectors of A and so to obtain an orthonormal set of qi, with which you form the matrix Q. Then you can form a matrix R, whose entries are obtained from a series of dot products of ui and qi. I won't attempt to notate it here in depth. Go read Anton.

A= [q_1 q_2 ... q_n]  \begin{bmatrix}   \langle u_1,q_1 \rangle & \langle u_2,q_1 \rangle & \cdots & \langle u_n,q_1 \rangle    \\   0 & \langle u_2,q_2 \rangle & \cdots   \\    \vdots &  & \ddots & \\    0      & \cdots & \cdots &  \langle u_n,q_n \rangle \end{bmatrix}


Least squares

Figure: Orthogonal projection, by MS Paint
Enlarge
Figure: Orthogonal projection, by MS Paint

The least squares problem involves a linear system Ax=b that is not consistent... a solution x cannot be found that accurately fits with A and b. Instead a "best fit" x can be found. Essentially, one can think of it as follows: Let W denote the column space of A. Then as the vector x ranges over all R^n, the vector Ax varies over all of W (as Ax is a linear combination of the columns of A). We wish to find a vector x such that Ax is as close as possible to b. Rather intuitively, the vector Ax that is closest to b is the orthogonal projection of b on W.

Thus, we have Ax=projWb. Now, note that b - projWb = b - Ax is orthogonal to the column space W. This means that b - Ax is in the nullspace of AT.

Hence, A^T (b - Ax) =0

Or, A^T Ax = A^Tb.

This is known as the normal system associated with Ax = b. A solution x of this system is known as a least squares solution of Ax = b. If A has linearly independent columns, then the normal system has a unique solution.

Band and block matrices

  • Band Matrix: A Band matrix is a matrix A with elements that are non-zero only on the main diagonal & adjacent diagionals.
  • Block Matrix: A Block matrix A element of R(mxn) is a qxr block matrix if can be written as a qxr collection of blocks.


Jordan form

  • Jordan Form: An nxn matrix A that does not have n linearly indepedent eigenvectors cannot be diagonalised. However, it can be transformed into an almost diagonal form, known as Jordan form, or canonical form, J. This matrix J has the eigenvalues of A on the main diagonal, and some ones on the superdiagonal (just above the main diagonal).


Matrix Norms

  • Matrix Norm: a mapping f:R^{mxn} \to R is a norm of an mxn matrix A if for R^{m} and R^{n} there is a vector norm, and 1)f(A) \ge 0,

2) f(A+B) \le f(A) + f(B), and 3)f(cA) = |c|f(A). The norm is denotedf(A) = ||A||

  • Frobenius norm: ||A||_F = \sqrt {\sum_{i=1}^m \sum_{j=1}^n |a_{ij}|^2}.

In words, the square root of the sum of the squares of each value.

  • p-norms: ||A||_p = max \frac{||Ax||_p}{||x||_p}, x \ne 0, p \ge 1.The p-norm for vectors is ||x||_p = (|x_1|^p + |x_2|^p + .... )^{ \frac {1}{p}}

For matrices, it is common to use ||A||_1 which works out as just the maximum absolute column sum (i.e. for a column, add up the absolute value of each entry. Do this for all columns and take the maximum), or ||A||_{\infty}, which is the maximum absolute row sum.

  • Condition number: The condition number of a matrix A is given by \kappa(A) = ||A||||A^{-1}|| for a given norm. If the condition number is large compared to the size of the entries in A, then the matrix is said to be badly conditioned, and a small change in the value of one of the entries can cause large changes in the solution set of the matrix.

Cayley-Hamilton Theorem And Applications

  • Cayley-Hamilton Theorem: The Cayley-Hamilton Theorem states that every square matrix A satisfies its own characteristic polynomial.


Applications

  • Reducing Order of a Polynomial:

Given a polynomial P(S), you can reduce its order using the C-H Theorem. Denote the characteristic polynomial of nxn matrix A by \Delta(S).

Then P(S) = Q(S) \Delta (S) + R(S), where Q(S) is the quotient formed from dividing \Delta(S) into P(S) by polynomial division, and R(S) is the remainder from the same operation.

Now, write this for the matrix A. P(A) = Q(A)\Delta(A) + R(A).

But as A satisfies its own characteristic polynomial, \Delta(A) = 0. Hence, P(A) = R(A). Note that R(S) is always of degree n-1 or less.

  • Determining Analytic Functions of a Matrix: Consider a scalar function f(s) analytic in some area of the complex plane. Then in that region we can express the function as a polynomial:

f(s)= \sum_{k=0}^{\infty} p_ks^k

Consider now the nxn matrix A with characteristic polynomial \Delta (S) and eigenvalues \lambda_i i=1,2...n

Then we can write f(S) = Q(S) \Delta (S) + R(S)

And so f( \lambda_i ) = R( \lambda_i ) since \Delta( \lambda_i ) = 0

And as R(S) is of degree n-1 or less, this means f( \lambda_i )= \sum_{k=0}^{n-1} \alpha_k \lambda_i^k and since we know the \lambda_i this defines a set of simultaneous equations which we can solve for the \alpha_k

Then f(A)= Q(A) \Delta (A) + R(A)=R(A)

So f(A)= \sum_{k=0}^{n-1} \alpha_k A^k and as we now know the \alpha_k we can work out f(A).

  • Determining Exponential of a Matrix:

Hermitian Matrices

  • A Hermitian Matrix (or self-adjoint matrix) is a square matrix which is equal to its own conjugate transpose.
    • i.e A=\bar{A}^T.
  • A skew-hermitian matrix is one which is equal to the negative of its conjugate transpose ie A=-\bar{A}^T.
  • NB Sinead Ryan denoted the operation of transposing the conjugate of a matrix A as A^H but this can be quite confusing because A^H usually denotes a hermitian matrix itself, not the operation.

External Links

Gilbert Strang's Video Lectures Well-recommended. Short on proofs but full of concepts.

Sinead Ryan's 111 Website (06/07)

O'Dunlaing's 111 Website (05/06)

David Wilkins' 111 Website (96/97)

References

My notes from Sinead Ryan's lectures and my own vague understanding of algebra.

Modern Algebra by John Durbin, 5th Edition

Elementary Linear Algebra, by Howard Anton, 9th Edition

Linear Algebra and Its Applications by Gilbert Strang, 8th Edition