Math 206 Homework #24 Notes

Math 206
HWK #24
Due Monday May 6

Back to Math 206 Home page

Back to Syllabus page

Orthonormal Bases and Gram-Schmidt Orthonormalization

Having it all: Diagonalization using Orthonormal Basis

Problems

6.2 p294: 1a, 12, 14 (for 12, use the fact that two lines in the plane are perpendicular iff their slopes are negative reciprocals) Click for solutions.
6.3 p308: 3b, 10a, 16a, 18. Click for solutions.
6.5 p330: 1, 16, 23. Click for solutions.
7.3 p360: 2, 7. Click for solutions.

6.1/6.2. Geometry and dot products.

You only need to read part of 6.2, and none of 6.1. We will only use the dot product you learned in Math 205, not the more general inner products discussed in the text. You already know practically everything you need to know from 6.1 and 6.2., but here's an outline of the key points.

Geometry in Euclidean Rn is given by the dot product learned in Math 205 (also called the Euclidean inner product ).

The length of a vector u, ||u||, is the square root of u*u.
The distance between two points P and Q is the length of the vector from P to Q (or vice versa).
The angle q between two nonzero vectors u and v is characterized by the equation cos q =(u*v)/(||u|| ||v||).
Two nonzero vectors u and v are perpendicular to each other (or orthogonal) iff their dot product is zero.
More generally, we'll say that two vectors are orthogonal iff their dot product is zero (so the zero vector is orthogonal to every vector in the space.)

This dot product (a real valued function that accepts two vectors from Rn as inputs) also has important algebraic properties, properties used in the geometry connection.

Symmetry: u*v = v*u
Additivity: (u+v)*w = u*w+v*w
Homogeneity: (ku)*v = u*(kv) = k(u*v)
Positivity:
- u*u is never negative (so is either positive or zero)
- u = 0 ==> u*u = 0
- u*u = 0 ==> u = 0

A general inner product on an abstract real vector space V is a real valued function that accepts two vectors (u, v, say) from V as inputs and returns a real number <u,v> as output and that has the same four properties described above.

Symmetry: <u,v> = <v,u>
Additivity: <u+v,w> = <u,w>+<v,w>
Homogeneity: <ku,v> = <u,kv> = k<u,v>
Positivity:
- <u,u> is never negative (so is either positive or zero)
- u = 0 ==> <u,u> = 0
- <u,u> = 0 ==> u = 0

For instance, one important example involves weighting the individual entries a little differently so that not all entries carry the same weight. (See Example 2 p277 and Example 4 p278, if you're interested.)

It turns out that these 4 conditions are exactly what's needed to give us meaningful notions of length, distance, angle, and orthogonality in a general inner product space.

Browse 6.1 and 6.2, if you like, to see some other possible inner products and properties of inner products (for instance, the Pythagorean theorem continues to hold in inner product spaces). However, we will use only the usual Euclidean inner product, in other words the dot product that is familiar to you from Math 205.

Notes for 6.3. Orthonormal Bases. Gram-Schmidt Process.

What is an orthonormal basis (for Euclidean Rn)?
- It's a basis in which all vectors are orthogonal to each other and each vector has length one. See the definitions and examples on pp298-299.
- So {u1, u2, ...un} is an orthonormal basis iff ui*uj = 0 for distinct i,j and ui*uj = 1 when i=j.
- A set is said to be orthogonal if 2 vectors chosen from the set are always orthogonal to each other. So an orthonormal basis is a basis that's an orthogonal set and in which each vector has length one.
What makes orthonormal bases so nice? And how do we find them?
- When the basis is orthonormal, it's very easy to find coordinates. One just has to compute some dot products, not solve systems of equations.
  - See Thm 6.3.1 on p299 and Example 3 p300. Say {u1, ...un} is an orthonormal basis for Euclidean Rn. Let v be a vector in Rn. Then we have
  - v = (v*u1)u1 + (v*u2)u2 +...(v*un)un.
  - In other words, the various coordinates for v are just the dot products of v with the respective basis vectors.
    - This formula is obvious if the basis is the standard basis.
    - This formula holds no matter what the basis is, as long as it's an orthonormal basis.
- Orthogonal ==> independent (almost).
  - See Theorem 6.3 p301. It says that every finite orthogonal set, unless it contains the zero vector, is automatically independent.
- So an orthogonal set with the correct number of orthogonal vectors, none of which is the zero vector, will automatically be an orthogonal basis.
- From an orthogonal basis, it's very easy to create an orthonormal basis. Just "normalize" the basis vectors. In other words, replace each vector u in the basis by a scalar multiple of length one, namely by u/||u||.
- So the question becomes: how to create an orthogonal basis?

6.3. Gram-Schmidt Process.

So if orthonormal bases are so nice, how do we find them? In other words, given some subspace of Euclidean Rn, how do we find an orthonormal basis for it? That's where Gram-Schmidt comes in.
So what does this Gram-Schmidt process do? It takes a given basis (for some subspace of some Euclidean Rn) and creates from it an orthonormal basis.
- So if this process always works, which it does, then every subspace of Euclidean Rn must have an orthonormal basis.
- More generally, this process gives us that every finite-dimensional inner product space (whether using the usual dot product or not) has an orthonormal basis. This makes inner product spaces particularly nice to work with. We not only have some geometry (length, distance, angle, orthogonality), but we also have the possibility of working with a really nice basis for the space.
And how does Gram-Schmidt work? Here's a description of the process, with some of the reasoning for why it works as advertised.
- Say we start with a particular basis {u1, u2, ..., un} for V. We will first create an orthogonal basis {v1, v2, ..., vn}, one vector at a time. Then we will normalize to get the desired orthonormal basis.
  - Step 1. Keep u1. In other words, set v1=u1.
  - Step 2. Obtain a vector v2 that is orthogonal to v1 as follows. Realize that the vector [(u2*v1)/(||v1||2 )]v1 lies in span{v1}. From Math 205 we can think of this as the projection of u2 onto v1. Subtract this from u2 and the result will be orthogonal to v1. In other words, set
  v2 = u2- [(u2*v1)/(||v1||2 )]v1.
  - Note that v2 cannot be zero. It is easy to check that v2 and v1 are orthogonal. So {v1, v2} is an orthogonal basis for span{u1,u2}.
- Step 3. Now obtain a vector v3 that is orthogonal to each of v1 and v2 as follows. The vector [(u3*v1)/(||v1||2)]v1 lies in span{v1}. From Math 205 think of this as the projection of u3 onto v1. Similarly think of [(u3*v2)/(||v2||2)]v2 as the projection of u3 onto v2. The sum of these two projections gives what we'll eventually call the projection of u3 onto span{v1,v2}. Subtract this sum from u3 and the result will be orthogonal to both v1 and v2. In other words, set v3 = u3 - [(u3*v1)/(||v1||2)]v1 - [(u3*v2)/(||v2||2)]v2
  - The vector v3 cannot be zero. A direct computation shows that v3 is orthogonal to both v1 and v2. By now, we've built an orthogonal basis for span{u1, u2, u3}.
  - Continue in this fashion, obtaining, after n steps, an orthogonal basis
  {v1, v2, ..., vn}
  - for span{u1, u2, ...,un}.
  - Step n+1. Normalize each vector, as necessary, to obtain an orthonormal basis. In other words, multiply each vi by the scalar 1/(length of vi), and the resulting set will be the desired orthonormal basis.

Study Example 7 pp 304-305, for an example illustrating Gram-Schmidt. A general outline is given on pp 304-305 as the proof for Thm 6.3.6. (Every nonzero finite-dimensional inner product space has an orthonormal basis.)

6.5. Orthogonal Matrices and orthogonal operators

What is an orthogonal matrix and what do orthogonal matrices have to do with orthonormal bases?
- A square matrix is said to be an orthogonal matrix iff it is invertible and its transpose is its inverse. Note how much easier it is to transpose than to find inverses. So knowing a matrix is orthogonal can be very useful. See the definition on p320 and the examples on p321.
- A square matrix (say n x n) is orthogonal iff its row vectors form an orthonormal set (in Euclidean Rn). Read Theorem 6.5.1 p321 and its proof.
- Similarly, a square matrix (n x n) is orthogonal iff its column vectors form an orthonormal set.
- Therefore a square matrix (n x n) is orthogonal iff its row vectors form an orthonormal bases for Euclidean Rn. In other words, orthogonal n x n matrices are the same as matrices whose rows are orthonormal bases for Euclidean Rn.
- Transition matrices used to convert from one orthonormal basis to another are always orthogonal matrices, another reason why orthonormal bases are particularly nice.
Orthogonal matrices have other nice properties. See theorem 6.5.2 p322.
Not surprisingly, a linear transformation from Rn to Rn that is accomplished as multiplication by an orthogonal matrix (called an orthogonal operator) has nice properties. In particular, multiplication by an orthogonal matrix does not change the length of the vector being multiplied. See Theorem 6.5.3 pp 322-323.
6.5 also includes a discussion of what we called transition matrices (the matrices that are used to translate coordinates relative to one basis into coordinates relative to some other matrix). We've already covered this material in connection with linear transformations in 8.4, but you might want to read it here (pp 324 - top of 327).
As already mentioned above, transition matrices that convert from one orthonormal basis to another are always orthogonal. Read Theorem 6.5.5 p327 and the first rotation example (Exple 6) that follows it.

7.3. Can we have it all? Diagonalization using an orthonormal basis.

To represent a particular linear transformation we often want to use a diagonal matrix.
To make it easy to find coordinates and to find transition matrices we often want to use orthonormal bases.
So given some linear operator from Euclidean Rn to Euclidean Rn one often wants to represent this operator by a diagonal matrix that uses an orthonormal basis.
Read 7.3., with primary attention to Theorem 7.3.1 and to the Diagonalization procedure for symmetric matrices described on p359 and illustrated in Example 1 pp359 -360.
Theorem 7.3.1 gives us a nice answer to three related questions.
1. For which n x n matrices A will there exist an orthonormal basis for Euclidean Rn consisting entirely of eigenvectors for A? (Text says A has an orthonormal set of n eigenvectors.)
2. For which n x n matrices A will there exist an orthogonal matrix P (with inverse Q equal to the transpose of P) such that QAP is a diagonal matrix. (Text says A is orthogonally diagonalizable.)
3. Viewing A as the standard matrix for some linear operator on Euclidean Rn, for which n x n matrices A will it be possible to find an orthonormal basis that will let us represent the operator by a diagonal matrix.
From our earlier analysis of diagonalization problems, we know that all three questions will have the same answer. In other words, these three questions are essentially equivalent.
So what is the answer? The matrices in question are the n x n symmetric matrices. In other words, if A is symmetric, then there are n orthonormal eigenvectors for A and A can be orthogonally diagonalized. And vice versa.
Assuming A is symmetric, how do we accomplish the diagonalization?
- In other words, how do we find the orthogonal matrix P, with inverse Q equal to the transpose of P, such that QAP is a diagonal matrix D? And what is the corresponding diagonal D?
- It's pretty straightforward, if a bit time-consuming.
- The technique combines two procedures already studied: the general procedure for diagonalizing a matrix and the Gram-Schmidt procedure for creating orthonormal bases.
  - First find the eigenvalues.
  - For each eigenvalue, find a basis for the corresponding eigenspace. (Do this by solving the relevant homogeneous system).
  - Apply Gram-Schmidt to each of these eigenspace bases, thereby obtaining an orthonormal basis for each eigenspace.
  - Then P is the matrix whose columns are the various basis vectors identified by this Gram-Schmidt process.
  - D is the diagonal matrix that uses the corresponding eigenvalues along the diagonal.
  - Q, to repeat, is just the transpose of P.
- Study Example 1 p359 for an illustration of this process of orthogonal diagonalization.

Go back to the top

Alexia Sontag, Mathematics
Wellesley College
Date Created: January 4, 2001
Last Modified: May 8, 2002
Expires: June 30, 2002