Orthonormal Bases and Gram-Schmidt
Orthonormalization
Having it all: Diagonalization using
Orthonormal Basis
Problems
6.1/6.2. Geometry and dot products.
You only need to read part of 6.2, and none of
6.1. We will only use the dot product you learned in Math 205, not
the more general inner products discussed in the text. You already
know practically everything you need to know from 6.1 and 6.2., but
here's an outline of the key points.
Geometry in Euclidean Rn is given by the dot
product learned in Math 205 (also called the Euclidean inner product
).
- The length of a vector u, ||u||, is the square
root of u*u.
- The distance between two points P and Q is the
length of the vector from P to Q (or vice versa).
- The angle q
between two nonzero vectors u and v is
characterized by the equation cos q
=(u*v)/(||u|| ||v||).
- Two nonzero vectors u and v are perpendicular
to each other (or orthogonal) iff their dot product is zero.
- More generally, we'll say that two vectors are
orthogonal iff their dot product is zero (so the zero vector is
orthogonal to every vector in the space.)
This dot product (a real valued function that accepts two vectors
from Rn as inputs) also has important algebraic properties,
properties used in the geometry connection.
- Symmetry: u*v = v*u
- Additivity: (u+v)*w = u*w+v*w
- Homogeneity: (ku)*v = u*(kv) =
k(u*v)
- Positivity:
- u*u is never negative (so is either
positive or zero)
- u = 0 ==> u*u = 0
- u*u = 0 ==> u = 0
A general inner product on an abstract real vector
space V is a real valued function that accepts two vectors (u, v,
say) from V as inputs and returns a real number <u,v> as output
and that has the same four properties described above.
- Symmetry: <u,v> =
<v,u>
- Additivity: <u+v,w> =
<u,w>+<v,w>
- Homogeneity: <ku,v> = <u,kv> =
k<u,v>
- Positivity:
- <u,u> is never negative (so is either
positive or zero)
- u = 0 ==> <u,u> = 0
- <u,u> = 0 ==> u = 0
For instance, one important example involves
weighting the individual entries a little differently so that not all
entries carry the same weight. (See Example 2 p277 and Example 4
p278, if you're interested.)
It turns out that these 4 conditions are exactly
what's needed to give us meaningful notions of length, distance,
angle, and orthogonality in a general inner product space.
Browse 6.1 and 6.2, if you like, to see some other
possible inner products and properties of inner products (for
instance, the Pythagorean theorem continues to hold in inner product
spaces). However, we will use only the usual Euclidean inner product,
in other words the dot product that is familiar to you from Math
205.
Notes for 6.3. Orthonormal Bases. Gram-Schmidt
Process.
- What is an orthonormal basis (for Euclidean
Rn)?
- It's a basis in which all vectors are
orthogonal to each other and each vector has length one.
See the definitions and
examples on pp298-299.
- So {u1, u2, ...un} is an orthonormal basis
iff ui*uj = 0 for distinct i,j and ui*uj = 1 when
i=j.
- A set is said to be orthogonal if 2 vectors
chosen from the set are always orthogonal to each other. So an
orthonormal basis is a basis that's an orthogonal set and in
which each vector has length one.
- What makes orthonormal bases so nice? And how
do we find them?
- When the basis is orthonormal, it's very
easy to find coordinates. One just has to compute some dot
products, not solve systems of equations.
- See Thm 6.3.1 on p299
and Example 3 p300. Say {u1, ...un}
is an orthonormal basis for Euclidean Rn. Let v be a vector
in Rn. Then we have
- v = (v*u1)u1 + (v*u2)u2
+...(v*un)un.
- In other words, the various coordinates
for v are just the dot products of v with the respective
basis vectors.
- This formula is obvious if the basis
is the standard basis.
- This formula holds no matter what the
basis is, as long as it's an orthonormal
basis.
- Orthogonal ==> independent
(almost).
- See Theorem 6.3
p301. It says that every finite
orthogonal set, unless it contains the zero vector, is
automatically independent.
- So an orthogonal set with the correct
number of orthogonal vectors, none of which is the zero vector,
will automatically be an orthogonal basis.
- From an orthogonal basis, it's very easy to
create an orthonormal basis. Just "normalize" the basis
vectors. In other words, replace each vector u in the basis by
a scalar multiple of length one, namely by u/||u||.
- So the question becomes: how to create an
orthogonal basis?
6.3. Gram-Schmidt Process.
- So if orthonormal bases are so nice, how do we
find them? In other words, given some subspace of Euclidean Rn,
how do we find an orthonormal basis for it? That's where
Gram-Schmidt comes in.
- So what does this Gram-Schmidt process do? It
takes a given basis (for some subspace of some Euclidean Rn) and
creates from it an orthonormal basis.
- So if this process always works, which it
does, then every subspace of Euclidean Rn must have an
orthonormal basis.
- More generally, this process gives us that
every finite-dimensional inner product space (whether using the
usual dot product or not) has an orthonormal basis. This makes
inner product spaces particularly nice to work with. We not
only have some geometry (length, distance, angle,
orthogonality), but we also have the possibility of working
with a really nice basis for the space.
- And how does Gram-Schmidt work? Here's a
description of the process, with some of the reasoning for why it
works as advertised.
- Say we start with a particular basis {u1,
u2, ..., un} for V. We will first create an orthogonal basis
{v1, v2, ..., vn}, one vector at a time. Then we will normalize
to get the desired orthonormal basis.
- Step 1. Keep u1. In other words, set
v1=u1.
- Step 2. Obtain a vector v2 that is
orthogonal to v1 as follows. Realize that the vector
[(u2*v1)/(||v1||2 )]v1 lies in span{v1}. From Math
205 we can think of this as the projection of u2 onto v1.
Subtract this from u2 and the result will be orthogonal to
v1. In other words, set
v2 = u2- [(u2*v1)/(||v1||2
)]v1.
- Note that v2 cannot be zero. It is easy
to check that v2 and v1 are orthogonal. So {v1, v2} is an
orthogonal basis for span{u1,u2}.
- Step 3. Now obtain a vector v3 that is
orthogonal to each of v1 and v2 as follows. The vector
[(u3*v1)/(||v1||2)]v1 lies in span{v1}. From Math 205
think of this as the projection of u3 onto v1. Similarly think
of [(u3*v2)/(||v2||2)]v2 as the projection of u3 onto
v2. The sum of these two projections gives what we'll
eventually call the projection of u3 onto span{v1,v2}. Subtract
this sum from u3 and the result will be orthogonal to both v1
and v2. In other words, set
v3 = u3 - [(u3*v1)/(||v1||2)]v1
- [(u3*v2)/(||v2||2)]v2
- The vector v3 cannot be zero. A direct
computation shows that v3 is orthogonal to both v1 and v2.
By now, we've built an orthogonal basis for span{u1, u2,
u3}.
- Continue in this fashion, obtaining,
after n steps, an orthogonal basis
{v1, v2, ..., vn}
- for span{u1, u2, ...,un}.
- Step n+1. Normalize each vector, as
necessary, to obtain an orthonormal basis. In other words,
multiply each vi by the scalar 1/(length of vi), and the
resulting set will be the desired orthonormal
basis.
Study Example 7 pp 304-305, for an
example illustrating Gram-Schmidt. A general outline is given on pp
304-305 as the proof for Thm 6.3.6. (Every nonzero finite-dimensional
inner product space has an orthonormal basis.)
6.5. Orthogonal Matrices and orthogonal
operators
- What is an orthogonal matrix and what do
orthogonal matrices have to do with orthonormal bases?
- A square matrix is said to be an orthogonal
matrix iff it is invertible and its transpose is its inverse.
Note how much easier it is to transpose than to find inverses.
So knowing a matrix is orthogonal can be very useful.
See the definition on
p320 and the examples on p321.
- A square matrix (say n x n) is orthogonal
iff its row vectors form an orthonormal set (in Euclidean Rn).
Read Theorem 6.5.1 p321
and its proof.
- Similarly, a square matrix (n x n) is
orthogonal iff its column vectors form an orthonormal
set.
- Therefore a square matrix (n x n) is
orthogonal iff its row vectors form an orthonormal bases for
Euclidean Rn. In other words, orthogonal n x n matrices are the
same as matrices whose rows are orthonormal bases for Euclidean
Rn.
- Transition matrices used to convert from
one orthonormal basis to another are always orthogonal
matrices, another reason why orthonormal bases are particularly
nice.
- Orthogonal matrices have other nice
properties. See theorem 6.5.2 p322.
- Not surprisingly, a linear transformation from
Rn to Rn that is accomplished as multiplication by an orthogonal
matrix (called an orthogonal operator) has nice properties. In
particular, multiplication by an orthogonal matrix does not change
the length of the vector being multiplied. See Theorem 6.5.3 pp
322-323.
- 6.5 also includes a discussion of what we
called transition matrices (the matrices that are used to
translate coordinates relative to one basis into coordinates
relative to some other matrix). We've already covered this
material in connection with linear transformations in 8.4, but you
might want to read it here (pp 324 - top of 327).
- As already mentioned above, transition
matrices that convert from one orthonormal basis to another are
always orthogonal. Read
Theorem 6.5.5 p327 and the first rotation example (Exple 6) that
follows it.
7.3. Can we have it all? Diagonalization using
an orthonormal basis.
- To represent a particular linear
transformation we often want to use a diagonal matrix.
- To make it easy to find coordinates and to
find transition matrices we often want to use orthonormal
bases.
- So given some linear operator from Euclidean
Rn to Euclidean Rn one often wants to represent this operator by a
diagonal matrix that uses an orthonormal basis.
- Read 7.3., with primary
attention to Theorem 7.3.1 and to the Diagonalization procedure
for symmetric matrices described on p359 and illustrated in
Example 1 pp359 -360.
- Theorem 7.3.1 gives us a nice answer to three
related questions.
- For which n x n matrices A will there exist
an orthonormal basis for Euclidean Rn consisting entirely of
eigenvectors for A? (Text says A has an orthonormal set of n
eigenvectors.)
- For which n x n matrices A will there exist
an orthogonal matrix P (with inverse Q equal to the transpose
of P) such that QAP is a diagonal matrix. (Text says A is
orthogonally diagonalizable.)
- Viewing A as the standard matrix for some
linear operator on Euclidean Rn, for which n x n matrices A
will it be possible to find an orthonormal basis that will let
us represent the operator by a diagonal matrix.
- From our earlier analysis of diagonalization
problems, we know that all three questions will have the same
answer. In other words, these three questions are essentially
equivalent.
- So what is the answer? The matrices in
question are the n x n symmetric matrices. In other words, if A is
symmetric, then there are n orthonormal eigenvectors for A and A
can be orthogonally diagonalized. And vice versa.
- Assuming A is symmetric, how do we accomplish
the diagonalization?
- In other words, how do we find the
orthogonal matrix P, with inverse Q equal to the transpose of
P, such that QAP is a diagonal matrix D? And what is the
corresponding diagonal D?
- It's pretty straightforward, if a bit
time-consuming.
- The technique combines two procedures
already studied: the general procedure for diagonalizing a
matrix and the Gram-Schmidt procedure for creating orthonormal
bases.
- First find the eigenvalues.
- For each eigenvalue, find a basis for
the corresponding eigenspace. (Do this by solving the
relevant homogeneous system).
- Apply Gram-Schmidt to each of these
eigenspace bases, thereby obtaining an orthonormal basis for
each eigenspace.
- Then P is the matrix whose columns are
the various basis vectors identified by this Gram-Schmidt
process.
- D is the diagonal matrix that uses the
corresponding eigenvalues along the diagonal.
- Q, to repeat, is just the transpose of
P.
- Study Example 1 p359 for an
illustration of this process of orthogonal
diagonalization.
Go back to the top
- Alexia
Sontag, Mathematics
- Wellesley College
- Date Created: January 4, 2001
- Last Modified: May 8, 2002
- Expires: June 30, 2002