Definition

A set of vectors $\{u_1,u_2,u_3,\cdots,u_n\}$ of a vector space $V$ are orthonormal basis if
1. $<u_i,u_j>=0\ \forall\ i \neq j$ (corresponding to orthogonal property)
2. $||u_i||=1\ \forall\ i$ (corresponding to normalisation property)

1

Motivation

Let us consider a vector space $V$ of $dim\ 3$. Let the basis for the vector space be $\{u_1,u_2,u_3\}$ and v$\in V$. This means that we can write $v$ as a linear combination of the basis $\rightarrow$

$v = a_{1} u_{1} + a_{2} u_{2} + a_{3} u_{3}$

For finding $a_1,a_2,a_3$, we usually need to go through a long process(especially when we deal with high dimensional vector space). This is where orthonormal bases are used. Using orthonormal bases we can find $a_1,a_2,a_3$ easily, and hence, orthonormal bases help in computation.

2

Bird's Eye View

We are aware that for a vector space there can be infinity many bases. But orthogonal bases are a set of the basis of a vector space are mutually orthogonal and length of the vectors is 1(i.e. All vectors are unit vectors). Due to their orthogonal property and their length being of 1, they are probably the best basis to work with. An example of orthonormal bases is standard bases (See video 1).

3

4

Video 1: Example of Orthonormal bases

Context Of Definition

We say that 2 vectors, $u,v$ are orthogonal, if they are perpendicular to each other i.e. $u.v = 0$ or more generally speaking $<u,v> = 0$ (If you want to read more about inner product spaces, you may refer to the lecture notes of Inner Product Spaces). A set of vectors $\{v_1,v_2,v_3,\cdots,v_n\}$ is called mutually orthogonal if $\rightarrow$

$<v_i,v_j>=0\ \forall\ i \neq j$.

These mutually orthogonal vectors will also be mutually linearly independent. If we divide each vector of this set by its length, we get a set of unit vectors. This set of unit vectors are orthonormal vectors.

5

The difference between the orthogonal vectors and corresponding orthonormal vectors is only in terms of the magnitude of vectors. If we have a set of orthogonal vectors and we normalize it, we get the set of corresponding orthonormal vectors.

As mutually orthogonal vectors are linearly independent therefore corresponding mutually orthonormal vectors are also linearly independent. Hence, orthonormal bases can act as bases for vector spaces. It is quite easy to work with orthonormal basis due to this property $\rightarrow$

If there is a vector space $V$ with $dim.\ n$ with orthonormal basis $\{ v_1,v_2,v_3, \cdots, v_n\}$ and $v\in V$ then:

$v= <v,v_1> v_1 + <v,v_2> v_2 +<v,v_3> v_3 +\cdots+<v,v_n> v_n$

The property $v= <v,v_1> v_1 + <v,v_2> v_2 +<v,v_3> v_3 +\cdots+<v,v_n> v_n$ simply means that by adding the projections of $v$ on orthonormal basis we get the same vector. Let us see this visually (See video 2). And this same property is used to get a set of orthonormal bases from any set of bases of a vector space $V$. We will see this in the next lecture notes (i.e. Gram-Schmidt Orthogonalization Process).

6

7

Video 2: Adding the projections of a vector on orthonormal basis will produce the same vector.

Let us see how this property is so important with an example $\rightarrow$

Say, $v=\left( \begin{array} {c} 3 \\ 4 \\ 5 \end{array} \right)$ and let the basis be $\left(\begin{array} {c} 1 \\ 2 \\ 3 \end{array} \right) ,\left(\begin{array} {c} 0 \\ 1 \\ 3 \end{array} \right) ,\left(\begin{array} {c} 2 \\ 0 \\ -4 \end{array} \right)$.

Clearly, these basis are not mutually orthogonal. If $v = a_1v_1+a_2v_2+a_3v_3$, and we want to calculate $a_1,a_2,a_3$ then $\rightarrow$

$\left( \begin{array} {c} 3 \\ 4 \\ 5 \end{array} \right) = a_1 \left(\begin{array} {c} 1 \\ 2 \\ 3 \end{array} \right) + a_2 \left(\begin{array} {c} 0 \\ 1 \\ 3 \end{array} \right) + a_3 \left(\begin{array} {c} 2 \\ 0 \\ -4 \end{array} \right)$.

Now, we get this system of linear equations which we have to solve to get the values of $a_1,a_2,a_3$.

$1.a_1 + 0.a_2 + 2.a_3 = 3$

$2.a_1 + 1.a_2 + 0.a_3 = 4$

$3.a_1 + 3.a_2 - 4.a_3 = 5$

But if the basis were orthonormal for example$\rightarrow$

$\left(\begin{array} {c} \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} \\ 0 \end{array} \right) ,\left(\begin{array} {c} \frac{-1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} \\ 0 \end{array} \right), \left(\begin{array} {c} 0 \\ 0 \\ 1 \end{array} \right)$

Using this property: $v= <v,v_1> v_1 + <v,v_2> v_2 +<v,v_3> v_3 +\cdots+<v,v_n> v_n$, we can find scalars corresponding to the vectors easily.

$<v,v_1> = \frac{3}{\sqrt{2}} + \frac{4}{\sqrt{2}} + 0 = \frac{7}{\sqrt{2}}$

$<v,v_2> = \frac{-3}{\sqrt{2}} + \frac{4}{\sqrt{2}} + 0 =\frac{1}{\sqrt{2}}$

$<v,v_3> = 0 + 0 + 5 =5$

$i.e. \left( \begin{array} {c} 3 \\ 4 \\ 5 \end{array} \right) = \frac{7}{\sqrt{2}} \left(\begin{array} {c} \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} \\ 0 \end{array} \right) + \frac{1}{\sqrt{2}} \left(\begin{array} {c} \frac{-1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} \\ 0 \end{array} \right)+5 \left(\begin{array} {c} 0 \\ 0 \\ 1 \end{array} \right)$

Let us consider this example again with respect to the property $v= <v,v_1> v_1 + <v,v_2> v_2 +<v,v_3> v_3 +\cdots+<v,v_n> v_n$ which implies that by adding the projections of $v$ on orthonormal basis we get the same vector (See video 3).

8

9

Video 3: Relating the example and the property

You can understand now that this was just a three-dimensional vector space but when we deal with high dimensional vector spaces, it is a lot more convenient to use orthonormal basis.

10

Note that the property $v= <v,v_1> v_1 + <v,v_2> v_2 +<v,v_3> v_3 +\cdots+<v,v_n> v_n$ is true for all orthonormal bases of vector space. For example: Let us consider a vector space $\mathbb{P}_2$ which is a polynomial vector space and let the basis be $\{ \frac{1}{\sqrt{2}},\sqrt{\frac{3}{2}}x,\frac{3\sqrt{10}}{4}(x^{2}-\frac{1}{3})\}$ and if $p,q \in \mathbb{P_2}$ then $<p,q> = \int_{-1}^{1} p(x)q(x) dx$.

Notice that: $\{ \frac{1}{\sqrt{2}},\sqrt{\frac{3}{2}}x,\frac{3\sqrt{10}}{4}(x^{2}-\frac{1}{3})\}$ is a set of orthonormal basis.(Why ?). Let $v=x^{2}+2x+1$ and $v_1 = \frac{1}{\sqrt{2}}, v_2 = \sqrt{\frac{3}{2}}x$ and $v_3=\frac{3\sqrt{10}}{4}(x^{2}-\frac{1}{3})$ then:

$<v,v_1> = \int_{-1}^{1} \frac{1}{\sqrt{2}}(x^{2}+2x+1) dx = \frac{8}{3\sqrt{2}}$

$<v,v_2> = \int_{-1}^{1} \sqrt{\frac{3}{2}}x(x^{2}+2x+1) dx = \frac{4}{3}\sqrt{\frac{3}{2}}$

$<v,v_3> = \int_{-1}^{1} \frac{3\sqrt{10}}{4}(x^{2}-\frac{1}{3})(x^{2}+2x+1) dx = \frac{2\sqrt{10}}{15}$

$<v,v_1> v_1 + <v,v_2> v_2 +<v,v_3> v_3 = \frac{8}{3\sqrt{2}} \times \frac{1}{\sqrt{2}} + \frac{4}{3}\sqrt{\frac{3}{2}} \times \sqrt{\frac{3}{2}}x + \frac{2\sqrt{10}}{15}\frac{3\sqrt{10}}{4}(x^{2}-\frac{1}{3})$

$= \frac{4}{3} + 2x + (x^{2}-\frac{1}{3})$

$= x^{2}+2x+1$

$= v$

11

Applications

• Orthonormal bases are used for ease of computations. Refer to the second example where we try to find $a_1,a_2,a_3$ taking non-orthonormal bases, and orthonormal bases. We saw the difference between the ease of 2 computations. If you have high dimensional vector spaces, this difference may turn into infeasible and feasible computations respectively.
• The method of finding a set of orthonormal bases(i.e. Gram-Schmidt Orthogonalization Process) is the very foundation of QR decomposition. QR decomposition is further used linear least squares problem and forms the basis of the QR algorithm. You can understand this better after you go through the next lecture notes i.e. Gram-Schmidt Orthogonalization Process.
• Orthonormal bases are used in many derivations. For example when we prove that for a hermitian matrix $A_{n\times n}$ is unitarily diagonalisable we use orthonormal bases. An idea of how it is used is given below. Let $\{u_1, u_2, u_3, \cdots, u_n\}$ is a set of orthonormal bases then $\rightarrow$

$\left( \begin{array} {c} u_1 \\ u_2 \\ u_3 \\ \vdots \\ u_n \end{array} \right) \left( \begin{array} {c c c c c} u_1 & u_2 & u_3 & \cdots & u_n \end{array} \right) = \left( \begin{array} {c c c c c} u_1.u_1 & u_1.u_2 & u_1.u_3 & \cdots & u_1.u_n \\ u_2.u_1 & u_2.u_2 & u_2.u_3 & \cdots & u_2.u_n \\ \vdots &\vdots &\vdots & \ddots &\vdots \\ u_n.u_1 & u_n.u_2 & u_n.u_3 & \cdots & u_n.u_n \end{array} \right) = \left( \begin{array} {c c c c c} 1 & 0 & 0 & \cdots & 0 \\ 0 & 1 & 0 & \cdots & 0 \\ \vdots &\vdots &\vdots & \ddots &\vdots \\ 0 & 0 & 0 & \cdots & 1 \end{array} \right) = I_n$

As $I_n$ is also a diagonal matrix, you may get a feeling of how it is used. When we prove the theorem state above.

• One of the fields in which orthonormal bases are used is to do an effective calculation for statistical analysis of large data. If we do not use orthonormal bases, these processes will take the infeasible amount of time.

12

Pause And Ponder

• Think over that the fact that the mutually orthogonal vectors are always linearly independent.
• Try to come up with a formal proof of this property of orthonormal basis: $v= <v,v_1> v_1 + <v,v_2> v_2 +<v,v_3> v_3 +\cdots+<v,v_n> v_n$
• What do you think, how would orthonormal basis look like when you take different vector spaces?

13

History

Erhard Schmidt got his doctorate in 1905 under David Hilbert for a problem which involved integral equations. Also, he published a paper on integral equations in which he gave Gram-Schmidt process for getting orthonormal bases for some functions. This generalized results of J.P. Gram who considered this problem when studying the least squares. However, Laplace presented earlier than Gram or Schmidt.

14

References

1. Linear Algebra and its applications - By Gilbert Strang (Fourth Edition)
2. https://www-users.math.umn.edu/~olver/ln_/qr.pdf
3. https://www.ucl.ac.uk/~ucahmdl/LessonPlans/Lesson10.pdf
4. Numerical Analysis(8 th edition)
Book by J. Douglas Faires and Richard L. Burden(Pg 500)
5. Rowland, Todd. "Orthonormal Basis." From MathWorld--A Wolfram Web Resource, created by Eric W. Weissteinhttps://mathworld.wolfram.com/OrthonormalBasis.html
6. Orthogonality and Orthonormality. Brilliant.org. Retrieved 00:20, June 18, 2020, from https://brilliant.org/wiki/orthogonality-and-orthonormality/
7. https://math.berkeley.edu/~mcivor/math110s12/WS/WS3.14soln.pdf

15