The Cauchy-Schwarz Inequality

$$ \newcommand{\bra}[1]{\langle #1 \rvert} \newcommand{\ket}[1]{\lvert #1 \rangle} \newcommand{\inner}[2]{\langle #1 , #2 \rangle} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\norm}[1]{\lVert #1 \rVert} \newcommand{\xv}[0]{\mathbf x} \newcommand{\yv}[0]{\mathbf y} $$

Click here to jump straight to the interactive visualization.

Statement


For a vector space external link with an inner-product external link , the Cauchy-Schwarz inequality external link states that

$$\abs{\inner{\xv}{\yv}}^2 \leq \inner{\xv}{\xv}\inner{\yv}{\yv}$$

The inner product implies the so-called canonical norm external link :

$$\norm{\xv} = \sqrt{\inner{\xv}{\xv}}$$

We can write the inequality in terms of this norm by taking the square-root of both sides

$$\abs{\inner{\xv}{\yv}} \leq \norm{\xv}\,\norm{\yv}$$

Further, dividing by \(\norm{\xv}\) and \(\norm{\yv}\) gives

$$\abs{\inner{\hat\xv}{\hat\yv}} \leq 1$$

where \(\hat\xv = \xv/\norm{\xv}\) indicates a unit vector. Thus, the Cauchy-Schwarz inequality can also be viewed as a statement about the inner product of unit vectors. This certainly makes sense in the familiar setting of \(\mathbb R^n.\)

All three forms are equivalent to one another.

Importance


It is an important result because it lets us extend the notion of angles between vectors, to all inner-product spaces that have real inner products. This is done by defining

$$\cos\theta = \frac{\langle \xv, \yv\rangle}{\sqrt{\langle \xv, \xv\rangle\langle \yv, \yv\rangle}} = \inner{\hat\xv}{\hat\yv}$$

The Cauchy-Schwarz inequality ensures the right-hand side of this equation is in \([-1, 1]\) and, as such, matches the domain of \(\arccos.\)

Consider the familiar vectors in \(\mathbb R^n\), for which the inner product is the usual scalar (dot) product:

$$\inner{\xv}{\yv} = \sum_{i}x_i y_i$$

Here we know that \(\inner{\xv}{\yv} = \xv\cdot\yv = \norm{\xv}\,\norm{\yv}\,\cos(\theta)\). This motivates the generalization of angle to arbitrary inner product spaces.

Proof


Decompose \(\hat\xv\) into a component parallel to \(\hat\yv\) and a component perpendicular to \(\hat\yv:\)

$$\hat\xv = \hat\xv_\parallel + \hat\xv_\perp = a\hat\yv + \hat\xv_\perp$$

We can calculate \(a\) using the requirement that \(\inner{\hat\yv}{\hat\xv_\perp} = 0:\)

$$\inner{\hat\yv}{\hat\xv_\perp} = \inner{\hat\yv}{\hat\xv - a\hat\yv} = \inner{\hat\yv}{\hat\xv} - a\inner{\hat\yv}{\hat\yv} = \inner{\hat\yv}{\hat\xv} - a = 0$$

Thus \(a = \inner{\hat\yv}{\hat\xv}.\) Because we now we have two perpendicular components of \(\hat\xv\) we can apply Pythagoras’ theorem

$$\begin{align} \norm{\hat\xv}^2 &= \norm{\hphantom{.}\hat\xv_\parallel}^2 + \norm{\hphantom{.}\hat\xv_\perp}^2 \\ 1 &= \norm{\inner{\hat\xv}{\hat\yv}\hat\yv}^2 + \norm{\hphantom{.}\hat\xv_\perp}^2 \\ 1 &= |\inner{\hat\xv}{\hat\yv}|^2\,\norm{\hat\yv}^2 + \norm{\hphantom{.}\hat\xv_\perp}^2 \\ 1 - \norm{\hphantom{.}\hat\xv_\perp}^2 &= |\inner{\hat\xv}{\hat\yv}|^2 \\ \end{align}$$

but \(\norm{\hphantom{.}\hat\xv_\perp}^2 \geq 0,\) thus the final result

$$|\inner{\hat\xv}{\hat\yv}| \leq 1\qquad\square$$

I have shown the proof using unit vectors. However nothing changes if you use unnormalized vectors. You will just see a different value for \(a,\) and have \(\norm{\xv}\) instead of \(\norm{\hat\xv},\) resulting in the alternative form above.

Intuition


The parallel vector used in the proof above is an example of the vector projection external link . For arbitrary (i.e. not necessarily unit) vectors it looks like

$$\begin{align} \xv_\parallel &= |\inner{\xv}{\hat{\yv}}|\hat{\yv} = \norm{\xv}\,\hat{\yv}_\parallel\\ \yv_\parallel &= |\inner{\yv}{\hat{\xv}}|\hat{\xv} = \norm{\yv}\,\hat{\xv}_\parallel \end{align}$$
It is easy to show this projection satisfies involution (as it must, to be a projection) and that it is also an orthogonal projection external link .

The Cauchy-Schwarz inequality tells us that these projections are always smaller than or equal in size to the original vector, being equal only when the vectors are aligned (including being antiparallel). That is

$$\norm{\,\mathbf x_\parallel} \leq \norm{\mathbf x} \quad\text{and}\quad \norm{\,\mathbf y_\parallel} \leq \norm{\mathbf y}$$

thus

$$\norm{\,\mathbf x_\parallel}\cdot\norm{\,\mathbf y_\parallel} \leq \norm{\mathbf x}\cdot\norm{\mathbf y}$$

Visualization


This interactive visualization shows two vectors in \(\mathbb R^2\) and their projections onto each other. Try dragging the ends of the yellow and blue vectors.

The rectangles represent the product of the magnitudes of the two vectors and the product of the magnitudes of their projections, respectively. You can see the area of the orange rectangle is always less than or equal to that of the purple rectangle.



Latest Posts