Quantum State Spaces
These are some old, and pretty raw, notes I took while studying quantum mechanics.
Contents
Mathematical Spaces
Vector Space
As a reminder, the definition of a vector space is as follows. A vector space is a set $V$ and a field $\mathbb F$ (e.g. $\mathbb R$ or $\mathbb C$) with two operations
- Vector addition $+:V\times V \mapsto V$
- Scalar multiplication $\cdot:\mathbb F\times V \mapsto V$
that satisfy the following properties
| Operation | Property | Definition |
|---|---|---|
| Addition | Associative | $(\mathbf u + \mathbf v) + \mathbf w = \mathbf u + (\mathbf v + \mathbf w)$ |
| Commutative | $\mathbf u + \mathbf v = \mathbf v + \mathbf u$ | |
| Identity element | $\exists\,\,\mathbf 0\in V : \,\, \mathbf v + \mathbf 0 = \mathbf v$ | |
| Inverse element | $\exists\,\, (-\mathbf v)\in V:\,\, \mathbf v + (-\mathbf v):=\mathbf v - \mathbf v = \mathbf 0$ | |
| Multiplication | Compatibility | $a(b\mathbf v) = (ab)\mathbf v$ |
| Identity element | $\exists\,\, 1\in \mathbb F:\,\,1\cdot\mathbf v = \mathbf v$ | |
| Distributivity wrt $+$ | $a(\mathbf v + \mathbf u) = a\mathbf v + a\mathbf u$ | |
| Distributivity wrt $\cdot$ | $(a + b)\mathbf v = a\mathbf v + b\mathbf v$ |
Inner-product Space
An inner-product space is a vector space equipped with a binary function $\inp \cdot \cdot: V\times V \mapsto \mathbb F$, called the inner-product that satisfies the following properties
| Property | Definition |
|---|---|
| Linearity | $\inp{a \mathbf x}{\mathbf y} = a\inp{\mathbf x}{\mathbf y}$ |
| $\inp{\mathbf x + \mathbf y}{\mathbf z} = \inp{\mathbf x}{\mathbf z} + \inp{\mathbf y}{\mathbf z}$ | |
| Conjugate symmetry | $\inp{\mathbf x}{\mathbf y}= \inp{\mathbf y}{\mathbf x}^*$ |
| Positive definiteness | $\inp{\mathbf x}{\mathbf x} > 0$ for \(\mathbf x \neq \mathbf 0\) |
For the last property conjugate symmetry ensures that the result is real since $\inp {\mathbf x} {\mathbf x} = \inp{\mathbf x}{\mathbf x}^*$, so the result must be real. These properties also imply that \(\inp{\mathbf 0}{\mathbf 0} = 0\). This follows from the linearity of the inner product.
It is also easy to show that the following properties hold
- $\langle \mathbf x, \mathbf y + \mathbf z\rangle = \langle \mathbf x, \mathbf y \rangle + \langle\mathbf x, \mathbf z\rangle$
- $\langle \mathbf x, a \mathbf y \rangle = a^*\langle \mathbf x, \mathbf y \rangle$
Cauchy-Schwarz Inequality
The Cauchy-Schwarz inequality states that
$$|\langle \mathbf x, \mathbf y\rangle|^2 \leq \langle \mathbf x, \mathbf x\rangle\langle \mathbf y, \mathbf y\rangle$$Taking square roots of both sides (and preempting the next section slightly) we can see that this is effectively saying that the magnitude of the inner-product of two vectors, must be less than the geometric mean of the two magnitudes. Intuitively, think of the scalar product of ordinary $\mathbb R^n$ vectors, if we increase the length of $\mathbf x$ the dot product gets larger, but it is always scaled down by $\mathbf y$ and vice-versa.
This lets us extend the notion of angles between vectors to abstract inner-product spaces with real inner products as
$$\cos\theta = \frac{\langle \mathbf x, \mathbf y\rangle}{\sqrt{\langle \mathbf x, \mathbf x\rangle\langle \mathbf y, \mathbf y\rangle}}$$where the Cauchy-Schwarz inequality ensures the right-hand side of this equation is in $[-1, 1]$.
Normed Vector Space
The square-root of the inner-product of a vector with itself defines a norm
$$||\mathbf x|| := \sqrt{\langle \mathbf x, \mathbf x\rangle}$$and thus we also have a normed vector space.
The norm must obey the following properties
| Property | Definition | |
|---|---|---|
| Non-negative | $\|\mathbf x\| \geq 0$ | $\|\mathbf x\|=0 \Leftrightarrow x=0$ |
| Absolutely homogeneous | $\|a\mathbf x\| = \|a\|\cdot\|\mathbf x\|\quad$ | i.e. $\|-2\mathbf x\|=2\|\mathbf x\|\neq -2\|x\|$ |
| Subadditive | $\|\mathbf x + \mathbf y\| \leq \|\mathbf x\| + \|\mathbf y\|\quad$ | i.e. triangle ineq. |
Properties 1. & 2. follow simply from the properties of the inner-product and 3. can be proven using the Cauchy-Schwarz ineq.
Proof
Both sides of the inequality are real & non-negative, so we can square them and go from there
$$\begin{align} |x+y|^2 &\leq (|x| + |y|)^2 \\ \inp {x+y} {x+y} &\leq |x|^2 + |y|^2 + 2|x|\,|y| \\ \inp x x + \inp y y + \inp x y + \inp y x &\leq |x|^2 + |y|^2 + 2\sqrt{\inp x x \inp y y} \\ |x|^2 + |y|^2 + \inp x y + {\inp x y}^* &\leq |x|^2 + |y|^2 + 2\sqrt{\inp x x \inp y y} \\ \inp x y + {\inp x y}^* &\leq 2\sqrt{\inp x x \inp y y} \\ \end{align}$$We know that
$$\inp x y + {\inp y x}^* \leq 2|\inp x y|$$holds. Meaning the inequality above holds if
$$|\inp x y| \leq \sqrt{\inp x x \inp y y}$$which is true because this is just the square root of the Cauchy-Schwarz inequality.
Metric Space
Since we have a norm we can also define a metric $\delta \in \mathbb R \geq 0$
$$\delta(\mathbf x, \mathbf y) := ||\mathbf x - \mathbf y||$$that satisfies all the required properties of a metric
| Property | Definition |
|---|---|
| Identity of indescernibles | $\,\,\,\delta(\mathbf x, \mathbf y) = 0 \Leftrightarrow \mathbf x = \mathbf y$ |
| Symmetry | $\,\,\,\delta(\mathbf x, \mathbf y) = \delta(\mathbf y, \mathbf x)$ |
| Triangle inequality | $\,\,\,\delta(\mathbf x, \mathbf y) \leq \delta(\mathbf x, \mathbf z) + \delta(\mathbf z, \mathbf y)$ |
Property 3. follows directly from the definition of $\delta$ and the fact the norm obeys the triangle inequality.
Proof of 1:
The direction $\mathbf x = \mathbf y \Rightarrow \delta(\mathbf x, \mathbf y) = 0$ is clear;
$$\delta(\mathbf x, \mathbf x) = ||\mathbf x-\mathbf x|| = ||\mathbf 0|| = \sqrt{\langle \mathbf 0, \mathbf 0 \rangle} = 0$$In the other direction
$$0 = \delta(\mathbf x, \mathbf y)^2 = \langle \mathbf x - \mathbf y, \mathbf x - \mathbf y \rangle \,\,\Rightarrow\,\, (\mathbf x - \mathbf y) = \mathbf 0 \,\,\Rightarrow\,\, \mathbf x = \mathbf y$$Proof of 2:
Use the properties of the inner product to pull a minus sign out of each argument
$$\delta(\mathbf x, \mathbf y)^2 = \langle \mathbf x - \mathbf y, \mathbf x - \mathbf y \rangle = -\langle \mathbf y - \mathbf x, \mathbf x - \mathbf y \rangle = \langle \mathbf y - \mathbf x, \mathbf y - \mathbf x \rangle = \delta(\mathbf y, \mathbf x)^2$$where we can also ignore the squares since $\delta(\mathbf x, \mathbf y) \geq 0$.
Interestingly non-negativity can be proved from the three properties above
$$0 = \delta(\mathbf x, \mathbf x) \leq \delta(\mathbf x, \mathbf y) + \delta(\mathbf y, \mathbf x) = 2\delta(\mathbf x, \mathbf y)$$which implies $\delta(\mathbf x, \mathbf y) \geq 0$.
Cauchy Sequences & Completeness
In a metric space $M$ a sequence $x_1, x_2, ...$ is ‘Cauchy’ if for every $\epsilon \in \mathbb R$ there exists $m, n \in \mathbb N$ such that $\delta(x_m, x_n) < \epsilon$. Roughly, the sequence is getting closer and closer. Note however that the limit of this sequence is not always in $M$. If the limits of all Cauchy sequences of elements of $M$ themselves lie in $M$, then $M$ is said to be complete.
Hilbert Spaces
A Hilbert space is a complete inner-product space. We can see that being an inner-product space is already quite a strong statement since it implies the space is also a normed vector space and a metric space. Despite being completely abstract we can also discuss things such as the angle between vectors, their lengths, and distances between them.
Some usual examples of Hibert spaces are
Euclidean space: The vectors are just ordinary vectors in $\mathbb R^N$, the inner product is the scalar product.
$$\langle \mathbf x, \mathbf y \rangle = \sum_{\alpha=1}^N x_\alpha y_\alpha $$Continuous functions: The vectors are the functions with a suitable basis (e.g. Fourier decomposition), the inner product can be defined as an integral
$$\langle f, g \rangle = \int_\mathbb{R} \!f(x)g^*(x) \,dx$$Dual Space & Bra-Ket Notation
A linear function $\phi : V \mapsto \mathbb F$, has the property
$$\phi\left(a\mathbf v + b \mathbf u\right) = a\phi(\mathbf v) + b\phi(\mathbf u)$$The set of all such linear functions itself forms a vector space where the vectors are the functions themselves;
- Scalar multiplication: $a\phi(\cdot)$
- Vector addition: $\phi(\cdot) + \rho(\cdot)$
It is clear that all the required properties for a vector space are fulfilled by linear functions. We can also identify every linear function $V \mapsto \mathbb F$ with a vector in $V$. Thus the space of linear functions forms a so-called “dual” vector space to $V$, which is denoted by $V^*$.
For something like $\mathbb R^N$ Euclidean space the identification is clear since every linear function is just a linear combination of the components of $\mathbf v$:
$$\phi(\mathbf v) = \sum_{i=1}^N \phi_i v_i = \boldsymbol \phi \cdot \mathbf v$$This is just the dot product of $\mathbf v$ with some vector $\boldsymbol \phi$, with components $\phi_i$. The situation is very similar for functions; now we can just use the inner product defined above (an integral is kinda like a big sum). For more abstract vector spaces it’s not so clear to me how we can know that each linear function is identified with a vector, it appears quite technical.
The inner-product is a linear function in the first argument, thus we can identify $\langle\cdot,\mathbf y\rangle$ with some element of $V^*$, this is the element associated with vector $\mathbf y$. In Dirac’s bra-ket notation the vector $\mathbf x$ is denoted $|\mathbf x\rangle\in V$ (a ‘ket’) and the corresponding dual vector is denoted $\langle \mathbf x|\in V^*$ (a ‘bra’). Then the application of the linear function on a vector is represented by juxtaposition, but we elide the extra $|$
$$\langle \mathbf x || \mathbf y\rangle := \langle \mathbf x | \mathbf y\rangle = \langle \mathbf x, \mathbf y \rangle $$Also, if we have some operator that acts on the states then
$$\langle \mathbf x | L | \mathbf y\rangle = \langle \mathbf x, L \mathbf y\rangle$$Important!
Do not think of the contents of bra- or ket notation as necessarily numbers or vectors. They are simply labels for vectors in Hilbert space.
Often the labels will correspond to numbers (e.g. for numerical observations such as position or momentum) so we do sometimes see algebraic operations happening on the labels themselves, but do not be fooled.
The canonical example of this is a qubit realized by a spin-\(\tfrac{1}{2}\) system. Such systems have the basis vectors \(\ket\uparrow\) and \(\ket\downarrow\). It makes sense to add & multiply the vectors, e.g. \(\psi = (\ket\uparrow + \ket\downarrow)/\sqrt 2,\) but not the labels \(\uparrow\) and \(\downarrow.\)
Classical vs. Quantum State-spaces
The canonical write-up about the tensor product, and it’s relation to quantum state-spaces, is The Tensor Product, Demystified by Tai-Danae Bradley.
In classical physics we often describe the phase-space or state-space. What is different from the quantum mechanical view in this case?
Example: Two States
To answer this its easiest to step back somewhat and consider the simplest possible system; a classical system that can only be in two states, labelled $a$ and $b$. In this case, our state space is simply $\Omega = \{a, b\}$, it isn’t even 1-dimensional, it’s just a finite set. But since the system is classical we know it is definitely in one and only one of the two states. To fully describe the state of the system we need one piece of information; “$a$ or $b$?”.
Now consider the quantum analog; a system that can be measured to be in either of two states, also labelled $a$ and $b$. In this case we have a probability to measure $a$, and a probability to measure $b$. This is clearly different from the classical case where the system is in some definite state. The states $a$ and $b$ are actually the (orthogonal) axes of our Hilbert state-space, so our state space is 2-dimensional. The position in $\mathcal H$ tells us the relative probabilities of observing state $a$ or state $b$.
Compare the classical and quantum cases; in both we have a system that can be measured to be in one of two states. However, in the classical our state-space is a single binary bit, whereas in the quantum case it is a 2-dimensional (complex) space.
Example: Position in 1-dimension
Consider a particle that is located at some point in one dimension. Classically we need just one real number to describe the whole state; namely the position of the particle. For a quantum system we have a probability to measure the particle at every position, and this probability is set by some (complex) number, i.e. the state is specified by a complex function on the real line. Therefore every position has an associated complex number, perhaps somewhat counter-intuitively, each position we can measure is an orthogonal axis in our Hilbert space, meaning in this case our quantum state-space is uncountably infinite dimensional!
Comparing the quantum and the classical cases here, we go from one (real) number describing the whole system classically, to uncountably infinitely many (complex) numbers describing the whole quantum system.
These two examples hopefully make clear quite how much bigger quantum state-spaces are than classical. But they mainly serve to illustrate the fact that the possible states we can measure a system to be in are orthogonal, it is merely that in the classical case this does not matter since only one state ever has non-zero probability to be observed (i.e. probability 1) and thus we can just identify the labels of the states with the state-space. This is not the case in quantum mechanics.
Composite Systems
Classical Systems $\Rightarrow$ Direct Sum
In classical mechanics if we compose two systems $A$ and $B$ the possible states the combined system can be in is given by the cartesian product of the component state spaces $A \times B$. Thus the number of possible states is $|A||B|$. However, to fully describe which state we are in we need only $\mathrm{dim}(A) + \mathrm{dim}(B)$ numbers to describe which state we are in, the state-space is the direct sum $A\oplus B$, where $\mathrm{dim}(A\oplus B) = \mathrm{dim}(A) + \mathrm{dim}(B)$.
Quantum Systems $\Rightarrow$ Tensor Product
In quantum mechanics the compositions of systems is given by the tensor product of their state spaces. If system A has measurable states $A$ and system B has measurable states $B$ then the state space for each isolated system is $\mathcal{H}_A$ and $\mathcal{H}_B$ with dimensions $\mathrm{dim}(\mathcal{H}_A) = |A|$ and $\mathrm{dim}(\mathcal{H}_B)=|B|$. As for the classical system the combined system has possible observable states in $A \times B$, but until we observe we do not know exactly which state it is in, so we must track the probabilities of being in each of the $|A||B|$ possible states. Thus the state space is the tensor product of the Hilbert spaces $\mathcal{H}_A \otimes \mathcal{H}_B$, which has $\mathrm{dim}(\mathcal{H}_A \otimes \mathcal{H}_B) = \mathrm{dim}(\mathcal{H}_A)\,\mathrm{dim}(\mathcal{H}_B)$.
Example
Let $A = \{a_1, a_2, a_3\}$, $B=\{b_1, b_2, b_3\}$. The possible states for a classical system to be in are
$$\begin{align} A\times B = \{&a_1b_1,\,\, a_1b_2,\,\, a_1b_3, \\ &a_2b_1,\,\, a_2b_2,\,\, a_2b_3, \\ &a_3b_1,\,\, a_3b_2,\,\, a_3b_3\} \end{align}$$It has nine possible values, $|A||B|=3\times 3 = 9$, but we need only two trinary digits to specify which state we’re in, $\mathrm{dim}(A) + \mathrm{dim}(B) = 1 + 1 = 2$.
For the analogous quantum system the possible measurable states are also nine, but we need to track the coefficients of all of them to know their relative probabilities
$$\begin{align} |\psi\rangle =\, &\gamma_{11} |a_1b_1\rangle + \gamma_{12} |a_1b_2\rangle + \gamma_{13} |a_1b_3\rangle \,+ \\ &\gamma_{21} |a_2b_1\rangle + \gamma_{22} |a_2b_2\rangle + \gamma_{23} |a_2b_3\rangle \,+ \\ &\gamma_{31} |a_3b_1\rangle + \gamma_{32} |a_3b_2\rangle + \gamma_{33} |a_3b_3\rangle \hphantom{\,+}= \sum_{ij} \gamma_{ij} |a_i b_j\rangle \end{align}$$with $\gamma_{ij} \in \mathbb{C}$