Equation Covariance under Coordinate Transforms

April 2025

The laws of physics are symmetric under various transformations (for example rotation, translation, etc). This means that after performing such a transformation we cannot do any experiment which will tell us it has happened. The equations of motion are left unchanged! What does this look like mathematically? Read on to find out!

Preliminaries

Coordinate Transforms

Consider a coordinate system $S$ with the current coordinate denoted as $\mathbf x\in\mathbb X$. We can convert into another coordinate system $S^\prime$ by a bijective mapping $M$ (for map), where $M:\mathbb X \mapsto\mathbb X:$

$$\mathbf x^\prime = M\mathbf x \label{eq:xprime_defn}\tag{1}$$

Since it’s a bijection the inverse $M^{-1}$ also exists. Additionally, because we are interested in differential equations we will also require that $M$ be differentiable.

For physical systems the coordinate will often include time as its zeroth component, but that does not change any of the discussion or results.

Scalar Invariance

For a scalar property invariance means the value does not change if we use the new coordinates in place of the old. Mathematically

$$s = f(\mathbf x_1, \mathbf x_2, \dots) = f(\mathbf x_1^\prime, \mathbf x_2^\prime, \dots)$$

So, for example the Euclidean distance remains unchanged under translation

$$r^2 = (x_2 - x_1)^2 = (x_2 - L - x_1 + L)^2 = (x^\prime_2 - x^\prime_1)^2$$

or the acceleration is invariant under Galilean transformations

$$a = \frac{dv}{dt} = \frac{d(v - u + u)}{dt} = \frac{d(v^\prime+u)}{dt} = \frac{dv^\prime}{dt}$$

Equations are somewhat different because the form of the equation can stay the same but the actual numerical values we see can change, hence ‘covariance’.

Equation Covariance

Function Transformation

Consider an arbitrary function. In frame $S$ let’s call it $\phi(\mathbf x)$ where $\phi:\mathbb X \mapsto \mathbb F.$ What does this function look like when ‘viewed’ from frame $S^\prime?$ Denote the function as it appears in $S^\prime$ as $\phi^\prime(\mathbf x^\prime).$ Because the system $S^\prime$ is nothing more than a different coordinate system for the same space, we are merely referring to the same underlying function with different coordinates. Meaning we must have

$$\phi^\prime(\mathbf x^\prime) = \phi(\mathbf x) \label{eq:xformed_func}\tag{2}$$

This is visualized in the figure below:

“Coordinate Mapping Overlayed on a Function” — Coordinate Mapping Overlayed on a Function

The gradients of $\phi$ clearly also get changed by this mapping. Using equation \eqref{eq:xformed_func} we can easily derive a mapping from gradients in the original coordinate system $S$ to (combinations of) gradients in the transformed system $S^\prime.$ Taking derivatives of $\phi(\mathbf x$) directly gives us

$$\begin{align} \frac{\partial\phi(\mathbf x)}{\partial x_\mu} &= \vphantom{\sum_\nu}\frac{\partial \phi^\prime(\mathbf x^\prime)}{\partial x_\mu} &\qquad\text{Equation \eqref{eq:xformed_func}} \\ &= \sum_{\nu} \frac{\partial \phi^\prime(\mathbf x^\prime)}{\partial x^\prime_\nu} \frac{\partial x^\prime_{\nu}(\mathbf x)}{\partial x_\mu} &\qquad\text{Chain rule + Total} \\ &= \sum_{\nu} \frac{\partial \phi^\prime(\mathbf x^\prime)}{\partial x_\nu^\prime} \frac{\partial (M\mathbf x)_\nu}{ \partial x_\mu} &\qquad\text{Definition \eqref{eq:xprime_defn}} \end{align}$$

On line two we have to take care because in general all components of $\mathbf x^\prime$ can depend on $x_\mu$ and so we have to use the total derivative in combination with the chain rule.

This result tells us how to map first derivatives from $S$ to $S^\prime,$ and we can use composition to get higher order derivatives. Note there is nothing special about $\phi$ here, it is a completely arbitrary differentiable function in frame $S.$

Transformed Equations & Solutions

Say $\phi(\mathbf x)$ is the solution to the equation

$$F\phi(\mathbf x) = 0 \tag{3}\label{eq:generic_eqn}$$

for some operator $F: (\mathbb X \mapsto \mathbb F)\mapsto(\mathbb X \mapsto \mathbb F).$ In general $F$ will involve various partial-derivatives of $\phi$. We can transform into frame $S^\prime$, updating the derivatives inside $F$ appropriately to give

$$F^\prime\phi^\prime(\mathbf x^\prime) = 0$$

Note that this equation necessarily holds true for $\phi^\prime(\mathbf x^\prime)$ since it is just a transformation of equation \eqref{eq:generic_eqn}.

If the mapped operator $F^\prime$ is unchanged from the original then we say it is covariant under $M.$ An equivalent statement is that the transformed solution $\phi^\prime$ itself is a solution to the original equation:

$$\begin{align} F\phi(\mathbf x) &= 0 & \\ F^\prime\phi^\prime(\mathbf x^\prime) &= 0 &\qquad\text{Map }S \mapsto S^\prime\\ F\phi^\prime(\mathbf x^\prime) &= 0 &\qquad\text{Covariance Defn.} \end{align}$$

The two statements of covariance are equivalent because we can validly run these steps in reverse.

For a covariant equation the transformed solution $\phi^\prime$ is still a solution to the equation but it is not the same solution as $\phi$.

In practice I guess we care more about the 'equation first' perspective, for two reasons. The first is simply that we will always have an equation to work with, whereas we may not have a solution. Secondly, consider actually performing an experiment to attempt to distinguish between coordinate systems. If we have the same set-up in our two different systems we will see the same behaviour in each frame; this is equivalent to having $F\phi = 0$ and $F\phi^\prime = 0.$ The ‘solution first’ perspective corresponds to viewing somebody in a different coordinate system perform their experiment, i.e. $\phi\circ M^{-1}.$

Interpretation

What does the transformed operator look like in general? To answer that note that we can write equation \eqref{eq:xformed_func} using function composition:

$$\phi^\prime( M\mathbf x) = (\phi^\prime\circ M)(\mathbf x)$$

This let’s us write an explicit formula for the transformed operator. We do this by starting with the operator in frame $S$, then rewriting it using only references to $\phi^\prime$ and $\mathbf x^\prime.$

$$\begin{align} F \phi (\mathbf x) &= (F\,\,\phi^\prime\circ M)(\mathbf x) \\ &= (F\,\,\phi^\prime\circ M)(M^{-1} \mathbf x^\prime) \\ &= ((F\,\,\phi^\prime\circ M)\circ M^{-1})(\mathbf x^\prime) \\ &= F^\prime\phi^\prime(\mathbf x^\prime) \end{align}$$

This equation is completely general and does not depend on the mapping of derivatives discussed above. But note if $F$ is some derivate and $M$ is differentiable then this is where the chain rule appears.

Covariance of $F$ under $M$ is the requirement that $F^\prime\phi^\prime = F \phi^\prime$, i.e.

$$(F\,\,\phi^\prime\circ M)\circ M^{-1} = F \phi^\prime$$

Now compose both sides of the equation with $M$, remembering $I = M^{-1} \circ M$, thus

$$\boxed{ F(\phi^\prime\circ M) = (F \phi^\prime)\circ M }$$

This equation has the nice interpretation that applying $F$ to the transformed input function is the same as applying $F$ and then transforming the output function.

Examples

Wave Equation + Translation

Consider the 1d wave equation

$$\partial_{tt} \psi = c^2 \partial_{xx}\psi$$

which has solutions

$$\psi(x, t) = A f(x - ct) + B g(x + ct)$$

Solution First

Translating $x \mapsto x^\prime = x + L$ gives

$$\psi^\prime(x^\prime, t^\prime) = \psi(x^\prime - L, t) = A f(x^\prime - L - ct^\prime) + B g(x^\prime - L + ct^\prime) $$

By substituting $\psi^\prime$ into $\partial_{t^\prime t^\prime}\psi^\prime = c^2\partial_{x^\prime x^\prime}\psi^\prime$ we can verify it is still a solution of the wave equation. This shows that the wave equation is covariant under translation.

Proof

For brevity, let’s begin by defining $y = x^\prime - L - ct^\prime$ and $z = x^\prime - L + ct^\prime.$ Then

$$\begin{align} \partial_{t^\prime t^\prime}\psi &= \partial_{t^\prime t^\prime} \big( A f(y) + B g(z) \big) \\ &= A\, \partial_{t^\prime t^\prime} f(y) + B\, \partial_{t^\prime t^\prime} g(z) \\ &= A\, \partial_{t^\prime} (\partial_y f(y)\,\partial_{t^\prime}y) + B\, \partial_{t^\prime} (\partial_z g(z)\,\partial_{t^\prime}z) \\ &= -c A\, \partial_{t^\prime} (\partial_y f(y)) + c B\, \partial_{t^\prime} (\partial_z g(z)) \\ &= -c A\, \partial_{yy} f(y) \,\partial_{t^\prime}y + c B\, \partial_{zz}g(z)\,\partial_{t^\prime}z \\ &= c^2 A\, \partial_{yy} f(y) + c^2 B\, \partial_{zz}g(z) \\ &= c^2 \partial_{x^\prime x^\prime} \big(Af(y) + Bg(z)\big) \\ &= c^2 \partial_{x^\prime x^\prime}\psi\quad\square \end{align}$$

The penultimate line follows because

$$\partial_{x^\prime} \phi(y) = \partial_y\phi(y)\cancelto{1}{\partial_{x^\prime} y} = \partial_y \phi(y)$$

and similarly for $z.$

Equation First

The mapping ends up giving identity gradient transformations

$$\begin{align} \partial_x = \partial_x x^\prime\,\partial_{x^\prime} + \partial_x t^\prime\,\partial_{t^\prime} &= \partial_x(x + L)\partial_{x^\prime} + \partial_x(t)\partial_{t^\prime} = \partial_{x^\prime} \\ \partial_t = \partial_t x^\prime\,\partial_{x^\prime} + \partial_t t^\prime\,\partial_{t^\prime} &= \partial_t(x + L)\partial_{x^\prime} + \partial_t(t)\partial_{t^\prime} = \partial_{t^\prime} \end{align}$$

Given the wave equation only involves derivatives of $\phi$ it must therefore be covariant. In fact this shows that any equation which does not involve any explicit terms in $\mathbf x$ will be translationally covariant.

Continuity Equation + Rotation

The continuity equation reads in two dimensions

$$\partial_t \rho + \partial_x (\rho v_x) + \partial_y (\rho v_y) = 0$$

Rotation can be represented by a rotation matrix

$$R = \bigg[\,\begin{matrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{matrix}\,\bigg]$$

thus

$$\begin{align} x^\prime &= x\cos\theta - y\sin\theta \\ y^\prime &= x\sin\theta + y\cos\theta \end{align}$$

and

$$\begin{align} \partial_x &= \cos\theta\,\partial_{x^\prime} + \sin\theta\,\partial_{y^\prime} \\ \partial_y &= \cos\theta\,\partial_{y^\prime} - \sin\theta\,\partial_{x^\prime} \end{align}$$

With this equation there is an additional wrinkle compared to the previous example; the velocity vector will also transform under rotation. It transforms exactly as the position vector (i.e. contravariantly ) meaning

$$\begin{align} v_x^\prime &= v_x\cos\theta - v_y\sin\theta \\ v_y^\prime &= v_y\cos\theta + v_x\sin\theta \end{align}$$

Putting it all together gives

$$\begin{align} 0 = \partial_t\rho &+ (c\,\partial_{x^\prime} + s\,\partial_{y^\prime})\rho(v_x^\prime c + v_y^\prime s) \\ \vphantom{+} &+ (c\,\partial_{y^\prime} - s\,\partial_{x^\prime})\rho(v_y^\prime c - v_x^\prime s) \\ = \partial_t\rho &+ c^2\,\partial_{x^\prime}\rho v_x^\prime + cs\,\partial_{x^\prime}\rho v_y^\prime + cs\,\partial_{y^\prime} \rho v_x^\prime + s^2\,\partial_{y^\prime} \rho v_y^\prime \\ &+ c^2\,\partial_{y^\prime}\rho v_y^\prime - cs\,\partial_{y^\prime} \rho v_x^\prime - cs\,\partial_{x^\prime}\rho v_y^\prime + s^2\,\partial_{x^\prime}\rho v_x^\prime \\ = \partial_t\rho &+ \partial_{x^\prime} (\rho v^\prime_x)+ \partial_{y^\prime} (\rho v^\prime_y)\quad\square \end{align}$$

where I’ve used $c:=\cos\theta$ and $s:=\sin\theta$ for brevity.

This has a slightly different feel because the equation involved the velocity, which is also a physical quantity whose coordinates transform under rotation. (The velocity vector itself is of course invariant under rotation, only the coordinates really change.)

Dipolar Forces

Contents

Proof