2013-03-18

A Less-Seen View of Angular Momentum

Many people learn in basic physics classes that angular momentum is a scalar quantity that describes the magnitude and direction of rotation, such that its rate of change is equal to the sum of all torques $\tau = \dot{L}$, akin to Newton's equation of motion $\vec{F} = \dot{\vec{p}}$. People who take more advanced physics classes, such as 8.012 — Physics I, learn that in fact angular momentum and torque are vectors; in the case of fixed-axis rotation, the moment of inertia (the rotational equivalent to mass) is a scalar so $\vec{L} = I\vec{\omega}$ means that angular momentum points in the same direction as angular velocity. By contrast, in general rigid body motion, the moment of inertia becomes anisotropic and becomes a tensor, so \[\vec{L} = \stackrel{\leftrightarrow}{I} \cdot \vec{\omega}\] implies that angular momentum is no longer parallel to angular velocity, but instead the components are related (using Einstein summation for convenience) by \[L_i = I_{ij} \omega_{j}.\] This becomes important in the analysis of situations like gyroscopes and torque-induced precession, torque-free precession, and nutation.

There is one problem though: there is nothing particularly vector-like about angular momentum. It is constructed as a vector essentially for mathematical convenience. The definition $\vec{L} = \vec{x} \times \vec{p}$ only works in 3 dimensions. Why is this? Let's look at the definition of the cross product components: in 3 dimensions, the permutation tensor has 3 indices, so contracting it with 2 vectors produces a third vector $\vec{c} = \vec{a} \times \vec{b}$ such that $c_i = \varepsilon_{ijk} a_{j} b_{k}$. One trick that is commonly taught to make the cross product easier is to turn the first vector into a matrix and then perform matrix multiplication with the column representation of the second vector to get the column representation of the resulting vector: the details of this rule are hard to remember, but the source is simple, as it is just $a_{ij} = \varepsilon_{ijk} a_{k}$. Now let us see what happens to angular velocity and angular momentum using this definition. Angular velocity was previously defined as a vector through $\vec{v} = \vec{\omega} \times \vec{x}$. We know that $\vec{x}$ and $\vec{v}$ are true vectors, while $\vec{\omega}$ is a pseudovector (defined by it flipping direction when the coordinate system undergoes reflection), so $\vec{\omega}$ is vector to be made into a tensor. Using the previous definition that in 3 dimensions $\omega_{ij} = \varepsilon_{ijk} \omega_{k}$, then \[v_i = \omega_{ij} x_{j}\] now defines the angular velocity tensor. Similarly, angular momentum is a pseudovector, so it can be made into a tensor through $L_{ij} = \varepsilon_{ijk} L_{k}$. Substituting this into the equation relating angular momenta and angular velocities yields \[L_{ij} = I_{ik} \omega_{kj}\] meaning the matrix representation of the angular momentum tensor is now the matrix multiplication of the matrices representing the moment of inertia and angular velocity tensors.

This has another consequence: the meaning of the components of the angular velocity and angular momentum become much more clear. Previously, $L_{j}$ was the generator of rotation in the plane perpendicular to the $j$-axis, and $\omega_{j}$ described the rate of this rotation: for instance, $L_z$ and $\omega_z$ relate to rotation in the $xy$-plane. This is somewhat counterintuitive. On the other hand, the tensor definitions $L_{ij}$ and $\omega_{ij}$ deal with rotations in the $ij$-plane: for example, $L_{xy}$ generates and $\omega_{xy}$ describes rotations in the $xy$-plane, which seem much more intuitive. Also, with this, $L_{ij} = x_{i} p_{j} - p_{i} x_{j}$ becomes a definition (though there may be a numerical coefficient that I am missing, so forgive me).

The nice thing about this formulation of angular velocities and momenta as tensor quantities is that this is generalizable to 4 dimensions, be it 4 spatial dimensions or 3 spatial and 1 temporal dimension (as in relativity). $L_{\mu \nu} = x_{\mu} p_{\nu} - p_{\mu} x_{\nu}$ now defines the generator of rotation in the $\mu\nu$-plane. Similarly, $\omega_{\mu \nu}$ defined in $L_{\mu \nu} = I_{\mu}^{\; \xi} \omega^{\xi}_{\; \nu}$ describes the rate of rotation in that plane. The reason why these cannot be vectors any more is that the permutation tensor gains an additional index, so contracting it with two vectors yields a tensor with 2 indices; this means that the cross product as laid out in 3 dimensions does not work in any other number of dimensions (except, interestingly enough, for 7, and that is because a 7-dimensional Cartesian vector space can be described through the algebra of octonions which does have a cross product, just as 2-dimensional vectors can be described by complex numbers and 3-dimensional vectors can be described by quaternions).

This has further nice consequence for special relativity. The Lorentz transformation as given in $x^{\mu'} = \Lambda^{\mu'}_{\; \mu} x^{\mu}$ is a hyperbolic rotation through an angle $\alpha$, equal to the rapidity defined as $\alpha = \tanh(\beta)$. A hyperbolic rotation is basically just a normal rotation through an imaginary angle. This can actually be seen by transforming to coordinates with imaginary time (called a Wick rotation, which may come back up in a post in the near future): $x^{\mu} = (ct, x^{j}) \rightarrow (ict, x^{j})$, allowing the metric to change as $\eta_{\mu \nu} = \mathrm{diag}(-1, 1, 1, 1) \rightarrow \delta_{\mu \nu}$. This changes the rapidity to just be a real angle, and the Lorentz transformation becomes a real rotation. Because only the temporal coordinate has been made imaginary while the spatial coordinates have been left untouched, because the Lorentz transformation is now a real rotation, and because angular momentum generates real rotations, then it can be said that the angular momentum components $L_{(0, j)}$ generate Lorentz boosts along the $j$-axis. This fact remains true even if the temporal coordinate is not made imaginary and the metric remains with an opposite sign for the temporal component, though the math of Lorentz boost generation becomes a little more tricky. That said, typically the conservation of angular momentum implies symmetry of the system under rotation, thanks to the Noether theorem. Naïvely, this would imply that conservation of $L_{(0, j)}$ is associated with symmetry under the Lorentz transformation. The truth is a little more complicated (but not by too much), as my advisor and I found from a few Internet searches. Basically, in nonrelativistic mechanics, just as momentum is the generator of spatial translation, position is the generator of (Galilean) momentum boosting: this can be seen in the quantum mechanical representation of momentum in the position basis $\hat{p} = -i\hbar \frac{\partial}{\partial x}$, and the analogous representation of position in the momentum basis $\hat{x} = i\hbar \frac{\partial}{\partial p}$. If the system is invariant under translation, then the momentum is conserved and the system is inertial, whereas if the system is invariant under boosting, then the position is conserved and the system is fixed at a given point in space. In relativity, the analogue to a Galilean momentum boost is exactly the Lorentz transformation, so conservation of $L_{(0, j)}$ corresponds to the system being fixed at its initial spacetime coordinate; this is OK even in relativity because spacetime coordinates are invariant geometric objects, even if their components transform covariantly.

There are a few remaining issues with this analysis. One is that rotations in 3 dimensions are just sums of pairs of rotations in planes, and rotations in 4 dimensions are just sums of pairs of rotations in 3 dimensions. This relates in some way (that I am not really sure of) to symmetries under special orthogonal/unitary transformations in those dimensions. In dimensions higher than 4, things get a lot more hairy, and I'm not sure if any of this continues to hold. Also, one remaining issue is that in special relativity, because the speed of light is fixed and finite, rigid bodies cease to exist except as an approximation, so the description of such dynamics using a moment of inertia tensor generalized to special relativity may not work anymore (though the description of angular momentum as a tensor should still work anyway). Finally, note that the generalization of particle momentum $p_{\mu}$ to a distribution of energy lies in the stress-energy tensor $T_{\mu \nu}$, so the angular momentum of such a distribution becomes a tensor with 3 indices that looks something like (though maybe not exactly like) $L_{\mu \nu \xi} = x_{\mu} T_{\nu \xi} - x_{\nu} T_{\mu \xi}$. In addition, stress-energy tensors with relativistic angular momenta may change the metric itself, so that would need to be accounted for through the Einstein field equations. Anyway, I just wanted to further explore the formulations and generalizations of angular momentum, and I hope this helped in that regard.