2022-12-02

Fundamental Theorem of Calculus for Functionals

I happened to think more about the idea of recovering a functional by somehow integrating its functional derivative. In the process, I realized that certain ideas that I would have to consider make this post a natural follow-up to a recent post [LINK] about mapping scalars to functions. This will become clear later in this post.

For a single variable, a function \( f(x) \) has an antiderivative \( F(x) \) such that \( f(x) = \frac{\mathrm{d}F}{\mathrm{d}x} \). One statement of the fundamental theorem of calculus is that this implies that \[ \int_{a}^{b} f(x)~\mathrm{d}x = F(b) - F(a) \] for these functions. In turn, this means \( F(x) \) can be extracted directly from \( f(x) \) through \[ F(x) = \int_{x_{0}}^{x} f(x')~\mathrm{d}x' \] in which \( x_{0} \) is chosen such that \( F(x_{0}) = 0 \).

For multiple variables, a conservative vector field \( \mathbf{f}(\mathbf{x}) \) in which \( \mathbf{f} \) must have the same number of components as \( \mathbf{x} \) can be said to have a scalar antiderivative \( F(\mathbf{x}) \) in the sense that \( \mathbf{f} \) is the gradient of \( F \), meaning \( \mathbf{f}(\mathbf{x}) = \nabla F(\mathbf{x}) \); more precisely, \( f_{i}(x_{1}, x_{2}, \ldots, x_{N}) = \frac{\partial F}{\partial x_{i}} \) for all \( i \in \{1, 2, \ldots, N \} \). (Note that if \( \mathbf{f} \) is not conservative, then it by definition cannot be written as the gradient of a scalar function! This is an important point to which I will return later in this post.) In such a case, a line integral (which, as I will emphasize again later in this post, is distinct from a functional path integral) from vector point \( \mathbf{a} \) to vector point \( \mathbf{b} \) of \( \mathbf{f} \) can be computed as \( \int \mathbf{f}(\mathbf{x}) \cdot \mathrm{d}\mathbf{x} = F(\mathbf{b}) - F(\mathbf{a}) \); more precisely, this equality holds along any contour, so if a contour is defined as \( \mathbf{x}(s) \) for \( s \in [0, 1] \), no matter what \( \mathbf{x}(s) \) actually is, as long as \( \mathbf{x}(0) = \mathbf{a} \) and \( \mathbf{x}(1) = \mathbf{b} \) hold, then \[ \sum_{i = 1}^{N} \int_{0}^{1} f_{i}(x_{1}(s), x_{2}(s), \ldots, x_{N}(s)) \frac{\mathrm{d}x_{i}}{\mathrm{d}s} \mathrm{d}s = F(\mathbf{b}) - F(\mathbf{a}) \] must also hold. This therefore suggests that \( F(\mathbf{x}) \) can be extracted from \( \mathbf{f}(\mathbf{x}) \) by relabeling \( \mathbf{x}(s) \to \mathbf{x}'(s) \), \( \mathbf{a} \) to a point such that \( F(\mathbf{a}) = 0 \), and \( \mathbf{b} \to \mathbf{x} \). Once again, if \( \mathbf{f}(\mathbf{x}) \) is not conservative, then it cannot be written as the gradient of a scalar field \( F \), and the integral \( \sum_{i = 1}^{N} \int_{0}^{1} f_{i}(x_{1}(s), x_{2}(s), \ldots, x_{N}(s)) \frac{\mathrm{d}x_{i}}{\mathrm{d}s} \mathrm{d}s \) will depend on the specific choice of \( \mathbf{x}(s) \), not just the endpoints \( \mathbf{a} \) and \( \mathbf{b} \).

For continuous functions, the generalization of a vector \( \mathbf{x} \), or more precisely \( x_{i} \) for \( i \in \{1, 2, \ldots, N\} \), is a function \( x(t) \) where \( t \) is a continuous dummy index or parameter analogous to the discrete index \( i \). This means the generalization of a scalar field \( F(\mathbf{x}) \) is the scalar functional \( F[x] \). What is the generalization of a vector field \( \mathbf{f}(\mathbf{x}) \)? To be precise, a vector field is a collection of functions \( f_{i}(x_{1}, x_{2}, \ldots, x_{N}) \) for all \( i \in \{1, 2, \ldots, N \} \). This suggests that its generalization should be a function of \( t \) and must somehow depend on \( x(t) \) as well. It is tempting therefore to write this as \( f(t, x(t)) \) for all \( t \). However, although this is a valid subset of the generalization, it is not the whole generalization, because vector fields of the form \( f_{i}(x_{i}) \) are collections of single-variable functions that do not fully capture all vector fields of the form \( f_{i}(x_{1}, x_{2}, \ldots, x_{N}) \) for all \( i \in \{1, 2, \ldots, N \} \). As a specific example, for \( N = 2 \), the vector field with components \( f_{1}(x_{1}, x_{2}) = (x_{1} - x_{2})^{2} \) and \( f_{2}(x_{1}, x_{2}) = (x_{1} + x_{2})^{3} \) cannot be written as just \( f_{1}(x_{1}) \) and \( f_{2}(x_{2}) \), as \( f_{1} \) depends on \( x_{2} \) and \( f_{2} \) depends on \( x_{1} \) as well. Similarly, in the generalization, one could imagine a function of the form \( f = \frac{x(t)}{x(t - t_{0})} \mathrm{exp}(-(t - t_{0})^{2}) \); in this case, it is not correct to write it as \( f(t, x(t)) \) because the dependence of \( f \) on \( x \) at a given dummy index value \( t \) comes through not only \( x(t) \) but also \( x(t - t_{0}) \) for some fixed parameter \( t_{0} \). Additionally, the function may depend not only on \( x \) per se but also on derivatives \( \frac{\mathrm{d}^{n} x}{\mathrm{d}t^{n}} \); the case of the first derivative \( \frac{\mathrm{d}x}{\mathrm{d}t} = \lim_{t_{0} \to 0} \frac{x(t) - x(t - t_{0})}{t_{0}} \) illustrates the connection to the aforementioned example. Therefore, the most generic way to write such a function is effectively as a functional \( f[x; t] \) with a dummy index \( t \). The example \( f = \frac{x(t)}{x(t - t_{0})} \mathrm{exp}(-(t - t_{0})^{2}) \) can be formalized as \( f[t, x] = \int_{-\infty}^{\infty} \frac{x(t')}{x(t' - t_{0})} \mathrm{exp}(-(t' - t_{0})^{2}) \delta(t - t')~\mathrm{d}t' \) where the dummy index \( t' \) is the integration variable while the dummy index \( t \) is free. (For \( N = 3 \), the condition of a vector field being conservative is often written as \( \nabla \times \mathbf{f}(\mathbf{x}) = 0 \). I have not used that condition in this post because the curl operator does not easily generalize to \( N \neq 3 \).)

If a functional \( f[x; t] \) is conservative, then there exists a functional \( F[x] \) (with no free dummy index) such that \( f \) is the functional derivative \( f[x; t] = \frac{\delta F}{\delta x(t)} \). Comparing the notation between scalar fields and functionals, \( \sum_{i} A_{i} \to \int A(t)~\mathrm{d}t \) and \( \mathrm{d}x_{i} \to \delta x(t) \), in which \( \delta x(t) \) is a small variation in a function \( x \) specifically at the index value \( t \) and nowhere else. This suggests a generalization of the fundamental theorem of calculus to functionals as follows. If \( a(t) \) and \( b(t) \) are fixed functions, then \( \int_{-\infty}^{\infty} \int f[x; t]~\delta x(t)~\mathrm{d}t = F[b] - F[a] \). More precisely, a path from the function \( a(t) \) to the function \( b(t) \) at every index value \( t \) can be parameterized by \( s \in [0, 1] \) by the map \( s \to x(t, s) \) which is a function of \( t \) for each \( s \) such that \( x(t, 0) = a(t) \) and \( x(t, 1) = b(t) \); this is why I linked this post to the most recent post on this blog. With this in mind, the fundamental theorem of calculus becomes \[ \int_{-\infty}^{\infty} \int_{0}^{1} f[x(s); t] \frac{\partial x}{\partial s}~\mathrm{d}s~\mathrm{d}t = F[b] - F[a] \] where, in the integrand, the argument \( x \) in \( f \) has the parameter \( s \) explicit but the dummy index \( t \) implicit; the point is that this equality holds regardless of the specific parameterization \( x(t, s) \) as long as \( x \) at the endpoints of \( s \) satisfies \( x(t, 0) = a(t) \) and \( x(t, 1) = b(t) \). This also means that \( F[x] \) can be recovered if \( b(t) = x(t) \) and \( a(t) \) is chosen such that \( F[a] = 0 \), in which case \[ F[x] = \int_{-\infty}^{\infty} \int_{0}^{1} f[x'(s); t]~\frac{\partial x'}{\partial s}~\mathrm{d}s~\mathrm{d}t \] (where \( x(t, s) \) has been renamed to \( x'(t, s) \) to avoid confusion with \( x(t) \)). If \( f[x; t] \) is not conservative, then there is no functional \( F[x] \) whose functional derivative with respect to \( x(t) \) would yield \( f[x; t] \); in that case, with \( x(t, 0) = a(t) \) and \( x(t, 1) = b(t) \), the integral \( \int_{-\infty}^{\infty} \int_{0}^{1} f[x(s); t] \frac{\partial x}{\partial s}~\mathrm{d}s~\mathrm{d}t \) does depend on the specific choice of parameterization \( x(t, s) \) with respect to \( s \) and not just on the functions \( a(t) \) and \( b(t) \) at the endpoints of \( s \).

As an example, consider from a previous post [LINK] the nonrelativistic Newtonian action \[ S[x] = \int_{-\infty}^{\infty} \left(\frac{m}{2} \left(\frac{\mathrm{d}x}{\mathrm{d}t}\right)^{2} + F_{0} x(t) \right)~\mathrm{d}t \] for a particle under the influence of a uniform force \( F_{0} \) (which may vanish). The first functional derivative is \[ f[x; t] = \frac{\delta S}{\delta x(t)} = F_{0} - m\frac{\mathrm{d}^{2} x}{\mathrm{d}t^{2}} \] and its vanishing would yield the usual equation of motion. The action itself vanishes for \( x(t) = 0 \), which will be helpful when using the fundamental theorem of calculus to recover the action from the equation of motion. In particular, one can parameterize \( x'(t, s) = sx(t) \) such that \( x'(t, 0) = 0 \) and \( x'(t, 1) = x(t) \). This gives the integral \( \int_{0}^{1} \left(F_{0} - ms\frac{\mathrm{d}^{2} x}{\mathrm{d}t^{2}}\right)x(t)~\mathrm{d}s = F_{0} x(t) - \frac{m}{2} x(t) \frac{\mathrm{d}^{2} x}{\mathrm{d}t^{2}} \). This is then integrated over all \( t \), so the first term is identical to the corresponding term in the definition of \( S[x] \), and the second term becomes the same as the corresponding term in the definition of \( S[x] \) after integrating over \( t \) by parts and setting the boundary conditions that \( x(t) \to 0 \) for \( |t| \to \infty \). (Other boundary conditions may require more care.) In any case, the parameterization \( x'(t, s) = sx(t) \) is not the only choice that could fulfill the boundary conditions; the salient point is that any parameterization fulfilling the boundary conditions would yield the correct action \( S[x] \).

I considered that example because I wondered whether any special formulas need to be considered if \( f[x; t] \) depends explicitly on first or second derivatives of \( x(t) \), as might be the case in nonrelativistic Newtonian mechanics. That example shows that no special formulas are needed because even if the Lagrangian explicitly depends on the velocity \( \frac{\mathrm{d}x}{\mathrm{d}t} \), the action \( S \) only explicitly depends as a functional on \( x(t) \), so proper application of functional differentiation and regular integration by parts will ensure proper accounting of each piece.

This post has been about the fundamental theorem of calculus saying that the 1-dimensional integral of a function in \( N \) dimensions along a contour, if that function is conservative, is equal to the difference between the two endpoints of its scalar antiderivative. This generalizes easily to infinite dimensions and continuous functions instead of finite-dimensional vectors. There is another fundamental theorem of calculus saying that the \( N \)-dimensional integral in a finite volume of the scalar divergence of an \( N \)-dimensional vector function, if that volume has a closed orientable surface, is equal to the \( N - 1 \)-dimensional integral of the inner product of that function with the normal vector (of unit 2-norm) at every point on the surface across the whole surface, meaning \[ \int_{V} \sum_{i = 1}^{N} \frac{\partial f_{i}}{\partial x_{i}}~\mathrm{d}V = \oint_{\partial V} \sum_{i = 1}^{N} f_{i}(x_{1}, x_{2}, \ldots, x_{N}) n_{i}(x_{1}, x_{2}, \ldots, x_{N})~\mathrm{d}S \] where \( \sum_{i = 1}^{N} |n_{i}(x_{1}, x_{2}, \ldots, x_{N})|^{2} = 1 \) for every \( \mathbf{x} \). From a purely formal perspective, this could generalize to something like \( \int_{V} \int_{-\infty}^{\infty} \frac{\delta f[x; t]}{\delta x(t)}~\mathrm{d}t~\mathcal{D}x = \oint_{\partial V} \int_{-\infty}^{\infty} f[x; t]n[x; t]~\mathrm{d}t~\mathcal{D}x \) having generalized \( \frac{\partial}{\partial x_{i}} \to \frac{\delta}{\delta x(t)} \), \( \prod_{i} \mathrm{d}x_{i} \to \mathcal{D}x \), and \( n_{i}(\mathbf{x}) \to n[x; t] \) where \( n[x; t] \) is normalized such that \( \int_{-\infty}^{\infty} |n[x; t]|^{2}~\mathrm{d}t = 1 \) for all \( x(t) \) on the surface. However, this formalism may be hard to further develop because the space has infinite dimensions. Even when working in a countable basis, it might not be possible to characterize an orientable surface enclosing a volume in an infinite-dimensional space; the surface is also infinite-dimensional. While the choice of basis is arbitrary, things become even less intuitive when choosing to work in an uncountable basis.