Introduction to Partial Differential Equations

John Stalker

2025-09-30

Chapter 1 Introduction

Differential equations are equations involving an unknown function and its derivatives, like \[\tag{1.0.1} m \frac { d ^ 2 x } { d t ^ 2 } + k x = 0 ,\] \[\tag{1.0.2} \frac { d ^ 2 x } { d t ^ 2 } - \mu ( 1 - x ^ 2 ) \frac { d x } { d t } + x = 0\] \[\tag{1.0.3} \frac { \partial ^ 2 u } { \partial x ^ 2 } + \frac { \partial ^ 2 u } { \partial y ^ 2 } = 0\] \[\tag{1.0.4} \frac { \partial u } { \partial t } - k \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0\] \[\tag{1.0.5} \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0\] \[\tag{1.0.6} \frac { \partial u } { \partial t } + u \frac { \partial u } { \partial x } = 0\] \[\tag{1.0.7} \frac { \partial u } { \partial t } + \frac { \partial ^ 3 u } { \partial x ^ 3 } + 6 u \frac { \partial u } { \partial x } = 0\] \[\tag{1.0.8} \frac { \partial v } { \partial \tau } + \frac 1 2 \sigma ^ 2 s ^ 2 \frac { \partial ^ 2 v } { \partial s ^ 2 } + r s \frac { \partial v } { \partial s } - r v = 0\]

All of these equations have names. Equation (1.0.1) is the simple harmonic oscillator equation. It appears in mechanics and many other places. Equation (1.0.2) is the Van der Pol equation. It first appeared in electrical engineering. Equation (1.0.3) is the Laplace equation. It appears in the study of gravitational and electrostatic fields. Equation (1.0.4) is the diffusion equation, also known as the heat equation. It appears in study of diffusion, originally diffusion of heat but it also applies to, for example, diffusion of chemical solutions. Equation (1.0.5) is the wave equation. It appears in the study of various kinds of waves, e.g. electromagnetic, acoustic, etc., but not generally water waves. Equation (1.0.6) is Burgers' equation. Unlike the wave equation Burgers' equation is often used as a simple model for water waves. Equation (1.0.7) is the Korteweg–De Vries equation. It is a more refined model of water waves, but still simpler than real water waves. Equation (1.0.8) is the Black-Scholes equation. It appears in mathematical finance.

Some terminology is useful for describing these. We say that variables which are differentiated are dependent variables, variables with respect to which we differentiate them are independent variables, and variables which don't appear in derivatives at all are parameters. Note that which variables play which role varies from equation to equation. $x$, for example, is a dependent variable in the first two equations, an independent variable in the next five equations, and doesn't appear in the last equation. If there's only one independent variable, i.e. if we only differentiate with respect to one variable, then the derivatives are ordinary derivatives and so the equation is called an ordinary differential equation. If there is more than one then the derivatives are partial derivatives and so the equation is called a partial differential equation, which is the subject of these notes. In the list above equations (1.0.1) and (1.0.2) are ordinary, while the others are all partial differential equations. Some knowledge of ordinary differential equations is useful for studying partial differential equations, but for what we'll do here it's not essential. The order of a differential equation is the order of the highest derivative appearing in it. Equation (1.0.6) is first order and equation (1.0.7) is third order while all the other equations above are second order. Second order equations seem to be pervasive in mathematical physics.

The most important distinction in the theory of differential equations is between linear and nonlinear equations. The distinction is somewhat subtle though. A linear differential equation is a linear equation in the unknown function, i.e. the dependent variable, and its derivatives, the coefficients of which are allowed to be functions of the independent variables and parameters, but not of the dependent variable. It's this last bit which tends to cause confusion. In the list above equations (1.0.1), (1.0.3), (1.0.4), (1.0.5) and (1.0.8) are linear, while (1.0.2), (1.0.6) and (1.0.7) are nonlinear. Equation (1.0.8), for example, is linear because the coefficients of the derivatives, $\partial v / \partial t$, $\partial ^ 2 v / \partial s ^ 2$, $\partial v / \partial s$ and $v$, the last of these being considered as a zeroeth order derivative of the unknown function $v$, are $1$, $\frac 1 2 \sigma ^ 2 s ^ 2$, $r s$ and $- r$, all of which are functions of the independent variables $t$ and $s$ and the parameters $r$ and $\sigma$, but don't depend on the independent variable $v$. Equation (1.0.7), by contrast, is nonlinear because if we try to write it as linear equation for the derivatives $ \partial u / \partial t $, $ \partial ^ 3 u / \partial x ^ 3 $ and $ \partial u / \partial x $ with coefficients $ 1 $, $ 1 $ and $ 6 u $ then the first two are okay but $ 6 u $ is a function of the dependent variable $u$, which is not allowed. Note that constant functions are functions, so $ 1 $ is a function of the independent variables $t$ and $x$, it's just a constant function. $ 6 u $, by contrast, is not a function of $t$ and $x$.

Chapter 2 Wave Equation

Section 2.1 D'Alembert Solution

Suppose that $u$ is a twice continuously differentiable function on $\mathbf R ^ 2$. For this chapter we'll label the coordinates on $\mathbf R ^ 2$ as $t$ and $x$, written in that order, and in diagrams the $t$ will be vertical and the $x$ axis will be horizontal. This is slightly awkward since we are used to coordinate systems in the plane where the first coordinate corresponds to the horizontal axis and the second one corresponds to the vertical. The letters $ t $ and $ x $ for time and space coordinates are far too well established to consider changing. Similarly the convention that in space-time diagrams time corresponds to the vertical direction is fairly universal. The only other option to avoid listing the vertical coordinate before the horizontal one would be to reverse the order of $ t $ and $ x $. Some authors do this, but listing coordinates in alphabetical order is standard nearly everywhere else and physicists largely switched from the convention of listing time last to listing time last more than half a century ago so the conventions we're using here are probably the least bad option.

It's helpful to introduce the auxiliary functions \[\tag{2.1.1} v = \frac { \partial u } { \partial t } + c \frac { \partial u } { \partial x } , \quad w = \frac { \partial u } { \partial t } - c \frac { \partial u } { \partial x }.\] Since $u$ was twice continuously differentiable $v$ and $w$ are once continuous differentiable.

Suppose that $ ( t _ 1 , x _ 1 ) $ and $ ( t _ 2 , x _ 2 ) $ are points such that \[\tag{2.1.2} x _ 1 - c t _ 1 = x _ 2 - c t _ 2 .\] and set \[\tag{2.1.3} \tau ( r ) = t _ 1 + r ( t _ 2 - t _ 1 ) , \quad \xi ( r ) = x _ 1 + r ( x _ 2 - x _ 1 ) .\] Then, by the chain rule \[\tag{2.1.4} \begin{split} \frac d { dr } z ( \tau ( r ) , \xi ( r ) ) & = \left ( t _ 2 - t _ 1 \right ) \frac { \partial z } { \partial t } ( \tau ( r ) , \xi ( r ) ) \\ & \quad {} + \left ( x _ 2 - x _ 1 \right ) \frac { \partial z } { \partial x } ( \tau ( r ) , \xi ( r ) ) \\ & = \left ( t _ 2 - t _ 1 \right ) \left ( \frac { \partial z } { \partial t } + c \frac { \partial z } { \partial x } \right ) ( \tau ( r ) , \xi ( r ) ) \end{split}\] for any function $z$ which is at least once continuously differentiable. Integrating from $r = 0$ to $r = 1$ and using the fundamental theorem of calculus we see that \[\tag{2.1.5} \begin{split} & z ( t _ 2 , x _ 2 ) = z ( t _ 1 , x _ 1 ) \\ & \qquad {} + ( t _ 2 - t _ 1 ) \int _ 0 ^ 1 \left ( \frac { \partial z } { \partial t } + c \frac { \partial z } { \partial x } \right ) \left ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 1 + r ( x _ 2 - x _ 1 ) \right ) \, d r . \end{split}\] This holds in particular for $z = u$ and $z = w$. In the latter case note that \[\tag{2.1.6} \frac { \partial w } { \partial t } + c \frac { \partial w } { \partial x } = \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 }\] so if, as we will assume from now until the end of this section, $u$ satisfies the wave equation, the integrand vanishes throughout the interval of integration. We then conclude that \[\tag{2.1.7} u ( t _ 2 , x _ 2 ) = u ( t _ 1 , x _ 1 ) + ( t _ 2 - t _ 1 ) \int _ 0 ^ 1 v \left ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 1 + r ( x _ 2 - x _ 1 ) \right ) \, d r .\] and \[\tag{2.1.8} w ( t _ 2 , x _ 2 ) = w ( t _ 1 , x _ 1 ) .\] The preceding calculation was carried out under the assumption that $ x _ 1 - c t _ 1 = x _ 2 - c t _ 2 $, which is certainly true if \[\tag{2.1.9} x _ 1 = x _ 2 + c t _ 1 - c t _ 2 ,\] so we can rewrite the preceding equations as \[\tag{2.1.10} \begin{split} u ( t _ 2 , x _ 2 ) & = u ( t _ 1 , x _ 2 + c t _ 1 - c t _ 2 ) \\ & \quad {} + ( t _ 2 - t _ 1 ) \int _ 0 ^ 1 v \left ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 2 - ( 1 - r ) c ( t _ 2 - t _ 1 ) \right ) \, d r . \end{split}\] and \[\tag{2.1.11} w ( t _ 2 , x _ 2 ) = w ( t _ 1 , x _ 2 + c t _ 1 - c t _ 2 ) .\] If instead we assume that \[\tag{2.1.12} x _ 1 + c t _ 1 = x _ 2 + c t _ 2\] then a very similar calculation leads to \[\tag{2.1.13} \begin{split} u ( t _ 2 , x _ 2 ) & = u ( t _ 1 , x _ 2 - c t _ 1 + c t _ 2 ) \\ & \quad {} + ( t _ 2 - t _ 1 ) \int _ 0 ^ 1 w \left ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 2 + ( 1 - r ) c ( t _ 2 - t _ 1 ) \right ) \, d r . \end{split}\] and \[\tag{2.1.14} v ( t _ 2 , x _ 2 ) = v ( t _ 1 , x _ 2 - c t _ 1 + c t _ 2 ) .\]

Since $ ( t _ 2 , x _ 2 ) $ is an arbitrary point in the equations above we can substitute any other point for it. In particular we can substitute \[\tag{2.1.15} ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 2 + ( r - 1 ) c ( t _ 2 - t _ 1 ) )\] for it in (2.1.14), which gives \[\tag{2.1.16} v ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 1 + r c ( t _ 2 - t _ 1 ) ) = v ( t _ 1 , x _ 2 + ( 2 r - 1 ) c ( t _ 2 - t _ 1 ) ) ,\] which we can substitute into (2.1.10) to obtain \[\tag{2.1.17} \begin{split} u ( t _ 2 , x _ 2 ) & = u ( t _ 1 , x _ 2 + c t _ 1 - c t _ 2 ) \\ & \quad {} + ( t _ 2 - t _ 1 ) \int _ 0 ^ 1 v \left ( t _ 1 , x _ 2 + ( 2 r - 1 ) c ( t _ 2 - t _ 1 ) \right ) \, d r . \end{split}\] The substitution \[\tag{2.1.18} y = x _ 2 + ( 2 r - 1 ) c ( t _ 2 - t _ 1 )\] converts this into \[\tag{2.1.19} \begin{split} u ( t _ 2 , x _ 2 ) & = u ( t _ 1 , x _ 2 + c t _ 1 - c t _ 2 ) \\ & \quad {} + \frac 1 { 2 c } \int _ { x _ 2 - c ( t _ 2 - t _ 1 ) } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } v ( t _ 1 , y ) \, d y . \end{split}\] Similarly, we could substitute \[\tag{2.1.20} ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 2 + ( 1 - r ) c ( t _ 2 - t _ 1 ) )\] for $ ( t _ 2 , x _ 2 ) $ in (2.1.11), substitute the result into (2.1.13), and make the substitution \[\tag{2.1.21} y = x _ 2 + ( 1 - 2 r ) c ( t _ 2 - t _ 1 )\] into the resulting integral to obtain \[\tag{2.1.22} \begin{split} u ( t _ 2 , x _ 2 ) & = u ( t _ 1 , x _ 2 - c t _ 1 + c t _ 2 ) \\ & \quad {} + \frac 1 { 2 c } \int _ { x _ 2 - c ( t _ 2 - t _ 1 ) } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } w ( t _ 1 , y ) \, d y . \end{split}\] Averaging (2.1.19) and (2.1.22) and noting that \[\tag{2.1.23} v + w = 2 \frac { \partial u } { \partial t }\] gives the equation \[\tag{2.1.24} \begin{split} u ( t _ 2 , x _ 2 ) & = \frac 1 2 u ( t _ 1 , x _ 2 + c t _ 1 - c t _ 2 ) \\ & \quad {} + \frac 1 2 u ( t _ 1 , x _ 2 - c t _ 1 + c t _ 2 ) \\ & \quad {} + \frac 1 { 2 c } \int _ { x _ 2 - c ( t _ 2 - t _ 1 ) } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } \frac { \partial u } { \partial t } ( t _ 1 , y ) \, d y . \end{split}\] Relabeling the variables gives D'Alembert's formula \[\tag{2.1.25} u ( t , x ) = \frac 1 2 f ( x + c s - c t ) + \frac 1 2 f ( x - c s + c t ) + \frac 1 { 2 c } \int _ { x + c s - c t } ^ { x - c s + c t } g ( y ) \, d y ,\] where \[\tag{2.1.26} f ( y ) = u ( s , y ) , \quad g ( y ) = \frac { \partial u } { \partial t } ( s , y ) .\] This gives the solution at time $ t $ in terms of its values and the values of its first derivatives at time $ s $. Note that we haven't assumed $ s < t $, although the formula is usually applied in this case. We haven't even assumed $ s \neq t $, although the formula doesn't give us any useful information when $ s = t $.

Section 2.2 Existence and Uniqueness

D'Alembert's formula has a natural interpretation in terms of the in initial value problem for the wave equation, i.e. the problem of finding a classical solution to the wave equation with initial conditions at time $s$ given by \[\tag{2.2.1} u ( s , y ) = f ( y ) , \quad \frac { \partial u } { \partial t } ( s , y ) = g ( y ) .\] A classical solution is just a twice continuously differentiable function. It's natural to assume two derivatives because the equation is of second order. In other words, second derivatives are the highest ones which appear. The equation doesn't have any obvious interpretation if we assume much less differentiability than this. It is possible, and indeed useful, to give it less obvious interpretations which assume less differentiability, but that would be a topic for a more advanced text. Here we'll only consider classical solutions. Since $u$ should be twice continuously differentiable the initial conditions force $ f $ to be twice continuously differentiable as well and $ g $ to be continuously differentiable. So a more precise formulation of the initial value problem is, given a twice continuously differentiable function $ f $ and a continuously differentiable function $ g $, to find a classical solution $ u $ of the wave equation such that the initial conditions (2.2.1) are satisfied, i.e a twice continuously differentiable function $ u $ satisfying \[\tag{2.2.2} \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0 .\]

In terms of the initial value problem of the preceding paragraph the calculation of the preceding section provides a proof of the following theorem.

Theorem 2.2.A For given $ f $ and $ g $, defined on the real line $ \mathbf R $, there is at most one solution to the initial value problem in $ \mathbf R ^ 2 $, which is given by D'Alembert's formula (2.1.25).
In fact a closer inspection of the proof shows that it in fact also proves the following local version.
Theorem 2.2.B For given $ f $ and $ g $, defined on the interval $ [ a , b ] $ there is at most one solution to the initial value problem in the parallelogram with vertices $ \left ( s , a \right ) $, $ \left ( s - \frac { b - a } { 2 c } , \frac { a + b } 2 \right ) $, $ \left ( s , b \right ) $, and $ \left ( s + \frac { b - a } { 2 c } , \frac { a + b } 2 \right ) $, $ \mathbf R ^ 2 $, which is given by D'Alembert's formula (2.1.25).
To verify this you can go back through the calculation and see where the various functions need to be defined in order to justify our uses of the chain rule and fundamental theorem of calculus. The global theorem is actually a consequence of the local theorem since every point in $ \mathbf R ^ 2 $ belongs to such a parallelogram for some choice of interval $ [ a , b ] $.

Both theorems above are pure uniqueness theorems. They assert that there is at most one solution of the initial value problem, without guaranteeing that there is at least one. And indeed there is no way to turn the calculation of the preceding section into an existence proof, since we assumed at a very early stage that we had a classical solution $ u $. The nice thing about an explicit formula though is that it naturally suggests a way of proving existence: we just need to take the formula and verify that what it gives is indeed a solution. Fortunately this works for D'Alembert's formula. The most naive way of doing this is rather awkward though, for two reasons. First, we'd need to differentiate under the integral, and the variables with respect to which we want to differentiate appear in the limits of the integral. Second, we need to take two derivatives and the function $ g $ appearing in the integrand is only known to have one derivative. Neither of these problems is insurmountable but it's possible to avoid facing either of them directly.

We start by choosing some point $ p $ in the interval in which the functions $ f $ and $ g $ are defined and set \[\tag{2.2.3} \begin{split} \varphi ( z ) & = \frac 1 2 f ( z ) + \frac 1 { 2 c } \int _ z ^ p g ( y ) \, d y , \\ \psi ( z ) & = \frac 1 2 f ( z ) + \frac 1 { 2 c } \int _ p ^ z g ( y ) \, d y \end{split}\] with the usual convention that if the lower limit of an integral is greater than the upper limit then the limits should be swapped and the sign should be changed. Then $ \varphi $ and $ \psi $ are twice continuously differentiable functions, defined on the same interval that $ f $ and $ g $ were. Indeed the fundamental theorem of calculus gives \[\tag{2.2.4} \begin{split} \varphi ' ( z ) & = \frac 1 2 f ' ( z ) - \frac 1 { 2 c } g ( z ) , \\ \psi ' ( z ) & = \frac 1 2 f ' ( z ) + \frac 1 { 2 c } g ( z ) \end{split}\] and then taking an additional derivative gives \[\tag{2.2.5} \begin{split} \varphi '' ( z ) & = \frac 1 2 f '' ( z ) - \frac 1 { 2 c } g ' ( z ) , \\ \psi '' ( z ) & = \frac 1 2 f '' ( z ) + \frac 1 { 2 c } g ' ( z ) . \end{split}\] By assumption $ f $ is twice continuously differentiable and $ g $ is continuously differentiable in the initial value problem so the right hand sides are continuous. Let \[\tag{2.2.6} u ( t , x ) = \varphi ( x + c s - c t ) + \psi ( x - c s + c t )\] and note that $ u $ is twice continuously differentiable. Then \[\tag{2.2.7} u ( s , x ) = \varphi ( x ) + \psi ( x ) = f ( x ) .\] Also, \[\tag{2.2.8} \frac { \partial u } { \partial t } ( t , x ) = - c \varphi ' ( x + c s - c t ) + c \psi ' ( x - c s + c t )\] so \[\tag{2.2.9} \frac { \partial u } { \partial t } ( s , x ) = - c \varphi ' ( x ) + c \psi ' ( x ) = g ( s ) .\] Thus $ u $ satisfies the initial conditions. Does it also satisfy the wave equation? Taking another derivative, \[\tag{2.2.10} \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t , x ) = c ^ 2 \varphi ' ( x + c s - c t ) + c ^ 2 \psi ' ( x - c s + c t ) .\] Similarly, \[\tag{2.2.11} \frac { \partial u } { \partial x } ( t , x ) = \varphi ' ( x + c s - c t ) + \psi ' ( x - c s + c t )\] and \[\tag{2.2.12} \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = \varphi ' ( x + c s - c t ) + \psi ' ( x - c s + c t )\] so \[\tag{2.2.13} \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t , x ) - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = 0\] and so $ u $ is a classical solution of the wave equation. In particular, since we've already checked that the initial conditions are satisfied, the initial value problem has at least one solution. We've already seen that it has at most one solution so there is exactly one solution. The uniqueness theorems tell us that this solution must be given by D'Alembert's formula so there is no need to check separately that the expression for $ u $ given above is equal to the one in D'Alembert's formula, although it is not difficult to do so.

We've now proved existence theorems, complementary to our earlier uniqueness theorems:

Theorem 2.2.C For given $ f $ and $ g $, defined on the real line $ \mathbf R $, there is at least one solution to the initial value problem in $ \mathbf R ^ 2 $, which is given by D'Alembert's formula (2.1.25).
Theorem 2.2.D For given $ f $ and $ g $, defined on the interval $ [ a , b ] $ there is at least one solution to the initial value problem in the parallelogram with vertices $ \left ( s , a \right ) $, $ \left ( s - \frac { b - a } { 2 c } , \frac { a + b } 2 \right ) $, $ \left ( s , b \right ) $, and $ \left ( s + \frac { b - a } { 2 c } , \frac { a + b } 2 \right ) $, $ \mathbf R ^ 2 $, which is given by D'Alembert's formula (2.1.25).

Section 2.3 Energy

We saw in section 2.1 that when $u$ is a classical solution to the wave equation the functions \[\tag{2.3.1} v = \frac { \partial u } { \partial t } + c \frac { \partial u } { \partial x } , \quad w = \frac { \partial u } { \partial t } - c \frac { \partial u } { \partial x } \] satisfy the relations \[\tag{2.3.2} v ( t , x ) = v ( s , x - c s + c t ) , \quad w ( t , x ) = w ( s , x + c s - c t ) . \] The quantity \[\tag{2.3.3} E = \frac 1 4 v ^ 2 + \frac 1 4 w ^ 2 = \frac 1 2 \left ( \frac { \partial u } { \partial t } \right ) ^ 2 + \frac { c ^ 2 } 2 \left ( \frac { \partial u } { \partial x } \right ) ^ 2 \] has a physical interpretation as the energy density, so the integral \[\tag{2.3.4} I = \frac 1 4 \int _ a ^ b \left ( v ( t , x ) ^ 2 + w ( t , x ) ^ 2 \right) \, d x \] represents the energy present in the interval $ [ a , b ] $ at time $t$. We can split this into two parts, \[\tag{2.3.5} I = \frac 1 4 \int _ a ^ b v ( t , x ) ^ 2 \, d x + \frac 1 4 \int _ a ^ b w ( t , x ) ^ 2 \, d x \] and use the relations above to get \[\tag{2.3.6} I = \frac 1 4 \int _ a ^ b v ( s , x - c s + c t ) ^ 2 \, d x + \frac 1 4 \int _ a ^ b w ( s , x + c s - c t ) ^ 2 \, d x \] or, changing variables in the integrals, \[\tag{2.3.7} I = \frac 1 4 \int _ { a + c s - c t } ^ { b + c s - c t } v ( s , x ) ^ 2 \, d x + \frac 1 4 \int _ { a - c s + c t } ^ { b - c s + c t } w ( s , x ) ^ 2 \, d x . \] Now the integral of a non-negative integrand over an interval is at least as large as the integral over a smaller interval and at most as large as the integral over a larger integral so we see that \[\tag{2.3.8} I \ge \frac 1 4 \int _ { \max ( a + c s - c t , a - c s + c t ) } ^ { \min ( b + c s - c t , b - c s + c t ) } v ( s , x ) ^ 2 \, d x + \frac 1 4 \int _ { \max ( a + c s - c t , a - c s + c t ) } ^ { \min ( b + c s - c t , b - c s + c t ) } w ( s , x ) ^ 2 \, d x \] \[\tag{2.3.9} I \le \frac 1 4 \int _ { \min ( a + c s - c t , a - c s + c t ) } ^ { \max ( b + c s - c t , b - c s + c t ) } v ( s , x ) ^ 2 \, d x + \frac 1 4 \int _ { \min ( a + c s - c t , a - c s + c t ) } ^ { \max ( b + c s - c t , b - c s + c t ) } w ( s , x ) ^ 2 \, d x . \] Combining the integrals, and writing the limits in a slightly cleaner form, we find \[\tag{2.3.10} \int _ { a + c | s - t | } ^ { b - c | s - t | } E ( s , x ) \, d x \le I \le \int _ { a - c | s - t | } ^ { b + c | s - t | } E ( s , x ) \, d x . \] If $ 2 c | s - t | > b - a $ then the lower limit of the integral on the left is greater than its upper limit but that's okay. We continue to follow the convention that in such cases the limits are to be swapped and the sign is to be changed and we still get a valid inequality in that case.

Suppose that the integral \[\tag{2.3.11} \int _ { - \infty } ^ { + \infty } E ( s , x ) \, d x\] is finite, i.e. that the total energy in all of space at time $s$ is finite. This is true if and only if the limit of the integral over a finite interval, \[\tag{2.3.12} \int _ { \alpha } ^ \beta E ( s , x ) \, d x\] tends to a finite limit as $\alpha$ tends to $- \infty$ and $\beta$ tends to $+ \infty$, in which case the integrals in the upper and lower bounds of the inequality above tend to that same limit as $a$ tends to $- \infty$ and $b$ tends to $+ \infty$. It then follows from the squeeze principle from real analysis that the limit of $I$ as $a$ tends to $- \infty$ and $b$ tends to $+ \infty$ exists and is equal to the other limits considered. In other words, \[\tag{2.3.13} \int _ { - \infty } ^ { + \infty } E ( t , x ) \, d x = \int _ { - \infty } ^ { + \infty } E ( s , x ) \, d x \] in the sense that if the integral on the right is finite then so is the integral on the left and both are equal. Since $s$ and $t$ are arbitrary we therefore have the following theorem.

Theorem 2.3.A Suppose $u$ is a classical solution to the wave equation. If the total energy \[\tag{2.3.14} \int _ { - \infty } ^ { + \infty } E ( t , x ) \, d x \] is finite for some value of $t$ then it is finite for all values of $t$ and is independent of $t$.

Section 2.4 Symmetries

Symmetries of a differential equation are transformations of a function with the property the that transformed function satisfies the differential equation if and only if the original function does. Symmetries often depend on one or more parameters. For example, the wave equation has the scaling symmetry \[\tag{2.4.1} ( S _ \alpha u ) ( t , x ) = u ( t / \alpha , x / \alpha ) ,\] where $\alpha$ is non-zero. To verify that this is indeed a symmetry we need to check that if $ \tilde u = S _ \alpha u $ for some non-zero value of $\alpha$ then $ u $ is a solution of the wave equation if and only if $ \tilde u $ is. This is completely straightforward because the chain rule gives \[\tag{2.4.2} \frac { \partial \tilde u } { \partial t } ( t , x ) = \frac 1 \alpha \frac { \partial u } { \partial t } ( t / \alpha , x / \alpha ) ,\] \[\tag{2.4.3} \frac { \partial \tilde u } { \partial x } ( t , x ) = \frac 1 \alpha \frac { \partial u } { \partial x } ( t / \alpha , x / \alpha ) ,\] \[\tag{2.4.4} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) = \frac 1 { \alpha ^ 2 } \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t / \alpha , x / \alpha ) ,\] and \[\tag{2.4.5} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) = \frac 1 { \alpha ^ 2 } \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t / \alpha , x / \alpha )\] so \[\tag{2.4.6} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) - c ^ 2 \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) = \frac 1 { \alpha ^ 2 } \left [ \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t / \alpha , x / \alpha ) , - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t / \alpha , x / \alpha ) \right ]\] and hence $u$ satisfies \[\tag{2.4.7} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } = 0\] wherever it's defined if and only if $ \tilde u $ satisfies \[\tag{2.4.8} \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0\] wherever it's defined. The domains of definition need not be the same, although they could be.

Although the preceding proof is straightforward there is one point where the notation could be confusing. In (2.4.2) the expression \[\tag{2.4.9} \frac { \partial u } { \partial t } ( t / \alpha , x / \alpha )\] means that we take the function $ u $, differentiate it with respect to its first argument, and then evaluate the resulting function at the point $ ( t / \alpha , x / \alpha ) $, not that we differentiate the function obtained by mapping $ ( t , x ) $ to $ u ( t / \alpha , x / \alpha ) $ with respect to its first argument. Either of these would be a plausible interpretation of the expression but they are unfortunately not equal in general. In these notes we'll be consistent in interpreting partial derivatives as in this example.

Another symmetry of the wave equation is the spatial reflection symmetry \[\tag{2.4.10} ( R u ) ( t , x ) = u ( t , - x ) .\] The proof that this is indeed a symmetry is straightforward. Temporal reflection is also a symmetry, but don't need to check this separately, since we can write a temporal reflection as the composition of a spatial reflection and a scaling by a factor of $- 1$, in either order: \[\tag{2.4.11} ( R S _ { - 1 } u ) ( t , x ) = ( S _ { - 1 } u ) ( t , - x ) = u ( - t , x )\] and \[\tag{2.4.12} ( S _ { - 1 } R u ) ( t , x ) = ( R u ) ( - t , - x ) = u ( - t , x ) .\] Although scaling and spatial reflection happen to commute this isn't true of symmetries in general. For example, we also have spacetime translational symmetries \[\tag{2.4.13} ( T _ { \tau , \xi } u ) ( t , x ) = u ( t - \tau , x - \xi ) .\] Again, the proof that this is a symmetry for any real numbers $ \tau $ and $ \xi $ is straightforward. Note that \[\tag{2.4.14} ( T _ { \tau , \xi } R u ) ( t , x ) = ( R u ) ( t - \tau , x - \xi ) = u ( t - \tau , \xi - x )\] while \[\tag{2.4.15} ( R T _ { \tau , \xi } u ) ( t , x ) = ( T _ { \tau , \xi } R u ) ( t , - x ) = u ( t - \tau , - x - \xi ) ,\] so although the compositions $ T _ { \tau , \xi } R $ and $ R T _ { \tau , \xi } $ are both symmetries they are not the same symmetry unless $ \xi = 0 $.

Not all symmetries of the wave equation are as easy to verify as the ones above. Another important class of symmetries are the Lorentz transformations \[\tag{2.4.16} ( L _ { \kappa } u ) ( t , x ) = u \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) .\] If $ \tilde u = L _ { \kappa } u $ then \[\tag{2.4.17} \begin{split} \frac { \partial \tilde u } { \partial t } ( t , x ) & = \cosh \kappa \, \frac { \partial u } { \partial t } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + c \sinh \kappa \, \frac { \partial u } { \partial x } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) , \end{split}\] \[\tag{2.4.18} \begin{split} \frac { \partial \tilde u } { \partial x } ( t , x ) & = \frac 1 c \sinh \kappa \, \frac { \partial u } { \partial t } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + \cosh \kappa \, \frac { \partial u } { \partial x } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) , \end{split}\] \[\tag{2.4.19} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) & = \cosh ^ 2 \kappa \, \frac { \partial ^ 2 u } { \partial t ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + 2 c \sinh \kappa \, \cosh \kappa \, \frac { \partial ^ 2 u } { \partial t \partial x } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + c ^ 2 \sinh ^ 2 \kappa \, \frac { \partial ^ 2 u } { \partial x ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) , \end{split}\] \[\tag{2.4.20} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) & = \frac 1 { c ^ 2 } \sinh ^ 2 \kappa \, \frac { \partial ^ 2 u } { \partial t ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + \frac 2 c \sinh \kappa \, \cosh \kappa \, \frac { \partial ^ 2 u } { \partial t \partial x } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + \cosh ^ 2 \kappa \, \frac { \partial ^ 2 u } { \partial x ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) , \end{split}\] and \[\tag{2.4.21} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) - c ^ 2 \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) & = \frac { \partial ^ 2 u } { \partial t ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) . \end{split}\] Here we've used the identity $ \cosh ^ 2 \kappa - \sinh ^ 2 \kappa = 1 $ from the theory of hyperbolic functions. So $ \tilde u $ is a solution of \[\tag{2.4.22} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } = 0\] if and only if $ u $ is a solution of \[\tag{2.4.23} \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0 ,\] and so $ L _ \kappa $ is indeed a symmetry.

Even the calculation for $ L _ \kappa $ is relatively tame though compared to the one required to show that spacetime inversion \[\tag{2.4.24} ( I u ) ( t , x ) = u \left ( \frac { c ^ 2 t } { c ^ 2 t ^ 2 - x ^ 2 } , \frac { x } { c ^ 2 t ^ 2 - x ^ 2 } \right )\] is a symmetry. Set $ \tilde u = I u $, \[\tag{2.4.25} \tau ( t , x ) = \frac { c ^ 2 t } { c ^ 2 t ^ 2 - x ^ 2 }, \] and \[\tag{2.4.26} \xi ( t , x ) = \frac { x } { c ^ 2 t ^ 2 - x ^ 2 } \] so that \[\tag{2.4.27} \tilde u ( t , x ) = u ( \tau ( t , x ) , \xi ( t , x ) ) .\] By the chain rule we have \[\tag{2.4.28} \frac { \partial \tilde u } { \partial t } ( t , x ) = \frac { \partial \tau } { \partial t } ( t , x ) \frac { \partial u } { \partial t } ( \tau ( t , x ) , \xi ( t , x ) ) + \frac { \partial \xi } { \partial t } ( t , x ) \frac { \partial u } { \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) ,\] \[\tag{2.4.29} \frac { \partial \tilde u } { \partial x } ( t , x ) = \frac { \partial \tau } { \partial x } ( t , x ) \frac { \partial u } { \partial t } ( \tau ( t , x ) , \xi ( t , x ) ) + \frac { \partial \xi } { \partial x } ( t , x ) \frac { \partial u } { \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) ,\] \[\tag{2.4.30} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) & = \left ( \frac { \partial \tau } { \partial t } ( t , x ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial t ^ 2 } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + 2 \frac { \partial \tau } { \partial t } ( t , x ) \frac { \partial \xi } { \partial t } ( t , x ) \frac { \partial ^ 2 u } { \partial t \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \left ( \frac { \partial \xi } { \partial t } ( t , x ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \frac { \partial ^ 2 \tau } { \partial t ^ 2 } ( t , x ) \frac { \partial u } { \partial t } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \frac { \partial ^ 2 \xi } { \partial t ^ 2 } ( t , x ) \frac { \partial u } { \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) , \end{split}\] and \[\tag{2.4.31} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) & = \left ( \frac { \partial \tau } { \partial x } ( t , x ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial t ^ 2 } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + 2 \frac { \partial \tau } { \partial x } ( t , x ) \frac { \partial \xi } { \partial x } ( t , x ) \frac { \partial ^ 2 u } { \partial t \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \left ( \frac { \partial \xi } { \partial x } ( t , x ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \frac { \partial ^ 2 \tau } { \partial x ^ 2 } ( t , x ) \frac { \partial u } { \partial t } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \frac { \partial ^ 2 \xi } { \partial x ^ 2 } ( t , x ) \frac { \partial u } { \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) . \end{split}\] We can then compute the partial derivatives \[\tag{2.4.32} \frac { \partial \tau } { \partial t } ( t , x ) = - c ^ 2 x ^ 2 \frac { c ^ 2 t ^ 2 + x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 2 } ,\] \[\tag{2.4.33} \frac { \partial \tau } { \partial x } ( t , x ) = \frac { 2 c ^ 2 t x } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 2 } ,\] \[\tag{2.4.34} \frac { \partial \xi } { \partial t } ( t , x ) = - \frac { 2 c ^ 2 t x } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 2 } ,\] \[\tag{2.4.35} \frac { \partial \xi } { \partial x } ( t , x ) = \frac { c ^ 2 t ^ 2 + x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 2 } ,\] \[\tag{2.4.36} \frac { \partial ^ 2 \tau } { \partial t ^ 2 } ( t , x ) = c ^ 4 t \frac { 2 c ^ 2 t ^ 2 + 6 x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 3 } ,\] \[\tag{2.4.37} \frac { \partial ^ 2 \tau } { \partial x ^ 2 } ( t , x ) = c ^ 2 t \frac { 2 c ^ 2 t ^ 2 + 6 x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 3 } ,\] \[\tag{2.4.38} \frac { \partial ^ 2 \xi } { \partial t ^ 2 } ( t , x ) = c ^ 2 x \frac { 6 c ^ 2 t ^ 2 + 2 x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 3 } ,\] and \[\tag{2.4.39} \frac { \partial ^ 2 \xi } { \partial x ^ 2 } ( t , x ) = x \frac { 6 c ^ 2 t ^ 2 + 2 x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 3 } .\] Substituting, \[\tag{2.4.40} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) - c ^ 2 \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) = \frac { c ^ 4 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 2 } \left ( \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t , x ) - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) \right ) .\] It follows that $ \tilde u $ satisfies the wave equation if and only if $ u $ does, i.e. that $ I $ is a symmetry of the wave equation.

It follows immediately from the definition of a symmetry that the composition of two symmetries is a symmetry and the inverse of a symmetry is a symmetry. In other words the symmetries of a differential equation form a group. The problem of determining the full symmetry group of a differential equation is in general quite a difficult one and indeed the theory of Lie groups was originally developed as a technique for solving this problem.

There are two particular simple symmetries which are valid not just for the wave equation but for all linear differential equations. One is scaling of the dependent variable \[\tag{2.4.41} ( M _ \lambda u ) ( t , x ) = \lambda u ( t , x )\] and the other is the addition of another solution of the equation \[\tag{2.4.42} ( A _ \varphi u ) ( t , x ) = u ( t , x ) + \varphi ( t , x ) ,\] where $ \lambda $ is a non-zero real number and $ \varphi $ is a solution of the wave equation. The fact that $ M _ { - 1 } $ is a symmetry means that $ \varphi $ is a solution if and only if $ - \varphi $ is and so we could equally well have said that subtraction of another solution is a symmetry rather than addition.

In general, a symmetry will take a solution to the differential equation and give another, different, solution, but it may happen to give the same solution, in which case we say that it is a symmetry not just of the differential equation but of the solution. It is often useful to find the set of solutions with a given symmetry or group of symmetries. Of course if we ask for too many symmetries we are unlikely to get any solutions. For example, the only solutions of the wave equation symmetric under the full group of spacetime translations are the constant solutions. Interestingly, as long as we restrict our attention to classical solutions defined on all of $ \mathbf R ^ 2 $, the only scaling invariant solutions are also constant. Indeed, if $ u $ is scaling invariant and $ \ell $ is a line through the origin then $ u $ must take the same value at all points of $ \ell $, except possibly the origin itself. But classical solutions are continuous, so it must take the same value at the origin as well. Every point is on some line through the origin so the value at that point is therefore equal to the value at the origin. What's interesting about this argument is we never actually needed the fact that $ u $ satisfies the wave equation!

If we impose less symmetry then we get more solutions. We could, for example, consider the set of solutions invariant under spatial reflection, i.e. the ones satisfying $ R u = u $, i.e. \[\tag{2.4.43} u ( t , - x ) = u ( t , x ) .\] These are just the solutions which are even functions of the spatial variable for each fixed value of the temporal variable. Note that differentiating the equation above gives \[\tag{2.4.44} - \frac{ \partial u } { \partial x } ( t , - x ) = \frac { \partial u } { \partial x } ( t , x )\] and so \[\tag{2.4.45} - \frac{ \partial u } { \partial x } ( t , 0 ) = \frac { \partial u } { \partial x } ( t , 0 ) ,\] from which it follows that \[\tag{2.4.46} \frac{ \partial u } { \partial x } ( t , 0 ) = 0 .\] More generally, any derivative of odd degree in $x$, if it exists, is zero on the time axis.

Similarly we could look at the solutions symmetric under the symmetry $ M _ { - 1 } R $. These are the solutions which are odd, considered as a function of the spatial variable for fixed value of the temporal variable, and they have the property that any derivative of even degree in $x$, if it exists, is zero on the time axis. In particular the zeroeth and second derivatives, which both certainly exist for any classical solution, are zero: \[\tag{2.4.47} u ( t , 0 ) = 0 .\] and \[\tag{2.4.48} \frac{ \partial ^ 2 u } { \partial x ^ 2 } ( t , 0 ) = 0 .\]

Symmetries, linearity and uniqueness interact in interesting ways. Suppose, for example, that $ u $ is a solution to the initial value problem considered earlier, the one with initial conditions given by (2.2.1). If $ u $ has $ R $ as a symmetry, i.e. if it is even as a function of the spatial variable, then $ f $ and $ g $ must also both be even. More interestingly, suppose $ f $ and $ g $ are even. Since $ u $ is a solution and $ R $ is a symmetry it follows that $ R u $ is also a solution. From linearity it then follows that $ u - R u $ is a solution. But the initial data for $ u - R u $ are identically zero and we already know a solution with zero initial data, namely the zero solution, so the uniqueness theorem implies that $ u - R u $ is the zero solution. In other words, $ u = R u $, or $ R $ is a symmetry of $ u $. So not only does an even solution necessarily have even initial data but even initial data can only give rise to an even solution. Similar remarks apply to odd solutions and odd initial data.

Section 2.5 Energy and Uniqueness for Boundary Value Problems

Up to now we've been trying to solve the wave equation in the whole of $ \mathbf R ^ 2 $ with initial data given on the whole of $ \mathbf R $ but often one wants to solve the equation in a region where the spatial variable is restricted to an interval, either bounded or semi-infinite, and the initial data are given in this interval. This, by itself, is not a problem that admits unique solutions, but it becomes one if we impose appropriate boundary conditions at the endpoint or endpoints of the interval. The two most important boundary conditions are the Dirichlet condition, $ u = 0 $, and the Neumann condition, $ \partial u / \partial x = 0 $.

To start with, let's consider the case of a finite interval, $ [ a , b ] $, with a Dirichlet condition at the left endpoint and a Neumann condition at the right endpoint: \[\tag{2.5.1} u ( t , a ) = 0 , \quad \frac { \partial u } { \partial x } ( t , b ) = 0 .\] These equations are to hold for all values of $t$. The initial conditions will be specified as usual by (2.2.1), where the functions $f$ and $g$ are defined on the interval $ [ a , b ] $ and the solution $u$ should be defined and twice continuously differentiable on $ \mathbf R \times [ a , b ] $, and should of course satisfy the wave equation there.

We see immediately that some additional restrictions are needed on $f$ and $g$. If we set $t = s$ in the equations above we get \[\tag{2.5.2} f ( a ) = 0 , \quad f ' ( b ) = 0\] since taking a partial derivative with respect to $ x $ and then fixing $ t $ is the same as fixing $ t $ and then taking an ordinary derivative. We could also take a $ t $ derivative of the boundary conditions, obtaining \[\tag{2.5.3} \frac { \partial u } { \partial t } ( t , a ) = 0 , \quad \frac { \partial ^ 2 u } { \partial t \partial x } ( t , b ) = 0 .\] Since $ u $ is twice continuously differentiable the mixed partial derivatives are equal so we can replace $ \partial ^ 2 u / \partial t \partial x $ with $ \partial ^ 2 u / \partial x \partial t $. If we do so and then set $ t = s $ then we get \[\tag{2.5.4} g ( a ) = 0 , \quad g ' ( b ) = 0 .\] There is one more restriction. Differentiating the Dirichlet condition once again gives \[\tag{2.5.5} \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t , a ) = 0 ,\] but $ u $ satisfies the wave equation, so \[\tag{2.5.6} c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , a ) = 0 ,\] so $ c ^ 2 f '' ( x ) = 0 $ and therefore \[\tag{2.5.7} f '' ( a ) = 0 .\] There is no similar condition for $ g $ and there is no analogous condition at $b$. To summarise, we've found that there can be no solution to the initial value problem with these boundary conditions unless the initial data satisfy the constraints \[\tag{2.5.8} f ( a ) = 0 , \quad g ( a ) = 0 , \quad f '' ( a ) = 0 , \quad f ' ( b ) = 0 , \quad g ' ( b ) = 0 .\]

Before considering existence and uniqueness let's examine energy conservation. It's useful to prove the following lemma.

Lemma 2.5.A Suppose $ p $ and $ q $ are continuously differentiable functions on the rectangle $ [ t _ 1 , t _ 2 ] \times [ x _ 1 , x _ 2 ] $. Then \[\tag{2.5.9} \begin{split} & \int _ { x _ 1 } ^ { x _ 2 } p ( t _ 1 , x ) \, d x + \int _ { t _ 1 } ^ { t _ 2 } q ( t , x _ 2 ) \, d t + \int _ { x _ 2 } ^ { x _ 1 } p ( t _ 2 , x ) \, d x + \int _ { t _ 2 } ^ { t _ 1 } q ( t , x _ 1 ) \, d t \\ & \qquad {} = \iint _ { [ t _ 1 , t _ 2 ] \times [ x _ 1 , x _ 2 ] } \left ( \frac { \partial x } { \partial x } - \frac { \partial p } { \partial t } \right ) \, d A . \end{split}\]
As usual we interpret integrals where the lower limit is greater than the upper limit by swapping the limits and reversing the sign. We could, therefore, rewrite the equation above as \[\tag{2.5.10} \begin{split} & \int _ { x _ 1 } ^ { x _ 2 } p ( t _ 1 , x ) \, d x + \int _ { t _ 1 } ^ { t _ 2 } q ( t , x _ 2 ) \, d t - \int _ { x _ 1 } ^ { x _ 2 } p ( t _ 2 , x ) \, d x - \int _ { t _ 1 } ^ { t _ 2 } q ( t , x _ 1 ) \, d t \\ & \qquad {} = \iint _ { [ t _ 1 , t _ 2 ] \times [ x _ 1 , x _ 2 ] } \left ( \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } \right ) \, d A . \end{split}\] and indeed for purposes of this section it would be simpler to do so, but the lemma is stated in the way that it is so that we can see it in the next section as a special case of a more general theorem which we will need repeatedly in these notes.

The proof of the lemma is very simple. By the fundamental theorem of calculus \[\tag{2.5.11} q ( t , x _ 2 ) - q ( t , x _ 1 ) = \int _ { x _ 1 } ^ { x _ 2 } \frac { \partial q } { \partial x } ( t , x ) \, d x .\] Integrating this equation over the interval $ [ x _ 1 , x _ 2 ] $ gives \[\tag{2.5.12} \int _ { t _ 1 } ^ { t _ 2 } \left [ q ( t , x _ 2 ) - q ( t , x _ 1 ) \right ] \, d t = \int _ { x _ 1 } ^ { x _ 2 } \int _ { t _ 1 } ^ { t _ 2 } \frac { \partial q } { \partial x } ( t , x ) \, d x \, d t .\] On the left hand side we write the integral of the difference as a difference of integrals and on the right hand side we write the repeated integral as an area integral, using Fubini's theorem. \[\tag{2.5.13} \int _ { t _ 1 } ^ { t _ 2 } q ( t , x _ 2 ) \, d t - \int _ { t _ 1 } ^ { t _ 2 } q ( t , x _ 1 ) \, d t = \iint _ { [ t _ 1 , t _ 2 ] \times [ x _ 1 , x _ 2 ] } \frac { \partial q } { \partial x } \, d A .\] Similarly, \[\tag{2.5.14} \int _ { x _ 1 } ^ { x _ 2 } p ( t _ 2 , x ) \, d x - \int _ { x _ 1 } ^ { x _ 2 } p ( t _ 1 , x ) \, d x = \iint _ { [ t _ 1 , t _ 2 ] \times [ x _ 1 , x _ 2 ] } \frac { \partial p } { \partial t } \, d A .\] Subtracting this from the previous equation and using again the fact that that integral of a difference is the difference of the integrals we get (2.5.10).

Now that we've proved the lemma we can apply it to $ [ x _ 1 , x _ 2 ] = [ a , b ] $, \[\tag{2.5.15} p = E = \frac 1 2 \left ( \frac { \partial u } { \partial t } \right ) ^ 2 + \frac { c ^ 2 } 2 \left ( \frac { \partial u } { \partial x } \right ) ^ 2 , \quad q = c ^ 2 \frac { \partial u } { \partial t } \frac { \partial u } { \partial x }\] and note that \[\tag{2.5.16} \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } = - \left ( \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } \right ) \frac { \partial u } { \partial t } ,\] so the right hand side of the equation in the lemma is zero when $ u $ satisfies the wave equation. The terms on the left hand side with a $ q $ in them are also zero if $ u $ satisfies the boundary conditions. When $ x = a $ this happens because $ \partial u / \partial t = 0 $ there and when $ x = b $ this happens because $ \partial u / \partial x = 0 $ there. The only terms which are left then are the ones with a $ p $, which is the same as $ E $, on the right hand side, so we have \[\tag{2.5.17} \int _ a ^ b E ( d , x ) \, d x - \int _ a ^ b E ( c , x ) \, d x = 0 .\] In other words, the energy in the interval $ [ a , b ] $ at time $ t _ 2 $ is the same as the energy at time $ t _ 1 $. We therefore have an energy conservation theorem for solutions on a bounded interval, just as we did in the case of an infinite interval.

Theorem 2.5.B Suppose $u$ is a classical solution to the wave equation for $ x $ in the interval $ [ a , b ] $ and that at each endpoint of this interval either the Dirichlet or Neumann boundary condition is satisfied. If the total energy \[\tag{2.5.18} \int _ a ^ b E ( t , x ) \, d x \] is finite for some value of $t$ then it is finite for all values of $t$ and is independent of $t$.
Note that previously we assumed a Dirichlet condition at the left endpoint and a Neumann condition at the right endpoint but the argument works fine no matter which condition is assumed at which endpoint.

We can use this energy conservation theorem to get a uniqueness theorem even though we don't yet have an explicit solution. Suppose that $ u _ 1 $ and $ u _ 2 $ are solutions of the initial value problem which satisfy the same boundary conditions, either Dirichlet or Neumann at each endpoint. Then their difference $ u = u _ 1 - u _ 2 $ satisfies the initial value problem with the same boundary conditions and zero initial data. It therefore has zero energy density everywhere. Looking at the definition of the energy density we see this means that its partial derivatives are everywhere zero. It must therefore be locally constant. Since $ \mathbf R \times [ a , b ] $ is connected it follows that $ u $ is constant. It's zero initial, since $ u $ has zero initial data, and hence is zero everywhere. So we've proved the following uniqueness theorem.

Theorem 2.5.C For given $ f $ and $ g $, defined on the interval $ [ a , b ] $, there is at most one solution to the initial value problem in $ \mathbf R \times [ a , b ] $ with given boundary conditions, Dirichlet or Neumann, at the endpoints.

Section 2.6 Existence for Boundary Value Problems

We haven't yet proved the existence of solutions. For the moment we'll return to the setting where we have a Dirichlet condition at $ x = a $ and Neumann at $ x = b $. In this setting we saw that there can be no solution unless $ f $ is twice continuously differentiable, $ g $ is continuously differentiable, and $ f ( a ) $, $ g ( a ) $, $ f '' ( a ) $, $ f ' ( b ) $, and $ g ' ( b ) $ are all zero, so we'll assume those conditions are satisfied. We then extend $ f $ and $ g $ to functions on all of $ \mathbf R $ as follows. Any real number can be written uniquely as $ n + r $ where $ n $ is an integer and $ r $ belongs to the half-open interval $ [ 0 , 1 ) $ and every integer $ n $ can be written at $ 4 m + l $ where $ m $ is an integer and $ l $ is $ 0 $, $ 1 $, $ 2 $, or $ 3 $. We use this to write \[\tag{2.6.1} \frac { x - a } { b - a } = 4 m ( x ) + l ( x ) + r ( x )\] and then define \[\tag{2.6.2} f ( x ) = \begin{cases} f ( a + ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 0 }$,} \\ f ( b - ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 1 }$,} \\ - f ( a + ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 2 }$,} \\ - f ( b - ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 3 }$.} \\ \end{cases}\] The intended interpretation of this equation is that the $ f $'s on the right hand side refer to the function we were originally given, while the $ f $ on the left hand side is the new function we are defining. We'll also define an extension of $ g $ via the same equation, except with all $ f $'s replaced by $ g $'s. Note that $ m ( x ) $ does not appear in the equations above, which implies that the extended functions are periodic of period $ 4 ( b - a ) $.

There are a number of things to check. One is that these really are extensions of the original functions, i.e. that they give the same values when evaluated at a point in the original interval $ [ a , b ] $. Suppose $ x \in [ a , b ) $. Then $ m ( x ) = 0 $, $ l ( x ) = 0 $, and $ r ( x ) = ( b - x ) / ( b - a ) $ so the right hand side above gives the value \[\tag{2.6.3} f ( a + ( b - a ) r ( x ) ) = f \left ( a + ( b - a ) \frac { x - a } { b - a } \right ) = f ( x ) ,\] as it should. On the other hand, if $ x = b $ then $ m ( x ) = 0 $, $ l ( x ) = 1 $ and $ r ( x ) = 0 $ and the right hand side gives \[\tag{2.6.4} f ( b - ( b - a ) r ( x ) ) = f ( b ) = f ( x ) ,\] and we again get the correct value. So the new $ f $ is indeed an extension of the old one, which is fortunate because otherwise we wouldn't know whether $ f ( x ) $ for $ x $ in $ [ a , b ] $ referred to the old function or the new one. Of course the same argument works equally well for $ g $.

We'd like to know, in addition, that the extended functions have the same differentiability properties as the old functions, i.e. that the extended $ f $ is twice continuously differentiable and the extended $ g $ is continuously differentiable. This might seem obvious from the definition since the various pieces of which they are composed have this property, but there's clearly something wrong with that line of reasoning since it would imply that the absolute value function \[\tag{2.6.5} | x | = \begin{cases} x & \mbox{ if ${ x > 0 }$,} \\ 0 & \mbox{ if ${ x = 0 }$,} \\ - x & \mbox{ if ${ x < 0 }$,} \end{cases}\] which is also composed of continuous differentiable pieces, is continuously differentiable, when it is in fact not differentiable at $ x = 0 $. We need a criterion for a function defined by different expressions on the two different sides of a point to have some number of continuous derivatives. This is supplied by the following lemma.

Lemma 2.6.A Suppose $ p $, $ q $ are $ k $ times continuously differentiable functions on the intervals $ ( \alpha , \beta ) $ and $ ( \beta , \gamma ) $. Suppose $ f $ is defined on $ ( \alpha , \gamma ) $ by \[\tag{2.6.6} f ( x ) = \begin{cases} p ( x ) & \mbox{ if ${ \alpha < x < \beta }$,} \\ c & \mbox{ if ${ x = \beta }$,} \\ q ( x ) & \mbox{ if ${ \beta < x < \gamma }$} \\ \end{cases}\] for some $ c $. Then $ f $ is $ k $ times continuously differentiable if and only if \[\tag{2.6.7} \lim _ { x \to \beta ^ - } p ( x ) = c = \lim _ { x \to \beta ^ + } q ( x )\] and \[\tag{2.6.8} \lim _ { x \to \beta ^ - } p ^ { ( j ) } ( x ) = \lim _ { x \to \beta ^ + } q ^ { ( j ) } ( x )\] for all $ j \le k $.
As usual, a parenthesised subscript indicates the number of derivatives, so $ p ^ { ( 0 ) } = p $, $ p ^ { ( 1 ) } = p ' $, $ p ^ { ( 2 ) } = p '' $, etc.

To prove the lemma we first introduce the functions \[\tag{2.6.9} f _ j ( x ) = \begin{cases} p ^ { ( j ) } ( x ) & \mbox{ if ${ \alpha < x < \beta }$,} \\ c _ j & \mbox{ if ${ x = \beta }$,} \\ q ^ { ( j ) } ( x ) & \mbox{ if ${ \beta < x < \gamma }$,} \end{cases}\] where $ c _ j $ is the common value of $ \lim _ { x \to \beta ^ - } p ^ { ( j ) } ( x ) $ and $ \lim _ { x \to \beta ^ + } q ^ { ( j ) } ( x ) $. It follows immediately from the definition of continuity that $ f _ n $ is continuous. Also, \[\tag{2.6.10} f _ j ' ( x ) = f _ { j + 1 } ( x )\] if $ j < k $ and $ \alpha < x < \beta $ or $ \beta < x < \gamma $. We would like to know that this is true also when $ x = \beta $. If $ \beta < z < \gamma $ then \[\tag{2.6.11} \int _ \beta ^ z f _ { j + 1 } ( y ) \, d y = \int _ \beta ^ w f _ { j + 1 } ( y ) \, d y + \int _ w ^ z f _ { j + 1 } ( y ) \, d y\] for any $ w $ in the interval $ ( \beta , z ) $. Now \[\tag{2.6.12} \lim _ { w \to \beta ^ + } \int _ \beta ^ w f _ { j + 1 } ( y ) \, d y = 0\] because integrals of continuous functions depend continuously on their limits of integration and \[\tag{2.6.13} \int _ w ^ z f _ { j + 1 } ( y ) \, d y = f _ j ( z ) - f _ j ( w )\] by the fundamental theorem of calculus. So \[\tag{2.6.14} \int _ \beta ^ z f _ { j + 1 } ( y ) \, d y = \lim _ { w \to \beta ^ + } \left ( f _ z ( y ) - f _ j ( w ) \right ) = f _ j ( z ) - f _ j ( \beta ) ,\] where we've used the continuity of $ f _ j $ to evaluate $ \lim _ { w \to \beta ^ + } f _ j ( w ) $. Now the change of variable $ r = ( y - \beta ) / ( z - \beta ) $ gives \[\tag{2.6.15} \int _ \beta ^ z f _ { j + 1 } ( y ) \, d y = ( z - \beta ) \int _ 0 ^ 1 f _ { j + 1 } ( \beta + r ( z - \beta ) ) \, d r\] and so \[\tag{2.6.16} \frac { f _ j ( z ) - f _ j ( \beta ) } { z - \beta } = \int _ 0 ^ 1 f _ { j + 1 } ( \beta + r ( z - \beta ) ) \, d r .\] This was proved for $ z $ in the interval $ ( \beta , \gamma ) $ but an almost identical argument gives the same equation when $ z $ is in the interval $ ( \alpha , \beta ) $. Taking limits inside the integral and using the continuity of $ f _ { j + 1 } $ gives \[\tag{2.6.17} \begin{split} \lim _ { z \to \beta } \frac { f _ j ( z ) - f _ j ( \beta ) } { z - \beta } & = \int _ 0 ^ 1 \lim _ { z \to \beta } f _ { j + 1 } ( \beta + r ( z - \beta ) ) \, d r . \\ & = \int _ 0 ^ 1 f _ { j + 1 } \left ( \lim _ { z \to \beta } \left ( \beta + r ( z - \beta ) \right ) \right ) \, d r . \\ & = \int _ 0 ^ 1 f _ { j + 1 } ( \beta ) \, d r = f _ { j + 1 } ( \beta ) . \end{split}\] Taking the limit inside the integral is justified in this case because we have continuous integrands with uniform convergence on a bounded integral. The equation above just says that \[\tag{2.6.18} f _ j ' ( x ) = f _ { j + 1 } ( x )\] when $ x = \beta $ though. Since we already had the same equation for $ x $ in the intervals $ ( \alpha , \beta ) $ and $ ( \beta , \gamma ) $ we now have it in the full interval $ ( \alpha , \gamma ) $. Now $ f ^ { ( 0 ) } = f = f _ 0 $ so we see by induction on $ j $ that $ f ^ { ( j ) } = f _ j $ for $ j \le k $. The $ f _ j $'s are already known to be continuous, so $ f $ is $ k $ times continuously differentiable.

Now that we have the lemma it's straightforward, if slightly tedious, to check that the extended $ f $ and $ g $ defined previously are twice and once continuously differentiable, respectively. We'll do this just at the point $ b $ as an illustration. To the left of $ b $ we have $ m ( x ) = 0 $, $ l ( x ) = 0 $ and $ r ( x ) = ( x - a ) / ( b - a ) $ so \[\tag{2.6.19} f ( x ) = f ( a + ( b - a ) r ( x ) ) = f ( x )\] and to the right of $ b $ we have $ m ( x ) = 0 $, $ l ( x ) = 1 $ and $ r ( x ) = ( x - b ) / ( b - a ) $ so \[\tag{2.6.20} f ( x ) = f ( b - ( b - a ) r ( x ) ) = f ( 2 b - x ) .\] The first and second derivatives to the left and right of $ b $ are \[\tag{2.6.21} f ' ( x ) = \begin{cases} f ' ( x ) = f ' ( x ) & \mbox{if ${ a < x < b }$,} \\ f ' ( x ) = - f ' ( 2 b - x ) & \mbox{if ${ b < x < 2 b - a }$} \end{cases}\] and \[\tag{2.6.22} f '' ( x ) = \begin{cases} f '' ( x ) = f '' ( x ) & \mbox{if ${ a < x < b }$,} \\ f '' ( x ) = f '' ( 2 b - x ) & \mbox{if ${ b < x < 2 b - a }$.} \end{cases}\] It's clear that the second derivatives approach a common limit as $ x $ tends to $ b $ from either side. It's less clear that the first derivatives do but here we have to remember that our original $ f $ was assumed to satisfy $ f ' ( b ) = 0 $. Without this assumption the extended $ f $ would not be continuously differentiable. The argument for $ g $ at $ b $ is similar, but we only need to worry about the first derivative.

Let $ u $ be the solution to the initial value problem in $ \mathbf R $ to the wave equation with initial data given by the extended $ f $ and $ g $. We can check directly that \[\tag{2.6.23} ( O _ a u ) ( t , x ) = - u ( t , 2 a - x )\] and \[\tag{2.6.24} ( E _ b u ) ( t , x ) = u ( t , 2 b - x )\] are symmetries of the wave equation, or we can note that $ O _ a = M _ { - 1 } T _ { a , 0 } R T _ { - a , 0 } $ and $ E _ b = T _ { b , 0 } R T _ { - b , 0 } $ so each of these is a composition of symmetries and hence a symmetry. Indeed $ O _ a $ and $ E _ b $ are symmetries not just of the wave equation in general but of our particular solution. To see this it suffices to check that the initial data is unchanged by each of these and then apply the uniqueness theorem. Then we note that \[\tag{2.6.25} u ( t , a ) = ( O _ a u ) ( t , a ) = - u ( t , a )\] so $ u ( t , a ) = 0 $ and $ u $ satisfies the Dirichlet boundary condition at the left endpoint. Similarly \[\tag{2.6.26} \frac { \partial u } { \partial x } ( t , b ) = \frac { \partial O _ b u } { \partial x } ( t , b ) = \frac { \partial O _ b u } { \partial x } ( t , b ) = - \frac { \partial u } { \partial x } ( t , b )\] so $ \partial u / \partial x ( t , b ) = 0 $. and $ u $ satisfies the Neumann boundary condition at the left endpoint. We now have a solution to our initial value problem with the given boundary conditions. We did this with a Dirichlet condition at the left endpoint and a Neumann condition at the right endpoint but the same technique works for any of the other three combinations of boundary conditions--we just need to choose the appropriate extension of the initial data $ f $ and $ g $, which will be the one which has the correct symmetry properties on spatial reflection through $ a $ and $ b $ to ensure the boundary conditions are satisfied. We therefore have the following complement to our earlier uniqueness theorem.

Theorem 2.6.B For given $ f $ and $ g $, defined on the interval $ [ a , b ] $, there is at least one solution to the initial value problem in $ \mathbf R \times [ a , b ] $ with given boundary conditions, Dirichlet or Neumann, at the endpoints.

The solution we found above has two interesting properties. First, it is periodic in the spatial variable, with period $ 4 ( b - a ) $. To see this, recall that that $ E _ a $ and $ O _ b $ are symmetries of $ u $ and note that \[\tag{2.6.27} ( E _ a O _ b u ) ( t , x ) = ( O _ b u ) ( t , 2 b - x ) = - u ( t , 2 a - ( 2 b - x ) = - u ( t , x - 2 ( b - a ) )\] so \[\tag{2.6.28} ( E _ a O _ b E _ a O _ b u ) ( t , x ) = u ( t , x - 4 ( b - a ) ) = T _ { 0 , 4 ( b - a ) } u ( t , x ) .\] Similar arguments apply to the other three combinations of boundary conditions. With a Neumann condition at the left endpoint and a Dirichlet condition at the right endpoint we again get a solution which is periodic with period $ 4 ( b - a ) $. If have Dirichlet conditions at both endpoints or Neumann conditions at both endpoint then we get solutions which are periodic with period $ 2 ( b - a ) $. Of course the original problem was to solve the initial value problem in $ \mathbf R \times [ a , b ] $, so when we say that the solution is spatially periodic what we really mean is that the natural extension of the solution to $ \mathbf R ^ 2 $ is periodic.

The second interesting property of our solution is that it is periodic in time, with period $ 4 ( b - a ) / c $. To see this, note that in terms of our extended initial data the solution is given by the D'Alembert formula (2.1.25). Substituting $ t + 4 ( b - a ) / c $ for $ t $ in this formula gives \[\tag{2.6.29} \begin{split} u ( t + 4 ( b - a ) / c , x ) & = \frac 1 2 f ( x + c s - c t - 4 ( b - a ) ) \\ & \quad {} + \frac 1 2 f ( x - c s + c t + 4 ( b - a ) ) \\ & \quad {} + \frac 1 { 2 c } \int _ { x + c s - c t - 4 ( b - a ) } ^ { x - c s + c t + 4 ( b - a ) } g ( y ) \, d y . \end{split}\] Adding or subtracting $ 4 ( b - a ) $ from the argument of $ f $ has no effect, as we just saw when we discussed spatial periodicity, so the first two terms are equal to the corresponding terms in the formula for $ u ( t , x ) $. For the last term we split the integral into five pieces, corresponding to the subintervals \[\tag{2.6.30} [ x + c s - c t - 4 ( b - a ) , x + c s - c t - 2 ( b - a ) ] \], \[\tag{2.6.31} [ x + c s - c t - 2 ( b - a ) , x + c s - c t ] \], \[\tag{2.6.32} [ x + c s - c t , x - c s + c t ] \], \[\tag{2.6.33} [ x - c s + c t , x - c s + c t + 2 ( b - a ) ] \], and \[\tag{2.6.34} [ x - c s + c t + 2 ( b - a ) , x - c s + c t + 4 ( b - a ) ] \]. The middle one corresponds to the corresponding terms in the formula for $ u ( t , x ) $. The first two cancel each other out because translation by $ 2 ( b - a ) $ changes the sign of $ g $, and the last two cancel for the same reason. So we find that $ u ( t + 4 ( b - a ) / c , x ) $ and $ u ( t , x ) $ are equal, as claimed.

The argument above works without changes in the case of a Neumann left endpoint and Dirichlet right endpoint. It also works when both endpoints satisfy Dirichlet boundary conditions. In fact a somewhat more careful argument gives periodicity with period $ 2 ( b - a ) / c $ in that case. No variant of the argument above can prove periodicity in time in the case of Neumann conditions at both endpoints though, because this is not true in general. Indeed the wave equation on any interval with Neumann conditions at both endpoints and initial data $ f ( x ) = 0 $, $ g ( x ) = 1 $ has as its solution $ u ( t , x ) = t $, which is not periodic in $ t $.

Section 2.7 Green's Theorem

A number of results in earlier sections of this chapter are proved by evaluating double integrals by repeated integration. A more systematic approach is to use Green's theorem, which is the following generalisation of Lemma 2.5.A.

Theorem 2.7.A Suppose the boundary of the closed bounded region $ R $ in $ \mathbf R ^ 2 $ consists of a sequence of finitely many continuously differentiable curves $ C _ 1 $, $ C _ 2 $, ..., $ C _ k $. Suppose that $ p $ and $ q $ are continuous differentiable functions on $ R $. Then \[\tag{2.7.1} \sum _ { j = 1 } ^ k \int _ { C _ j } \left ( p ( t , x ) \, d x + q ( t , x ) \, d t \right ) = \int _ R \left ( \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } \right ) \, d A ,\] where the curves $ C _ j $ are traversed in such a direction that that the region $ R $ is on our left.
For simple regions, those without holes, the condition that the region is on our left means the anticlockwise direction around $ R $. The anticlockwise orientation of the curves is linked to our choice to make the $ t $ variable the vertical and the $ x $ variable the horizontal. With the reverse convention the orientation of the curves would need to be clockwise. If there are holes then boundary curves around the holes have the opposite orientation, but it will be quite a while before we need to consider such a region.

In most applications of the theorem the functions $ p $ and $ q $ are chosen so as to make the integrand $ \partial p / \partial t - \partial q / \partial x $ on the right hand side equal to zero. As a particular example, consider \[\tag{2.7.2} p = \frac { \partial u } { \partial t } , \quad q = c ^ 2 \frac { \partial u } { \partial x }\] where $ u $ is a classical solution of the wave equation. Then \[\tag{2.7.3} \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } = 0\] is just the wave equation with the signs reversed. In addition to the functions $ p $ and $ q $ we also need to choose a region $ R $. In this case we'll choose the triangle with vertices $ ( t _ 2 , x _ 2 ) $, $ ( t _ 1 , x _ 2 - c ( t _ 2 - t _ 1 ) ) $, and $ ( t _ 1 , x _ 2 + c ( t _ 2 - t _ 1 ) ) $. Let $ C _ 1 $, $ C _ 2 $ and $ C _ 3 $ be the edges from $ ( t _ 1 , x _ 2 + c ( t _ 2 - t _ 1 ) ) $ to $ ( t _ 2 , x _ 2 ) $, from $ ( t _ 2 , x _ 2 ) $ to $ ( t _ 1 , x _ 2 - c ( t _ 2 - t _ 1 ) ) $, and from $ ( t _ 1 , x _ 2 - c ( t _ 2 - t _ 1 ) ) $ to $ ( t _ 1 , x _ 2 + c ( t _ 2 - t _ 1 ) ) $.

Figure 2.7.1 Triangle for proof of D'Alembert
We have, by Green's theorem, \[\tag{2.7.4} \sum _ { j = 1 } ^ 3 \int _ { C _ j } \left ( \frac { \partial u } { \partial t } \, d x - c ^ 2 \frac { \partial u } { \partial x } \, d t \right ) = 0 .\] We can parameterise $ C _ 1 $ by \[\tag{2.7.5} t = t _ 1 + r ( t _ 2 - t _ 1 ) , \quad x = x _ 1 + c ( 1 - r ) ( t _ 2 - t _ 1 ) .\] With this parameterisation the integral over $ C _ 1 $ becomes \[\tag{2.7.6} \int _ 0 ^ 1 \left ( \frac { \partial u } { \partial t } \frac { d x } { d r } + c ^ 2 \frac { \partial u } { \partial x } \frac { d t } { d r } \right ) .\] Since \[\tag{2.7.7} \frac { d x } { d r } = - c ( t _ 2 - t _ 1 ) = - c \frac { d t } { d r }\] we can rewrite the integral as \[\tag{2.7.8} - c \int _ 0 ^ 1 \left ( \frac { \partial u } { \partial t } \frac { d t } { d r } + \frac { \partial u } { \partial x } \frac { d x } { d r } \right ) .\] By the fundamental theorem of calculus the integral above is just the value of $ u $ at the upper endpoint minus the value at the lower endpoint, so \[\tag{2.7.9} \int _ { C _ 1 } \left ( \frac { \partial u } { \partial t } \, d x - c ^ 2 \frac { \partial u } { \partial x } \, d t \right ) = - c \left [ u ( t _ 2 , x _ 2 ) - u ( t _ 1 , x _ 2 + c ( t _ 2 - t _ 1 ) ) \right ] .\] A similar calculation shows that \[\tag{2.7.10} \int _ { C _ 2 } \left ( \frac { \partial u } { \partial t } \, d x - c ^ 2 \frac { \partial u } { \partial x } \, d t \right ) = - c \left [ u ( t _ 2 , x _ 2 ) - u ( t _ 1 , x _ 2 - c ( t _ 2 - t _ 1 ) ) \right ] .\] For $ C _ 3 $, which is horizontal, we just use $ x $ as the parameter, finding \[\tag{2.7.11} \int _ { C _ 3 } \left ( \frac { \partial u } { \partial t } \, d x - c ^ 2 \frac { \partial u } { \partial x } \, d t \right ) = \int _ { x _ 2 - c ( t _ 2 - t _ 1 ) } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } \frac { \partial u } { \partial t } ( t _ 1 , x ) \, d x .\] Adding the three together we find that \[\tag{2.7.12} \begin{split} & {} - 2 c u ( t _ 2 , x _ 2 ) + c u ( t _ 1 , x _ 2 - c ( t _ 2 - t _ 1 ) ) + c u ( t _ 1 , x _ 2 + c ( t _ 2 - t _ 1 ) ) \\ & \quad {} + \int _ { x _ 2 - c ( t _ 2 - t _ 1 ) } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } \frac { \partial u } { \partial t } ( t _ 1 , x ) \, d x = 0 \end{split}\] or \[\tag{2.7.13} \begin{split} u ( t _ 2 , x _ 2 ) & = \frac 1 2 u ( t _ 1 , x _ 2 - c ( t _ 2 - t _ 1 ) ) + \frac 1 2 c u ( t _ 1 , x _ 2 + c ( t _ 2 - t _ 1 ) ) \\ & \quad {} + \frac 1 { 2 c } \int _ { x _ 2 - c ( t _ 2 - t _ 1 ) } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } \frac { \partial u } { \partial t } ( t _ 1 , x ) \, d x = 0 . \end{split}\] Taking $ t _ 1 = s $, $ t _ 2 = t $, $ x _ 2 = x $ we get \[\tag{2.7.14} \begin{split} u ( t , x ) & = \frac 1 2 u ( s , x - c ( t - s ) ) + \frac 1 2 c u ( s , x + c ( t - s ) ) \\ & \quad {} + \frac 1 { 2 c } \int _ { x - c ( t - s ) } ^ { x + c ( t - s ) } \frac { \partial u } { \partial t } ( s , y ) \, d y = 0 . \end{split}\] Here we needed to rename the variable of integration to avoid a clash with the variable $ x $ we substituted for $ x _ 2 $. For a solution of the initial value problem the equation above is just D'Alembert's formula (2.1.25).

We can also get energy conservation from Green's theorem using the pair of functions \[\tag{2.7.15} p = \frac 1 2 \left ( \frac { \partial u } { \partial t } \right ) ^ 2 + \frac { c ^ 2 } 2 \left ( \frac { \partial u } { \partial x } \right ) ^ 2 , \quad q = - c ^ 2 \frac { \partial u } { \partial t } \frac { \partial u } { \partial x } .\] In fact this is essentially how we proved in the case of a bounded interval, with $ R $ chosen to be a rectangle. We could also reprove the original version by the same method, taking $ R $ to be a trapezoid. Here, instead, we'll prove energy conservation for a semi-infinite interval, which we haven't treated yet.

Suppose then that $ u $ is a classical solution of the wave equation on $ \mathbf R \times [ a , + \infty ) $ which satisfies either the Dirichlet or Neumann boundary condition at $ a $. The region $ R $ we will choose is a quadrilateral with four sides, $ C _ 1 $ from $ ( t _ 1 , a ) $ to $ ( t _ 1 , x _ 1 ) $, $ C _ 2 $ from $ ( t _ 1 , x _ 1 ) $ to $ ( t _ 2 , x _ 2 ) $, $ C _ 3 $ from $ ( t _ 2 , x _ 2 ) $ to $ ( t _ 2 , a ) $, and $ C _ 4 $ from $ ( t _ 2 , a ) $ back to $ ( t _ 1 , a ) $, where \[\tag{2.7.16} x _ 2 = x _ 1 \pm c ( t _ 2 - t _ 1 ) .\]

Figure 2.7.2 Quadrilateral for Energy Conservation
Then \[\tag{2.7.17} \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } = 0\] since $ u $ satisfies the wave equation and so Green's theorem gives \[\tag{2.7.18} \sum _ { j = 1 } ^ 4 \int _ { C _ j } \left ( p \, d t + q \, d x \right ) = 0 .\] The integrals over $ C _ 1 $, $ C _ 3 $ and $ C _ 4 $ are straightforward. \[\tag{2.7.19} \begin{split} \int _ { C _ 1 } \left ( p \, d x + q \, d t \right ) & = \int _ { a } ^ { x _ 1 } p ( t _ 1 , x ) \, d x , \\ \int _ { C _ 3 } \left ( p \, d x + q \, d t \right ) & = \int _ { t _ 1 } ^ { t _ 2 } q ( t , x ) \, d t = 0 , \\ \int _ { C _ 4 } \left ( p \, d x + q \, d t \right ) & = - \int _ { a } ^ { x _ 2 } p ( t , x ) \, d x . \end{split}\] Note that on $ C _ 3 $ the integrand $ q $ vanishes identically because of the boundary conditions and note the negative sign in the integral for $ C _ 4 $, coming from the fact that $ C _ 4 $ is traversed from right to left. The interesting integral is the one over $ C _ 2 $. There $ d x = \mp c d t $ so \[\tag{2.7.20} p \, d x + q \, d t = \mp \frac c 2 \left ( \frac { \partial u } { \partial t } \pm c \frac { \partial u } { \partial x } \right ) ^ 2 \, d t .\] So the sign of the integrand, and hence the sign of the integral, is the opposite one from the sign in $ x _ 2 = x _ 1 \pm c ( t _ 2 - t _ 1 ) $. Since the sum of the four integrals is zero the sum of the remaining three has the same sign as that in $ x _ 2 = x _ 1 \pm c ( t _ 2 - t _ 1 ) $. In other words, \[\tag{2.7.21} \int _ { a } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } p ( t _ 1 , x ) \, d x - \int _ { a } ^ { x _ 2 } p ( t _ 2 , x ) \, d x \ge 0\] while \[\tag{2.7.22} \int _ { a } ^ { x _ 2 - c ( t _ 2 - t _ 1 ) } p ( t _ 1 , x ) \, d x - \int _ { a } ^ { x _ 2 } p ( t _ 2 , x ) \, d x \le 0 .\] We can combine these into \[\tag{2.7.23} \int _ { a } ^ { x _ 2 - c ( t _ 2 - t _ 1 ) } p ( t _ 1 , x ) \, d x \le \int _ { a } ^ { x _ 2 } p ( t _ 2 , x ) \, d x \le \int _ { a } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } p ( t _ 1 , x ) \, d x .\] Now we can use the squeeze principle in the same way we did previously to conclude that if \[\tag{2.7.24} \int _ a ^ { + \infty } E ( t , x ) \, d x = \int _ a ^ { + \infty } p ( t , x ) \, d x\] converges for $ t = t _ 1 $ then it also converges for $ t = t _ 2 $ and has the same value there. Unlike the earlier argument $ t _ 1 $ and $ t _ 2 $ are not arbitrary here but have been assumed to satisfy $ t _ 1 < t _ 2 $. To treat the case $ t _ 1 > t _ 2 $ we could use a similar argument but it's simpler to note that the wave equation is symmetric under temporal reflections and apply the result we already have to \[\tag{2.7.25} \tilde u ( t , x ) = u ( t _ 1 + t _ 2 - t , x ) , \] which has the same energy at $ t = t _ 1 $ as $ u $ does at $ t = t _ 2 $ and vice versa. Thus we've proved the following energy conservation theorem for a semi-infinite interval.
Theorem 2.7.B Suppose $u$ is a classical solution to the wave equation on $ \mathbf R \times [ a , + \infty ) $ satisfying either the Dirichlet or Neumann condition at the boundary. If the total energy \[\tag{2.7.26} \int _ { a } ^ { + \infty } E ( t , x ) \, d x \] is finite for some value of $t$ then it is finite for all values of $t$ and is independent of $t$.

Section 2.8 Klein-Gordon and Sine-Gordon Equations

One reason for developing multiple techniques for proving results about the wave equation in one spatial dimension is that some generalise better to higher dimensions or related equations than others. We won't consider higher dimensions here but we will briefly consider two related equations, the Klein-Gordon and Sine-Gordon equations.

The Klein-Gordon equation \[\tag{2.8.1} \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } + m ^ 2 u = 0\] plays a fundamental result in relativistic quantum mechanics. It shares some, but not all, of the symmetries of the wave equation. Of the ones we considered earlier it has the spatial reflection symmetry, spacetime translational symmetry and Lorentz symmetry and scaling symmetry in the dependent variable, but not the scaling symmetry in the independent variable or the spacetime inversion symmetry. It also has temporal reflection symmetry. In the case of the wave equation we got this from spatial reflection symmetry and scaling symmetry in the independent variables but for Klein-Gordon this needs to be checked separately because we don't have scaling symmetry in the independent variables.

We can still prove energy conservation, but with the energy density \[\tag{2.8.2} E = \frac 1 2 \left ( \frac { \partial u } { \partial t } \right ) ^ 2 + \frac { c ^ 2 } 2 \left ( \frac { \partial u } { \partial x } \right ) ^ 2 + \frac { m ^ 2 } 2 u .\] The first argument we used for the wave equation, using the auxiliary functions $ v $ and $ w $, does not generalise but the argument using Green's theorem does. A solution with zero initial data has zero initial energy and so has zero energy for all time. Looking at the form of the energy density this means the solution can only be the zero solution. The equation is linear, so the difference of two solutions is also a solution. If the two solutions have the same initial data then the difference has zero initial data and hence is zero, so the two solutions are the same. Thus we see that analogues of the uniqueness theorems for the wave equation also hold for the Klein-Gordon equation. The technique above is the one we used for boundary value problems for the wave equation, but not the one we originally used in $ \mathbf R ^ 2 $, which relied on having an explicit solution. Here we haven't used an explicit solution. There is one, but it involves special functions and isn't nearly as convenient to work with as the D'Alembert formula.

The sine-Gordon equation \[\tag{2.8.3} \frac { \partial ^ 2 u } { \partial t ^ 2 } - \frac { \partial ^ 2 u } { \partial x ^ 2 } + \sin u = 0\] arises in a variety of geometric and physical contexts. It is reasonable to guess that small solutions should behave like solutions of the Klein-Gordon equation with $ c = m = 1 $ because $ \sin u \approx u $ for small $ u $, and this is at least somewhat correct, but it has many peculiarities of its own.

The sine-Gordon equation shares most, but not all, of the symmetries of the Klein-Gordon equation. Of the ones we've considered it has spatial and temporal reflection symmetry, spacetime translation symmetry, and Lorentz symmetry, but not scaling symmetry in the dependent variable.

Energy conservation also holds for the sine-Gordon equation, with energy density \[\tag{2.8.4} E = \frac 1 2 \left ( \frac { \partial u } { \partial t } \right ) ^ 2 + \frac 1 2 \left ( \frac { \partial u } { \partial x } \right ) ^ 2 + 1 - \cos u .\] As for Klein-Gordon, any solution with zero initial data must be zero for all time, but sine-Gordon is non-linear, so the difference of two solutions is not, in general, a solution, and so we can't obtain a uniqueness theorem in the same way. Perhaps unsurprisingly there is also no explicit solution formula for sine-Gordon.

Chapter 3 Diffusion Equation

Section 3.1 Symmetries

We met the diffusion equation (1.0.4) \[\tag{3.1.1} \frac { \partial u } { \partial t } - k \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0\] earlier. The natural differentiability assumption is that $ u $ is continuously differentiable in $ t $ and twice continuously differentiable in $ x $.

Like the wave equation, the diffusion equation has a spatial reflection symmetry, but unlike the wave equation it does not have temporal reflection symmetry, and indeed behaves very differently in the positive and negative time directions. It's also symmetric under spatial and temporal translation. It also has a scaling symmetry, in both the dependent and independent variables, but the scaling in the independent variables is different from the one we had for the wave equation. Let \[\tag{3.1.2} ( S _ { \alpha , \lambda } u ) ( t , x ) = \lambda u ( t / \alpha ^ 2 , x / \alpha ) ,\] where $ \alpha $ and $ \lambda $ are non-zero. A simple calculation shows that \[\tag{3.1.3} \begin{split} & \frac { \partial S _ { \alpha , \lambda } u } { \partial t } ( t , x ) - k \frac { \partial ^ 2 S _ { \alpha , \lambda } u } { \partial x ^ 2 } ( t , x ) \\ \quad & = \frac { \lambda } { \alpha ^ 2 } \left [ \frac { \partial u } { \partial t } ( t / \alpha ^ 2 , x / \alpha ) - k \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t / \alpha ^ 2 , x / \alpha ) \right ] \end{split}\] so $ S _ { \alpha , \lambda } u $ satisfies the diffusion equation if and only $ u $ does. The diffusion equation also has some less obvious symmetries. Let \[\tag{3.1.4} ( G _ v u ) ( t , x ) = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) u ( t , x - v t ) .\] Then $ G _ v $ is a symmetry for any value of $ v $. To verify this we set $ \tilde u = G _ v u $ and compute partial derivatives. \[\tag{3.1.5} \frac { \partial \tilde u } { \partial t } ( t , x ) = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) \left [ \frac { \partial u } { \partial t } - v \frac { \partial u } { \partial x } + \frac { v ^ 2 } { 4 k } u \right ] ,\] \[\tag{3.1.6} \frac { \partial \tilde u } { \partial x } ( t , x ) = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) \left [ \frac { \partial u } { \partial x } - \frac { v } { 2 k } u \right ] ,\] and \[\tag{3.1.7} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) \left [ \frac { \partial ^ 2 u } { \partial x ^ 2 } - \frac { v } { k } \frac { \partial u } { \partial x } + \frac { v ^ 2 } { 4 k ^ 2 } u \right ] .\] On the right hand side $ u $ and its various derivatives are always evaluated at $ ( t , x - v t ) $ but this has not been written explicitly to prevent the equations from becoming too long. The same will be done in the next equation, \[\tag{3.1.8} \frac { \partial \tilde u } { \partial t } ( t , x ) - k \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) \left [ \frac { \partial u } { \partial t } - k \frac { \partial ^ 2 u } { \partial x ^ 2 } \right ] ,\] which follows immediately from the preceding equations. The left hand side is zero if and only if the right hand side is zero and the exponential factor is non-zero so $ \tilde u $ satisfies the diffusion equation if and only if $ u $ does.

Section 3.2 Special Solutions

One thing we can do with symmetries of a differential equation is to look for solutions invariant under the symmetry. We may or may not get something interesting, depending on the equation and the symmetry. Spatial and temporal translations don't give very interesting solutions of the diffusion equation, for example. A function which is invariant under spatial translations, for example, is just a $ u ( t , x ) $ which is independent of $ x $, but then $ \partial ^ 2 u / \partial x ^ 2 = 0 $ and so if $ u $ is a solution of the diffusion equation then $ \partial u / \partial t = 0 $ as well, so $ u $ is independent of $ t $ as well, and hence constant. So the only solutions of the diffusion equation invariant under spatial translations are the constant solutions. The solutions which are invariant under temporal translations are just the linear functions of $ x $.

The only solution to the diffusion equation which is invariant under scaling in the dependent variable is the zero solution, which is also not very interesting. Scaling in the dependent variables is more interesting. A scale invariant solution is one which satisfies $ S _ { \alpha , 0 } u = u $ for all $ \alpha $. In other words \[\tag{3.2.1} u ( t , x ) = u ( t / \alpha ^ 2 , x / \alpha ) .\] Since this holds for all $ \alpha $ it must hold in particular for $ \alpha = \sqrt { k t } $ , so \[\tag{3.2.2} u ( t , x ) = \varphi ( x / \sqrt { k t } ) ,\] where \[\tag{3.2.3} \varphi ( y ) = u ( 1 / k , y ) .\] Here we've implicitly assumed that $ t > 0 $, so we can only expect this procedure to give us an invariant solution defined there. Of course we also need $ u $ to satisfy the diffusion equation. Taking partial derivatives, \[\tag{3.2.4} \frac { \partial u } { \partial t } ( t , x ) = - \frac 1 2 \frac x { \sqrt { k t ^ 3 } } \varphi ' ( x / \sqrt { k t } ) ,\] \[\tag{3.2.5} \frac { \partial u } { \partial x } ( t , x ) = \frac 1 { \sqrt { k t } } \varphi ' ( x / \sqrt { k t } ) ,\] and \[\tag{3.2.6} \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = \frac 1 { k t } \varphi '' ( x / \sqrt { k t } ) ,\] so \[\tag{3.2.7} \frac { \partial u } { \partial t } ( t , x ) - k \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = - \frac 1 t \left [ \varphi '' ( x / \sqrt { k t } ) + \frac 1 2 \frac x { \sqrt { k t } } \varphi ' ( x / \sqrt { k t } ) \right ] .\] The right hand side is zero for all $ t > 0 $ and all $ x $ if and only if $ \varphi $ satisfies the ordinary differential equation \[\tag{3.2.8} \varphi '' ( y ) + \frac y 2 \varphi ' ( y ) = 0 .\] This can be solved as follows. Let \[\tag{3.2.9} \psi ( y ) = \exp ( y ^ 2 / 4 ) \varphi ' ( y ) .\] Then \[\tag{3.2.10} \psi ' ( y ) = \exp ( y ^ 2 / 4 ) \left [ \varphi '' ( y ) + \frac y 2 \varphi ' ( y ) \right ]\] so $ \varphi $ satisfies the differential equation above if and only if $ \psi $ is constant. Calling that constant $ c _ 1 $ we then have \[\tag{3.2.11} \varphi ' ( y ) = c _ 1 \exp ( - y ^ 2 / 4 )\] and so \[\tag{3.2.12} \varphi ( y ) = c _ 1 \int _ 0 ^ y \exp ( - z ^ 2 / 4 ) \, d z + c _ 2\] for some other constant $ c _ 2 $. The integral above can't be expressed in terms of elementary functions. It can be expressed in terms of what's called the error function, named because of its interpretation in probability theory, but the error function is defined in terms of this integral so that doesn't really provide any new information. In any case, we conclude that the scale invariant solutions to the diffusion are the two parameter family given by the equation above.

Next we look for solutions of the diffusion equation which are invariant under the transformations $ G _ v $ defined earlier, i.e. those that satisfy \[\tag{3.2.13} u ( t , x ) = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) u ( t , x - v t )\] for all $ v $. Again we'll look for solutions valid for $ t > 0 $. Since the equation above holds for all $ v $ it holds in particular for $ v = x / t $, which gives \[\tag{3.2.14} u ( t , x ) = \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) \varphi ( t )\] where $ \varphi ( t ) = u ( t , 0 ) $. We still need to impose the condition that $ u $ satisfies the diffusion equation. Computing partial derivatives, \[\tag{3.2.15} \frac { \partial u } { \partial t } ( t , x ) = \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) \left [ \varphi ' ( t ) + \frac { x ^ 2 } { 4 k t ^ 2 } \varphi ( t ) \right ] ,\] \[\tag{3.2.16} \frac { \partial u } { \partial x } ( t , x ) = \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) \left [ - \frac { x } { 2 k t } \varphi ( t ) \right ] ,\] and \[\tag{3.2.17} \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) \left [ \frac { x ^ 2 } { 4 k ^ 2 t ^ 2 } \varphi ( t ) - \frac 1 { 2 k t } \varphi ( t ) \right ] ,\] so \[\tag{3.2.18} \frac { \partial u } { \partial t } ( t , x ) - k \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) \left [ \varphi ' ( t ) + \frac 1 { 2 t } \varphi ( t ) \right ] ,\] and $ u $ satisfies the diffusion equation if and only if $ \varphi $ satisfies the ordinary differential equation \[\tag{3.2.19} \varphi ' ( t ) + \frac 1 { 2 t } \varphi ( t ) = 0 .\] The solutions of this equation are precisely the constant multiples of $ t ^ { - 1 / 2 } $. In this way we see that every solution of the diffusion equation invariant under the transformations $ G _ v $ is a constant multiple of \[\tag{3.2.20} K ( t , x ) = \frac 1 { \sqrt { 4 \pi k t } } \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) .\] This solution turns out to be so important to the theory of the the diffusion equation that it is know as the fundamental solution. The extra factor $ 1 / \sqrt { 4 \pi k } $ is chosen to simplify the form of various equations which will appear later in the chapter.

Section 3.3 Positivity

For the diffusion equation we will consider the initial value problem \[\tag{3.3.1} u ( s , x ) = f ( s ) .\] Note that unlike the case of the wave equation, for the diffusion we only specify the initial value and not the initial time derivative as data. One other difference is that we will only look for a solution for $ t \ge s $.

The diffusion equation is called the diffusion equation because it describes various diffusion processes, for chemicals, heat, etc. The last of these also explains why it is often also called the heat equation. This physical origin suggests a conjecture about the initial value problem, namely that if $ f $ is positive, or non-negative, then $ u $ should be as well. This conjecture is, unfortunately, false but it is true for bounded solutions. In fact most things we want to prove about the diffusion equation turn out to be false in general but true when we restrict our attention to bounded solutions and initial data. In fact it's possible to replace boundedness with considerably weaker growth conditions but we won't bother doing this. Of course there is no hope of $ u $ being bounded unless $ f $ is so for the remainder of the chapter every time $ u $ and $ f $ appear there will be an implicit assumption that both are bounded. That assumption will be made explicit though in the statements of theorems or in those parts of their proofs where we use it.

Suppose $ u $ is a bounded solution of the initial value problem for the diffusion equation with non-negative initial data. For $ \epsilon > 0 $ define $ w $ by \[\tag{3.3.2} w ( t , x ) = u ( t , x ) + 3 k \epsilon t + \epsilon x ^ 2 .\] Note that $ w $ doesn't satisfy the diffusion equation, but instead satisfies the related equation \[\tag{3.3.3} \frac { \partial w } { \partial t } - k \frac { \partial ^ 2 w } { \partial x ^ 2 } = k \epsilon .\] Since $ w $ is a continuous function it must have a minimum on the rectangle $ [ s , T ] \times [ - L , L ] $ for any $ T > s $ and $ L > 0 $. It does not have a minimum in the interior $ ( s , T ) \times ( - L , L ) $ of the rectangle. If it did then the first partial derivatives would be zero there, which would then imply that $ \partial ^ 2 w / \partial x ^ 2 = - \epsilon $ there, but the second partial derivative can't be negative at an interior minimum. A similar, but more careful, argument shows that there is also no minimum on $ \{ T \} \times ( - L , L ) $, the interior of the top of the rectangle. At such a minimum the $ \partial w / \partial t $ would have to be non-negative, since otherwise we would have points just below it where the value is smaller. Similarly, $ \partial w / \partial x $ would have to be non-negative since otherwise we'd have points just to its left where the value is smaller. But $ \partial w / \partial x $ would also have to be non-positive or we'd have points to its right where the value is smaller, so in fact $ \partial w / \partial x $ must be zero. It then follows that $ \partial w ^ 2 / \partial x ^ 2 $ must be non-negative at the minimum, because otherwise we'd have points on either side where the value is smaller. So $ \partial w / \partial t - k \partial w ^ 2 / \partial x ^ 2 $ would be non-positive at the minimum. But we've already seen that this is equal to $ k \epsilon $, the product of two positive numbers, so the assumption that there is a minimum on the interior of the top of the rectangle leads to a contradiction.

On the sides $ [ s , T ] \times \{ - L \} $ and $ [ s , T ] \times \{ L \} $ we have \[\tag{3.3.4} w ( t , x ) = u ( t , x ) + 3 k \epsilon t + \epsilon L ^ 2 > \inf u + 3 k \epsilon s + \epsilon L ^ 2 \ge 3 k \epsilon s\] if \[\tag{3.3.5} L > \sqrt { \frac { \max ( 0 , - \inf u ) } \epsilon } .\] Here we've used our boundedness assumption $ u $.

The only possibility not considered so far is that the minimum of $ w $ is located on $ \{ s \} \times ( - L , L ) $, the interior of the bottom of the rectangle. If so then the definition of $ w $ and our assumption that $ f $ is non-negative imply that this minimum value of $ w $ is at least $ 3 k \epsilon s $.

What we have shown is that the minimum of $ w $ in the rectangle is attained at a point on the sides or bottom and is at least $ 3 k \epsilon s $, at least provided $ L $ is sufficiently large. In particular, \[\tag{3.3.6} w ( t , x ) \ge 3 k \epsilon s\] for any $ ( t , x ) $ in the rectangle, and therefore \[\tag{3.3.7} u ( t , x ) \ge - k \epsilon x ^ 2 .\] For any $ t > s $ there is a $ T > t $ and an \[\tag{3.3.8} L > \max ( | x | , \sqrt { \frac { \max ( 0 , - \inf u ) } \epsilon } ) ,\] so we have \[\tag{3.3.9} u ( t , x ) \ge - k \epsilon x ^ 2 .\] No assumptions other than positivity were made on $ \epsilon $ so this inequality holds for all positive $ \epsilon $ and therefore . This then gives \[\tag{3.3.10} u ( t , x ) \ge - k \epsilon x ^ 2 .\] In other words we've proved the following theorem.

Theorem 3.3.A Suppose $ u $ is a bounded classical solution of the initial value problem for the diffusion equation with initial data $ f $ which is non-negative. Then $ u $ is non-negative.
In fact we proved something slightly stronger since we only used the fact that $ u $ has a lower bound, not that it has an upper bound. A more precise result is that if the initial data is non-negative and is positive at least one point then the solution is positive for all later times. We will prove this in the next section.

Section 3.4 Uniqueness

If $ u $ is a solution of the diffusion equation then so is $ - u $ so the positivity theorem at the end of the last section also shows that if the initial data for the initial problem are non-positive then the solution is non-positive. Combining this with the original version we see that if the initial data are zero then the solution is also zero. The equation is linear so the difference of two solutions is also a solution. Considering the difference of two solutions then we see that if the difference of their initial data is zero then the difference of the solutions is zero. Put more simply, if they have the same initial data then they are the same solution. In this way we obtain the following uniqueness theorem.

Theorem 3.4.A There is at most one bounded classical solution to the initial value problem for the diffusion equation in the region $ [ s , + \infty ) \times \mathbf R $.
The method of proof we've used unfortunately gives no information about what this solution might be.

We had several different proofs of uniqueness for the wave equation. There was a more or less direct proof based on a pair of auxiliary functions we defined there, there was a proof based on Green's theorem, and there was a proof using energy conservation and linearity. The first two both led to D'Alembert's formula while the last one didn't give any explicit form for the solution. The first of these didn't generalise even to closely related equations like Klein-Gordon and so we can't expect to find anything similar for the diffusion equation. The last does have a generalisation to the diffusion equation, which we'll see later, but it gives us a pure uniqueness theorem, not an explicit formula, and we already have that, so it seems natural to look for an alternate uniqueness proof for the diffusion equation using Green's theorem. This works, but the choice of functions to apply Green's theorem to is much less obvious than it was for the wave equation.

We apply Green's function with \[\tag{3.4.1} p = - u v , \quad q = k v \frac { \partial u } { \partial x } - k u \frac { \partial v } { \partial x } ,\] where $ u $ and $ v $ are to be chosen later. The integrand on the right hand side in Green's theorem is then \[\tag{3.4.2} \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } = v \left ( \frac { \partial u } { \partial t } - k \frac { \partial ^ 2 u } { \partial x ^ 2 } \right ) + u \left ( \frac { \partial v } { \partial t } + k \frac { \partial ^ 2 v } { \partial x ^ 2 } \right ) .\] The first term on the right hand side will vanish if $ u $ satisfies the diffusion equation. The second term will vanish if $ v $ satisfies the time reversed version of the diffusion equation. Roughly the idea will be to let $ u $ be an arbitrary solution of the diffusion equation and let $ v $ be a particular solution of the time reversed equation, chosen so as to provide useful information about $ u $. In the end this isn't quite what we want, but for the moment it's a useful guide. Which particular solution should we take? We have a variety of particular solutions to the diffusion equation which we found earlier when we looked for solutions symmetric under particular symmetries and we can get a solution to the time reversed diffusion equation simply by reversing time in one of those. The most interesting of the solutions there was the fundamental solution, so we'll choose that one. We can get a bit more information though by applying an arbitrary space-time translation to the fundamental solution and then reversing time, so we'll choose \[\tag{3.4.3} v ( t , x ) = K ( t _ 3 - t , x - x _ 3 )\] for some point $ ( t _ 3 , x _ 3 ) $. Eventually we will need to modify this choice but first let's see what happens when we choose this $ v $.

Now that we have our functions $ p $ and $ q $ we need to choose a region $ R $. We will choose a strip $ [ t _ 1 , t _ 2 ] \times \mathbf R $, where $ t _ 1 < t _ 2 < t _ 3 $. Unfortunately this region is not bounded, so the hypotheses of Green's theorem are not satisfied, but we will temporarily ignore this problem and see what happens. The boundary of the strip consists of the line $ \{ t _ 1 \} \times \mathbf R $, traversed from left to right and the line $ \{ t _ 2 \} \times \mathbf R $, traversed from right to left. On each of these $ d t = 0 $ so we are integrating $ p \, d x $, i.e. $ - u v \, d x $. So Green's theorem, if it applied, would give \[\tag{3.4.4} \begin{split} & \int _ { - \infty } ^ { + \infty } K ( t _ 3 - t _ 2 , x - x _ 3 ) u ( t _ 2 , x ) \, d x \\ & \quad {} - \int _ { - \infty } ^ { + \infty } K ( t _ 3 - t _ 1 , x - x _ 3 ) u ( t _ 1 , x ) \, d x = 0 . \end{split}\] The two integrals are both of the form \[\tag{3.4.5} \begin{split} & \int _ { - \infty } ^ { + \infty } K ( t _ 3 - t _ j , x - x _ 3 ) u ( t _ j , x ) \, d x \\ & \quad {} = \frac 1 { \sqrt { 4 \pi k ( t _ 3 - t _ j ) } } \int _ { - \infty } ^ { + \infty } \exp \left ( - \frac { ( x - x _ 3 ) ^ 2 } { 4 k ( t _ 3 - t _ j ) } \right ) u ( t _ j , x ) \, d x . \end{split}\] Making the change of variable \[\tag{3.4.6} y = \frac { x - x _ 3 } { \sqrt { 4 \pi k ( t _ 3 - t _ j ) } }\] converts this integral to \[\tag{3.4.7} \int _ { - \infty } ^ { + \infty } \exp ( - \pi y ^ 2 ) u \left ( t _ 3 , x _ 3 + y \sqrt { 4 \pi k ( t _ 3 - t _ j ) } \right ) \, d y .\] We want to make this change of variable in the integral with $ j = 2 $ but not the one with $ j = 1 $. This gives us the equation \[\tag{3.4.8} \begin{split} & \int _ { - \infty } ^ { + \infty } \exp ( - \pi y ^ 2 ) u \left ( x _ 3 + y \sqrt { 4 \pi k ( t _ 3 - t _ 2 ) } \right ) \, d y \\ & \quad {} = \frac 1 { \sqrt { 4 \pi k ( t _ 3 - t _ j ) } } \int _ { - \infty } ^ { + \infty } \exp \left ( - \frac { ( x - x _ 3 ) ^ 2 } { 4 k ( t _ 3 - t _ 1 ) } \right ) u ( t _ 1 , x ) \, d x . \end{split}\] Next we take limits as $ t _ 3 $ tends to $ t _ 2 $ from above, simply taking the limit inside the integral without worrying for the moment whether this is justified. On the left hand side the argument of $ u $ tends to $ ( t _ 2 , x _ 3 ) $ and $ u $ is continuous so the integral tends to $ \exp ( - \pi y ^ 2 ) u ( t _ 2 , x _ 3 ) $. We can pull the constant outside the integral and the remaining integral is a well known definite integral with value 1 so on the left hand side of the equation we just get $ u ( t _ 2 , x _ 3 ) $. One the right hand side, using the continuity of $ u $ again, we just get the result of substituting $ t _ 2 $ for $ t _ 3 $ in the integral we had previously. In other words, the limit of the equation above is \[\tag{3.4.9} u ( t _ 2 , x _ 3 ) = \frac 1 { \sqrt { 4 \pi k ( t _ 2 - t _ j ) } } \int _ { - \infty } ^ { + \infty } \exp \left ( - \frac { ( x - x _ 3 ) ^ 2 } { 4 k ( t _ 2 - t _ 1 ) } \right ) u ( t _ 1 , x ) \, d x .\] Changing the names of various variables we see that if $ s < t $ then \[\tag{3.4.10} u ( t , x ) = \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \int _ { - \infty } ^ { + \infty } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) u ( s , y ) \, d y .\] In other words, if $ u $ satisfies the initial value problem for the diffusion equation then \[\tag{3.4.11} u ( t , x ) = \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \int _ { - \infty } ^ { + \infty } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y\] for all $ t > s $ and all $ x $. This formula, if correct, gives us an alternate proof of the uniqueness of solutions to the initial value problem, one which gives us an idea for how to prove existence as well: we just check that the formula above does indeed give a solution to the initial value problem. Unfortunately there are two gaps in the proof above. We applied Green's theorem improperly and we exchanged limits and integrals without justification. We need to fix that, but in fact the equation above is correct.

Usually the way to prove anything involving integrals over an infinite interval is to take limits in a finite interval. This nearly works in our current situation, but as we'll see it doesn't quite do everything we want. Let's see what happens if we apply Green's theorem with $ p $ and $ q $ as before to the rectangle $ [ t _ 1 , t _ 2 ] \times [ x _ 3 - L , x _ 3 + L ] $. This region does, of course, satisfy the hypotheses of Green's theorem. Its boundary consists of four straight segments: $ C _ 1 $ from $ ( t _ 2 , x _ 3 + L ) $ to $ ( t _ 2 , x _ 3 - L ) $, $ C _ 2 $ from $ ( t _ 2 , x _ 3 - L ) $ to $ ( t _ 1 , x _ 3 - L ) $, $ C _ 3 $ from $ ( t _ 1 , x _ 3 - L ) $ to $ ( t _ 1 , x _ 3 + L ) $, and $ C _ 4 $ from $ ( t _ 1 , x _ 3 + L ) $ to $ ( t _ 2 , x _ 3 + L ) $. As before the integrand in the area integral is zero so we have \[\tag{3.4.12} \sum _ { j = 1 } ^ 4 \int _ { C _ j } ( p \, d x + q \, d t ) = 0 .\] On the first and third boundary curves we have \[\tag{3.4.13} \int _ { C _ 1 } ( p \, d x + q \, d t ) = \int _ { x _ 3 - L } ^ { x _ 3 + L } u ( t _ 2 , x ) v ( t _ 2 , x ) \, d x\] and \[\tag{3.4.14} \int _ { C _ 3 } ( p \, d x + q \, d t ) = - \int _ { x _ 3 - L } ^ { x _ 3 + L } u ( t _ 1 , x ) v ( t _ 1 , x ) \, d x .\] When we take limits as $ L $ goes to infinity these will just tend to the corresponding integrals over $ ( - \infty , + \infty ) $, which is what we want. The integral over the right side of the rectangle is \[\tag{3.4.15} \begin{split} \int _ { C _ 4 } ( p \, d x + q \, d t ) & = k \int _ { t _ 1 } ^ { t _ 2 } v ( t , x _ 3 + L ) \frac { \partial u } { \partial x } ( t , x _ 3 + L ) \, d t \\ & \quad {} - k \int _ { t _ 1 } ^ { t _ 2 } u ( t , x _ 3 + L ) \frac { \partial v } { \partial x } ( t , x _ 3 + L ) \, d t . \end{split}\] The second of these integrals will tend to zero as $ L $ tends to infinity, a fact which we will now prove.

Our $ v $ was defined in terms of the fundamental solution $ K $ so we need an $ x $ derivative of $ K $. In fact for later purposes we will need higher order derivatives in both $ x $ and $ t $ so we go ahead and compute them now. Let \[\tag{3.4.16} w _ j ( t , x ) = ( - 2 k t ) ^ j \frac { \frac { \partial ^ j K } { \partial x ^ j } ( t , x ) } { K ( t , x ) } .\] Then $ w _ 0 = 1 $ and \[\tag{3.4.17} \begin{split} \frac { \partial ^ { j + 1 } K } { \partial x ^ { j + 1 } } ( t , x ) & = \frac { \partial } { \partial x } \frac { \partial ^ j K } { \partial x ^ j } ( t , x ) \\ & = \frac { \partial } { \partial x } \left [ w _ j ( t , x ) K ( t , x ) \right ] \\ & = K ( t , x ) \frac { \partial w _ j } { \partial x } ( t , x ) + w _ j ( t , x ) \frac { \partial K } { \partial x } ( t , x ) \\ & = K ( t , x ) \frac { \partial w _ j } { \partial x } ( t , x ) - w _ j ( t , x ) \frac { x } { 2 k t } K ( t , x ) \end{split}\] and so \[\tag{3.4.18} w _ { j + 1 } ( t , x ) = x w _ j ( t , x ) - 2 k t \frac { \partial w _ j } { \partial x } ( t , x ) .\] We can use this to compute successively \[\tag{3.4.19} w _ 0 ( t , x ) = 1 , \quad w _ 1 ( t , x ) = x , \quad w _ 2 ( t , x ) = x ^ 2 - 2 k t , \quad w _ 3 ( t , x ) = x ^ 3 - 6 k t x\] and so forth. These are the only ones we'll actually need though. You'll notice, and can easily prove by induction, that $ w _ j $ is always a polynomial of degree $ j $. Once we have the $ w $'s we can easily get the $ x $ derivatives of $ K $, \[\tag{3.4.20} \frac { \partial ^ j K } { \partial x ^ j } ( t , x ) = ( - 2 k t ) ^ { - j } w _ j ( t , x ) K ( t , x ) .\] We could get $ t $ derivatives by a similar method but it's simpler just to note that $ K $ satisfies the diffusion equation so one $ t $ derivative is the same as two $ x $ derivatives and a factor of $ k $. In this way we find that \[\tag{3.4.21} \frac { \partial ^ { i + j } K } { \partial t ^ i \partial x ^ j } ( t , x ) = ( - 1 ) ^ { j } 2 ^ { - 2 i - j } k ^ { - j } t ^ { - 2 i - j } w _ j ( t , x ) K ( t , x ) .\]

The preceding calculation gives us various useful properties of $ \partial K / \partial x ( t , x ) $. First of all, it always has the opposite sign to that of $ x $. Second, by looking at its $ t $ derivative we see that as a function of $ t $ for fixed negative $ x $ it increases until it reaches a maximum at $ t = x ^ 2 / 6 k $ and then decreases again, while for fixed positive $ x $ it decreases until it reaches a minimum at $ t = x ^ 2 / 6 k $ and then increases again. What this tells us about $ v ( t _ 3 - t , x _ 3 + L ) $ is that $ \partial v / \partial x $ is negative for all $ t $ in the interval $ [ t _ 1 , t _ 2 ] $ and that its minimum in that interval is attained when $ t = t _ 2 $ provided that $ L $ is sufficiently large, specifically $ L \ge \sqrt { 6 k ( t _ 3 - t _ 2 ) } $. At the maximum $ \partial v / \partial x $ is equal to \[\tag{3.4.22} - \frac { L } { 4 \pi ^ { 1 / 2 } k ^ { 3 / 2 } ( t _ 3 - t _ 2 ) ^ { 3 / 2 } } \exp \left ( - \frac { L ^ 2 } { 4 k ( t _ 3 - t _ 2 ) } \right ) .\] The integral \[\tag{3.4.23} \begin{split} \int _ { t _ 1 } ^ { t _ 2 } u ( t , x _ 3 + L ) \frac { \partial v } { \partial x } ( t , x _ 3 + L ) \, d t \end{split}\] which we met earlier has an absolute value less than or equal to the integral of the absolute value of the integrand, which in turn is less than or equal to the length of the interval, $ t _ 2 - t _ 1 $ times the maximum value of the absolute value of the integral. This in turn is bounded by the supremum of the absolute value of $ u $, which exists by our assumption that $ u $ is bounded, times the maximum value the absolute value of $ \partial v / \partial x $, which we just computed. This is the only factor which depends on $ L $ and it clearly tends to zero as $ L $ tends to infinity, so the integral tends to zero, as promised.

We've now treated one of the two terms in the integral over $ C _ 4 $. The other term is the one involving the integral \[\tag{3.4.24} \begin{split} \int _ { t _ 1 } ^ { t _ 2 } v ( t , x _ 3 + L ) \frac { \partial u } { \partial x } ( t , x _ 3 + L ) \, d t \end{split}\] If we try to apply a similar argument to this integral then we run into a problem. We assumed $ u $ was bounded so we had an upper bound for the absolute value of the factor $ u ( t , x _ 3 + L ) $ which was independent of $ L $. We haven't assumed that $ \partial u / \partial x $ is bounded though, so we don't have a an upper bound for the absolute value of the factor $ \partial u / \partial x ( t , x _ 3 + L ) $ in the integral above, or at least not one which is independent of $ L $. At this point there are two options. One is just to add the boundedness of $ \partial u / \partial x $ as an additional hypothesis and the other is to look for a cleverer argument. Adding an additional hypothesis might seem reasonable. After all, we already added the hypothesis that $ u $ is bounded so why not just add another hypothesis? The situation here is different though. There are known counter-examples to the uniqueness theorem with the boundedness assumption removed so we had to add it, or possibly some weaker version of it. There are no counter-examples to the version uniqueness theorem which assumes boundedness of $ u $ but not of $ \partial u / \partial x $. We know that because gave a proof of that theorem at the start of this section! So this is an assumption made purely for convenience, not from logical necessity. Mathematicians do sometimes make unnecessary hypotheses in order to simplify proofs but it's something we generally prefer to avoid so in this case we will not make any additional hypothesis and will instead look for a cleverer argument.

The problem came from the $ v \partial u / \partial x $ term so that is what we somehow have to eliminate. A simple way to kill this term is to choose a $ v $ which is zero on the left and right sides of the rectangle. The $ v $ we chose previously did not have this property. Indeed that $ v $ is positive everywhere on the boundary of the rectangle. It's easy to find $ v $'s which are zero on the left and right sides but the trick is to find which which doesn't spoil the rest of the argument. It's reasonable to try multiplying our previous $ v $ by a factor which vanishes on the left and right sides, for example \[\tag{3.4.25} v ( t , x ) = \rho \left ( \frac { x - x _ 3 } L \right ) K ( t _ 3 - t , x - x _ 3 ) ,\] where \[\tag{3.4.26} \rho ( r ) = \begin{cases} 0 & \mbox{ if ${ x < - 1 }$,} \\ 192 r ^ 5 + 720 r ^ 4 + 1040 r ^ 3 + 720 r ^ 2 + 240 r + 32 & \mbox{ if ${ - 1 \le r \le - 1 / 2 }$,} \\ 1 & \mbox{ if ${ - 1 / 2 < x < 1 / 2 }$,} \\ - 192 r ^ 5 + 720 r ^ 4 - 1040 r ^ 3 + 720 r ^ 2 - 240 r + 32 & \mbox{ if ${ 1 / 2 \le r \le 1 }$,} \\ 0 & \mbox{ if ${ x > 1 }$.} \\ \end{cases}\] This is not the only choice we could have made for $ \rho $ but it is relatively straightforward to check, using Lemma 2.6.A, that it is twice continuously differentiable. This implies in particular that it and its first two derivatives are zero at $ r = \pm 1 $. These properties ensure that our $ v $ is twice continuously differentiable and that our $ q $ is zero when $ x = x _ 3 \pm L $, which includes the boundary segments $ C _ 2 $ and $ C _ 4 $, so the corresponding integrals in Green's theorem are zero.

Modifying $ v $ as we did above fixes one problem but creates another. Our new $ v $ no longer satisfies the time reversed diffusion equation. Instead \[\tag{3.4.27} \begin{split} \frac { \partial v } { \partial t } ( t , x ) + k \frac { \partial ^ 2 v } { \partial x ^ 2 } ( t , x ) & = \frac k { L ^ 2 } \rho '' \left ( \frac { x - x _ 3 } L \right ) K ( t _ 3 - t , x - x _ 3 ) \\ & \quad {} + \frac { 2 k } { L } \rho ' \left ( \frac { x - x _ 3 } L \right ) \frac { \partial K } { \partial x } ( t _ 3 - t , x - x _ 3 ) . \end{split}\] This means the right hand side in Green's identity is no longer zero. It is important to note though that all of the terms on the right hand side of the equation above have at least one derivative of $ \rho $ and $ \rho $ is constant in the interval $ [ - 1 / 2 , 1 / 2 ] $ so the right hand side is zero for $ x $ in the interval $ [ x _ 3 - L / 2 , x _ 3 + L / 2 ] $. So Green's Theorem now gives us \[\tag{3.4.28} \begin{split} \int _ { x _ 3 - L } ^ { x _ 3 + L } u ( t _ 2 , x ) v ( t _ 2 , x ) \, d x & = \int _ { x _ 3 - L } ^ { x _ 3 + L } u ( t _ 1 , 1 ) v ( t _ 1 , x ) \, d x \\ & \quad {} + \int _ { R _ - } u \left ( \frac { \partial v } { \partial t } + k \frac { \partial ^ 2 v } { \partial x ^ 2 } \right ) \, d A \\ & \quad {} + \int _ { R _ + } u \left ( \frac { \partial v } { \partial t } + k \frac { \partial ^ 2 v } { \partial x ^ 2 } \right ) \, d A , \end{split}\] where $ R _ - $ is the rectangle $ [ t _ 1 , t _ 2 ] \times [ x _ 3 - L , x _ 3 - L / 2 ] $ and $ R _ + $ is the rectangle $ [ t _ 1 , t _ 2 ] \times [ x _ 3 + L / 2 , x _ 3 + L ] $. Compared to our previous calculation we've lost the line integrals over $ C _ 2 $ and $ C _ 4 $ and gained two area integrals over the rectangles $ R _ - $ and $ R _ + $. These are what remain of the original integral over the full rectangle when we remove the part over the rectangle $ [ t _ 1 , t _ 2 ] \times [ x _ 3 - L / 2 , x _ 3 + L / 2 ] $, where, as we've seen, $ v $ satisfies the time reversed diffusion equation and so the integrand vanishes there.

We need to show that the area integrals above are harmless, i.e. that they tend to zero as $ L $ tends to infinity. The two are similar so here we'll only consider the integral over $ R _ - $. The $ u $ factor has upper and lower bounds independent of $ L $ by assumption. We've computed $ \partial v / \partial t + k \partial ^ 2 v / \partial x ^ 2 $ in terms of derivatives of $ \rho $ and $ K $. The derivatives $ \rho $ must be bounded in the interval $ [ - 1 , - 1 / 2 ] $ because polynomials are continuous functions. We don't really care what the precise bounds are but they're not hard to obtain. The minimum and maximum of $ \rho '' $ are $ - 40 / \sqrt 3 $ and $ 40 / \sqrt 3 $ while the minimum and maximum of $ \rho ' $ are $ 0 $ and $ 15 / 4 $. Bounding $ K $ and its $ x $ derivative is more interesting, but we already have experience with this problem from our earlier attempt. As long as $ L $ is sufficiently large, which in this case means $ L > 2 \sqrt { 6 k ( t _ 3 - t _ 2 ) } $, both terms will be positive in $ R _ - $ and their $ t $ and $ x $ derivatives will be as well so the values in the rectangle lie between zero and the values in the upper right hand corner: \[\tag{3.4.29} \begin{split} 0 \le K ( t _ 3 - t , x - x _ 3 ) & \le K ( t _ 3 - t _ 2 , - L / 2 ) \\ & {} = \frac 1 { \sqrt { 4 \pi k ( t _ 3 - t _ 2 ) } } \exp \left ( - \frac { ( L / 2 ) ^ 2 } { 4 k ( t _ 3 - t _ 2 ) } \right ) \end{split}\] and \[\tag{3.4.30} \begin{split} 0 \le \frac { \partial K } { \partial x } ( t _ 3 - t , x - x _ 3 ) & \le \frac { \partial K } { \partial x } ( t _ 3 - t _ 2 , - L / 2 ) \\ & {} = \frac { \pi L } { \left [ 4 \pi k ( t _ 3 - t _ 2 ) \right ] ^ { 3 / 2 } } \exp \left ( - \frac { ( L / 2 ) ^ 2 } { 4 k ( t _ 3 - t _ 2 ) } \right ) . \end{split}\] These bounds are quite messy but the important point is that our integrand is bounded from above and below by factors independent of $ L $ times \[\tag{3.4.31} \frac 1 L \exp \left ( - \frac { ( L / 2 ) ^ 2 } { 4 k ( t _ 3 - t _ 2 ) } \right )\] and so our integral, which is over a rectangle of area $ L ( t _ 2 - t _ 1 ) / 2 $, is bounded by factors independent of $ L $ times \[\tag{3.4.32} \exp \left ( - \frac { ( L / 2 ) ^ 2 } { 4 k ( t _ 3 - t _ 2 ) } \right ) ,\] and so tends to zero as $ L $ tends to infinity, as we wanted.

There are still the integrals over $ C _ 1 $ and $ C _ 3 $ to be dealt with but these are the same integrals as before except for an extra factor of $ \rho ( ( x - x _ 3 ) / L $ in the integrands, which tends to 1 as $ L $ tends to infinity, and so is harmless. Of course we glossed over the interchange of the limits and integrals in our previous, unsuccessful, argument and we are still doing so here but other than that we have a new proof of our earlier uniqueness theorem and this one gives an actual solution formula. From this formula we can extract a lot of useful information. For example, if $ f $ is non-negative everywhere and positive somewhere then the same will be true of the integrand in the integral formula for $ u ( t , x ) $ for $ t > s $ and so $ u ( t , x ) $ will be positive. This is the strengthened version of the non-negativity theorem mentioned earlier.

Section 3.5 Regularity

What we have shown above is that if there is a classical solution to the initial value problem for the diffusion equation then it must be \[\tag{3.5.1} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y .\] We haven't yet shown that this is a solution though. The first thing we'll show is that it does indeed satisfy the diffusion equation, at least when $ t > s $. For this, and to fill a gap in our earlier proof of uniqueness, we need some multivariable calculus.

We have the following standard theorems from multivariable calculus.

Theorem 3.5.A Suppose that $ f $ is continuous on the product of closed intervals \[\tag{3.5.2} R = [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ]\] in $ \mathbf R ^ { m } $ and $ \sigma $ is a permutation of $ 1 , \ldots , m $. Then \[\tag{3.5.3} \begin{split} & \int _ { a _ { \sigma ( m ) } } ^ { b _ { \sigma ( m ) } } \ldots \int _ { a _ { \sigma ( 1 ) } } ^ { b _ { \sigma ( 1 ) } } f ( x _ 1 , \ldots , x _ m ) \, d x ^ { \sigma ( 1 ) } \cdots d x ^ { \sigma ( m ) } \\ & \qquad = \int _ { a _ m } ^ { b _ m } \ldots \int _ { a _ 1 } ^ { b _ 1 } f ( x _ 1 , \ldots , x _ m ) \, d x ^ 1 \cdots d x ^ m . \end{split}\] The cleanest way to prove this is to define integration over sufficiently general sets in $ \mathbf R ^ m $, for example over polyhedral regions, and then show that each of the repeated integrals above is equal to the integral over the whole region.
Theorem 3.5.B Suppose that $ f $ is continuous on the product of closed intervals \[\tag{3.5.4} R = [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ] \times [ c _ 1 , d _ 1 ] \times \cdots \times [ c _ n , d _ n ]\] in $ \mathbf R ^ { m + n } $. Then \[\tag{3.5.5} g ( x _ 1 , \ldots , x _ m ) = \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } f ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n\] is continuous on $ [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ] $.
Theorem 3.5.B can be combined with the fundamental theorem of calculus to give a criterion for differentiation under the integral sign. Suppose that $ f $ is continuously differentiable in $ R $. Then the difference quotient \[\tag{3.5.6} \frac { g ( x _ 1 , \ldots , x _ j + h , \ldots , x _ m ) - g ( x _ 1 , \ldots , x _ j , \ldots , x _ m ) } h\] is equal to \[\tag{3.5.7} \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } \frac { f ( x _ 1 , \ldots , x _ j + h , \ldots , x _ m , y _ 1 , \ldots , y _ n ) - f ( x _ 1 , \ldots , x _ j , \ldots , x _ m , y _ 1 , \ldots , y _ n ) } h \, d y _ 1 \cdots d y _ n .\] and this, by the fundamental theorem of calculus, is equal to the integral \[\tag{3.5.8} \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } \int _ { 0 } ^ 1 \frac { \partial f } { \partial x _ j } ( x _ 1 , \ldots , x _ j + r h , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d r \, d y _ 1 \cdots d y _ n .\] It doesn't matter in which order we perform the integrals so we can also write this as \[\tag{3.5.9} \int _ { 0 } ^ 1 \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } \frac { \partial f } { \partial x _ j } ( x _ 1 , \ldots , x _ j + r h , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n \, d r .\] The integrand is a continuous function on a product of closed intervals so the limit of the integral as $ h $ tends to zero exists and is equal to the integral of the limit, i.e. \[\tag{3.5.10} \int _ { 0 } ^ 1 \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } \frac { \partial f } { \partial x _ j } ( x _ 1 , \ldots , x _ j , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n \, d r ,\] which is the same as \[\tag{3.5.11} \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } \frac { \partial f } { \partial x _ j } ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n .\] The $ j $'th partial derivative of $ g $ is defined as the limit of this difference quotient so what we've just found is that $ \partial g / \partial x _ j $ exists and \[\tag{3.5.12} \frac { \partial g } { \partial x _ j } ( x _ 1 , \ldots \ldots , x _ m ) = \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } \frac { \partial f } { \partial x _ j } ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n ,\] which is just what we would get by formally differentiating under the integral sign. From Theorem 3.5.B we see that this partial derivative is in fact continuous. So we have the following theorem.
Theorem 3.5.C Suppose that $ f $ is continuously differentiable on the product of intervals \[\tag{3.5.13} R = [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ] \times [ c _ 1 , d _ 1 ] \times \cdots \times [ c _ n , d _ n ]\] in $ \mathbf R ^ { m + n } $. Then \[\tag{3.5.14} g ( x _ 1 , \ldots , x _ m ) = \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } f ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n .\] is continuously differentiable on $ [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ] $ and its partial derivatives can be obtained by formally exchanging the partial derivative and integral.

Without additional hypotheses none of these theorems are valid if the closed intervals are replaced by open or half-open intervals. For example, \[\tag{3.5.15} f ( x , y ) = \frac { 8 x y ( x ^ 2 - y ^ 2 ) } { ( x ^ 2 + y ^ 2 ) ^ 2 }\] is continuous on the product of open intervals $ ( 0 , 1 ) \times ( 0 , 1 ) $ but \[\tag{3.5.16} \int _ 0 ^ 1 \int _ 0 ^ 1 f ( x , y ) \, d x \, d y = 1\] while \[\tag{3.5.17} \int _ 0 ^ 1 \int _ 0 ^ 1 f ( x , y ) \, d y \, d x = - 1 ,\] which would be a counter-example to 3.5.A if it applied to products of open intervals. This example has the property that \[\tag{3.5.18} \int _ 0 ^ 1 \int _ 0 ^ 1 \left | f ( x , y ) \right | \, d x \, d y = \infty .\] This is not an accident. One can, in fact, show the following.

Theorem 3.5.D Suppose that $ f $ is continuous on the product of intervals \[\tag{3.5.19} R = I _ 1 \times \cdots \times I _ m\] in $ \mathbf R ^ { m } $, and let $ a _ j = \inf I _ j $ and $ b _ j = \sup I _ j $. If \[\tag{3.5.20} \int _ { a _ m } ^ { b _ m } \ldots \int _ { a _ 1 } ^ { b _ 1 } f ( x _ 1 , \ldots , x _ m ) \, d x ^ 1 \cdots d x ^ m < \infty\] then \[\tag{3.5.21} \begin{split} & \int _ { a _ { \sigma ( m ) } } ^ { b _ { \sigma ( m ) } } \ldots \int _ { a _ { \sigma ( 1 ) } } ^ { b _ { \sigma ( 1 ) } } f ( x _ 1 , \ldots , x _ m ) \, d x ^ { \sigma ( 1 ) } \cdots d x ^ { \sigma ( m ) } \\ & \qquad = \int _ { a _ m } ^ { b _ m } \ldots \int _ { a _ 1 } ^ { b _ 1 } f ( x _ 1 , \ldots , x _ m ) \, d x ^ 1 \cdots d x ^ m . \end{split}\] for every permutation $ \sigma $ of $ 1 , \ldots , m $.
The proof is fairly simple. Both integrals in the equation above are defined by a limit of integrals over products of closed intervals $ J _ 1 \times \cdots J _ m $ where $ J _ j = [ \alpha _ j , \beta _ j ] $ with $ a _ j < \alpha _ j < \beta _ j < b _ j $, the limit being taken $ \alpha _ j \to a _ j ^ + $ and $ \beta _ j \to a _ j ^ - $ so what we need to do is to show that these limits for the two integrals exist and are equal. We do this by noting that the integral of the absolute value, defined by a similar limit of integrals, is assumed to be convergent and hence Cauchy. Using the fact that absolute value of an integral is less than or equal to the integral of the absolute values we can then show that the other two limits are also Cauchy and hence convergent. Theorem 3.5.A applies to the integrals over $ J _ 1 \times \cdots J _ m $ so those are independent of the order of integration and therefore the same applies to their limit.

If you remember the proof that the limit of an absolutely convergent double sum is independent of the order of summation you may notice that it follows exactly the same lines, just with finite sums in place of integrals over finite intervals.

There was no assumption, either in the statement of the theorem above or in its proof, that $ a _ j $ or $ b _ j $ is finite. The theorem applies to finite intervals, semi-infinite intervals, infinite intervals, or any combination of them.

The same approach, using an integrable upper bound to relate write an integral over a product of intervals to an integral over a product of closed intervals plus a small error and then using the corresponding theorem for products of closed intervals, also applies to give analogues of the other two theorems for the other two theorems as well.

Theorem 3.5.E Suppose that $ f $ is continuous on the product of intervals \[\tag{3.5.22} R = I _ 1 \times \cdots \times I _ m \times J _ 1 \times \cdots \times J _ n\] in $ \mathbf R ^ { m + n } $ and let $ c _ k = \inf J _ k $ and $ d _ k = \sup J _ k $. Suppose also that there is a non-negative continuous $ h $ on $ J _ 1 \times \cdots \times J _ n $ such that \[\tag{3.5.23} \left | f ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \right | \le h ( y _ 1 , \ldots , y _ n )\] and \[\tag{3.5.24} \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } h ( y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n < \infty .\] Then \[\tag{3.5.25} g ( x _ 1 , \ldots , x _ m ) = \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } f ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n\] is continuous on $ I _ 1 \times \cdots \times I _ m $.
Theorem 3.5.F Suppose that $ f $ is continuously differentiable on the product of intervals \[\tag{3.5.26} R = I _ 1 \times \cdots \times I _ m \times J _ 1 \times \cdots \times J _ n\] in $ \mathbf R ^ { m + n } $ and let $ c _ k = \inf J _ k $ and $ d _ k = \sup J _ k $. Suppose also that there are non-negative continuous functions $ h _ j $ on $ J _ 1 \times \cdots \times J _ n $ such that \[\tag{3.5.27} \left | \frac { \partial f } { \partial x _ j } ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \right | \le h _ j ( y _ 1 , \ldots , y _ n )\] and \[\tag{3.5.28} \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } h _ j ( y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n < \infty .\] Then \[\tag{3.5.29} g ( x _ 1 , \ldots , x _ m ) = \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } f ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n .\] is continuously differentiable on $ [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ] $ and its partial derivatives can be obtained by formally exchanging the partial derivative and integral.

Once we have these theorems it's easy to see that Theorem 3.5.E is exactly what we need to justify the exchange of limits and integrals in the proof of the integral representation above. There we had an integral of the form \[\tag{3.5.30} \int _ { - \infty } ^ { + \infty } \exp ( - \pi y ^ 2 ) u \left ( x _ 3 + y \sqrt { 4 \pi k ( t _ 3 - t _ 2 ) } \right ) \, d y\] and we wanted to take the limit as $ t _ 3 $ tended to $ t _ 2 $ from above. The theorem says we can do that if we can find an integrable non-negative function $ h $ such that \[\tag{3.5.31} \exp ( - \pi y ^ 2 ) u \left ( x _ 3 + y \sqrt { 4 \pi k ( t _ 3 - t _ 2 ) } \right ) \le h ( y )\] for all $ y $ and it's clear that $h ( y ) = M \exp ( - \pi y ^ 2 )$ works, where $ M $ is a uniform bound on $ | u | $, which we assumed to be uniformly bounded.

We can also use Theorem 3.5.F to show that that the required derivatives of $ u $ exist and are continuous for $ t > s $. Formal differentiation of the solution formula gives \[\tag{3.5.32} \frac { \partial u } { \partial t } ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac { \partial } { \partial t } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y\] and \[\tag{3.5.33} \frac { \partial u } { \partial x } ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac { \partial } { \partial x } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y\] The partial derivatives in question were computed earlier. For example, \[\tag{3.5.34} \frac { \partial } { \partial x } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) = \frac { x - y } { \sqrt { 16 \pi k ^ 3 ( t - s ) ^ 3 } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) .\] Differentiating, we see that as a function of $ t $ for fixed values of the other variables the absolute value of the right hand side increases from zero to a maximum of \[\tag{3.5.35} \sqrt { \frac { 27 } { 2 \pi } } \exp ( - 3 / 2 )\] at \[\tag{3.5.36} t = s + \frac { ( y - x ) ^ 2 } { 6 k }\] and then decreases to zero again. If we restrict our attention to $ t \in [ t _ 1 , t _ 2 ] $ for some $ t _ 2 > t _ 1 > s $ then there are three different cases, depending on the size of $ | x - y | $. For small values of $ | x - y | $, specifically when \[\tag{3.5.37} | x - y | \le \sqrt { 6 k ( t _ 1 - s ) }\] the maximum occurs when $ t = t _ 1 $, so we have \[\tag{3.5.38} \left | \frac { \partial } { \partial x } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) \right | \le \frac { | x - y | } { \sqrt { 16 \pi k ^ 3 ( t _ 1 - s ) ^ 3 } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t _ 1 - s ) } \right ) .\] For large values of $ | x - y | $, specifically when \[\tag{3.5.39} | x - y | \ge \sqrt { 6 k ( t _ 2 - s ) }\] the maximum occurs when $ t = t _ 2 $, so we have \[\tag{3.5.40} \left | \frac { \partial } { \partial x } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) \right | \le \frac { | x - y | } { \sqrt { 16 \pi k ^ 3 ( t _ 2 - s ) ^ 3 } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t _ 2 - s ) } \right ) .\] For values in between those ranges the maximum occurs inside the interval $ ( t _ 1 , t _ 2 ) $ and so we have \[\tag{3.5.41} \left | \frac { \partial } { \partial x } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) \right | \le \sqrt { \frac { 27 } { 2 \pi } } \exp ( - 3 / 2 ) .\] The bounds above on the absolute value of the partial derivative are ugly, but they combine to give an integrable function because of the exponential decay at infinity. It's still bounded when we multiply by $ M $, the uniform bound for $ | u | $ which we've assumed exists, so the theorem applies and differentiation under the integral sign is justified by Theorem 3.5.F. It may seem that we've proved differentiability at $ t $ in the interval $ ( t _ 1 , t _ 2 ) $ but for any $ t > s $ we can choose $ t _ 1 $ and $ t _ 2 $ such that $ t _ 2 > t > t _ 1 > s $ so in fact we've proved it for all $ t > s $.

An argument similar to the one we just applied to the $ x $ derivative also applies to the $ t $ derivative as well. Furthermore we can apply our argument for the $ x $ derivative to $ \partial u / \partial x $ instead of $ u $ to get the second derivative $ \partial ^ 2 u / \partial x ^ 2 $, which is seen to be equal to the result of formally differentiating under the integral in the solution formula. Combining this with the result already obtained for $ \partial u / \partial t $ we see that we can apply the differential operator \[\tag{3.5.42} \frac { \partial } { \partial t } - k \frac { \partial ^ 2 } { \partial x ^ 2 }\] to the integral \[\tag{3.5.43} \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y\] by formally bringing it inside the integral, where it will just hit the fundamental solution since the $ f $ factor is constant as far as the integration is concerned. Since the fundamental solution is a solution we find that our solution formula does indeed give a solution. We suspected this to be the case, and wouldn't have gone to all of this effort in analysing it if we didn't, but we didn't have any certainty until now. It's worth expressing this as a theorem.

Theorem 3.5.G Suppose $ f $ is a bounded continuous function on $ \mathbf R $ and \[\tag{3.5.44} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y .\] Then $ u $ is continuously differentiable in $ t $ and twice continuously differentiable in $ x $ for $ t > s $ and is a solution of the diffusion equation there.

We don't need to stop after taking one $ t $ derivative or two space derivatives. We can take any number of derivatives of either type. The result will be the same as taking those derivatives inside the integral where they will hit the fundamental solution.

Theorem 3.5.H Suppose $ f $ is a bounded continuous function on $ \mathbf R $ and \[\tag{3.5.45} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y .\] Then $ u $ is infinitely differentiable in $ t $ and $ x $ for $ t > s $.

We've only ever considered solving the diffusion equation forward in time. This was largely motivated by applications but the preceding theorem shows that there are deeper reasons why we shouldn't try to solve the diffusion equation backward in time. If for some initial data $ f $ we can solve the diffusion backwards from $ t = s $ to some $ t = r $ then solving the initial value problem forward from $ t = r $ with initial data $ u ( t , r ) $ would give us back $ f $ and by the preceding theorem this $ f $ would be infinitely differentiable. Most functions, even most twice continuously differentiable functions, are not infinitely differentiable. The function $ \rho $ we met earlier is an example of a function which it twice continuously differentiable but not thrice differentiable. By what we've just shown the backwards initial value problem for this function, or any other function which is not infinitely differentiable, cannot have a solution. A more careful argument would show that even among infinitely differentiable functions the ones which can be evolved backwards in time by the diffusion equation are highly unusual.

Section 3.6 Existence

We've now filled the gap in our earlier uniqueness proof and partially proved existence of solutions to the initial value problem. More precisely, we've proved that the equation \[\tag{3.6.1} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y\] gives a solution to the diffusion equation for $ t > s $. We haven't shown that it satisfies the initial conditions though.

Normally the easiest part of proving that an explicit solution to an initial value problem is valid is checking the initial conditions, but if we do this in the naive way here we are in for a shock. First of all, the factor $ \sqrt { 4 \pi k ( t - s ) } $ in the denominator means the integrand can't even be evaluated at $ t = s $. Taking limits doesn't help much. \[\tag{3.6.2} \lim _ { t \to s ^ + } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) = 0\] so if we just exchange the limit and integral we appear to get the integral of zero, which is zero. If we're slightly more careful then we might note that the limit above only works when $ x \neq y $, but changing the integrand at a single point has no effect on the integral, so we appear to have a problem.

When you first see theorems about interchanging limits and integrals it's easy to get the impression formal calculations generally give correct results and proving the correctness of those results is simply a matter of selecting an appropriate theorem to justify the calculation. Here we see a practical example where the formal calculation definitely gives the wrong result. We can verify that it's wrong by considering, for example, the constant initial data $ f ( y ) = 1 $, for which the solution formula correctly gives us the solution $ u ( t , x ) = 1 $, for which the argument above incorrectly suggests should give us the equation $ \lim _ { t \to s ^ + } 1 = 0 $. So what we need to do is not to find a convergence theorem to justify the formal calculation above, because there can be no such argument, but rather to find a different formal calculation, giving a different result, and find a convergence theorem to justify that calculation.

We've actually seen a variant of the formal argument we require once before. The way we got the left hand side of our solution formula was to perform a change of variable before taking a limit. We do the same thing here, with a very similar change of variable, \[\tag{3.6.3} z = \frac { y - x } { \sqrt { 4 \pi k ( t - s ) } } .\] This gives \[\tag{3.6.4} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \exp ( - \pi z ^ 2 ) f ( x + z \sqrt { 4 \pi k ( t - s ) } ) \, d z .\] This now does give the correct value when $ t = s $. Furthermore, Theorem 3.5.E shows that this function is continuous for all $ t \ge s $. This is more important than it might seem. Indeed if the goal were simply to find a function which matches the initial data at $ t = s $ and solves the diffusion equation for $ t > s $ then we could simply have chosen the function which is equal to $ f $ for $ t = s $ and $ 0 $ for $ t > s $. Continuity is the condition which prevents us from doing this and forces us to find a solution whose values for $ t > s $ are related to the values at $ t = s $.

We have an unfortunate mismatch between our uniqueness results and our existence results. In proving uniqueness we assumed that $ u $, $ \partial u / \partial t $, $ \partial u / \partial x $ and $ \partial ^ 2 u / \partial x ^ 2 $ are all continuous in the region $ t \ge s $. We now have existence of a solution for which $ u $ is continuous for $ t \ge s $ but its various partial derivatives are only known to be continuous, or indeed to exist, for $ t > s $. We'd like the differentiability conditions in our existence and uniqueness theorem to be the same. To accomplish this we need to weaken the hypotheses of our uniqueness theorem or strengthen the conclusion of our existence theorem. In fact both of these are possible.

The option of weakening the hypotheses in the uniqueness theorem turns out to be both easier and more useful in applications. We had two proofs of the uniqueness theorem. The first one was relatively simple but didn't give us an explicit solution formula. Supposing there were two solutions with the name initial data and looked at their difference, which is still a solution, by linearity, and has zero initial data. Our non-negativity result, Theorem 3.3.A, then shows that the difference remains non-negative. By taking the difference in the opposite order we see that it also remains non-positive, so it's zero everywhere. In other words any two solutions with the same initial data are the same solution. This was for classical solutions, but if we can prove the following variant then we can use it to get a uniqueness theorem whose differentiability assumptions match those of our existence theorem.

Theorem 3.6.A Suppose $ u $ is a bounded continuous function on $ [ s , + \infty ) \times \mathbf R $ and on $ ( s , + \infty ) \times \mathbf R $ it is continuously differentiable in $ t $ and twice continuously differentiable in $ x $ and satisfies the diffusion equation there. Suppose also \[\tag{3.6.5} u ( s , x ) = f ( x )\] for all $ x $, where $ f $ is a non-negative continuous function. Then $ u $ is non-negative.
In fact a careful examination of the proof given for Theorem 3.3.A shows that the only places where we used the differentiability of $ u $ or the diffusion equation were in the region $ t > s $, so exactly the same proof gives Theorem 3.6.A.

As explained earlier, from Theorem 3.6.A we get an analogue of 3.4.A

Theorem 3.6.B There is at most one bounded continuous function $ u $ on $ [ s , + \infty ) \times \mathbf R $ which is continuously differentiable in $ t $ and twice continuously differentiable in $ x $ and on $ ( s , + \infty ) \times \mathbf R $ satisfies the diffusion equation there and also satisfies the initial condition \[\tag{3.6.6} u ( s , x ) = f ( x )\] for all $ x $.

As before, this only gives uniqueness, not a solution formula, but since we already have an existence theorem under the same conditions which does feature an explicit solution formula it follows from the theorem above that any solution must be given by that formula.

We won't pursue the other option, showing that when $ f $ is twice continuously differentiable the solution formula gives a classical solution for $ t \ge s $, here but I will give a quick sketch of the argument. One needs to start from the alternate form of the solution formula, \[\tag{3.6.7} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \exp ( - \pi z ^ 2 ) f ( x + z \sqrt { 4 \pi k ( t - s ) } ) \, d z .\] If $ f ' $ and $ f '' $ are bounded then our theorem on differentiation under the integral sign shows that $ u $ is twice continuously differentiable in $ x $ and that the derivatives are obtained by formal differentiation under the integral sign. This doesn't quite work for the $ t $ derivative but there is a variant of our theorem on differentiation under the integral which does work. We still have the problem though that this requires $ f ' $ and $ f '' $ to be bounded, while we've only assumed that $ f $ itself is bounded. We met this problem once before, when deriving the solution formula from Green's theorem, and the solution here is similar. We need to multiply our initial data by $ \rho ( x / L ) $, where $ \rho $ is the function defined there. The new function will have derivatives which are non-zero only in the interval $ [ - L , L ] $ and continuous functions on a closed interval are always bounded so the argument described above applies to the modified initial data. Of course we want a solution with the original initial data. The modified initial data agree with the original initial data in the interval $ ( - L / 2 , L / 2 ) $ though. Using the usual form of the solution formula we can show that when the initial data is zero in an open interval the solution is a classical one for $ t \ge s $, not just $ t > s $, and $ x $ in that interval. Combining this with what we already have gives the improved existence theorem.

Section 3.7 Boundary Value Problems

The same method we used for the wave equation, the method of reflection, can be used to treat boundary value problems for the diffusion equation, provided the boundary conditions are of Dirichlet or Neumann type. Suppose, for example, that we are given data $ f $ which are continuous on the closed interval $ [ a , b ] $ and are looking for a solution to the initial value problem in the region $ [ s , \infty ) \times [ a , b ] $ satisfying a Dirichlet condition at the left endpoint and a Neumann condition at the right endpoint, \[\tag{3.7.1} u ( t , a ) = 0 , \quad \frac { \partial u } { \partial x } ( t , b ) = 0 ,\] just as we did for the wave equation. We'll need $ f ( a ) = 0 $ in order to have a chance of solving this equation. As long as we're looking for a solution which is merely continuous for $ t \ge s $ and not one which is continously differentiable in $ t $ and twice continuously differentiable in $ x $ there we don't need to impose the conditions $ f ' ( b ) = 0 $ or $ f '' ( a ) = 0 $ which we imposed for the wave equation, and indeed it wouldn't make sense to impose them since we're merely assuming that $ f $ is continuous.

We can extend $ f $ to all of $ \mathbf R $ in the same way as we did for the wave equation, namely \[\tag{3.7.2} f ( x ) = \begin{cases} f ( a + ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 0 }$,} \\ f ( b - ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 1 }$,} \\ - f ( a + ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 2 }$,} \\ - f ( b - ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 3 }$,} \\ \end{cases}\] where \[\tag{3.7.3} \frac { x - a } { b - a } = 4 m ( x ) + l ( x ) + r ( x )\] and $ r ( x ) \in [ 0 , 1 ) $, and $ m ( x ) $ and $ l ( x ) $ both integers with $ 0 \le l ( x ) < 4 $. We then define \[\tag{3.7.4} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y ,\] where the $ f $ in the integrand is this extended $ f $. To prevent subsequent equations from getting very messy we will write this in terms of the fundamental solution: \[\tag{3.7.5} u ( t , x ) = \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) f ( y ) \, d y ,\] The argument that this is a solution to the initial value problem with the given boundary conditions, and is the only solution, is essentially the same as for the wave equation.

It's also possible to write the solution in terms of the original, unextended, $ f $ as follows. We split first split the integral into pieces: \[\tag{3.7.6} u ( t , x ) = \sum _ { l = 0 } ^ 3 \sum _ { m = - \infty } ^ { + \infty } \int _ { a + ( 4 m + l ) ( b - a ) } ^ { b + ( 4 m + l ) ( b - a ) } K ( t - s , x - y ) f ( y ) \, d y ,\] or, after a linear change of variable, \[\tag{3.7.7} u ( t , x ) = \sum _ { l = 0 } ^ 3 \sum _ { m = - \infty } ^ { + \infty } \int _ { a } ^ { b } K ( t - s , x - y - ( 4 m + 1 ) ( b - a ) ) f ( y + ( 4 m + l ) ( b - a ) ) \, d y .\] Now \[\tag{3.7.8} f ( y + ( 4 m + l ) ( b - a ) ) = \begin{cases} f ( y ) & \mbox { if ${ l = 0 }$,} \\ f ( a + b - y ) & \mbox { if ${ l = 1 }$,} \\ - f ( y ) & \mbox { if ${ l = 2 }$,} \\ - f ( a + b - y ) & \mbox { if ${ l = 3 }$.} \end{cases}\] We make a further change of variable in the odd cases, replacing $ y $ by $ a + b - y $, obtaining \[\tag{3.7.9} \begin{split} u ( t , x ) & = \sum _ { m = - \infty } ^ { + \infty } K ( t - s , x - y - 4 m ( b - a ) ) f ( y ) \, d y \\ & \quad{} + \sum _ { m = - \infty } ^ { + \infty } K ( t - s , x + y - 2 a - ( 4 m + 2 ) ( b - a ) - a - b ) f ( y ) \, d y \\ & \quad {} - \sum _ { m = - \infty } ^ { + \infty } K ( t - s , x - y - ( 4 m + 2 ) ( b - a ) ) f ( y ) \, d y \\ & \quad {} - \sum _ { m = - \infty } ^ { + \infty } K ( t - s , x + y - 2 a - ( 4 m + 4 ) ( b - a ) - a - b ) f ( y ) \, d y . \end{split}\] Here the $ f $'s in the integrands all refer to the original, unextended, $ f $.

Section 3.8 Conservation and Monotonicity

The diffusion equation has a conservation law which applies when either there is no boundary or the boundary conditions are all Neumann conditions. First we consider the case without boundary. Suppose that \[\tag{3.8.1} \int _ { - \infty } ^ { + \infty } | f ( y ) | \, d y = 0 .\] Then \[\tag{3.8.2} \begin{split} \int _ { - \infty } ^ { + \infty } u ( t , x ) \, d x & = \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) f ( y ) \, d y \, d x \\ & = \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) f ( y ) \, d x \, d y \\ & = \int _ { - \infty } ^ { + \infty } f ( y ) \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) \, d x \, d y \\ & = \int _ { - \infty } ^ { + \infty } f ( y ) \, d y . \end{split}\] The interchange of the two integrals is justified by Theorem 3.5.D. So the quantity \[\tag{3.8.3} \int _ { - \infty } ^ { + \infty } u ( t , x ) \, d x\] is independent of $ t $.

Next we consider the case of a finite interval $ [ a , b ] $ with Neumann conditions at both endpoints. We apply Green's theorem with \[\tag{3.8.4} p = - u , \quad q = - k \frac { \partial u } { \partial x } .\] For the region we take the rectangle $ R = [ t _ 1 , t _ 2 ] \times [ a , b ] $, where $ t _ 2 > t _ 1 > s $. Green's theorem gives \[\tag{3.8.5} \sum _ { j = 1 } ^ 4 \int _ { C _ j } ( p ( t , x ) \, d x + q ( t , x ) \, d t ) = \int _ R \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } \, d A .\] The curves $ C _ 1 $, $ C _ 2 $, $ C _ 3 $ and $ C _ 4 $ will be the line segments from $ ( t _ 1 , a ) $ to $ ( t _ 1 , b ) $, $ ( t _ 1 , b ) $ to $ ( t _ 2 , b ) $, $ ( t _ 2 , b ) $ to $ ( t _ 2 , a ) $, and $ ( t _ 2 , a ) $ to $ ( t _ 1 , a ) $, respectively. The integrals along $ C _ 2 $ and $ C _ 4 $ are zero because of the Neumann condition. The right hand side will be zero for solutions of the diffusion equation. What we are left with is \[\tag{3.8.6} \int _ a ^ b u ( t _ 2 , x ) \, d x - \int _ a ^ b u ( t _ 1 , x ) \, d x = 0 .\] So \[\tag{3.8.7} \int _ a ^ b u ( t , x ) \, d x\] is independent of $ t $. This argument doesn't work for $ t = s $ because Green's theorem requires $ p $ and $ q $ to be continuously differentiable in all of $ R $ but we can take the limit as $ t _ 1 $ tends to $ s $ from above since the hypotheses of Theorem 3.5.B are satisfied.

In the original application of the diffusion equation to heat conduction Dirichlet boundary conditions correspond to conducting boundaries and Neumann boundary conditions correspond to insulating boundary conditions. The integral of $ u $ corresponds to the total energy. The theorem we've just proved is then conservation of energy for a thermally isolated system. We can't expect the theorem to apply with Dirichlet boundary conditions because energy can enter or leave the system through the conducting boundary.

Another important physical quantity in the original physical application is entropy, given by the integral \[\tag{3.8.8} - \int u ( t , x ) \log u ( t , x ) \, d x .\] Of course this only makes sense if $ u $ is positive, which it always is in the study of heat conduction. We've already seen that if the initial data are positive then the solution will remain positive forever. Unlike energy, we don't expect entropy to be conserved. The second law of thermodynamics says that it should be increasing, or at least non-decreasing. Again, we expect this only for isolated systems, so it should hold either when there is no boundary or when all boundary conditions are of Neumann type.

This time we'll treat the case of a finite interval first. Taking $ R $ as before we set \[\tag{3.8.9} p = u \log u , \quad q = k \frac { \partial u } { \partial x } \log u .\] Then \[\tag{3.8.10} \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } = \frac { k u } { \left ( \partial u / \partial x \right ) ^ 2 } - \left ( \frac { \partial u } { \partial t } - k \frac { \partial ^ 2 u } { \partial x ^ 2 } \right ) \log u .\] The second term on the right hand side is zero for solutions of the diffusion equation. As with our proof of energy conservation, the integrals over $ C _ 2 $ and $ C _ 4 $ vanish because of the Neumann boundary condition are we are left with \[\tag{3.8.11} \int _ a ^ b u ( t _ 2 , x ) \, d x - \int _ a ^ b u ( t _ 1 , x ) \, d x = \int _ R \frac { k u } { \left ( \partial u / \partial x \right ) ^ 2 } \, d A .\] Since $ u $ is positive the integrand on the right hand side is positive everywhere and so the integral is positive. It follows that \[\tag{3.8.12} \int _ a ^ b u ( t _ 2 , x ) \, d x > \int _ a ^ b u ( t _ 1 , x ) \, d x .\] As before, the use of Green's theorem presupposes $ t _ 2 > t _ 1 > s $ but we can use continuity to get the same result for $ t _ 2 > t _ 1 \ge s $.

The argument for the case without boundaries is more subtle. Let \[\tag{3.8.13} \varphi ( z ) = z \log z\] and \[\tag{3.8.14} w ( t , x , y ) = \varphi ( f ( y ) ) - \varphi ( u ( t , x ) ) - \varphi ' ( u ( t , x ) ) ( f ( y ) - u ( t , x ) ) .\] Multiplying by $ K ( t - s , x - y ) $ and integrating with respect to both $ x $ and $ y $ we get \[\tag{3.8.15} \begin{split} & \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) w ( t , x , y ) \, d x \, d y \\ & \quad{} = \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) \varphi ( f ( y ) ) \, d x \, d y \\ & \qquad {} - \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) \varphi ( u ( t , x ) ) \, d y \, d x \\ & \qquad {} - \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) \varphi ' ( u ( t , x ) ) f ( y ) \, d y \, d x \\ & \qquad {} + \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) \varphi ' ( u ( t , x ) ) u ( t , x ) ) \, d y \, d x . \end{split}\] Here we've used Theorem 3.5.D to change the order of integration in some, but not all cases. Performing the inner integration in each of the integrals on the right we have \[\tag{3.8.16} \begin{split} \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) w ( t , x , y ) \, d x \, d y & = \int _ { - \infty } ^ { + \infty } \varphi ( f ( y ) ) \, d y \\ & \quad {} - \int _ { - \infty } ^ { + \infty } \varphi ( u ( t , x ) ) \, d x \\ & \quad {} - \int _ { - \infty } ^ { + \infty } \varphi ' ( u ( t , x ) ) u ( t , x ) \, d x \\ & \quad {} + \int _ { - \infty } ^ { + \infty } \varphi ' ( u ( t , x ) ) u ( t , x ) ) \, d x . \end{split}\] The last two cancel so we are left with \[\tag{3.8.17} \begin{split} \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) w ( t , x , y ) \, d x \, d y & = \int _ { - \infty } ^ { + \infty } \varphi ( f ( y ) ) \, d y \\ & \quad {} - \int _ { - \infty } ^ { + \infty } \varphi ( u ( t , x ) ) \, d x . \end{split}\] The fundamental solution is positive everywhere and $ w ( t , x , y ) $ is non-negative everywhere as a result of the convexity of $ \varphi $ so the integrand on the left hand side is non-negative and therefore so is its integral. It follows that \[\tag{3.8.18} \int _ { - \infty } ^ { + \infty } \varphi ( f ( y ) ) \, d y \le \int _ { - \infty } ^ { + \infty } \varphi ( u ( t , x ) ) \, d x\] or, equivalently, \[\tag{3.8.19} \int _ { - \infty } ^ { + \infty } \varphi ( u ( s , y ) ) \, d y \le \int _ { - \infty } ^ { + \infty } \varphi ( u ( t , x ) ) \, d x .\] In other words, the entropy at later times is always greater than or equal to the initial entropy. Of course if we have $ t _ 2 > t _ 1 \ge s $ then we can just view the solution restricted to $ [ t _ 1 , \infty ) $ as the solution to an initial value problem with data prescribed at time $ t _ 1 $, so we find that the entropy at time $ t _ 2 $ is greater than or equal the entropy at time $ t _ 1 $, so entropy is non-decreasing.

We can sharpen the result above by noting that $ \varphi $ is strictly convex, so if $ w ( t , x , y ) $ is positive except when $ u ( t , x ) = f ( y ) $ and both $ w $ and $ K $ are continuous so if $ K w $ is positive anywhere then it's positive on an open set and so the integral is positive. It follows that entropy is strictly decreasing unless $ u ( t , x ) = f ( y ) $ for all $ t $, $ x $ and $ y $, which happens only if $ u $ is constant.

The argument we used for entropy applies to prove monotonicity of other interesting integrals. If $ \varphi $ is convex and differentiable then \[\tag{3.8.20} \int \varphi ( u ( t , x ) ) \, d x\] is a decreasing function of $ t $, and strictly decreasing if $ \varphi $ is strictly convex and $ u $ is not constant. This applies, for example to \[\tag{3.8.21} \int u ( t , x ) ^ 2 \, d x\] or, more generally, to \[\tag{3.8.22} \int | u ( t , x ) | ^ p \, d x\] for $ p > 1 $. A slight variant of the argument applies when $ \varphi $ is convex but not necessarily differentiable, and so includes the case $ p = 1 $ above.

Section 3.9 Black-Scholes Equation

The Black-Scholes equation, (1.0.8), we mentioned in the introduction. As a reminder, it was \[\tag{3.9.1} \frac { \partial v } { \partial \tau } + \frac 1 2 \sigma ^ 2 s ^ 2 \frac { \partial ^ 2 v } { \partial s ^ 2 } + r s \frac { \partial v } { \partial s } - r v = 0 .\] It describes the evolution of the value of a derivative, although neither value nor derivative mean what they usual do in mathematics. Value means what you would expect in a financial context: the price at which an asset can be bought or sold. Derivative means an asset whose value depends on the value of some other asset. Usually this means an option on a stock, i.e. a contract giving one the right to buy or sell a stock at a given price on a given date. In the equation above $ v $ is the value of the derived asset, $ s $ is the price of the underlying asset, $ r $ is the rate of return on a risk free asset, $ \sigma $ is the volatility of the price of the underlying asset, and $ \tau $ is time.

The change of variable For now let's assume the derivative is an option on a stock which allows us to buy or sell it at a given price $ K $, usually called the strike price, at time $ T $, usually called the expiry date of the option. If make the changes of variable \[\tag{3.9.2} t = T - \tau , \quad u = v \exp ( r t ) , \quad x = \log ( s / K ) + \left ( r - \frac 1 2 \sigma ^ 2 \right ) t , \sigma = \sqrt { 2 k }\] in the Black-Scholes equation we get the diffusion equation. of $ t $ and $ s $. Note that $ t $ is the time remaining until expiry, so $ t = 0 $ corresponds to expiry and positive values of $ t $ correspond to times before the option expires, which are the times at which we'd like to compute its value. The value at expiry is a known function of $ s $ and therefore of $ s $. The specific function depends on whether our option is an option to sell, usually called a put, or an option to buy, usually called a call. So the problem of computing the value of the option at earlier times is an initial value problem for the diffusion equation, although it's really a final value problem for the Black-Scholes equation due to the time reversal in our change of variables.

While the Black-Scholes equation applies to the value of any option the example described above, an option which can be exercised only at expiry, is only one particular type of option, usually called a European option. The more common type of option, even in Europe, is what's called an American option, where the option can be exercised at any time. The effect of this is to convert our pure initial value problem into a boundary value problem, but this boundary value problem is of a very different type from the ones we've considered previously. The boundary conditions, in terms of the original variables, are \[\tag{3.9.3} v ( \tau , g ( \tau ) ) = \max ( 0 , v ( T , g ( \tau ) ) )\] and \[\tag{3.9.4} \frac { \partial v } { \partial x } ( \tau , g ( \tau ) ) = 1 .\] What is $ g ( \tau ) $? It is the price at which one should choose to exercise the option early at time $ \tau $. This is not a given function. Rather, finding this function is part of solving the problem. So, unlike the boundary value problems we've considered previously, the location of the boundary is not known in advance but rather has to be solved for. In some sense the lack of information about the location of the boundary is compensated for by the fact that we have two boundary conditions to be satisfied on the boundary rather than one.

Boundaries whose location is not known in advance are called free boundaries. They don't just arise in financial mathematics and indeed didn't first arise there. The classical example of a free boundary problem is fluid flow with a fluid which has a boundary, for example a bubble within the fluid region or the top surface of a water wave. The peculiarity of free boundary problems is that even when the equation is linear, as the Black-Scholes equation and the equations for irrotational incompressible fluid flow are, the methods needed to study them look much more like those of the theory of nonlinear differential equations.

Chapter 4 Burgers' Equation

We've already seen Burgers' equation \[\tag{4.0.1} \frac { \partial u } { \partial t } + u \frac { \partial u } { \partial x } = 0\] Burgers' equation is a vastly simplified model of the evolution of the free boundary for fluid flow without viscosity.

There is a general theory which applies to first order scalar differential equations, of which this is one, but here we'll just do everything by hand in this special case.

Section 4.1 Explicit Solution

Suppose $ u $ is a continuously differentiable solution to this equation in a neighbourhood of the point $ ( t _ 0 , x _ 0 ) $ and set \[\tag{4.1.1} p ( t ) = u ( t , x _ 0 + v t - v t _ 0 ) - v\] where $ v = u ( t _ 0 , x _ 0 ) $. Then $ p ( t _ 0 ) = 0 $ and the chain rule gives \[\tag{4.1.2} p ' ( t ) = \frac { \partial u } { \partial t } ( t , x _ 0 + v t - v t _ 0 ) + v \frac { \partial u } { \partial x } ( t , x _ 0 + v t - v t _ 0 )\] or, using the fact that $ u $ satisfies the differential equation, \[\tag{4.1.3} p ' ( t ) = - p ( t ) \frac { \partial u } { \partial x } ( t , x _ 0 + v t - v t _ 0 ) .\] Defining \[\tag{4.1.4} q ( t ) = p ( t ) \exp \left ( \int _ { t _ 0 } ^ t \frac { \partial u } { \partial x } ( t , x _ 0 + v s - v t _ 0 ) \, d s \right )\] we find $ q ( t _ 0 ) = 0 $ and using the product rule and the fundamental theorem of calculus we find that \[\tag{4.1.5} q ' ( t ) = 0\] so $ q $ is zero everywhere. It follows that $ p $ is also zero everywhere and \[\tag{4.1.6} u ( t , x _ 0 + v t - v t _ 0 ) = v .\] In other words, $ u $ is constant on the line $ x - v t = x _ 0 - v t _ 0 $. So to solve the initial value problem \[\tag{4.1.7} u ( t _ 0 , x ) = f ( x )\] it suffices to solve eliminate $ x _ 0 $ from the system of equations \[\tag{4.1.8} x - u t = x _ 0 - u t _ 0 , \quad u = f ( x _ 0 ) .\]

As a simple example consider linear initial conditions \[\tag{4.1.9} u ( t _ 0 , x ) = c x .\] Eliminating $ x _ 0 $ from \[\tag{4.1.10} x - u t = x _ 0 - u t _ 0 , \quad u = c x _ 0 .\] gives \[\tag{4.1.11} u ( t , x ) = \frac { c x } { 1 + c ( t - t _ 0 ) } .\] That this satisfies the differential equation and initial conditions is easy to check directly. The behaviour depends on the sign of $ c $. If $ c $ nonnegative then the solution exists for all nonnegative values of $ t $ while if $ c $ is negative the solution exists up until $ t = t _ 0 - 1 / c $ but there is no continuously differentiable solution afterwards. Unlike the wave or diffusion equations we can therefore not expect global solutions for Burgers' equation, even for very nice initial data.

Section 4.2 Shock Formation

If you know something about the existence and uniqueness theorems for ordinary differential equations this should not surprise you. The existence results for linear ordinary differential equations give global existence, while the ones for nonlinear ordinary differential equations only give existence in a finite time interval, whose length depends on the choice of initial data. Since the wave and diffusion equations are linear while Burgers' is nonlinear it isn't particularly unexpected that we get global existence for the first two and only existence in a finite interval for the last one. In fact the situation is worse than that though. Consider the initial conditions \[\tag{4.2.1} u ( 0 , x ) = \cos \left ( \pi x ^ 2 \right ) .\] The solution, for as long as it exists, should be equal to $ ( - 1 ) ^ k $ on the lines \[\tag{4.2.2} x = \sqrt { k } + ( - 1 ) ^ k t ,\] where $ k $ is a nonnegative integer. Considering the cases $ k = 2 j $ and $ k = 2 j + 1 $, where $ j $ is a positive integer, we see that at the point \[\tag{4.2.3} ( t , x ) = \left ( \frac { \sqrt { 2 j + 1 } - \sqrt { 2 j } } 2 , \frac { \sqrt { 2 j + 1 } + \sqrt { 2 j } } 2 \right )\] $ u ( t , x ) $ should be equal to both $ + 1 $ and $ - 1 $, so the solution cannot extend as far forward in time as \[\tag{4.2.4} t = \frac { \sqrt { 2 j + 1 } - \sqrt { 2 j } } 2 .\] Similarly, considering the cases $ k = 2 j - 1 $ and $ k = 2 j $ we see that at the point \[\tag{4.2.5} ( t , x ) = \left ( - \frac { \sqrt { 2 j } - \sqrt { 2 j - 1 } } 2 , \frac { \sqrt { 2 j - 1 } + \sqrt { 2 j } } 2 \right )\] $ u ( t , x ) $ should again be equal to both $ + 1 $ and $ - 1 $, so the solution cannot extend as far backward in time as \[\tag{4.2.6} t = - \frac { \sqrt { 2 j } - \sqrt { 2 j - 1 } } 2 .\] But these remarks apply to all integers $ j $, and both $ \sqrt { 2 j + 1 } - \sqrt { 2 j } $ and $ \sqrt { 2 j } - \sqrt { 2 j - 1 } $ tend to zero as $ j $ tends to infinity, so there is no time interval of positive length on which we have a continuously differentiable solution to this initial value problem, even though the initial data is bounded and infinitely differentiable!

There is much more to be said about Burgers' equation. It was originally introduce to model fluid flow and, in particular, shock formation. There is a natural way to extend solutions beyond the singularities we've seen above, although not as a continuously differentiable, or even continuous function. This is true also for the more complicated, but also more physically relevant, Euler equations. That is a topic for a more advanced text though.

Chapter 5 Laplace Equation

Section 5.1 Symmetries

Section 5.2 Poisson Solution in Half-plane

Section 5.3 Regularity

Section 5.4 Harmonic Conjugate

Section 5.5 More Symmetries

Section 5.6 Poisson Solution in Disc

Section 5.7 Mean Value Property

Section 5.8 Bounded Regions