Introduction to Partial Differential Equations

John Stalker

2025-12-08

1 Introduction
2 Wave Equation
2.1 D'Alembert Solution
2.2 Existence and Uniqueness
2.3 Energy
2.4 Symmetries
2.5 Energy and Uniqueness for Boundary Value Problems
2.6 Existence for Boundary Value Problems
2.7 Green's Theorem
2.8 Klein-Gordon and Sine-Gordon Equations
3 Diffusion Equation
3.1 Symmetries
3.2 Special Solutions
3.3 Positivity
3.4 Uniqueness
3.5 Regularity
3.6 Existence
3.7 Boundary Value Problems
3.8 Conservation and Monotonicity
3.9 Black-Scholes Equation
4 Burgers' Equation
4.1 Explicit Solution
4.2 Shock Formation
5 Laplace Equation
5.1 Symmetries
5.2 Mean Value Property
5.3 The Maximum Principle
5.4 Uniqueness in Bounded Domains
5.5 Lorentz Transformations
5.6 More Symmetries
5.7 Dirichlet Problem for a Disc
5.8 Regularity
5.9 Dirichlet Problem for a Half-plane
5.10 Harmonic Conjugate

Chapter 1 Introduction

Differential equations are equations involving an unknown function and its derivatives, like \[\tag{1.0.1} m \frac { d ^ 2 x } { d t ^ 2 } + k x = 0 ,\] \[\tag{1.0.2} \frac { d ^ 2 x } { d t ^ 2 } - \mu ( 1 - x ^ 2 ) \frac { d x } { d t } + x = 0\] \[\tag{1.0.3} \frac { \partial ^ 2 u } { \partial x ^ 2 } + \frac { \partial ^ 2 u } { \partial y ^ 2 } = 0\] \[\tag{1.0.4} \frac { \partial u } { \partial t } - k \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0\] \[\tag{1.0.5} \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0\] \[\tag{1.0.6} \frac { \partial u } { \partial t } + u \frac { \partial u } { \partial x } = 0\] \[\tag{1.0.7} \frac { \partial u } { \partial t } + \frac { \partial ^ 3 u } { \partial x ^ 3 } + 6 u \frac { \partial u } { \partial x } = 0\] \[\tag{1.0.8} \frac { \partial v } { \partial \tau } + \frac 1 2 \sigma ^ 2 s ^ 2 \frac { \partial ^ 2 v } { \partial s ^ 2 } + r s \frac { \partial v } { \partial s } - r v = 0\]

All of these equations have names. Equation (1.0.1) is the simple harmonic oscillator equation. It appears in mechanics and many other places. Equation (1.0.2) is the Van der Pol equation. It first appeared in electrical engineering. Equation (1.0.3) is the Laplace equation. It appears in the study of gravitational and electrostatic fields. Equation (1.0.4) is the diffusion equation, also known as the heat equation. It appears in study of diffusion, originally diffusion of heat but it also applies to, for example, diffusion of chemical solutions. Equation (1.0.5) is the wave equation. It appears in the study of various kinds of waves, e.g. electromagnetic, acoustic, etc., but not generally water waves. Equation (1.0.6) is Burgers' equation. Unlike the wave equation Burgers' equation is often used as a simple model for water waves. Equation (1.0.7) is the Korteweg–De Vries equation. It is a more refined model of water waves, but still simpler than real water waves. Equation (1.0.8) is the Black-Scholes equation. It appears in mathematical finance.

Some terminology is useful for describing these. We say that variables which are differentiated are dependent variables, variables with respect to which we differentiate them are independent variables, and variables which don't appear in derivatives at all are parameters. Note that which variables play which role varies from equation to equation. $x$, for example, is a dependent variable in the first two equations, an independent variable in the next five equations, and doesn't appear in the last equation. If there's only one independent variable, i.e. if we only differentiate with respect to one variable, then the derivatives are ordinary derivatives and so the equation is called an ordinary differential equation. If there is more than one then the derivatives are partial derivatives and so the equation is called a partial differential equation, which is the subject of these notes. In the list above equations (1.0.1) and (1.0.2) are ordinary, while the others are all partial differential equations. Some knowledge of ordinary differential equations is useful for studying partial differential equations, but for what we'll do here it's not essential. The order of a differential equation is the order of the highest derivative appearing in it. Equation (1.0.6) is first order and equation (1.0.7) is third order while all the other equations above are second order. Second order equations seem to be pervasive in mathematical physics.

The most important distinction in the theory of differential equations is between linear and nonlinear equations. The distinction is somewhat subtle though. A linear differential equation is a linear equation in the unknown function, i.e. the dependent variable, and its derivatives, the coefficients of which are allowed to be functions of the independent variables and parameters, but not of the dependent variable. It's this last bit which tends to cause confusion. In the list above equations (1.0.1), (1.0.3), (1.0.4), (1.0.5) and (1.0.8) are linear, while (1.0.2), (1.0.6) and (1.0.7) are nonlinear. Equation (1.0.8), for example, is linear because the coefficients of the derivatives, $\partial v / \partial t$, $\partial ^ 2 v / \partial s ^ 2$, $\partial v / \partial s$ and $v$, the last of these being considered as a zeroeth order derivative of the unknown function $v$, are $1$, $\frac 1 2 \sigma ^ 2 s ^ 2$, $r s$ and $- r$, all of which are functions of the independent variables $t$ and $s$ and the parameters $r$ and $\sigma$, but don't depend on the independent variable $v$. Equation (1.0.7), by contrast, is nonlinear because if we try to write it as linear equation for the derivatives $ \partial u / \partial t $, $ \partial ^ 3 u / \partial x ^ 3 $ and $ \partial u / \partial x $ with coefficients $ 1 $, $ 1 $ and $ 6 u $ then the first two are okay but $ 6 u $ is a function of the dependent variable $u$, which is not allowed. Note that constant functions are functions, so $ 1 $ is a function of the independent variables $t$ and $x$, it's just a constant function. $ 6 u $, by contrast, is not a function of $t$ and $x$.

Chapter 2 Wave Equation

Section 2.1 D'Alembert Solution

Suppose that $u$ is a twice continuously differentiable function on $\mathbf R ^ 2$. For this chapter we'll label the coordinates on $\mathbf R ^ 2$ as $t$ and $x$, written in that order, and in diagrams the $t$ will be vertical and the $x$ axis will be horizontal. This is slightly awkward since we are used to coordinate systems in the plane where the first coordinate corresponds to the horizontal axis and the second one corresponds to the vertical. The letters $ t $ and $ x $ for time and space coordinates are far too well established to consider changing. Similarly the convention that in space-time diagrams time corresponds to the vertical direction is fairly universal. The only other option to avoid listing the vertical coordinate before the horizontal one would be to reverse the order of $ t $ and $ x $. Some authors do this, but listing coordinates in alphabetical order is standard nearly everywhere else and physicists largely switched from the convention of listing time last to listing time last more than half a century ago so the conventions we're using here are probably the least bad option.

It's helpful to introduce the auxiliary functions \[\tag{2.1.1} v = \frac { \partial u } { \partial t } + c \frac { \partial u } { \partial x } , \quad w = \frac { \partial u } { \partial t } - c \frac { \partial u } { \partial x }.\] Since $u$ was twice continuously differentiable $v$ and $w$ are once continuous differentiable.

Suppose that $ ( t _ 1 , x _ 1 ) $ and $ ( t _ 2 , x _ 2 ) $ are points such that \[\tag{2.1.2} x _ 1 - c t _ 1 = x _ 2 - c t _ 2 .\] and set \[\tag{2.1.3} \tau ( r ) = t _ 1 + r ( t _ 2 - t _ 1 ) , \quad \xi ( r ) = x _ 1 + r ( x _ 2 - x _ 1 ) .\] Then, by the chain rule \[\tag{2.1.4} \begin{split} \frac d { dr } z ( \tau ( r ) , \xi ( r ) ) & = \left ( t _ 2 - t _ 1 \right ) \frac { \partial z } { \partial t } ( \tau ( r ) , \xi ( r ) ) \\ & \quad {} + \left ( x _ 2 - x _ 1 \right ) \frac { \partial z } { \partial x } ( \tau ( r ) , \xi ( r ) ) \\ & = \left ( t _ 2 - t _ 1 \right ) \left ( \frac { \partial z } { \partial t } + c \frac { \partial z } { \partial x } \right ) ( \tau ( r ) , \xi ( r ) ) \end{split}\] for any function $z$ which is at least once continuously differentiable. Integrating from $r = 0$ to $r = 1$ and using the fundamental theorem of calculus we see that \[\tag{2.1.5} \begin{split} & z ( t _ 2 , x _ 2 ) = z ( t _ 1 , x _ 1 ) \\ & \qquad {} + ( t _ 2 - t _ 1 ) \int _ 0 ^ 1 \left ( \frac { \partial z } { \partial t } + c \frac { \partial z } { \partial x } \right ) \left ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 1 + r ( x _ 2 - x _ 1 ) \right ) \, d r . \end{split}\] This holds in particular for $z = u$ and $z = w$. In the latter case note that \[\tag{2.1.6} \frac { \partial w } { \partial t } + c \frac { \partial w } { \partial x } = \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 }\] so if, as we will assume from now until the end of this section, $u$ satisfies the wave equation, the integrand vanishes throughout the interval of integration. We then conclude that \[\tag{2.1.7} u ( t _ 2 , x _ 2 ) = u ( t _ 1 , x _ 1 ) + ( t _ 2 - t _ 1 ) \int _ 0 ^ 1 v \left ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 1 + r ( x _ 2 - x _ 1 ) \right ) \, d r .\] and \[\tag{2.1.8} w ( t _ 2 , x _ 2 ) = w ( t _ 1 , x _ 1 ) .\] The preceding calculation was carried out under the assumption that $ x _ 1 - c t _ 1 = x _ 2 - c t _ 2 $, which is certainly true if \[\tag{2.1.9} x _ 1 = x _ 2 + c t _ 1 - c t _ 2 ,\] so we can rewrite the preceding equations as \[\tag{2.1.10} \begin{split} u ( t _ 2 , x _ 2 ) & = u ( t _ 1 , x _ 2 + c t _ 1 - c t _ 2 ) \\ & \quad {} + ( t _ 2 - t _ 1 ) \int _ 0 ^ 1 v \left ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 2 - ( 1 - r ) c ( t _ 2 - t _ 1 ) \right ) \, d r . \end{split}\] and \[\tag{2.1.11} w ( t _ 2 , x _ 2 ) = w ( t _ 1 , x _ 2 + c t _ 1 - c t _ 2 ) .\] If instead we assume that \[\tag{2.1.12} x _ 1 + c t _ 1 = x _ 2 + c t _ 2\] then a very similar calculation leads to \[\tag{2.1.13} \begin{split} u ( t _ 2 , x _ 2 ) & = u ( t _ 1 , x _ 2 - c t _ 1 + c t _ 2 ) \\ & \quad {} + ( t _ 2 - t _ 1 ) \int _ 0 ^ 1 w \left ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 2 + ( 1 - r ) c ( t _ 2 - t _ 1 ) \right ) \, d r . \end{split}\] and \[\tag{2.1.14} v ( t _ 2 , x _ 2 ) = v ( t _ 1 , x _ 2 - c t _ 1 + c t _ 2 ) .\]

Since $ ( t _ 2 , x _ 2 ) $ is an arbitrary point in the equations above we can substitute any other point for it. In particular we can substitute \[\tag{2.1.15} ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 2 + ( r - 1 ) c ( t _ 2 - t _ 1 ) )\] for it in (2.1.14), which gives \[\tag{2.1.16} v ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 1 + r c ( t _ 2 - t _ 1 ) ) = v ( t _ 1 , x _ 2 + ( 2 r - 1 ) c ( t _ 2 - t _ 1 ) ) ,\] which we can substitute into (2.1.10) to obtain \[\tag{2.1.17} \begin{split} u ( t _ 2 , x _ 2 ) & = u ( t _ 1 , x _ 2 + c t _ 1 - c t _ 2 ) \\ & \quad {} + ( t _ 2 - t _ 1 ) \int _ 0 ^ 1 v \left ( t _ 1 , x _ 2 + ( 2 r - 1 ) c ( t _ 2 - t _ 1 ) \right ) \, d r . \end{split}\] The substitution \[\tag{2.1.18} y = x _ 2 + ( 2 r - 1 ) c ( t _ 2 - t _ 1 )\] converts this into \[\tag{2.1.19} \begin{split} u ( t _ 2 , x _ 2 ) & = u ( t _ 1 , x _ 2 + c t _ 1 - c t _ 2 ) \\ & \quad {} + \frac 1 { 2 c } \int _ { x _ 2 - c ( t _ 2 - t _ 1 ) } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } v ( t _ 1 , y ) \, d y . \end{split}\] Similarly, we could substitute \[\tag{2.1.20} ( t _ 1 + r ( t _ 2 - t _ 1 ) , x _ 2 + ( 1 - r ) c ( t _ 2 - t _ 1 ) )\] for $ ( t _ 2 , x _ 2 ) $ in (2.1.11), substitute the result into (2.1.13), and make the substitution \[\tag{2.1.21} y = x _ 2 + ( 1 - 2 r ) c ( t _ 2 - t _ 1 )\] into the resulting integral to obtain \[\tag{2.1.22} \begin{split} u ( t _ 2 , x _ 2 ) & = u ( t _ 1 , x _ 2 - c t _ 1 + c t _ 2 ) \\ & \quad {} + \frac 1 { 2 c } \int _ { x _ 2 - c ( t _ 2 - t _ 1 ) } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } w ( t _ 1 , y ) \, d y . \end{split}\] Averaging (2.1.19) and (2.1.22) and noting that \[\tag{2.1.23} v + w = 2 \frac { \partial u } { \partial t }\] gives the equation \[\tag{2.1.24} \begin{split} u ( t _ 2 , x _ 2 ) & = \frac 1 2 u ( t _ 1 , x _ 2 + c t _ 1 - c t _ 2 ) \\ & \quad {} + \frac 1 2 u ( t _ 1 , x _ 2 - c t _ 1 + c t _ 2 ) \\ & \quad {} + \frac 1 { 2 c } \int _ { x _ 2 - c ( t _ 2 - t _ 1 ) } ^ { x _ 2 + c ( t _ 2 - t _ 1 ) } \frac { \partial u } { \partial t } ( t _ 1 , y ) \, d y . \end{split}\] Relabeling the variables gives D'Alembert's formula \[\tag{2.1.25} u ( t , x ) = \frac 1 2 f ( x + c s - c t ) + \frac 1 2 f ( x - c s + c t ) + \frac 1 { 2 c } \int _ { x + c s - c t } ^ { x - c s + c t } g ( y ) \, d y ,\] where \[\tag{2.1.26} f ( y ) = u ( s , y ) , \quad g ( y ) = \frac { \partial u } { \partial t } ( s , y ) .\] This gives the solution at time $ t $ in terms of its values and the values of its first derivatives at time $ s $. Note that we haven't assumed $ s < t $, although the formula is usually applied in this case. We haven't even assumed $ s \neq t $, although the formula doesn't give us any useful information when $ s = t $.

Section 2.2 Existence and Uniqueness

D'Alembert's formula has a natural interpretation in terms of the in initial value problem for the wave equation, i.e. the problem of finding a classical solution to the wave equation with initial conditions at time $s$ given by \[\tag{2.2.1} u ( s , y ) = f ( y ) , \quad \frac { \partial u } { \partial t } ( s , y ) = g ( y ) .\] A classical solution is just a twice continuously differentiable function. It's natural to assume two derivatives because the equation is of second order. In other words, second derivatives are the highest ones which appear. The equation doesn't have any obvious interpretation if we assume much less differentiability than this. It is possible, and indeed useful, to give it less obvious interpretations which assume less differentiability, but that would be a topic for a more advanced text. Here we'll only consider classical solutions. Since $u$ should be twice continuously differentiable the initial conditions force $ f $ to be twice continuously differentiable as well and $ g $ to be continuously differentiable. So a more precise formulation of the initial value problem is, given a twice continuously differentiable function $ f $ and a continuously differentiable function $ g $, to find a classical solution $ u $ of the wave equation such that the initial conditions (2.2.1) are satisfied, i.e a twice continuously differentiable function $ u $ satisfying \[\tag{2.2.2} \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0 .\]

In terms of the initial value problem of the preceding paragraph the calculation of the preceding section provides a proof of the following theorem.

Theorem 2.2.B For given $ f $ and $ g $, defined on the interval $ [ a , b ] $ there is at most one solution to the initial value problem in the parallelogram with vertices $ \left ( s , a \right ) $, $ \left ( s - \frac { b - a } { 2 c } , \frac { a + b } 2 \right ) $, $ \left ( s , b \right ) $, and $ \left ( s + \frac { b - a } { 2 c } , \frac { a + b } 2 \right ) $, $ \mathbf R ^ 2 $, which is given by D'Alembert's formula (2.1.25).

Both theorems above are pure uniqueness theorems. They assert that there is at most one solution of the initial value problem, without guaranteeing that there is at least one. And indeed there is no way to turn the calculation of the preceding section into an existence proof, since we assumed at a very early stage that we had a classical solution $ u $. The nice thing about an explicit formula though is that it naturally suggests a way of proving existence: we just need to take the formula and verify that what it gives is indeed a solution. Fortunately this works for D'Alembert's formula. The most naive way of doing this is rather awkward though, for two reasons. First, we'd need to differentiate under the integral, and the variables with respect to which we want to differentiate appear in the limits of the integral. Second, we need to take two derivatives and the function $ g $ appearing in the integrand is only known to have one derivative. Neither of these problems is insurmountable but it's possible to avoid facing either of them directly.

We start by choosing some point $ p $ in the interval in which the functions $ f $ and $ g $ are defined and set \[\tag{2.2.3} \begin{split} \varphi ( z ) & = \frac 1 2 f ( z ) + \frac 1 { 2 c } \int _ z ^ p g ( y ) \, d y , \\ \psi ( z ) & = \frac 1 2 f ( z ) + \frac 1 { 2 c } \int _ p ^ z g ( y ) \, d y \end{split}\] with the usual convention that if the lower limit of an integral is greater than the upper limit then the limits should be swapped and the sign should be changed. Then $ \varphi $ and $ \psi $ are twice continuously differentiable functions, defined on the same interval that $ f $ and $ g $ were. Indeed the fundamental theorem of calculus gives \[\tag{2.2.4} \begin{split} \varphi ' ( z ) & = \frac 1 2 f ' ( z ) - \frac 1 { 2 c } g ( z ) , \\ \psi ' ( z ) & = \frac 1 2 f ' ( z ) + \frac 1 { 2 c } g ( z ) \end{split}\] and then taking an additional derivative gives \[\tag{2.2.5} \begin{split} \varphi '' ( z ) & = \frac 1 2 f '' ( z ) - \frac 1 { 2 c } g ' ( z ) , \\ \psi '' ( z ) & = \frac 1 2 f '' ( z ) + \frac 1 { 2 c } g ' ( z ) . \end{split}\] By assumption $ f $ is twice continuously differentiable and $ g $ is continuously differentiable in the initial value problem so the right hand sides are continuous. Let \[\tag{2.2.6} u ( t , x ) = \varphi ( x + c s - c t ) + \psi ( x - c s + c t )\] and note that $ u $ is twice continuously differentiable. Then \[\tag{2.2.7} u ( s , x ) = \varphi ( x ) + \psi ( x ) = f ( x ) .\] Also, \[\tag{2.2.8} \frac { \partial u } { \partial t } ( t , x ) = - c \varphi ' ( x + c s - c t ) + c \psi ' ( x - c s + c t )\] so \[\tag{2.2.9} \frac { \partial u } { \partial t } ( s , x ) = - c \varphi ' ( x ) + c \psi ' ( x ) = g ( s ) .\] Thus $ u $ satisfies the initial conditions. Does it also satisfy the wave equation? Taking another derivative, \[\tag{2.2.10} \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t , x ) = c ^ 2 \varphi ' ( x + c s - c t ) + c ^ 2 \psi ' ( x - c s + c t ) .\] Similarly, \[\tag{2.2.11} \frac { \partial u } { \partial x } ( t , x ) = \varphi ' ( x + c s - c t ) + \psi ' ( x - c s + c t )\] and \[\tag{2.2.12} \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = \varphi ' ( x + c s - c t ) + \psi ' ( x - c s + c t )\] so \[\tag{2.2.13} \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t , x ) - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = 0\] and so $ u $ is a classical solution of the wave equation. In particular, since we've already checked that the initial conditions are satisfied, the initial value problem has at least one solution. We've already seen that it has at most one solution so there is exactly one solution. The uniqueness theorems tell us that this solution must be given by D'Alembert's formula so there is no need to check separately that the expression for $ u $ given above is equal to the one in D'Alembert's formula, although it is not difficult to do so.

We've now proved existence theorems, complementary to our earlier uniqueness theorems:

Theorem 2.2.D For given $ f $ and $ g $, defined on the interval $ [ a , b ] $ there is at least one solution to the initial value problem in the parallelogram with vertices $ \left ( s , a \right ) $, $ \left ( s - \frac { b - a } { 2 c } , \frac { a + b } 2 \right ) $, $ \left ( s , b \right ) $, and $ \left ( s + \frac { b - a } { 2 c } , \frac { a + b } 2 \right ) $, $ \mathbf R ^ 2 $, which is given by D'Alembert's formula (2.1.25).

Section 2.3 Energy

We saw in section 2.1 that when $u$ is a classical solution to the wave equation the functions \[\tag{2.3.1} v = \frac { \partial u } { \partial t } + c \frac { \partial u } { \partial x } , \quad w = \frac { \partial u } { \partial t } - c \frac { \partial u } { \partial x } \] satisfy the relations \[\tag{2.3.2} v ( t , x ) = v ( s , x - c s + c t ) , \quad w ( t , x ) = w ( s , x + c s - c t ) . \] The quantity \[\tag{2.3.3} E = \frac 1 4 v ^ 2 + \frac 1 4 w ^ 2 = \frac 1 2 \left ( \frac { \partial u } { \partial t } \right ) ^ 2 + \frac { c ^ 2 } 2 \left ( \frac { \partial u } { \partial x } \right ) ^ 2 \] has a physical interpretation as the energy density, so the integral \[\tag{2.3.4} I = \frac 1 4 \int _ a ^ b \left ( v ( t , x ) ^ 2 + w ( t , x ) ^ 2 \right) \, d x \] represents the energy present in the interval $ [ a , b ] $ at time $t$. We can split this into two parts, \[\tag{2.3.5} I = \frac 1 4 \int _ a ^ b v ( t , x ) ^ 2 \, d x + \frac 1 4 \int _ a ^ b w ( t , x ) ^ 2 \, d x \] and use the relations above to get \[\tag{2.3.6} I = \frac 1 4 \int _ a ^ b v ( s , x - c s + c t ) ^ 2 \, d x + \frac 1 4 \int _ a ^ b w ( s , x + c s - c t ) ^ 2 \, d x \] or, changing variables in the integrals, \[\tag{2.3.7} I = \frac 1 4 \int _ { a + c s - c t } ^ { b + c s - c t } v ( s , x ) ^ 2 \, d x + \frac 1 4 \int _ { a - c s + c t } ^ { b - c s + c t } w ( s , x ) ^ 2 \, d x . \] Now the integral of a non-negative integrand over an interval is at least as large as the integral over a smaller interval and at most as large as the integral over a larger integral so we see that \[\tag{2.3.8} I \ge \frac 1 4 \int _ { \max ( a + c s - c t , a - c s + c t ) } ^ { \min ( b + c s - c t , b - c s + c t ) } v ( s , x ) ^ 2 \, d x + \frac 1 4 \int _ { \max ( a + c s - c t , a - c s + c t ) } ^ { \min ( b + c s - c t , b - c s + c t ) } w ( s , x ) ^ 2 \, d x \] \[\tag{2.3.9} I \le \frac 1 4 \int _ { \min ( a + c s - c t , a - c s + c t ) } ^ { \max ( b + c s - c t , b - c s + c t ) } v ( s , x ) ^ 2 \, d x + \frac 1 4 \int _ { \min ( a + c s - c t , a - c s + c t ) } ^ { \max ( b + c s - c t , b - c s + c t ) } w ( s , x ) ^ 2 \, d x . \] Combining the integrals, and writing the limits in a slightly cleaner form, we find \[\tag{2.3.10} \int _ { a + c | s - t | } ^ { b - c | s - t | } E ( s , x ) \, d x \le I \le \int _ { a - c | s - t | } ^ { b + c | s - t | } E ( s , x ) \, d x . \] If $ 2 c | s - t | > b - a $ then the lower limit of the integral on the left is greater than its upper limit but that's okay. We continue to follow the convention that in such cases the limits are to be swapped and the sign is to be changed and we still get a valid inequality in that case.

Suppose that the integral \[\tag{2.3.11} \int _ { - \infty } ^ { + \infty } E ( s , x ) \, d x\] is finite, i.e. that the total energy in all of space at time $s$ is finite. This is true if and only if the limit of the integral over a finite interval, \[\tag{2.3.12} \int _ { \alpha } ^ \beta E ( s , x ) \, d x\] tends to a finite limit as $\alpha$ tends to $- \infty$ and $\beta$ tends to $+ \infty$, in which case the integrals in the upper and lower bounds of the inequality above tend to that same limit as $a$ tends to $- \infty$ and $b$ tends to $+ \infty$. It then follows from the squeeze principle from real analysis that the limit of $I$ as $a$ tends to $- \infty$ and $b$ tends to $+ \infty$ exists and is equal to the other limits considered. In other words, \[\tag{2.3.13} \int _ { - \infty } ^ { + \infty } E ( t , x ) \, d x = \int _ { - \infty } ^ { + \infty } E ( s , x ) \, d x \] in the sense that if the integral on the right is finite then so is the integral on the left and both are equal. Since $s$ and $t$ are arbitrary we therefore have the following theorem.

Theorem 2.3.A Suppose $u$ is a classical solution to the wave equation. If the total energy \[\tag{2.3.14} \int _ { - \infty } ^ { + \infty } E ( t , x ) \, d x \] is finite for some value of $t$ then it is finite for all values of $t$ and is independent of $t$.

Section 2.4 Symmetries

Symmetries of a differential equation are transformations of a function with the property the that transformed function satisfies the differential equation if and only if the original function does. Symmetries often depend on one or more parameters. For example, the wave equation has the scaling symmetry \[\tag{2.4.1} ( S _ \alpha u ) ( t , x ) = u ( t / \alpha , x / \alpha ) ,\] where $\alpha$ is non-zero. To verify that this is indeed a symmetry we need to check that if $ \tilde u = S _ \alpha u $ for some non-zero value of $\alpha$ then $ u $ is a solution of the wave equation if and only if $ \tilde u $ is. This is completely straightforward because the chain rule gives \[\tag{2.4.2} \frac { \partial \tilde u } { \partial t } ( t , x ) = \frac 1 \alpha \frac { \partial u } { \partial t } ( t / \alpha , x / \alpha ) ,\] \[\tag{2.4.3} \frac { \partial \tilde u } { \partial x } ( t , x ) = \frac 1 \alpha \frac { \partial u } { \partial x } ( t / \alpha , x / \alpha ) ,\] \[\tag{2.4.4} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) = \frac 1 { \alpha ^ 2 } \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t / \alpha , x / \alpha ) ,\] and \[\tag{2.4.5} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) = \frac 1 { \alpha ^ 2 } \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t / \alpha , x / \alpha )\] so \[\tag{2.4.6} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) - c ^ 2 \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) = \frac 1 { \alpha ^ 2 } \left [ \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t / \alpha , x / \alpha ) , - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t / \alpha , x / \alpha ) \right ]\] and hence $u$ satisfies \[\tag{2.4.7} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } = 0\] wherever it's defined if and only if $ \tilde u $ satisfies \[\tag{2.4.8} \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0\] wherever it's defined. The domains of definition need not be the same, although they could be.

Although the preceding proof is straightforward there is one point where the notation could be confusing. In (2.4.2) the expression \[\tag{2.4.9} \frac { \partial u } { \partial t } ( t / \alpha , x / \alpha )\] means that we take the function $ u $, differentiate it with respect to its first argument, and then evaluate the resulting function at the point $ ( t / \alpha , x / \alpha ) $, not that we differentiate the function obtained by mapping $ ( t , x ) $ to $ u ( t / \alpha , x / \alpha ) $ with respect to its first argument. Either of these would be a plausible interpretation of the expression but they are unfortunately not equal in general. In these notes we'll be consistent in interpreting partial derivatives as in this example.

Another symmetry of the wave equation is the spatial reflection symmetry \[\tag{2.4.10} ( R u ) ( t , x ) = u ( t , - x ) .\] The proof that this is indeed a symmetry is straightforward. Temporal reflection is also a symmetry, but don't need to check this separately, since we can write a temporal reflection as the composition of a spatial reflection and a scaling by a factor of $- 1$, in either order: \[\tag{2.4.11} ( R S _ { - 1 } u ) ( t , x ) = ( S _ { - 1 } u ) ( t , - x ) = u ( - t , x )\] and \[\tag{2.4.12} ( S _ { - 1 } R u ) ( t , x ) = ( R u ) ( - t , - x ) = u ( - t , x ) .\] Although scaling and spatial reflection happen to commute this isn't true of symmetries in general. For example, we also have spacetime translational symmetries \[\tag{2.4.13} ( T _ { \tau , \xi } u ) ( t , x ) = u ( t - \tau , x - \xi ) .\] Again, the proof that this is a symmetry for any real numbers $ \tau $ and $ \xi $ is straightforward. Note that \[\tag{2.4.14} ( T _ { \tau , \xi } R u ) ( t , x ) = ( R u ) ( t - \tau , x - \xi ) = u ( t - \tau , \xi - x )\] while \[\tag{2.4.15} ( R T _ { \tau , \xi } u ) ( t , x ) = ( T _ { \tau , \xi } R u ) ( t , - x ) = u ( t - \tau , - x - \xi ) ,\] so although the compositions $ T _ { \tau , \xi } R $ and $ R T _ { \tau , \xi } $ are both symmetries they are not the same symmetry unless $ \xi = 0 $.

Not all symmetries of the wave equation are as easy to verify as the ones above. Another important class of symmetries are the Lorentz transformations \[\tag{2.4.16} ( L _ { \kappa } u ) ( t , x ) = u \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) .\] If $ \tilde u = L _ { \kappa } u $ then \[\tag{2.4.17} \begin{split} \frac { \partial \tilde u } { \partial t } ( t , x ) & = \cosh \kappa \, \frac { \partial u } { \partial t } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + c \sinh \kappa \, \frac { \partial u } { \partial x } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) , \end{split}\] \[\tag{2.4.18} \begin{split} \frac { \partial \tilde u } { \partial x } ( t , x ) & = \frac 1 c \sinh \kappa \, \frac { \partial u } { \partial t } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + \cosh \kappa \, \frac { \partial u } { \partial x } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) , \end{split}\] \[\tag{2.4.19} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) & = \cosh ^ 2 \kappa \, \frac { \partial ^ 2 u } { \partial t ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + 2 c \sinh \kappa \, \cosh \kappa \, \frac { \partial ^ 2 u } { \partial t \partial x } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + c ^ 2 \sinh ^ 2 \kappa \, \frac { \partial ^ 2 u } { \partial x ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) , \end{split}\] \[\tag{2.4.20} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) & = \frac 1 { c ^ 2 } \sinh ^ 2 \kappa \, \frac { \partial ^ 2 u } { \partial t ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + \frac 2 c \sinh \kappa \, \cosh \kappa \, \frac { \partial ^ 2 u } { \partial t \partial x } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} + \cosh ^ 2 \kappa \, \frac { \partial ^ 2 u } { \partial x ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) , \end{split}\] and \[\tag{2.4.21} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) - c ^ 2 \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) & = \frac { \partial ^ 2 u } { \partial t ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) \\ & \quad {} - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } \left ( \cosh \kappa \, t + \frac 1 c \sinh \kappa \, x , c \sinh \kappa \, t + \cosh \kappa \, x \right ) . \end{split}\] Here we've used the identity $ \cosh ^ 2 \kappa - \sinh ^ 2 \kappa = 1 $ from the theory of hyperbolic functions. So $ \tilde u $ is a solution of \[\tag{2.4.22} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } = 0\] if and only if $ u $ is a solution of \[\tag{2.4.23} \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0 ,\] and so $ L _ \kappa $ is indeed a symmetry.

Even the calculation for $ L _ \kappa $ is relatively tame though compared to the one required to show that spacetime inversion \[\tag{2.4.24} ( I u ) ( t , x ) = u \left ( \frac { c ^ 2 t } { c ^ 2 t ^ 2 - x ^ 2 } , \frac { x } { c ^ 2 t ^ 2 - x ^ 2 } \right )\] is a symmetry. Set $ \tilde u = I u $, \[\tag{2.4.25} \tau ( t , x ) = \frac { c ^ 2 t } { c ^ 2 t ^ 2 - x ^ 2 }, \] and \[\tag{2.4.26} \xi ( t , x ) = \frac { x } { c ^ 2 t ^ 2 - x ^ 2 } \] so that \[\tag{2.4.27} \tilde u ( t , x ) = u ( \tau ( t , x ) , \xi ( t , x ) ) .\] By the chain rule we have \[\tag{2.4.28} \frac { \partial \tilde u } { \partial t } ( t , x ) = \frac { \partial \tau } { \partial t } ( t , x ) \frac { \partial u } { \partial t } ( \tau ( t , x ) , \xi ( t , x ) ) + \frac { \partial \xi } { \partial t } ( t , x ) \frac { \partial u } { \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) ,\] \[\tag{2.4.29} \frac { \partial \tilde u } { \partial x } ( t , x ) = \frac { \partial \tau } { \partial x } ( t , x ) \frac { \partial u } { \partial t } ( \tau ( t , x ) , \xi ( t , x ) ) + \frac { \partial \xi } { \partial x } ( t , x ) \frac { \partial u } { \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) ,\] \[\tag{2.4.30} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) & = \left ( \frac { \partial \tau } { \partial t } ( t , x ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial t ^ 2 } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + 2 \frac { \partial \tau } { \partial t } ( t , x ) \frac { \partial \xi } { \partial t } ( t , x ) \frac { \partial ^ 2 u } { \partial t \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \left ( \frac { \partial \xi } { \partial t } ( t , x ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \frac { \partial ^ 2 \tau } { \partial t ^ 2 } ( t , x ) \frac { \partial u } { \partial t } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \frac { \partial ^ 2 \xi } { \partial t ^ 2 } ( t , x ) \frac { \partial u } { \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) , \end{split}\] and \[\tag{2.4.31} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) & = \left ( \frac { \partial \tau } { \partial x } ( t , x ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial t ^ 2 } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + 2 \frac { \partial \tau } { \partial x } ( t , x ) \frac { \partial \xi } { \partial x } ( t , x ) \frac { \partial ^ 2 u } { \partial t \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \left ( \frac { \partial \xi } { \partial x } ( t , x ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \frac { \partial ^ 2 \tau } { \partial x ^ 2 } ( t , x ) \frac { \partial u } { \partial t } ( \tau ( t , x ) , \xi ( t , x ) ) \\ & \quad {} + \frac { \partial ^ 2 \xi } { \partial x ^ 2 } ( t , x ) \frac { \partial u } { \partial x } ( \tau ( t , x ) , \xi ( t , x ) ) . \end{split}\] We can then compute the partial derivatives \[\tag{2.4.32} \frac { \partial \tau } { \partial t } ( t , x ) = - c ^ 2 \frac { c ^ 2 t ^ 2 + x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 2 } ,\] \[\tag{2.4.33} \frac { \partial \tau } { \partial x } ( t , x ) = \frac { 2 c ^ 2 t x } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 2 } ,\] \[\tag{2.4.34} \frac { \partial \xi } { \partial t } ( t , x ) = - \frac { 2 c ^ 2 t x } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 2 } ,\] \[\tag{2.4.35} \frac { \partial \xi } { \partial x } ( t , x ) = \frac { c ^ 2 t ^ 2 + x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 2 } ,\] \[\tag{2.4.36} \frac { \partial ^ 2 \tau } { \partial t ^ 2 } ( t , x ) = c ^ 4 t \frac { 2 c ^ 2 t ^ 2 + 6 x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 3 } ,\] \[\tag{2.4.37} \frac { \partial ^ 2 \tau } { \partial x ^ 2 } ( t , x ) = c ^ 2 t \frac { 2 c ^ 2 t ^ 2 + 6 x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 3 } ,\] \[\tag{2.4.38} \frac { \partial ^ 2 \xi } { \partial t ^ 2 } ( t , x ) = c ^ 2 x \frac { 6 c ^ 2 t ^ 2 + 2 x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 3 } ,\] and \[\tag{2.4.39} \frac { \partial ^ 2 \xi } { \partial x ^ 2 } ( t , x ) = x \frac { 6 c ^ 2 t ^ 2 + 2 x ^ 2 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 3 } .\] Substituting, \[\tag{2.4.40} \frac { \partial ^ 2 \tilde u } { \partial t ^ 2 } ( t , x ) - c ^ 2 \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) = \frac { c ^ 4 } { \left ( c ^ 2 t ^ 2 - x ^ 2 \right ) ^ 2 } \left ( \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t , x ) - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) \right ) .\] It follows that $ \tilde u $ satisfies the wave equation if and only if $ u $ does, i.e. that $ I $ is a symmetry of the wave equation.

It follows immediately from the definition of a symmetry that the composition of two symmetries is a symmetry and the inverse of a symmetry is a symmetry. In other words the symmetries of a differential equation form a group. The problem of determining the full symmetry group of a differential equation is in general quite a difficult one and indeed the theory of Lie groups was originally developed as a technique for solving this problem.

There are two particular simple symmetries which are valid not just for the wave equation but for all linear differential equations. One is scaling of the dependent variable \[\tag{2.4.41} ( M _ \lambda u ) ( t , x ) = \lambda u ( t , x )\] and the other is the addition of another solution of the equation \[\tag{2.4.42} ( A _ \varphi u ) ( t , x ) = u ( t , x ) + \varphi ( t , x ) ,\] where $ \lambda $ is a non-zero real number and $ \varphi $ is a solution of the wave equation. The fact that $ M _ { - 1 } $ is a symmetry means that $ \varphi $ is a solution if and only if $ - \varphi $ is and so we could equally well have said that subtraction of another solution is a symmetry rather than addition.

In general, a symmetry will take a solution to the differential equation and give another, different, solution, but it may happen to give the same solution, in which case we say that it is a symmetry not just of the differential equation but of the solution. It is often useful to find the set of solutions with a given symmetry or group of symmetries. Of course if we ask for too many symmetries we are unlikely to get any solutions. For example, the only solutions of the wave equation symmetric under the full group of spacetime translations are the constant solutions. Interestingly, as long as we restrict our attention to classical solutions defined on all of $ \mathbf R ^ 2 $, the only scaling invariant solutions are also constant. Indeed, if $ u $ is scaling invariant and $ \ell $ is a line through the origin then $ u $ must take the same value at all points of $ \ell $, except possibly the origin itself. But classical solutions are continuous, so it must take the same value at the origin as well. Every point is on some line through the origin so the value at that point is therefore equal to the value at the origin. What's interesting about this argument is we never actually needed the fact that $ u $ satisfies the wave equation!

If we impose less symmetry then we get more solutions. We could, for example, consider the set of solutions invariant under spatial reflection, i.e. the ones satisfying $ R u = u $, i.e. \[\tag{2.4.43} u ( t , - x ) = u ( t , x ) .\] These are just the solutions which are even functions of the spatial variable for each fixed value of the temporal variable. Note that differentiating the equation above gives \[\tag{2.4.44} - \frac{ \partial u } { \partial x } ( t , - x ) = \frac { \partial u } { \partial x } ( t , x )\] and so \[\tag{2.4.45} - \frac{ \partial u } { \partial x } ( t , 0 ) = \frac { \partial u } { \partial x } ( t , 0 ) ,\] from which it follows that \[\tag{2.4.46} \frac{ \partial u } { \partial x } ( t , 0 ) = 0 .\] More generally, any derivative of odd degree in $x$, if it exists, is zero on the time axis.

Similarly we could look at the solutions symmetric under the symmetry $ M _ { - 1 } R $. These are the solutions which are odd, considered as a function of the spatial variable for fixed value of the temporal variable, and they have the property that any derivative of even degree in $x$, if it exists, is zero on the time axis. In particular the zeroeth and second derivatives, which both certainly exist for any classical solution, are zero: \[\tag{2.4.47} u ( t , 0 ) = 0 .\] and \[\tag{2.4.48} \frac{ \partial ^ 2 u } { \partial x ^ 2 } ( t , 0 ) = 0 .\]

Symmetries, linearity and uniqueness interact in interesting ways. Suppose, for example, that $ u $ is a solution to the initial value problem considered earlier, the one with initial conditions given by (2.2.1). If $ u $ has $ R $ as a symmetry, i.e. if it is even as a function of the spatial variable, then $ f $ and $ g $ must also both be even. More interestingly, suppose $ f $ and $ g $ are even. Since $ u $ is a solution and $ R $ is a symmetry it follows that $ R u $ is also a solution. From linearity it then follows that $ u - R u $ is a solution. But the initial data for $ u - R u $ are identically zero and we already know a solution with zero initial data, namely the zero solution, so the uniqueness theorem implies that $ u - R u $ is the zero solution. In other words, $ u = R u $, or $ R $ is a symmetry of $ u $. So not only does an even solution necessarily have even initial data but even initial data can only give rise to an even solution. Similar remarks apply to odd solutions and odd initial data.

Section 2.5 Energy and Uniqueness for Boundary Value Problems

Up to now we've been trying to solve the wave equation in the whole of $ \mathbf R ^ 2 $ with initial data given on the whole of $ \mathbf R $ but often one wants to solve the equation in a region where the spatial variable is restricted to an interval, either bounded or semi-infinite, and the initial data are given in this interval. This, by itself, is not a problem that admits unique solutions, but it becomes one if we impose appropriate boundary conditions at the endpoint or endpoints of the interval. The two most important boundary conditions are the Dirichlet condition, $ u = 0 $, and the Neumann condition, $ \partial u / \partial x = 0 $.

To start with, let's consider the case of a finite interval, $ [ a , b ] $, with a Dirichlet condition at the left endpoint and a Neumann condition at the right endpoint: \[\tag{2.5.1} u ( t , a ) = 0 , \quad \frac { \partial u } { \partial x } ( t , b ) = 0 .\] These equations are to hold for all values of $t$. The initial conditions will be specified as usual by (2.2.1), where the functions $f$ and $g$ are defined on the interval $ [ a , b ] $ and the solution $u$ should be defined and twice continuously differentiable on $ \mathbf R \times [ a , b ] $, and should of course satisfy the wave equation there.

We see immediately that some additional restrictions are needed on $f$ and $g$. If we set $t = s$ in the equations above we get \[\tag{2.5.2} f ( a ) = 0 , \quad f ' ( b ) = 0\] since taking a partial derivative with respect to $ x $ and then fixing $ t $ is the same as fixing $ t $ and then taking an ordinary derivative. We could also take a $ t $ derivative of the boundary conditions, obtaining \[\tag{2.5.3} \frac { \partial u } { \partial t } ( t , a ) = 0 , \quad \frac { \partial ^ 2 u } { \partial t \partial x } ( t , b ) = 0 .\] Since $ u $ is twice continuously differentiable the mixed partial derivatives are equal so we can replace $ \partial ^ 2 u / \partial t \partial x $ with $ \partial ^ 2 u / \partial x \partial t $. If we do so and then set $ t = s $ then we get \[\tag{2.5.4} g ( a ) = 0 , \quad g ' ( b ) = 0 .\] There is one more restriction. Differentiating the Dirichlet condition once again gives \[\tag{2.5.5} \frac { \partial ^ 2 u } { \partial t ^ 2 } ( t , a ) = 0 ,\] but $ u $ satisfies the wave equation, so \[\tag{2.5.6} c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , a ) = 0 ,\] so $ c ^ 2 f '' ( x ) = 0 $ and therefore \[\tag{2.5.7} f '' ( a ) = 0 .\] There is no similar condition for $ g $ and there is no analogous condition at $b$. To summarise, we've found that there can be no solution to the initial value problem with these boundary conditions unless the initial data satisfy the constraints \[\tag{2.5.8} f ( a ) = 0 , \quad g ( a ) = 0 , \quad f '' ( a ) = 0 , \quad f ' ( b ) = 0 , \quad g ' ( b ) = 0 .\]

Before considering existence and uniqueness let's examine energy conservation. It's useful to prove the following lemma.

Lemma 2.5.A Suppose $ p $ and $ q $ are continuously differentiable functions on the rectangle $ [ t _ 1 , t _ 2 ] \times [ x _ 1 , x _ 2 ] $. Then \[\tag{2.5.9} \begin{split} & \int _ { x _ 1 } ^ { x _ 2 } p ( t _ 1 , x ) \, d x + \int _ { t _ 1 } ^ { t _ 2 } q ( t , x _ 2 ) \, d t + \int _ { x _ 2 } ^ { x _ 1 } p ( t _ 2 , x ) \, d x + \int _ { t _ 2 } ^ { t _ 1 } q ( t , x _ 1 ) \, d t \\ & \qquad {} = \iint _ { [ t _ 1 , t _ 2 ] \times [ x _ 1 , x _ 2 ] } \left ( \frac { \partial x } { \partial x } - \frac { \partial p } { \partial t } \right ) \, d A . \end{split}\]

The proof of the lemma is very simple. By the fundamental theorem of calculus \[\tag{2.5.11} q ( t , x _ 2 ) - q ( t , x _ 1 ) = \int _ { x _ 1 } ^ { x _ 2 } \frac { \partial q } { \partial x } ( t , x ) \, d x .\] Integrating this equation over the interval $ [ x _ 1 , x _ 2 ] $ gives \[\tag{2.5.12} \int _ { t _ 1 } ^ { t _ 2 } \left [ q ( t , x _ 2 ) - q ( t , x _ 1 ) \right ] \, d t = \int _ { t _ 1 } ^ { t _ 2 } \int _ { x _ 1 } ^ { x _ 2 } \frac { \partial q } { \partial x } ( t , x ) \, d x \, d t .\] On the left hand side we write the integral of the difference as a difference of integrals and on the right hand side we write the repeated integral as an area integral, using Fubini's theorem. \[\tag{2.5.13} \int _ { t _ 1 } ^ { t _ 2 } q ( t , x _ 2 ) \, d t - \int _ { t _ 1 } ^ { t _ 2 } q ( t , x _ 1 ) \, d t = \iint _ { [ t _ 1 , t _ 2 ] \times [ x _ 1 , x _ 2 ] } \frac { \partial q } { \partial x } \, d A .\] Similarly, \[\tag{2.5.14} \int _ { x _ 1 } ^ { x _ 2 } p ( t _ 2 , x ) \, d x - \int _ { x _ 1 } ^ { x _ 2 } p ( t _ 1 , x ) \, d x = \iint _ { [ t _ 1 , t _ 2 ] \times [ x _ 1 , x _ 2 ] } \frac { \partial p } { \partial t } \, d A .\] Subtracting this from the previous equation and using again the fact that that integral of a difference is the difference of the integrals we get (2.5.10).

Now that we've proved the lemma we can apply it to $ [ x _ 1 , x _ 2 ] = [ a , b ] $, \[\tag{2.5.15} p = E = \frac 1 2 \left ( \frac { \partial u } { \partial t } \right ) ^ 2 + \frac { c ^ 2 } 2 \left ( \frac { \partial u } { \partial x } \right ) ^ 2 , \quad q = c ^ 2 \frac { \partial u } { \partial t } \frac { \partial u } { \partial x }\] and note that \[\tag{2.5.16} \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } = - \left ( \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } \right ) \frac { \partial u } { \partial t } ,\] so the right hand side of the equation in the lemma is zero when $ u $ satisfies the wave equation. The terms on the left hand side with a $ q $ in them are also zero if $ u $ satisfies the boundary conditions. When $ x = a $ this happens because $ \partial u / \partial t = 0 $ there and when $ x = b $ this happens because $ \partial u / \partial x = 0 $ there. The only terms which are left then are the ones with a $ p $, which is the same as $ E $, on the right hand side, so we have \[\tag{2.5.17} \int _ a ^ b E ( d , x ) \, d x - \int _ a ^ b E ( c , x ) \, d x = 0 .\] In other words, the energy in the interval $ [ a , b ] $ at time $ t _ 2 $ is the same as the energy at time $ t _ 1 $. We therefore have an energy conservation theorem for solutions on a bounded interval, just as we did in the case of an infinite interval.

Theorem 2.5.B Suppose $u$ is a classical solution to the wave equation for $ x $ in the interval $ [ a , b ] $ and that at each endpoint of this interval either the Dirichlet or Neumann boundary condition is satisfied. If the total energy \[\tag{2.5.18} \int _ a ^ b E ( t , x ) \, d x \] is finite for some value of $t$ then it is finite for all values of $t$ and is independent of $t$.

We can use this energy conservation theorem to get a uniqueness theorem even though we don't yet have an explicit solution. Suppose that $ u _ 1 $ and $ u _ 2 $ are solutions of the initial value problem which satisfy the same boundary conditions, either Dirichlet or Neumann at each endpoint. Then their difference $ u = u _ 1 - u _ 2 $ satisfies the initial value problem with the same boundary conditions and zero initial data. It therefore has zero energy density everywhere. Looking at the definition of the energy density we see this means that its partial derivatives are everywhere zero. It must therefore be locally constant. Since $ \mathbf R \times [ a , b ] $ is connected it follows that $ u $ is constant. It's zero initial, since $ u $ has zero initial data, and hence is zero everywhere. So we've proved the following uniqueness theorem.

Theorem 2.5.C For given $ f $ and $ g $, defined on the interval $ [ a , b ] $, there is at most one solution to the initial value problem in $ \mathbf R \times [ a , b ] $ with given boundary conditions, Dirichlet or Neumann, at the endpoints.

Section 2.6 Existence for Boundary Value Problems

We haven't yet proved the existence of solutions. For the moment we'll return to the setting where we have a Dirichlet condition at $ x = a $ and Neumann at $ x = b $. In this setting we saw that there can be no solution unless $ f $ is twice continuously differentiable, $ g $ is continuously differentiable, and $ f ( a ) $, $ g ( a ) $, $ f '' ( a ) $, $ f ' ( b ) $, and $ g ' ( b ) $ are all zero, so we'll assume those conditions are satisfied. We then extend $ f $ and $ g $ to functions on all of $ \mathbf R $ as follows. Any real number can be written uniquely as $ n + r $ where $ n $ is an integer and $ r $ belongs to the half-open interval $ [ 0 , 1 ) $ and every integer $ n $ can be written at $ 4 m + l $ where $ m $ is an integer and $ l $ is $ 0 $, $ 1 $, $ 2 $, or $ 3 $. We use this to write \[\tag{2.6.1} \frac { x - a } { b - a } = 4 m ( x ) + l ( x ) + r ( x )\] and then define \[\tag{2.6.2} f ( x ) = \begin{cases} f ( a + ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 0 }$,} \\ f ( b - ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 1 }$,} \\ - f ( a + ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 2 }$,} \\ - f ( b - ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 3 }$.} \\ \end{cases}\] The intended interpretation of this equation is that the $ f $'s on the right hand side refer to the function we were originally given, while the $ f $ on the left hand side is the new function we are defining. We'll also define an extension of $ g $ via the same equation, except with all $ f $'s replaced by $ g $'s. Note that $ m ( x ) $ does not appear in the equations above, which implies that the extended functions are periodic of period $ 4 ( b - a ) $.

There are a number of things to check. One is that these really are extensions of the original functions, i.e. that they give the same values when evaluated at a point in the original interval $ [ a , b ] $. Suppose $ x \in [ a , b ) $. Then $ m ( x ) = 0 $, $ l ( x ) = 0 $, and $ r ( x ) = ( b - x ) / ( b - a ) $ so the right hand side above gives the value \[\tag{2.6.3} f ( a + ( b - a ) r ( x ) ) = f \left ( a + ( b - a ) \frac { x - a } { b - a } \right ) = f ( x ) ,\] as it should. On the other hand, if $ x = b $ then $ m ( x ) = 0 $, $ l ( x ) = 1 $ and $ r ( x ) = 0 $ and the right hand side gives \[\tag{2.6.4} f ( b - ( b - a ) r ( x ) ) = f ( b ) = f ( x ) ,\] and we again get the correct value. So the new $ f $ is indeed an extension of the old one, which is fortunate because otherwise we wouldn't know whether $ f ( x ) $ for $ x $ in $ [ a , b ] $ referred to the old function or the new one. Of course the same argument works equally well for $ g $.

We'd like to know, in addition, that the extended functions have the same differentiability properties as the old functions, i.e. that the extended $ f $ is twice continuously differentiable and the extended $ g $ is continuously differentiable. This might seem obvious from the definition since the various pieces of which they are composed have this property, but there's clearly something wrong with that line of reasoning since it would imply that the absolute value function \[\tag{2.6.5} | x | = \begin{cases} x & \mbox{ if ${ x > 0 }$,} \\ 0 & \mbox{ if ${ x = 0 }$,} \\ - x & \mbox{ if ${ x < 0 }$,} \end{cases}\] which is also composed of continuous differentiable pieces, is continuously differentiable, when it is in fact not differentiable at $ x = 0 $. We need a criterion for a function defined by different expressions on the two different sides of a point to have some number of continuous derivatives. This is supplied by the following lemma.

Lemma 2.6.A Suppose $ p $, $ q $ are $ k $ times continuously differentiable functions on the intervals $ ( \alpha , \beta ) $ and $ ( \beta , \gamma ) $. Suppose $ f $ is defined on $ ( \alpha , \gamma ) $ by \[\tag{2.6.6} f ( x ) = \begin{cases} p ( x ) & \mbox{ if ${ \alpha < x < \beta }$,} \\ c & \mbox{ if ${ x = \beta }$,} \\ q ( x ) & \mbox{ if ${ \beta < x < \gamma }$} \\ \end{cases}\] for some $ c $. Then $ f $ is $ k $ times continuously differentiable if and only if \[\tag{2.6.7} \lim _ { x \to \beta ^ - } p ( x ) = c = \lim _ { x \to \beta ^ + } q ( x )\] and \[\tag{2.6.8} \lim _ { x \to \beta ^ - } p ^ { ( j ) } ( x ) = \lim _ { x \to \beta ^ + } q ^ { ( j ) } ( x )\] for all $ j \le k $.

To prove the lemma we first introduce the functions \[\tag{2.6.9} f _ j ( x ) = \begin{cases} p ^ { ( j ) } ( x ) & \mbox{ if ${ \alpha < x < \beta }$,} \\ c _ j & \mbox{ if ${ x = \beta }$,} \\ q ^ { ( j ) } ( x ) & \mbox{ if ${ \beta < x < \gamma }$,} \end{cases}\] where $ c _ j $ is the common value of $ \lim _ { x \to \beta ^ - } p ^ { ( j ) } ( x ) $ and $ \lim _ { x \to \beta ^ + } q ^ { ( j ) } ( x ) $. It follows immediately from the definition of continuity that $ f _ n $ is continuous. Also, \[\tag{2.6.10} f _ j ' ( x ) = f _ { j + 1 } ( x )\] if $ j < k $ and $ \alpha < x < \beta $ or $ \beta < x < \gamma $. We would like to know that this is true also when $ x = \beta $. If $ \beta < z < \gamma $ then \[\tag{2.6.11} \int _ \beta ^ z f _ { j + 1 } ( y ) \, d y = \int _ \beta ^ w f _ { j + 1 } ( y ) \, d y + \int _ w ^ z f _ { j + 1 } ( y ) \, d y\] for any $ w $ in the interval $ ( \beta , z ) $. Now \[\tag{2.6.12} \lim _ { w \to \beta ^ + } \int _ \beta ^ w f _ { j + 1 } ( y ) \, d y = 0\] because integrals of continuous functions depend continuously on their limits of integration and \[\tag{2.6.13} \int _ w ^ z f _ { j + 1 } ( y ) \, d y = f _ j ( z ) - f _ j ( w )\] by the fundamental theorem of calculus. So \[\tag{2.6.14} \int _ \beta ^ z f _ { j + 1 } ( y ) \, d y = \lim _ { w \to \beta ^ + } \left ( f _ z ( y ) - f _ j ( w ) \right ) = f _ j ( z ) - f _ j ( \beta ) ,\] where we've used the continuity of $ f _ j $ to evaluate $ \lim _ { w \to \beta ^ + } f _ j ( w ) $. Now the change of variable $ r = ( y - \beta ) / ( z - \beta ) $ gives \[\tag{2.6.15} \int _ \beta ^ z f _ { j + 1 } ( y ) \, d y = ( z - \beta ) \int _ 0 ^ 1 f _ { j + 1 } ( \beta + r ( z - \beta ) ) \, d r\] and so \[\tag{2.6.16} \frac { f _ j ( z ) - f _ j ( \beta ) } { z - \beta } = \int _ 0 ^ 1 f _ { j + 1 } ( \beta + r ( z - \beta ) ) \, d r .\] This was proved for $ z $ in the interval $ ( \beta , \gamma ) $ but an almost identical argument gives the same equation when $ z $ is in the interval $ ( \alpha , \beta ) $. Taking limits inside the integral and using the continuity of $ f _ { j + 1 } $ gives \[\tag{2.6.17} \begin{split} \lim _ { z \to \beta } \frac { f _ j ( z ) - f _ j ( \beta ) } { z - \beta } & = \int _ 0 ^ 1 \lim _ { z \to \beta } f _ { j + 1 } ( \beta + r ( z - \beta ) ) \, d r . \\ & = \int _ 0 ^ 1 f _ { j + 1 } \left ( \lim _ { z \to \beta } \left ( \beta + r ( z - \beta ) \right ) \right ) \, d r . \\ & = \int _ 0 ^ 1 f _ { j + 1 } ( \beta ) \, d r = f _ { j + 1 } ( \beta ) . \end{split}\] Taking the limit inside the integral is justified in this case because we have continuous integrands with uniform convergence on a bounded integral. The equation above just says that \[\tag{2.6.18} f _ j ' ( x ) = f _ { j + 1 } ( x )\] when $ x = \beta $ though. Since we already had the same equation for $ x $ in the intervals $ ( \alpha , \beta ) $ and $ ( \beta , \gamma ) $ we now have it in the full interval $ ( \alpha , \gamma ) $. Now $ f ^ { ( 0 ) } = f = f _ 0 $ so we see by induction on $ j $ that $ f ^ { ( j ) } = f _ j $ for $ j \le k $. The $ f _ j $'s are already known to be continuous, so $ f $ is $ k $ times continuously differentiable.

Now that we have the lemma it's straightforward, if slightly tedious, to check that the extended $ f $ and $ g $ defined previously are twice and once continuously differentiable, respectively. We'll do this just at the point $ b $ as an illustration. To the left of $ b $ we have $ m ( x ) = 0 $, $ l ( x ) = 0 $ and $ r ( x ) = ( x - a ) / ( b - a ) $ so \[\tag{2.6.19} f ( x ) = f ( a + ( b - a ) r ( x ) ) = f ( x )\] and to the right of $ b $ we have $ m ( x ) = 0 $, $ l ( x ) = 1 $ and $ r ( x ) = ( x - b ) / ( b - a ) $ so \[\tag{2.6.20} f ( x ) = f ( b - ( b - a ) r ( x ) ) = f ( 2 b - x ) .\] The first and second derivatives to the left and right of $ b $ are \[\tag{2.6.21} f ' ( x ) = \begin{cases} f ' ( x ) = f ' ( x ) & \mbox{if ${ a < x < b }$,} \\ f ' ( x ) = - f ' ( 2 b - x ) & \mbox{if ${ b < x < 2 b - a }$} \end{cases}\] and \[\tag{2.6.22} f '' ( x ) = \begin{cases} f '' ( x ) = f '' ( x ) & \mbox{if ${ a < x < b }$,} \\ f '' ( x ) = f '' ( 2 b - x ) & \mbox{if ${ b < x < 2 b - a }$.} \end{cases}\] It's clear that the second derivatives approach a common limit as $ x $ tends to $ b $ from either side. It's less clear that the first derivatives do but here we have to remember that our original $ f $ was assumed to satisfy $ f ' ( b ) = 0 $. Without this assumption the extended $ f $ would not be continuously differentiable. The argument for $ g $ at $ b $ is similar, but we only need to worry about the first derivative.

Let $ u $ be the solution to the initial value problem in $ \mathbf R $ to the wave equation with initial data given by the extended $ f $ and $ g $. We can check directly that \[\tag{2.6.23} ( O _ a u ) ( t , x ) = - u ( t , 2 a - x )\] and \[\tag{2.6.24} ( E _ b u ) ( t , x ) = u ( t , 2 b - x )\] are symmetries of the wave equation, or we can note that $ O _ a = M _ { - 1 } T _ { a , 0 } R T _ { - a , 0 } $ and $ E _ b = T _ { b , 0 } R T _ { - b , 0 } $ so each of these is a composition of symmetries and hence a symmetry. Indeed $ O _ a $ and $ E _ b $ are symmetries not just of the wave equation in general but of our particular solution. To see this it suffices to check that the initial data is unchanged by each of these and then apply the uniqueness theorem. Then we note that \[\tag{2.6.25} u ( t , a ) = ( O _ a u ) ( t , a ) = - u ( t , a )\] so $ u ( t , a ) = 0 $ and $ u $ satisfies the Dirichlet boundary condition at the left endpoint. Similarly \[\tag{2.6.26} \frac { \partial u } { \partial x } ( t , b ) = \frac { \partial O _ b u } { \partial x } ( t , b ) = \frac { \partial O _ b u } { \partial x } ( t , b ) = - \frac { \partial u } { \partial x } ( t , b )\] so $ \partial u / \partial x ( t , b ) = 0 $. and $ u $ satisfies the Neumann boundary condition at the left endpoint. We now have a solution to our initial value problem with the given boundary conditions. We did this with a Dirichlet condition at the left endpoint and a Neumann condition at the right endpoint but the same technique works for any of the other three combinations of boundary conditions--we just need to choose the appropriate extension of the initial data $ f $ and $ g $, which will be the one which has the correct symmetry properties on spatial reflection through $ a $ and $ b $ to ensure the boundary conditions are satisfied. We therefore have the following complement to our earlier uniqueness theorem.

Theorem 2.6.B For given $ f $ and $ g $, defined on the interval $ [ a , b ] $, there is at least one solution to the initial value problem in $ \mathbf R \times [ a , b ] $ with given boundary conditions, Dirichlet or Neumann, at the endpoints.

The solution we found above has two interesting properties. First, it is periodic in the spatial variable, with period $ 4 ( b - a ) $. To see this, recall that that $ E _ a $ and $ O _ b $ are symmetries of $ u $ and note that \[\tag{2.6.27} ( E _ a O _ b u ) ( t , x ) = ( O _ b u ) ( t , 2 b - x ) = - u ( t , 2 a - ( 2 b - x ) = - u ( t , x - 2 ( b - a ) )\] so \[\tag{2.6.28} ( E _ a O _ b E _ a O _ b u ) ( t , x ) = u ( t , x - 4 ( b - a ) ) = T _ { 0 , 4 ( b - a ) } u ( t , x ) .\] Similar arguments apply to the other three combinations of boundary conditions. With a Neumann condition at the left endpoint and a Dirichlet condition at the right endpoint we again get a solution which is periodic with period $ 4 ( b - a ) $. If have Dirichlet conditions at both endpoints or Neumann conditions at both endpoint then we get solutions which are periodic with period $ 2 ( b - a ) $. Of course the original problem was to solve the initial value problem in $ \mathbf R \times [ a , b ] $, so when we say that the solution is spatially periodic what we really mean is that the natural extension of the solution to $ \mathbf R ^ 2 $ is periodic.

The second interesting property of our solution is that it is periodic in time, with period $ 4 ( b - a ) / c $. To see this, note that in terms of our extended initial data the solution is given by the D'Alembert formula (2.1.25). Substituting $ t + 4 ( b - a ) / c $ for $ t $ in this formula gives \[\tag{2.6.29} \begin{split} u ( t + 4 ( b - a ) / c , x ) & = \frac 1 2 f ( x + c s - c t - 4 ( b - a ) ) \\ & \quad {} + \frac 1 2 f ( x - c s + c t + 4 ( b - a ) ) \\ & \quad {} + \frac 1 { 2 c } \int _ { x + c s - c t - 4 ( b - a ) } ^ { x - c s + c t + 4 ( b - a ) } g ( y ) \, d y . \end{split}\] Adding or subtracting $ 4 ( b - a ) $ from the argument of $ f $ has no effect, as we just saw when we discussed spatial periodicity, so the first two terms are equal to the corresponding terms in the formula for $ u ( t , x ) $. For the last term we split the integral into five pieces, corresponding to the subintervals \[\tag{2.6.30} [ x + c s - c t - 4 ( b - a ) , x + c s - c t - 2 ( b - a ) ] \], \[\tag{2.6.31} [ x + c s - c t - 2 ( b - a ) , x + c s - c t ] \], \[\tag{2.6.32} [ x + c s - c t , x - c s + c t ] \], \[\tag{2.6.33} [ x - c s + c t , x - c s + c t + 2 ( b - a ) ] \], and \[\tag{2.6.34} [ x - c s + c t + 2 ( b - a ) , x - c s + c t + 4 ( b - a ) ] \]. The middle one corresponds to the corresponding terms in the formula for $ u ( t , x ) $. The first two cancel each other out because translation by $ 2 ( b - a ) $ changes the sign of $ g $, and the last two cancel for the same reason. So we find that $ u ( t + 4 ( b - a ) / c , x ) $ and $ u ( t , x ) $ are equal, as claimed.

The argument above works without changes in the case of a Neumann left endpoint and Dirichlet right endpoint. It also works when both endpoints satisfy Dirichlet boundary conditions. In fact a somewhat more careful argument gives periodicity with period $ 2 ( b - a ) / c $ in that case. No variant of the argument above can prove periodicity in time in the case of Neumann conditions at both endpoints though, because this is not true in general. Indeed the wave equation on any interval with Neumann conditions at both endpoints and initial data $ f ( x ) = 0 $, $ g ( x ) = 1 $ has as its solution $ u ( t , x ) = t $, which is not periodic in $ t $.

Section 2.7 Green's Theorem

A number of results in earlier sections of this chapter are proved by evaluating double integrals by repeated integration. A more systematic approach is to use Green's theorem, which is the following generalisation of Lemma 2.5.A.

Theorem 2.7.A Suppose the boundary of the closed bounded region $ R $ in $ \mathbf R ^ 2 $ consists of a sequence of finitely many continuously differentiable curves $ C _ 1 $, $ C _ 2 $, ..., $ C _ k $. Suppose that $ p $ and $ q $ are continuous differentiable functions on $ R $. Then \[\tag{2.7.1} \sum _ { j = 1 } ^ k \int _ { C _ j } \left ( p ( t , x ) \, d x + q ( t , x ) \, d t \right ) = \int _ R \left ( \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } \right ) \, d A ,\] where the curves $ C _ j $ are traversed in such a direction that that the region $ R $ is on our left.

In most applications of the theorem the functions $ p $ and $ q $ are chosen so as to make the integrand $ \partial p / \partial t - \partial q / \partial x $ on the right hand side equal to zero. As a particular example, consider \[\tag{2.7.2} p = \frac { \partial u } { \partial t } , \quad q = c ^ 2 \frac { \partial u } { \partial x }\] where $ u $ is a classical solution of the wave equation. Then \[\tag{2.7.3} \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } = 0\] is just the wave equation with the signs reversed. In addition to the functions $ p $ and $ q $ we also need to choose a region $ R $. In this case we'll choose the triangle with vertices $ ( t _ 2 , x _ 2 ) $, $ ( t _ 1 , x _ 2 - c ( t _ 2 - t _ 1 ) ) $, and $ ( t _ 1 , x _ 2 + c ( t _ 2 - t _ 1 ) ) $. Let $ C _ 1 $, $ C _ 2 $ and $ C _ 3 $ be the edges from $ ( t _ 1 , x _ 2 + c ( t _ 2 - t _ 1 ) ) $ to $ ( t _ 2 , x _ 2 ) $, from $ ( t _ 2 , x _ 2 ) $ to $ ( t _ 1 , x _ 2 - c ( t _ 2 - t _ 1 ) ) $, and from $ ( t _ 1 , x _ 2 - c ( t _ 2 - t _ 1 ) ) $ to $ ( t _ 1 , x _ 2 + c ( t _ 2 - t _ 1 ) ) $.

We can also get energy conservation from Green's theorem using the pair of functions \[\tag{2.7.15} p = \frac 1 2 \left ( \frac { \partial u } { \partial t } \right ) ^ 2 + \frac { c ^ 2 } 2 \left ( \frac { \partial u } { \partial x } \right ) ^ 2 , \quad q = - c ^ 2 \frac { \partial u } { \partial t } \frac { \partial u } { \partial x } .\] In fact this is essentially how we proved in the case of a bounded interval, with $ R $ chosen to be a rectangle. We could also reprove the original version by the same method, taking $ R $ to be a trapezoid. Here, instead, we'll prove energy conservation for a semi-infinite interval, which we haven't treated yet.

Suppose then that $ u $ is a classical solution of the wave equation on $ \mathbf R \times [ a , + \infty ) $ which satisfies either the Dirichlet or Neumann boundary condition at $ a $. The region $ R $ we will choose is a quadrilateral with four sides, $ C _ 1 $ from $ ( t _ 1 , a ) $ to $ ( t _ 1 , x _ 1 ) $, $ C _ 2 $ from $ ( t _ 1 , x _ 1 ) $ to $ ( t _ 2 , x _ 2 ) $, $ C _ 3 $ from $ ( t _ 2 , x _ 2 ) $ to $ ( t _ 2 , a ) $, and $ C _ 4 $ from $ ( t _ 2 , a ) $ back to $ ( t _ 1 , a ) $, where \[\tag{2.7.16} x _ 2 = x _ 1 \pm c ( t _ 2 - t _ 1 ) .\]

Theorem 2.7.B Suppose $u$ is a classical solution to the wave equation on $ \mathbf R \times [ a , + \infty ) $ satisfying either the Dirichlet or Neumann condition at the boundary. If the total energy \[\tag{2.7.26} \int _ { a } ^ { + \infty } E ( t , x ) \, d x \] is finite for some value of $t$ then it is finite for all values of $t$ and is independent of $t$.

Section 2.8 Klein-Gordon and Sine-Gordon Equations

One reason for developing multiple techniques for proving results about the wave equation in one spatial dimension is that some generalise better to higher dimensions or related equations than others. We won't consider higher dimensions here but we will briefly consider two related equations, the Klein-Gordon and Sine-Gordon equations.

The Klein-Gordon equation \[\tag{2.8.1} \frac { \partial ^ 2 u } { \partial t ^ 2 } - c ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } + m ^ 2 u = 0\] plays a fundamental result in relativistic quantum mechanics. It shares some, but not all, of the symmetries of the wave equation. Of the ones we considered earlier it has the spatial reflection symmetry, spacetime translational symmetry and Lorentz symmetry and scaling symmetry in the dependent variable, but not the scaling symmetry in the independent variable or the spacetime inversion symmetry. It also has temporal reflection symmetry. In the case of the wave equation we got this from spatial reflection symmetry and scaling symmetry in the independent variables but for Klein-Gordon this needs to be checked separately because we don't have scaling symmetry in the independent variables.

We can still prove energy conservation, but with the energy density \[\tag{2.8.2} E = \frac 1 2 \left ( \frac { \partial u } { \partial t } \right ) ^ 2 + \frac { c ^ 2 } 2 \left ( \frac { \partial u } { \partial x } \right ) ^ 2 + \frac { m ^ 2 } 2 u .\] The first argument we used for the wave equation, using the auxiliary functions $ v $ and $ w $, does not generalise but the argument using Green's theorem does. A solution with zero initial data has zero initial energy and so has zero energy for all time. Looking at the form of the energy density this means the solution can only be the zero solution. The equation is linear, so the difference of two solutions is also a solution. If the two solutions have the same initial data then the difference has zero initial data and hence is zero, so the two solutions are the same. Thus we see that analogues of the uniqueness theorems for the wave equation also hold for the Klein-Gordon equation. The technique above is the one we used for boundary value problems for the wave equation, but not the one we originally used in $ \mathbf R ^ 2 $, which relied on having an explicit solution. Here we haven't used an explicit solution. There is one, but it involves special functions and isn't nearly as convenient to work with as the D'Alembert formula.

The sine-Gordon equation \[\tag{2.8.3} \frac { \partial ^ 2 u } { \partial t ^ 2 } - \frac { \partial ^ 2 u } { \partial x ^ 2 } + \sin u = 0\] arises in a variety of geometric and physical contexts. It is reasonable to guess that small solutions should behave like solutions of the Klein-Gordon equation with $ c = m = 1 $ because $ \sin u \approx u $ for small $ u $, and this is at least somewhat correct, but it has many peculiarities of its own.

The sine-Gordon equation shares most, but not all, of the symmetries of the Klein-Gordon equation. Of the ones we've considered it has spatial and temporal reflection symmetry, spacetime translation symmetry, and Lorentz symmetry, but not scaling symmetry in the dependent variable.

Energy conservation also holds for the sine-Gordon equation, with energy density \[\tag{2.8.4} E = \frac 1 2 \left ( \frac { \partial u } { \partial t } \right ) ^ 2 + \frac 1 2 \left ( \frac { \partial u } { \partial x } \right ) ^ 2 + 1 - \cos u .\] As for Klein-Gordon, any solution with zero initial data must be zero for all time, but sine-Gordon is non-linear, so the difference of two solutions is not, in general, a solution, and so we can't obtain a uniqueness theorem in the same way. Perhaps unsurprisingly there is also no explicit solution formula for sine-Gordon.

Chapter 3 Diffusion Equation

Section 3.1 Symmetries

We met the diffusion equation (1.0.4) \[\tag{3.1.1} \frac { \partial u } { \partial t } - k \frac { \partial ^ 2 u } { \partial x ^ 2 } = 0\] earlier. The natural differentiability assumption is that $ u $ is continuously differentiable in $ t $ and twice continuously differentiable in $ x $.

Like the wave equation, the diffusion equation has a spatial reflection symmetry, but unlike the wave equation it does not have temporal reflection symmetry, and indeed behaves very differently in the positive and negative time directions. It's also symmetric under spatial and temporal translation. It also has a scaling symmetry, in both the dependent and independent variables, but the scaling in the independent variables is different from the one we had for the wave equation. Let \[\tag{3.1.2} ( S _ { \alpha , \lambda } u ) ( t , x ) = \lambda u ( t / \alpha ^ 2 , x / \alpha ) ,\] where $ \alpha $ and $ \lambda $ are non-zero. A simple calculation shows that \[\tag{3.1.3} \begin{split} & \frac { \partial S _ { \alpha , \lambda } u } { \partial t } ( t , x ) - k \frac { \partial ^ 2 S _ { \alpha , \lambda } u } { \partial x ^ 2 } ( t , x ) \\ \quad & = \frac { \lambda } { \alpha ^ 2 } \left [ \frac { \partial u } { \partial t } ( t / \alpha ^ 2 , x / \alpha ) - k \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t / \alpha ^ 2 , x / \alpha ) \right ] \end{split}\] so $ S _ { \alpha , \lambda } u $ satisfies the diffusion equation if and only $ u $ does. The diffusion equation also has some less obvious symmetries. Let \[\tag{3.1.4} ( G _ v u ) ( t , x ) = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) u ( t , x - v t ) .\] Then $ G _ v $ is a symmetry for any value of $ v $. To verify this we set $ \tilde u = G _ v u $ and compute partial derivatives. \[\tag{3.1.5} \frac { \partial \tilde u } { \partial t } ( t , x ) = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) \left [ \frac { \partial u } { \partial t } - v \frac { \partial u } { \partial x } + \frac { v ^ 2 } { 4 k } u \right ] ,\] \[\tag{3.1.6} \frac { \partial \tilde u } { \partial x } ( t , x ) = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) \left [ \frac { \partial u } { \partial x } - \frac { v } { 2 k } u \right ] ,\] and \[\tag{3.1.7} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) \left [ \frac { \partial ^ 2 u } { \partial x ^ 2 } - \frac { v } { k } \frac { \partial u } { \partial x } + \frac { v ^ 2 } { 4 k ^ 2 } u \right ] .\] On the right hand side $ u $ and its various derivatives are always evaluated at $ ( t , x - v t ) $ but this has not been written explicitly to prevent the equations from becoming too long. The same will be done in the next equation, \[\tag{3.1.8} \frac { \partial \tilde u } { \partial t } ( t , x ) - k \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( t , x ) = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) \left [ \frac { \partial u } { \partial t } - k \frac { \partial ^ 2 u } { \partial x ^ 2 } \right ] ,\] which follows immediately from the preceding equations. The left hand side is zero if and only if the right hand side is zero and the exponential factor is non-zero so $ \tilde u $ satisfies the diffusion equation if and only if $ u $ does.

Section 3.2 Special Solutions

One thing we can do with symmetries of a differential equation is to look for solutions invariant under the symmetry. We may or may not get something interesting, depending on the equation and the symmetry. Spatial and temporal translations don't give very interesting solutions of the diffusion equation, for example. A function which is invariant under spatial translations, for example, is just a $ u ( t , x ) $ which is independent of $ x $, but then $ \partial ^ 2 u / \partial x ^ 2 = 0 $ and so if $ u $ is a solution of the diffusion equation then $ \partial u / \partial t = 0 $ as well, so $ u $ is independent of $ t $ as well, and hence constant. So the only solutions of the diffusion equation invariant under spatial translations are the constant solutions. The solutions which are invariant under temporal translations are just the linear functions of $ x $.

The only solution to the diffusion equation which is invariant under scaling in the dependent variable is the zero solution, which is also not very interesting. Scaling in the dependent variables is more interesting. A scale invariant solution is one which satisfies $ S _ { \alpha , 0 } u = u $ for all $ \alpha $. In other words \[\tag{3.2.1} u ( t , x ) = u ( t / \alpha ^ 2 , x / \alpha ) .\] Since this holds for all $ \alpha $ it must hold in particular for $ \alpha = \sqrt { k t } $ , so \[\tag{3.2.2} u ( t , x ) = \varphi ( x / \sqrt { k t } ) ,\] where \[\tag{3.2.3} \varphi ( y ) = u ( 1 / k , y ) .\] Here we've implicitly assumed that $ t > 0 $, so we can only expect this procedure to give us an invariant solution defined there. Of course we also need $ u $ to satisfy the diffusion equation. Taking partial derivatives, \[\tag{3.2.4} \frac { \partial u } { \partial t } ( t , x ) = - \frac 1 2 \frac x { \sqrt { k t ^ 3 } } \varphi ' ( x / \sqrt { k t } ) ,\] \[\tag{3.2.5} \frac { \partial u } { \partial x } ( t , x ) = \frac 1 { \sqrt { k t } } \varphi ' ( x / \sqrt { k t } ) ,\] and \[\tag{3.2.6} \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = \frac 1 { k t } \varphi '' ( x / \sqrt { k t } ) ,\] so \[\tag{3.2.7} \frac { \partial u } { \partial t } ( t , x ) - k \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = - \frac 1 t \left [ \varphi '' ( x / \sqrt { k t } ) + \frac 1 2 \frac x { \sqrt { k t } } \varphi ' ( x / \sqrt { k t } ) \right ] .\] The right hand side is zero for all $ t > 0 $ and all $ x $ if and only if $ \varphi $ satisfies the ordinary differential equation \[\tag{3.2.8} \varphi '' ( y ) + \frac y 2 \varphi ' ( y ) = 0 .\] This can be solved as follows. Let \[\tag{3.2.9} \psi ( y ) = \exp ( y ^ 2 / 4 ) \varphi ' ( y ) .\] Then \[\tag{3.2.10} \psi ' ( y ) = \exp ( y ^ 2 / 4 ) \left [ \varphi '' ( y ) + \frac y 2 \varphi ' ( y ) \right ]\] so $ \varphi $ satisfies the differential equation above if and only if $ \psi $ is constant. Calling that constant $ c _ 1 $ we then have \[\tag{3.2.11} \varphi ' ( y ) = c _ 1 \exp ( - y ^ 2 / 4 )\] and so \[\tag{3.2.12} \varphi ( y ) = c _ 1 \int _ 0 ^ y \exp ( - z ^ 2 / 4 ) \, d z + c _ 2\] for some other constant $ c _ 2 $. The integral above can't be expressed in terms of elementary functions. It can be expressed in terms of what's called the error function, named because of its interpretation in probability theory, but the error function is defined in terms of this integral so that doesn't really provide any new information. In any case, we conclude that the scale invariant solutions to the diffusion are the two parameter family given by the equation above.

Next we look for solutions of the diffusion equation which are invariant under the transformations $ G _ v $ defined earlier, i.e. those that satisfy \[\tag{3.2.13} u ( t , x ) = \exp \left ( - \frac { v x } { 2 k } + \frac { v ^ 2 t } { 4 k } \right ) u ( t , x - v t )\] for all $ v $. Again we'll look for solutions valid for $ t > 0 $. Since the equation above holds for all $ v $ it holds in particular for $ v = x / t $, which gives \[\tag{3.2.14} u ( t , x ) = \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) \varphi ( t )\] where $ \varphi ( t ) = u ( t , 0 ) $. We still need to impose the condition that $ u $ satisfies the diffusion equation. Computing partial derivatives, \[\tag{3.2.15} \frac { \partial u } { \partial t } ( t , x ) = \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) \left [ \varphi ' ( t ) + \frac { x ^ 2 } { 4 k t ^ 2 } \varphi ( t ) \right ] ,\] \[\tag{3.2.16} \frac { \partial u } { \partial x } ( t , x ) = \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) \left [ - \frac { x } { 2 k t } \varphi ( t ) \right ] ,\] and \[\tag{3.2.17} \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) \left [ \frac { x ^ 2 } { 4 k ^ 2 t ^ 2 } \varphi ( t ) - \frac 1 { 2 k t } \varphi ( t ) \right ] ,\] so \[\tag{3.2.18} \frac { \partial u } { \partial t } ( t , x ) - k \frac { \partial ^ 2 u } { \partial x ^ 2 } ( t , x ) = \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) \left [ \varphi ' ( t ) + \frac 1 { 2 t } \varphi ( t ) \right ] ,\] and $ u $ satisfies the diffusion equation if and only if $ \varphi $ satisfies the ordinary differential equation \[\tag{3.2.19} \varphi ' ( t ) + \frac 1 { 2 t } \varphi ( t ) = 0 .\] The solutions of this equation are precisely the constant multiples of $ t ^ { - 1 / 2 } $. In this way we see that every solution of the diffusion equation invariant under the transformations $ G _ v $ is a constant multiple of \[\tag{3.2.20} K ( t , x ) = \frac 1 { \sqrt { 4 \pi k t } } \exp \left ( - \frac { x ^ 2 } { 4 k t } \right ) .\] This solution turns out to be so important to the theory of the the diffusion equation that it is know as the fundamental solution. The extra factor $ 1 / \sqrt { 4 \pi k } $ is chosen to simplify the form of various equations which will appear later in the chapter.

Section 3.3 Positivity

For the diffusion equation we will consider the initial value problem \[\tag{3.3.1} u ( s , x ) = f ( s ) .\] Note that unlike the case of the wave equation, for the diffusion we only specify the initial value and not the initial time derivative as data. One other difference is that we will only look for a solution for $ t \ge s $.

The diffusion equation is called the diffusion equation because it describes various diffusion processes, for chemicals, heat, etc. The last of these also explains why it is often also called the heat equation. This physical origin suggests a conjecture about the initial value problem, namely that if $ f $ is positive, or non-negative, then $ u $ should be as well. This conjecture is, unfortunately, false but it is true for bounded solutions. In fact most things we want to prove about the diffusion equation turn out to be false in general but true when we restrict our attention to bounded solutions and initial data. In fact it's possible to replace boundedness with considerably weaker growth conditions but we won't bother doing this. Of course there is no hope of $ u $ being bounded unless $ f $ is so for the remainder of the chapter every time $ u $ and $ f $ appear there will be an implicit assumption that both are bounded. That assumption will be made explicit though in the statements of theorems or in those parts of their proofs where we use it.

Suppose $ u $ is a bounded solution of the initial value problem for the diffusion equation with non-negative initial data. For $ \epsilon > 0 $ define $ w $ by \[\tag{3.3.2} w ( t , x ) = u ( t , x ) + 3 k \epsilon t + \epsilon x ^ 2 .\] Note that $ w $ doesn't satisfy the diffusion equation, but instead satisfies the related equation \[\tag{3.3.3} \frac { \partial w } { \partial t } - k \frac { \partial ^ 2 w } { \partial x ^ 2 } = k \epsilon .\] Since $ w $ is a continuous function it must have a minimum on the rectangle $ [ s , T ] \times [ - L , L ] $ for any $ T > s $ and $ L > 0 $. It does not have a minimum in the interior $ ( s , T ) \times ( - L , L ) $ of the rectangle. If it did then the first partial derivatives would be zero there, which would then imply that $ \partial ^ 2 w / \partial x ^ 2 = - \epsilon $ there, but the second partial derivative can't be negative at an interior minimum. A similar, but more careful, argument shows that there is also no minimum on $ \{ T \} \times ( - L , L ) $, the interior of the top of the rectangle. At such a minimum the $ \partial w / \partial t $ would have to be non-negative, since otherwise we would have points just below it where the value is smaller. Similarly, $ \partial w / \partial x $ would have to be non-negative since otherwise we'd have points just to its left where the value is smaller. But $ \partial w / \partial x $ would also have to be non-positive or we'd have points to its right where the value is smaller, so in fact $ \partial w / \partial x $ must be zero. It then follows that $ \partial w ^ 2 / \partial x ^ 2 $ must be non-negative at the minimum, because otherwise we'd have points on either side where the value is smaller. So $ \partial w / \partial t - k \partial w ^ 2 / \partial x ^ 2 $ would be non-positive at the minimum. But we've already seen that this is equal to $ k \epsilon $, the product of two positive numbers, so the assumption that there is a minimum on the interior of the top of the rectangle leads to a contradiction.

On the sides $ [ s , T ] \times \{ - L \} $ and $ [ s , T ] \times \{ L \} $ we have \[\tag{3.3.4} w ( t , x ) = u ( t , x ) + 3 k \epsilon t + \epsilon L ^ 2 > \inf u + 3 k \epsilon s + \epsilon L ^ 2 \ge 3 k \epsilon s\] if \[\tag{3.3.5} L > \sqrt { \frac { \max ( 0 , - \inf u ) } \epsilon } .\] Here we've used our boundedness assumption $ u $.

The only possibility not considered so far is that the minimum of $ w $ is located on $ \{ s \} \times ( - L , L ) $, the interior of the bottom of the rectangle. If so then the definition of $ w $ and our assumption that $ f $ is non-negative imply that this minimum value of $ w $ is at least $ 3 k \epsilon s $.

What we have shown is that the minimum of $ w $ in the rectangle is attained at a point on the sides or bottom and is at least $ 3 k \epsilon s $, at least provided $ L $ is sufficiently large. In particular, \[\tag{3.3.6} w ( t , x ) \ge 3 k \epsilon s\] for any $ ( t , x ) $ in the rectangle, and therefore \[\tag{3.3.7} u ( t , x ) \ge - k \epsilon x ^ 2 .\] For any $ t > s $ there is a $ T > t $ and an \[\tag{3.3.8} L > \max ( | x | , \sqrt { \frac { \max ( 0 , - \inf u ) } \epsilon } ) ,\] so we have \[\tag{3.3.9} u ( t , x ) \ge - k \epsilon x ^ 2 .\] No assumptions other than positivity were made on $ \epsilon $ so this inequality holds for all positive $ \epsilon $ and therefore . This then gives \[\tag{3.3.10} u ( t , x ) \ge - k \epsilon x ^ 2 .\] In other words we've proved the following theorem.

Section 3.4 Uniqueness

If $ u $ is a solution of the diffusion equation then so is $ - u $ so the positivity theorem at the end of the last section also shows that if the initial data for the initial problem are non-positive then the solution is non-positive. Combining this with the original version we see that if the initial data are zero then the solution is also zero. The equation is linear so the difference of two solutions is also a solution. Considering the difference of two solutions then we see that if the difference of their initial data is zero then the difference of the solutions is zero. Put more simply, if they have the same initial data then they are the same solution. In this way we obtain the following uniqueness theorem.

We had several different proofs of uniqueness for the wave equation. There was a more or less direct proof based on a pair of auxiliary functions we defined there, there was a proof based on Green's theorem, and there was a proof using energy conservation and linearity. The first two both led to D'Alembert's formula while the last one didn't give any explicit form for the solution. The first of these didn't generalise even to closely related equations like Klein-Gordon and so we can't expect to find anything similar for the diffusion equation. The last does have a generalisation to the diffusion equation, which we'll see later, but it gives us a pure uniqueness theorem, not an explicit formula, and we already have that, so it seems natural to look for an alternate uniqueness proof for the diffusion equation using Green's theorem. This works, but the choice of functions to apply Green's theorem to is much less obvious than it was for the wave equation.

We apply Green's function with \[\tag{3.4.1} p = - u v , \quad q = k v \frac { \partial u } { \partial x } - k u \frac { \partial v } { \partial x } ,\] where $ u $ and $ v $ are to be chosen later. The integrand on the right hand side in Green's theorem is then \[\tag{3.4.2} \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } = v \left ( \frac { \partial u } { \partial t } - k \frac { \partial ^ 2 u } { \partial x ^ 2 } \right ) + u \left ( \frac { \partial v } { \partial t } + k \frac { \partial ^ 2 v } { \partial x ^ 2 } \right ) .\] The first term on the right hand side will vanish if $ u $ satisfies the diffusion equation. The second term will vanish if $ v $ satisfies the time reversed version of the diffusion equation. Roughly the idea will be to let $ u $ be an arbitrary solution of the diffusion equation and let $ v $ be a particular solution of the time reversed equation, chosen so as to provide useful information about $ u $. In the end this isn't quite what we want, but for the moment it's a useful guide. Which particular solution should we take? We have a variety of particular solutions to the diffusion equation which we found earlier when we looked for solutions symmetric under particular symmetries and we can get a solution to the time reversed diffusion equation simply by reversing time in one of those. The most interesting of the solutions there was the fundamental solution, so we'll choose that one. We can get a bit more information though by applying an arbitrary space-time translation to the fundamental solution and then reversing time, so we'll choose \[\tag{3.4.3} v ( t , x ) = K ( t _ 3 - t , x - x _ 3 )\] for some point $ ( t _ 3 , x _ 3 ) $. Eventually we will need to modify this choice but first let's see what happens when we choose this $ v $.

Now that we have our functions $ p $ and $ q $ we need to choose a region $ R $. We will choose a strip $ [ t _ 1 , t _ 2 ] \times \mathbf R $, where $ t _ 1 < t _ 2 < t _ 3 $. Unfortunately this region is not bounded, so the hypotheses of Green's theorem are not satisfied, but we will temporarily ignore this problem and see what happens. The boundary of the strip consists of the line $ \{ t _ 1 \} \times \mathbf R $, traversed from left to right and the line $ \{ t _ 2 \} \times \mathbf R $, traversed from right to left. On each of these $ d t = 0 $ so we are integrating $ p \, d x $, i.e. $ - u v \, d x $. So Green's theorem, if it applied, would give \[\tag{3.4.4} \begin{split} & \int _ { - \infty } ^ { + \infty } K ( t _ 3 - t _ 2 , x - x _ 3 ) u ( t _ 2 , x ) \, d x \\ & \quad {} - \int _ { - \infty } ^ { + \infty } K ( t _ 3 - t _ 1 , x - x _ 3 ) u ( t _ 1 , x ) \, d x = 0 . \end{split}\] The two integrals are both of the form \[\tag{3.4.5} \begin{split} & \int _ { - \infty } ^ { + \infty } K ( t _ 3 - t _ j , x - x _ 3 ) u ( t _ j , x ) \, d x \\ & \quad {} = \frac 1 { \sqrt { 4 \pi k ( t _ 3 - t _ j ) } } \int _ { - \infty } ^ { + \infty } \exp \left ( - \frac { ( x - x _ 3 ) ^ 2 } { 4 k ( t _ 3 - t _ j ) } \right ) u ( t _ j , x ) \, d x . \end{split}\] Making the change of variable \[\tag{3.4.6} y = \frac { x - x _ 3 } { \sqrt { 4 \pi k ( t _ 3 - t _ j ) } }\] converts this integral to \[\tag{3.4.7} \int _ { - \infty } ^ { + \infty } \exp ( - \pi y ^ 2 ) u \left ( t _ 3 , x _ 3 + y \sqrt { 4 \pi k ( t _ 3 - t _ j ) } \right ) \, d y .\] We want to make this change of variable in the integral with $ j = 2 $ but not the one with $ j = 1 $. This gives us the equation \[\tag{3.4.8} \begin{split} & \int _ { - \infty } ^ { + \infty } \exp ( - \pi y ^ 2 ) u \left ( x _ 3 + y \sqrt { 4 \pi k ( t _ 3 - t _ 2 ) } \right ) \, d y \\ & \quad {} = \frac 1 { \sqrt { 4 \pi k ( t _ 3 - t _ j ) } } \int _ { - \infty } ^ { + \infty } \exp \left ( - \frac { ( x - x _ 3 ) ^ 2 } { 4 k ( t _ 3 - t _ 1 ) } \right ) u ( t _ 1 , x ) \, d x . \end{split}\] Next we take limits as $ t _ 3 $ tends to $ t _ 2 $ from above, simply taking the limit inside the integral without worrying for the moment whether this is justified. On the left hand side the argument of $ u $ tends to $ ( t _ 2 , x _ 3 ) $ and $ u $ is continuous so the integral tends to $ \exp ( - \pi y ^ 2 ) u ( t _ 2 , x _ 3 ) $. We can pull the constant outside the integral and the remaining integral is a well known definite integral with value 1 so on the left hand side of the equation we just get $ u ( t _ 2 , x _ 3 ) $. One the right hand side, using the continuity of $ u $ again, we just get the result of substituting $ t _ 2 $ for $ t _ 3 $ in the integral we had previously. In other words, the limit of the equation above is \[\tag{3.4.9} u ( t _ 2 , x _ 3 ) = \frac 1 { \sqrt { 4 \pi k ( t _ 2 - t _ j ) } } \int _ { - \infty } ^ { + \infty } \exp \left ( - \frac { ( x - x _ 3 ) ^ 2 } { 4 k ( t _ 2 - t _ 1 ) } \right ) u ( t _ 1 , x ) \, d x .\] Changing the names of various variables we see that if $ s < t $ then \[\tag{3.4.10} u ( t , x ) = \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \int _ { - \infty } ^ { + \infty } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) u ( s , y ) \, d y .\] In other words, if $ u $ satisfies the initial value problem for the diffusion equation then \[\tag{3.4.11} u ( t , x ) = \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \int _ { - \infty } ^ { + \infty } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y\] for all $ t > s $ and all $ x $. This formula, if correct, gives us an alternate proof of the uniqueness of solutions to the initial value problem, one which gives us an idea for how to prove existence as well: we just check that the formula above does indeed give a solution to the initial value problem. Unfortunately there are two gaps in the proof above. We applied Green's theorem improperly and we exchanged limits and integrals without justification. We need to fix that, but in fact the equation above is correct.

Usually the way to prove anything involving integrals over an infinite interval is to take limits in a finite interval. This nearly works in our current situation, but as we'll see it doesn't quite do everything we want. Let's see what happens if we apply Green's theorem with $ p $ and $ q $ as before to the rectangle $ [ t _ 1 , t _ 2 ] \times [ x _ 3 - L , x _ 3 + L ] $. This region does, of course, satisfy the hypotheses of Green's theorem. Its boundary consists of four straight segments: $ C _ 1 $ from $ ( t _ 2 , x _ 3 + L ) $ to $ ( t _ 2 , x _ 3 - L ) $, $ C _ 2 $ from $ ( t _ 2 , x _ 3 - L ) $ to $ ( t _ 1 , x _ 3 - L ) $, $ C _ 3 $ from $ ( t _ 1 , x _ 3 - L ) $ to $ ( t _ 1 , x _ 3 + L ) $, and $ C _ 4 $ from $ ( t _ 1 , x _ 3 + L ) $ to $ ( t _ 2 , x _ 3 + L ) $. As before the integrand in the area integral is zero so we have \[\tag{3.4.12} \sum _ { j = 1 } ^ 4 \int _ { C _ j } ( p \, d x + q \, d t ) = 0 .\] On the first and third boundary curves we have \[\tag{3.4.13} \int _ { C _ 1 } ( p \, d x + q \, d t ) = \int _ { x _ 3 - L } ^ { x _ 3 + L } u ( t _ 2 , x ) v ( t _ 2 , x ) \, d x\] and \[\tag{3.4.14} \int _ { C _ 3 } ( p \, d x + q \, d t ) = - \int _ { x _ 3 - L } ^ { x _ 3 + L } u ( t _ 1 , x ) v ( t _ 1 , x ) \, d x .\] When we take limits as $ L $ goes to infinity these will just tend to the corresponding integrals over $ ( - \infty , + \infty ) $, which is what we want. The integral over the right side of the rectangle is \[\tag{3.4.15} \begin{split} \int _ { C _ 4 } ( p \, d x + q \, d t ) & = k \int _ { t _ 1 } ^ { t _ 2 } v ( t , x _ 3 + L ) \frac { \partial u } { \partial x } ( t , x _ 3 + L ) \, d t \\ & \quad {} - k \int _ { t _ 1 } ^ { t _ 2 } u ( t , x _ 3 + L ) \frac { \partial v } { \partial x } ( t , x _ 3 + L ) \, d t . \end{split}\] The second of these integrals will tend to zero as $ L $ tends to infinity, a fact which we will now prove.

Our $ v $ was defined in terms of the fundamental solution $ K $ so we need an $ x $ derivative of $ K $. In fact for later purposes we will need higher order derivatives in both $ x $ and $ t $ so we go ahead and compute them now. Let \[\tag{3.4.16} w _ j ( t , x ) = ( - 2 k t ) ^ j \frac { \frac { \partial ^ j K } { \partial x ^ j } ( t , x ) } { K ( t , x ) } .\] Then $ w _ 0 = 1 $ and \[\tag{3.4.17} \begin{split} \frac { \partial ^ { j + 1 } K } { \partial x ^ { j + 1 } } ( t , x ) & = \frac { \partial } { \partial x } \frac { \partial ^ j K } { \partial x ^ j } ( t , x ) \\ & = \frac { \partial } { \partial x } \left [ w _ j ( t , x ) K ( t , x ) \right ] \\ & = K ( t , x ) \frac { \partial w _ j } { \partial x } ( t , x ) + w _ j ( t , x ) \frac { \partial K } { \partial x } ( t , x ) \\ & = K ( t , x ) \frac { \partial w _ j } { \partial x } ( t , x ) - w _ j ( t , x ) \frac { x } { 2 k t } K ( t , x ) \end{split}\] and so \[\tag{3.4.18} w _ { j + 1 } ( t , x ) = x w _ j ( t , x ) - 2 k t \frac { \partial w _ j } { \partial x } ( t , x ) .\] We can use this to compute successively \[\tag{3.4.19} w _ 0 ( t , x ) = 1 , \quad w _ 1 ( t , x ) = x , \quad w _ 2 ( t , x ) = x ^ 2 - 2 k t , \quad w _ 3 ( t , x ) = x ^ 3 - 6 k t x\] and so forth. These are the only ones we'll actually need though. You'll notice, and can easily prove by induction, that $ w _ j $ is always a polynomial of degree $ j $. Once we have the $ w $'s we can easily get the $ x $ derivatives of $ K $, \[\tag{3.4.20} \frac { \partial ^ j K } { \partial x ^ j } ( t , x ) = ( - 2 k t ) ^ { - j } w _ j ( t , x ) K ( t , x ) .\] We could get $ t $ derivatives by a similar method but it's simpler just to note that $ K $ satisfies the diffusion equation so one $ t $ derivative is the same as two $ x $ derivatives and a factor of $ k $. In this way we find that \[\tag{3.4.21} \frac { \partial ^ { i + j } K } { \partial t ^ i \partial x ^ j } ( t , x ) = ( - 1 ) ^ { j } 2 ^ { - 2 i - j } k ^ { - j } t ^ { - 2 i - j } w _ j ( t , x ) K ( t , x ) .\]

The preceding calculation gives us various useful properties of $ \partial K / \partial x ( t , x ) $. First of all, it always has the opposite sign to that of $ x $. Second, by looking at its $ t $ derivative we see that as a function of $ t $ for fixed negative $ x $ it increases until it reaches a maximum at $ t = x ^ 2 / 6 k $ and then decreases again, while for fixed positive $ x $ it decreases until it reaches a minimum at $ t = x ^ 2 / 6 k $ and then increases again. What this tells us about $ v ( t _ 3 - t , x _ 3 + L ) $ is that $ \partial v / \partial x $ is negative for all $ t $ in the interval $ [ t _ 1 , t _ 2 ] $ and that its minimum in that interval is attained when $ t = t _ 2 $ provided that $ L $ is sufficiently large, specifically $ L \ge \sqrt { 6 k ( t _ 3 - t _ 2 ) } $. At the maximum $ \partial v / \partial x $ is equal to \[\tag{3.4.22} - \frac { L } { 4 \pi ^ { 1 / 2 } k ^ { 3 / 2 } ( t _ 3 - t _ 2 ) ^ { 3 / 2 } } \exp \left ( - \frac { L ^ 2 } { 4 k ( t _ 3 - t _ 2 ) } \right ) .\] The integral \[\tag{3.4.23} \begin{split} \int _ { t _ 1 } ^ { t _ 2 } u ( t , x _ 3 + L ) \frac { \partial v } { \partial x } ( t , x _ 3 + L ) \, d t \end{split}\] which we met earlier has an absolute value less than or equal to the integral of the absolute value of the integrand, which in turn is less than or equal to the length of the interval, $ t _ 2 - t _ 1 $ times the maximum value of the absolute value of the integral. This in turn is bounded by the supremum of the absolute value of $ u $, which exists by our assumption that $ u $ is bounded, times the maximum value the absolute value of $ \partial v / \partial x $, which we just computed. This is the only factor which depends on $ L $ and it clearly tends to zero as $ L $ tends to infinity, so the integral tends to zero, as promised.

We've now treated one of the two terms in the integral over $ C _ 4 $. The other term is the one involving the integral \[\tag{3.4.24} \begin{split} \int _ { t _ 1 } ^ { t _ 2 } v ( t , x _ 3 + L ) \frac { \partial u } { \partial x } ( t , x _ 3 + L ) \, d t \end{split}\] If we try to apply a similar argument to this integral then we run into a problem. We assumed $ u $ was bounded so we had an upper bound for the absolute value of the factor $ u ( t , x _ 3 + L ) $ which was independent of $ L $. We haven't assumed that $ \partial u / \partial x $ is bounded though, so we don't have a an upper bound for the absolute value of the factor $ \partial u / \partial x ( t , x _ 3 + L ) $ in the integral above, or at least not one which is independent of $ L $. At this point there are two options. One is just to add the boundedness of $ \partial u / \partial x $ as an additional hypothesis and the other is to look for a cleverer argument. Adding an additional hypothesis might seem reasonable. After all, we already added the hypothesis that $ u $ is bounded so why not just add another hypothesis? The situation here is different though. There are known counter-examples to the uniqueness theorem with the boundedness assumption removed so we had to add it, or possibly some weaker version of it. There are no counter-examples to the version uniqueness theorem which assumes boundedness of $ u $ but not of $ \partial u / \partial x $. We know that because gave a proof of that theorem at the start of this section! So this is an assumption made purely for convenience, not from logical necessity. Mathematicians do sometimes make unnecessary hypotheses in order to simplify proofs but it's something we generally prefer to avoid so in this case we will not make any additional hypothesis and will instead look for a cleverer argument.

The problem came from the $ v \partial u / \partial x $ term so that is what we somehow have to eliminate. A simple way to kill this term is to choose a $ v $ which is zero on the left and right sides of the rectangle. The $ v $ we chose previously did not have this property. Indeed that $ v $ is positive everywhere on the boundary of the rectangle. It's easy to find $ v $'s which are zero on the left and right sides but the trick is to find which which doesn't spoil the rest of the argument. It's reasonable to try multiplying our previous $ v $ by a factor which vanishes on the left and right sides, for example \[\tag{3.4.25} v ( t , x ) = \rho \left ( \frac { x - x _ 3 } L \right ) K ( t _ 3 - t , x - x _ 3 ) ,\] where \[\tag{3.4.26} \rho ( r ) = \begin{cases} 0 & \mbox{ if ${ x < - 1 }$,} \\ 192 r ^ 5 + 720 r ^ 4 + 1040 r ^ 3 + 720 r ^ 2 + 240 r + 32 & \mbox{ if ${ - 1 \le r \le - 1 / 2 }$,} \\ 1 & \mbox{ if ${ - 1 / 2 < x < 1 / 2 }$,} \\ - 192 r ^ 5 + 720 r ^ 4 - 1040 r ^ 3 + 720 r ^ 2 - 240 r + 32 & \mbox{ if ${ 1 / 2 \le r \le 1 }$,} \\ 0 & \mbox{ if ${ x > 1 }$.} \\ \end{cases}\] This is not the only choice we could have made for $ \rho $ but it is relatively straightforward to check, using Lemma 2.6.A, that it is twice continuously differentiable. This implies in particular that it and its first two derivatives are zero at $ r = \pm 1 $. These properties ensure that our $ v $ is twice continuously differentiable and that our $ q $ is zero when $ x = x _ 3 \pm L $, which includes the boundary segments $ C _ 2 $ and $ C _ 4 $, so the corresponding integrals in Green's theorem are zero.

Modifying $ v $ as we did above fixes one problem but creates another. Our new $ v $ no longer satisfies the time reversed diffusion equation. Instead \[\tag{3.4.27} \begin{split} \frac { \partial v } { \partial t } ( t , x ) + k \frac { \partial ^ 2 v } { \partial x ^ 2 } ( t , x ) & = \frac k { L ^ 2 } \rho '' \left ( \frac { x - x _ 3 } L \right ) K ( t _ 3 - t , x - x _ 3 ) \\ & \quad {} + \frac { 2 k } { L } \rho ' \left ( \frac { x - x _ 3 } L \right ) \frac { \partial K } { \partial x } ( t _ 3 - t , x - x _ 3 ) . \end{split}\] This means the right hand side in Green's identity is no longer zero. It is important to note though that all of the terms on the right hand side of the equation above have at least one derivative of $ \rho $ and $ \rho $ is constant in the interval $ [ - 1 / 2 , 1 / 2 ] $ so the right hand side is zero for $ x $ in the interval $ [ x _ 3 - L / 2 , x _ 3 + L / 2 ] $. So Green's Theorem now gives us \[\tag{3.4.28} \begin{split} \int _ { x _ 3 - L } ^ { x _ 3 + L } u ( t _ 2 , x ) v ( t _ 2 , x ) \, d x & = \int _ { x _ 3 - L } ^ { x _ 3 + L } u ( t _ 1 , 1 ) v ( t _ 1 , x ) \, d x \\ & \quad {} + \int _ { R _ - } u \left ( \frac { \partial v } { \partial t } + k \frac { \partial ^ 2 v } { \partial x ^ 2 } \right ) \, d A \\ & \quad {} + \int _ { R _ + } u \left ( \frac { \partial v } { \partial t } + k \frac { \partial ^ 2 v } { \partial x ^ 2 } \right ) \, d A , \end{split}\] where $ R _ - $ is the rectangle $ [ t _ 1 , t _ 2 ] \times [ x _ 3 - L , x _ 3 - L / 2 ] $ and $ R _ + $ is the rectangle $ [ t _ 1 , t _ 2 ] \times [ x _ 3 + L / 2 , x _ 3 + L ] $. Compared to our previous calculation we've lost the line integrals over $ C _ 2 $ and $ C _ 4 $ and gained two area integrals over the rectangles $ R _ - $ and $ R _ + $. These are what remain of the original integral over the full rectangle when we remove the part over the rectangle $ [ t _ 1 , t _ 2 ] \times [ x _ 3 - L / 2 , x _ 3 + L / 2 ] $, where, as we've seen, $ v $ satisfies the time reversed diffusion equation and so the integrand vanishes there.

We need to show that the area integrals above are harmless, i.e. that they tend to zero as $ L $ tends to infinity. The two are similar so here we'll only consider the integral over $ R _ - $. The $ u $ factor has upper and lower bounds independent of $ L $ by assumption. We've computed $ \partial v / \partial t + k \partial ^ 2 v / \partial x ^ 2 $ in terms of derivatives of $ \rho $ and $ K $. The derivatives $ \rho $ must be bounded in the interval $ [ - 1 , - 1 / 2 ] $ because polynomials are continuous functions. We don't really care what the precise bounds are but they're not hard to obtain. The minimum and maximum of $ \rho '' $ are $ - 40 / \sqrt 3 $ and $ 40 / \sqrt 3 $ while the minimum and maximum of $ \rho ' $ are $ 0 $ and $ 15 / 4 $. Bounding $ K $ and its $ x $ derivative is more interesting, but we already have experience with this problem from our earlier attempt. As long as $ L $ is sufficiently large, which in this case means $ L > 2 \sqrt { 6 k ( t _ 3 - t _ 2 ) } $, both terms will be positive in $ R _ - $ and their $ t $ and $ x $ derivatives will be as well so the values in the rectangle lie between zero and the values in the upper right hand corner: \[\tag{3.4.29} \begin{split} 0 \le K ( t _ 3 - t , x - x _ 3 ) & \le K ( t _ 3 - t _ 2 , - L / 2 ) \\ & {} = \frac 1 { \sqrt { 4 \pi k ( t _ 3 - t _ 2 ) } } \exp \left ( - \frac { ( L / 2 ) ^ 2 } { 4 k ( t _ 3 - t _ 2 ) } \right ) \end{split}\] and \[\tag{3.4.30} \begin{split} 0 \le \frac { \partial K } { \partial x } ( t _ 3 - t , x - x _ 3 ) & \le \frac { \partial K } { \partial x } ( t _ 3 - t _ 2 , - L / 2 ) \\ & {} = \frac { \pi L } { \left [ 4 \pi k ( t _ 3 - t _ 2 ) \right ] ^ { 3 / 2 } } \exp \left ( - \frac { ( L / 2 ) ^ 2 } { 4 k ( t _ 3 - t _ 2 ) } \right ) . \end{split}\] These bounds are quite messy but the important point is that our integrand is bounded from above and below by factors independent of $ L $ times \[\tag{3.4.31} \frac 1 L \exp \left ( - \frac { ( L / 2 ) ^ 2 } { 4 k ( t _ 3 - t _ 2 ) } \right )\] and so our integral, which is over a rectangle of area $ L ( t _ 2 - t _ 1 ) / 2 $, is bounded by factors independent of $ L $ times \[\tag{3.4.32} \exp \left ( - \frac { ( L / 2 ) ^ 2 } { 4 k ( t _ 3 - t _ 2 ) } \right ) ,\] and so tends to zero as $ L $ tends to infinity, as we wanted.

There are still the integrals over $ C _ 1 $ and $ C _ 3 $ to be dealt with but these are the same integrals as before except for an extra factor of $ \rho ( ( x - x _ 3 ) / L $ in the integrands, which tends to 1 as $ L $ tends to infinity, and so is harmless. Of course we glossed over the interchange of the limits and integrals in our previous, unsuccessful, argument and we are still doing so here but other than that we have a new proof of our earlier uniqueness theorem and this one gives an actual solution formula. From this formula we can extract a lot of useful information. For example, if $ f $ is non-negative everywhere and positive somewhere then the same will be true of the integrand in the integral formula for $ u ( t , x ) $ for $ t > s $ and so $ u ( t , x ) $ will be positive. This is the strengthened version of the non-negativity theorem mentioned earlier.

Section 3.5 Regularity

What we have shown above is that if there is a classical solution to the initial value problem for the diffusion equation then it must be \[\tag{3.5.1} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y .\] We haven't yet shown that this is a solution though. The first thing we'll show is that it does indeed satisfy the diffusion equation, at least when $ t > s $. For this, and to fill a gap in our earlier proof of uniqueness, we need some multivariable calculus.

Theorem 3.5.A Suppose that $ f $ is continuous on the product of closed intervals \[\tag{3.5.2} R = [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ]\] in $ \mathbf R ^ { m } $ and $ \sigma $ is a permutation of $ 1 , \ldots , m $. Then \[\tag{3.5.3} \begin{split} & \int _ { a _ { \sigma ( m ) } } ^ { b _ { \sigma ( m ) } } \ldots \int _ { a _ { \sigma ( 1 ) } } ^ { b _ { \sigma ( 1 ) } } f ( x _ 1 , \ldots , x _ m ) \, d x ^ { \sigma ( 1 ) } \cdots d x ^ { \sigma ( m ) } \\ & \qquad = \int _ { a _ m } ^ { b _ m } \ldots \int _ { a _ 1 } ^ { b _ 1 } f ( x _ 1 , \ldots , x _ m ) \, d x ^ 1 \cdots d x ^ m . \end{split}\] The cleanest way to prove this is to define integration over sufficiently general sets in $ \mathbf R ^ m $, for example over polyhedral regions, and then show that each of the repeated integrals above is equal to the integral over the whole region.

Theorem 3.5.B Suppose that $ f $ is continuous on the product of closed intervals \[\tag{3.5.4} R = [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ] \times [ c _ 1 , d _ 1 ] \times \cdots \times [ c _ n , d _ n ]\] in $ \mathbf R ^ { m + n } $. Then \[\tag{3.5.5} g ( x _ 1 , \ldots , x _ m ) = \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } f ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n\] is continuous on $ [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ] $.

Theorem 3.5.C Suppose that $ f $ is continuously differentiable on the product of intervals \[\tag{3.5.13} R = [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ] \times [ c _ 1 , d _ 1 ] \times \cdots \times [ c _ n , d _ n ]\] in $ \mathbf R ^ { m + n } $. Then \[\tag{3.5.14} g ( x _ 1 , \ldots , x _ m ) = \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } f ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n .\] is continuously differentiable on $ [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ] $ and its partial derivatives can be obtained by formally exchanging the partial derivative and integral.

Without additional hypotheses none of these theorems are valid if the closed intervals are replaced by open or half-open intervals. For example, \[\tag{3.5.15} f ( x , y ) = \frac { 8 x y ( x ^ 2 - y ^ 2 ) } { ( x ^ 2 + y ^ 2 ) ^ 2 }\] is continuous on the product of open intervals $ ( 0 , 1 ) \times ( 0 , 1 ) $ but \[\tag{3.5.16} \int _ 0 ^ 1 \int _ 0 ^ 1 f ( x , y ) \, d x \, d y = 1\] while \[\tag{3.5.17} \int _ 0 ^ 1 \int _ 0 ^ 1 f ( x , y ) \, d y \, d x = - 1 ,\] which would be a counter-example to 3.5.A if it applied to products of open intervals. This example has the property that \[\tag{3.5.18} \int _ 0 ^ 1 \int _ 0 ^ 1 \left | f ( x , y ) \right | \, d x \, d y = \infty .\] This is not an accident. One can, in fact, show the following.

Theorem 3.5.D Suppose that $ f $ is continuous on the product of intervals \[\tag{3.5.19} R = I _ 1 \times \cdots \times I _ m\] in $ \mathbf R ^ { m } $, and let $ a _ j = \inf I _ j $ and $ b _ j = \sup I _ j $. If \[\tag{3.5.20} \int _ { a _ m } ^ { b _ m } \ldots \int _ { a _ 1 } ^ { b _ 1 } f ( x _ 1 , \ldots , x _ m ) \, d x ^ 1 \cdots d x ^ m < \infty\] then \[\tag{3.5.21} \begin{split} & \int _ { a _ { \sigma ( m ) } } ^ { b _ { \sigma ( m ) } } \ldots \int _ { a _ { \sigma ( 1 ) } } ^ { b _ { \sigma ( 1 ) } } f ( x _ 1 , \ldots , x _ m ) \, d x ^ { \sigma ( 1 ) } \cdots d x ^ { \sigma ( m ) } \\ & \qquad = \int _ { a _ m } ^ { b _ m } \ldots \int _ { a _ 1 } ^ { b _ 1 } f ( x _ 1 , \ldots , x _ m ) \, d x ^ 1 \cdots d x ^ m . \end{split}\] for every permutation $ \sigma $ of $ 1 , \ldots , m $.

If you remember the proof that the limit of an absolutely convergent double sum is independent of the order of summation you may notice that it follows exactly the same lines, just with finite sums in place of integrals over finite intervals.

There was no assumption, either in the statement of the theorem above or in its proof, that $ a _ j $ or $ b _ j $ is finite. The theorem applies to finite intervals, semi-infinite intervals, infinite intervals, or any combination of them.

The same approach, using an integrable upper bound to relate write an integral over a product of intervals to an integral over a product of closed intervals plus a small error and then using the corresponding theorem for products of closed intervals, also applies to give analogues of the other two theorems for the other two theorems as well.

Theorem 3.5.E Suppose that $ f $ is continuous on the product of intervals \[\tag{3.5.22} R = I _ 1 \times \cdots \times I _ m \times J _ 1 \times \cdots \times J _ n\] in $ \mathbf R ^ { m + n } $ and let $ c _ k = \inf J _ k $ and $ d _ k = \sup J _ k $. Suppose also that there is a non-negative continuous $ h $ on $ J _ 1 \times \cdots \times J _ n $ such that \[\tag{3.5.23} \left | f ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \right | \le h ( y _ 1 , \ldots , y _ n )\] and \[\tag{3.5.24} \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } h ( y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n < \infty .\] Then \[\tag{3.5.25} g ( x _ 1 , \ldots , x _ m ) = \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } f ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n\] is continuous on $ I _ 1 \times \cdots \times I _ m $.

Theorem 3.5.F Suppose that $ f $ is continuously differentiable on the product of intervals \[\tag{3.5.26} R = I _ 1 \times \cdots \times I _ m \times J _ 1 \times \cdots \times J _ n\] in $ \mathbf R ^ { m + n } $ and let $ c _ k = \inf J _ k $ and $ d _ k = \sup J _ k $. Suppose also that there are non-negative continuous functions $ h _ j $ on $ J _ 1 \times \cdots \times J _ n $ such that \[\tag{3.5.27} \left | \frac { \partial f } { \partial x _ j } ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \right | \le h _ j ( y _ 1 , \ldots , y _ n )\] and \[\tag{3.5.28} \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } h _ j ( y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n < \infty .\] Then \[\tag{3.5.29} g ( x _ 1 , \ldots , x _ m ) = \int _ { c _ n } ^ { d _ n } \cdots \int _ { c _ 1 } ^ { d _ 1 } f ( x _ 1 , \ldots , x _ m , y _ 1 , \ldots , y _ n ) \, d y _ 1 \cdots d y _ n .\] is continuously differentiable on $ [ a _ 1 , b _ 1 ] \times \cdots \times [ a _ m , b _ m ] $ and its partial derivatives can be obtained by formally exchanging the partial derivative and integral.

Once we have these theorems it's easy to see that Theorem 3.5.E is exactly what we need to justify the exchange of limits and integrals in the proof of the integral representation above. There we had an integral of the form \[\tag{3.5.30} \int _ { - \infty } ^ { + \infty } \exp ( - \pi y ^ 2 ) u \left ( x _ 3 + y \sqrt { 4 \pi k ( t _ 3 - t _ 2 ) } \right ) \, d y\] and we wanted to take the limit as $ t _ 3 $ tended to $ t _ 2 $ from above. The theorem says we can do that if we can find an integrable non-negative function $ h $ such that \[\tag{3.5.31} \exp ( - \pi y ^ 2 ) u \left ( x _ 3 + y \sqrt { 4 \pi k ( t _ 3 - t _ 2 ) } \right ) \le h ( y )\] for all $ y $ and it's clear that $h ( y ) = M \exp ( - \pi y ^ 2 )$ works, where $ M $ is a uniform bound on $ | u | $, which we assumed to be uniformly bounded.

We can also use Theorem 3.5.F to show that that the required derivatives of $ u $ exist and are continuous for $ t > s $. Formal differentiation of the solution formula gives \[\tag{3.5.32} \frac { \partial u } { \partial t } ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac { \partial } { \partial t } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y\] and \[\tag{3.5.33} \frac { \partial u } { \partial x } ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac { \partial } { \partial x } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y\] The partial derivatives in question were computed earlier. For example, \[\tag{3.5.34} \frac { \partial } { \partial x } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) = \frac { x - y } { \sqrt { 16 \pi k ^ 3 ( t - s ) ^ 3 } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) .\] Differentiating, we see that as a function of $ t $ for fixed values of the other variables the absolute value of the right hand side increases from zero to a maximum of \[\tag{3.5.35} \sqrt { \frac { 27 } { 2 \pi } } \exp ( - 3 / 2 )\] at \[\tag{3.5.36} t = s + \frac { ( y - x ) ^ 2 } { 6 k }\] and then decreases to zero again. If we restrict our attention to $ t \in [ t _ 1 , t _ 2 ] $ for some $ t _ 2 > t _ 1 > s $ then there are three different cases, depending on the size of $ | x - y | $. For small values of $ | x - y | $, specifically when \[\tag{3.5.37} | x - y | \le \sqrt { 6 k ( t _ 1 - s ) }\] the maximum occurs when $ t = t _ 1 $, so we have \[\tag{3.5.38} \left | \frac { \partial } { \partial x } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) \right | \le \frac { | x - y | } { \sqrt { 16 \pi k ^ 3 ( t _ 1 - s ) ^ 3 } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t _ 1 - s ) } \right ) .\] For large values of $ | x - y | $, specifically when \[\tag{3.5.39} | x - y | \ge \sqrt { 6 k ( t _ 2 - s ) }\] the maximum occurs when $ t = t _ 2 $, so we have \[\tag{3.5.40} \left | \frac { \partial } { \partial x } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) \right | \le \frac { | x - y | } { \sqrt { 16 \pi k ^ 3 ( t _ 2 - s ) ^ 3 } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t _ 2 - s ) } \right ) .\] For values in between those ranges the maximum occurs inside the interval $ ( t _ 1 , t _ 2 ) $ and so we have \[\tag{3.5.41} \left | \frac { \partial } { \partial x } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) \right | \le \sqrt { \frac { 27 } { 2 \pi } } \exp ( - 3 / 2 ) .\] The bounds above on the absolute value of the partial derivative are ugly, but they combine to give an integrable function because of the exponential decay at infinity. It's still bounded when we multiply by $ M $, the uniform bound for $ | u | $ which we've assumed exists, so the theorem applies and differentiation under the integral sign is justified by Theorem 3.5.F. It may seem that we've proved differentiability at $ t $ in the interval $ ( t _ 1 , t _ 2 ) $ but for any $ t > s $ we can choose $ t _ 1 $ and $ t _ 2 $ such that $ t _ 2 > t > t _ 1 > s $ so in fact we've proved it for all $ t > s $.

An argument similar to the one we just applied to the $ x $ derivative also applies to the $ t $ derivative as well. Furthermore we can apply our argument for the $ x $ derivative to $ \partial u / \partial x $ instead of $ u $ to get the second derivative $ \partial ^ 2 u / \partial x ^ 2 $, which is seen to be equal to the result of formally differentiating under the integral in the solution formula. Combining this with the result already obtained for $ \partial u / \partial t $ we see that we can apply the differential operator \[\tag{3.5.42} \frac { \partial } { \partial t } - k \frac { \partial ^ 2 } { \partial x ^ 2 }\] to the integral \[\tag{3.5.43} \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y\] by formally bringing it inside the integral, where it will just hit the fundamental solution since the $ f $ factor is constant as far as the integration is concerned. Since the fundamental solution is a solution we find that our solution formula does indeed give a solution. We suspected this to be the case, and wouldn't have gone to all of this effort in analysing it if we didn't, but we didn't have any certainty until now. It's worth expressing this as a theorem.

Theorem 3.5.G Suppose $ f $ is a bounded continuous function on $ \mathbf R $ and \[\tag{3.5.44} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y .\] Then $ u $ is continuously differentiable in $ t $ and twice continuously differentiable in $ x $ for $ t > s $ and is a solution of the diffusion equation there.

We don't need to stop after taking one $ t $ derivative or two space derivatives. We can take any number of derivatives of either type. The result will be the same as taking those derivatives inside the integral where they will hit the fundamental solution.

Theorem 3.5.H Suppose $ f $ is a bounded continuous function on $ \mathbf R $ and \[\tag{3.5.45} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y .\] Then $ u $ is infinitely differentiable in $ t $ and $ x $ for $ t > s $.

We've only ever considered solving the diffusion equation forward in time. This was largely motivated by applications but the preceding theorem shows that there are deeper reasons why we shouldn't try to solve the diffusion equation backward in time. If for some initial data $ f $ we can solve the diffusion backwards from $ t = s $ to some $ t = r $ then solving the initial value problem forward from $ t = r $ with initial data $ u ( t , r ) $ would give us back $ f $ and by the preceding theorem this $ f $ would be infinitely differentiable. Most functions, even most twice continuously differentiable functions, are not infinitely differentiable. The function $ \rho $ we met earlier is an example of a function which it twice continuously differentiable but not thrice differentiable. By what we've just shown the backwards initial value problem for this function, or any other function which is not infinitely differentiable, cannot have a solution. A more careful argument would show that even among infinitely differentiable functions the ones which can be evolved backwards in time by the diffusion equation are highly unusual.

Section 3.6 Existence

We've now filled the gap in our earlier uniqueness proof and partially proved existence of solutions to the initial value problem. More precisely, we've proved that the equation \[\tag{3.6.1} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y\] gives a solution to the diffusion equation for $ t > s $. We haven't shown that it satisfies the initial conditions though.

Normally the easiest part of proving that an explicit solution to an initial value problem is valid is checking the initial conditions, but if we do this in the naive way here we are in for a shock. First of all, the factor $ \sqrt { 4 \pi k ( t - s ) } $ in the denominator means the integrand can't even be evaluated at $ t = s $. Taking limits doesn't help much. \[\tag{3.6.2} \lim _ { t \to s ^ + } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) = 0\] so if we just exchange the limit and integral we appear to get the integral of zero, which is zero. If we're slightly more careful then we might note that the limit above only works when $ x \neq y $, but changing the integrand at a single point has no effect on the integral, so we appear to have a problem.

When you first see theorems about interchanging limits and integrals it's easy to get the impression formal calculations generally give correct results and proving the correctness of those results is simply a matter of selecting an appropriate theorem to justify the calculation. Here we see a practical example where the formal calculation definitely gives the wrong result. We can verify that it's wrong by considering, for example, the constant initial data $ f ( y ) = 1 $, for which the solution formula correctly gives us the solution $ u ( t , x ) = 1 $, for which the argument above incorrectly suggests should give us the equation $ \lim _ { t \to s ^ + } 1 = 0 $. So what we need to do is not to find a convergence theorem to justify the formal calculation above, because there can be no such argument, but rather to find a different formal calculation, giving a different result, and find a convergence theorem to justify that calculation.

We've actually seen a variant of the formal argument we require once before. The way we got the left hand side of our solution formula was to perform a change of variable before taking a limit. We do the same thing here, with a very similar change of variable, \[\tag{3.6.3} z = \frac { y - x } { \sqrt { 4 \pi k ( t - s ) } } .\] This gives \[\tag{3.6.4} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \exp ( - \pi z ^ 2 ) f ( x + z \sqrt { 4 \pi k ( t - s ) } ) \, d z .\] This now does give the correct value when $ t = s $. Furthermore, Theorem 3.5.E shows that this function is continuous for all $ t \ge s $. This is more important than it might seem. Indeed if the goal were simply to find a function which matches the initial data at $ t = s $ and solves the diffusion equation for $ t > s $ then we could simply have chosen the function which is equal to $ f $ for $ t = s $ and $ 0 $ for $ t > s $. Continuity is the condition which prevents us from doing this and forces us to find a solution whose values for $ t > s $ are related to the values at $ t = s $.

We have an unfortunate mismatch between our uniqueness results and our existence results. In proving uniqueness we assumed that $ u $, $ \partial u / \partial t $, $ \partial u / \partial x $ and $ \partial ^ 2 u / \partial x ^ 2 $ are all continuous in the region $ t \ge s $. We now have existence of a solution for which $ u $ is continuous for $ t \ge s $ but its various partial derivatives are only known to be continuous, or indeed to exist, for $ t > s $. We'd like the differentiability conditions in our existence and uniqueness theorem to be the same. To accomplish this we need to weaken the hypotheses of our uniqueness theorem or strengthen the conclusion of our existence theorem. In fact both of these are possible.

The option of weakening the hypotheses in the uniqueness theorem turns out to be both easier and more useful in applications. We had two proofs of the uniqueness theorem. The first one was relatively simple but didn't give us an explicit solution formula. Supposing there were two solutions with the name initial data and looked at their difference, which is still a solution, by linearity, and has zero initial data. Our non-negativity result, Theorem 3.3.A, then shows that the difference remains non-negative. By taking the difference in the opposite order we see that it also remains non-positive, so it's zero everywhere. In other words any two solutions with the same initial data are the same solution. This was for classical solutions, but if we can prove the following variant then we can use it to get a uniqueness theorem whose differentiability assumptions match those of our existence theorem.

Theorem 3.6.A Suppose $ u $ is a bounded continuous function on $ [ s , + \infty ) \times \mathbf R $ and on $ ( s , + \infty ) \times \mathbf R $ it is continuously differentiable in $ t $ and twice continuously differentiable in $ x $ and satisfies the diffusion equation there. Suppose also \[\tag{3.6.5} u ( s , x ) = f ( x )\] for all $ x $, where $ f $ is a non-negative continuous function. Then $ u $ is non-negative.

Theorem 3.6.B There is at most one bounded continuous function $ u $ on $ [ s , + \infty ) \times \mathbf R $ which is continuously differentiable in $ t $ and twice continuously differentiable in $ x $ and on $ ( s , + \infty ) \times \mathbf R $ satisfies the diffusion equation there and also satisfies the initial condition \[\tag{3.6.6} u ( s , x ) = f ( x )\] for all $ x $.

As before, this only gives uniqueness, not a solution formula, but since we already have an existence theorem under the same conditions which does feature an explicit solution formula it follows from the theorem above that any solution must be given by that formula.

We won't pursue the other option, showing that when $ f $ is twice continuously differentiable the solution formula gives a classical solution for $ t \ge s $, here but I will give a quick sketch of the argument. One needs to start from the alternate form of the solution formula, \[\tag{3.6.7} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \exp ( - \pi z ^ 2 ) f ( x + z \sqrt { 4 \pi k ( t - s ) } ) \, d z .\] If $ f ' $ and $ f '' $ are bounded then our theorem on differentiation under the integral sign shows that $ u $ is twice continuously differentiable in $ x $ and that the derivatives are obtained by formal differentiation under the integral sign. This doesn't quite work for the $ t $ derivative but there is a variant of our theorem on differentiation under the integral which does work. We still have the problem though that this requires $ f ' $ and $ f '' $ to be bounded, while we've only assumed that $ f $ itself is bounded. We met this problem once before, when deriving the solution formula from Green's theorem, and the solution here is similar. We need to multiply our initial data by $ \rho ( x / L ) $, where $ \rho $ is the function defined there. The new function will have derivatives which are non-zero only in the interval $ [ - L , L ] $ and continuous functions on a closed interval are always bounded so the argument described above applies to the modified initial data. Of course we want a solution with the original initial data. The modified initial data agree with the original initial data in the interval $ ( - L / 2 , L / 2 ) $ though. Using the usual form of the solution formula we can show that when the initial data is zero in an open interval the solution is a classical one for $ t \ge s $, not just $ t > s $, and $ x $ in that interval. Combining this with what we already have gives the improved existence theorem.

Section 3.7 Boundary Value Problems

The same method we used for the wave equation, the method of reflection, can be used to treat boundary value problems for the diffusion equation, provided the boundary conditions are of Dirichlet or Neumann type. Suppose, for example, that we are given data $ f $ which are continuous on the closed interval $ [ a , b ] $ and are looking for a solution to the initial value problem in the region $ [ s , \infty ) \times [ a , b ] $ satisfying a Dirichlet condition at the left endpoint and a Neumann condition at the right endpoint, \[\tag{3.7.1} u ( t , a ) = 0 , \quad \frac { \partial u } { \partial x } ( t , b ) = 0 ,\] just as we did for the wave equation. We'll need $ f ( a ) = 0 $ in order to have a chance of solving this equation. As long as we're looking for a solution which is merely continuous for $ t \ge s $ and not one which is continously differentiable in $ t $ and twice continuously differentiable in $ x $ there we don't need to impose the conditions $ f ' ( b ) = 0 $ or $ f '' ( a ) = 0 $ which we imposed for the wave equation, and indeed it wouldn't make sense to impose them since we're merely assuming that $ f $ is continuous.

We can extend $ f $ to all of $ \mathbf R $ in the same way as we did for the wave equation, namely \[\tag{3.7.2} f ( x ) = \begin{cases} f ( a + ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 0 }$,} \\ f ( b - ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 1 }$,} \\ - f ( a + ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 2 }$,} \\ - f ( b - ( b - a ) r ( x ) ) & \mbox{ if ${ l ( x ) = 3 }$,} \\ \end{cases}\] where \[\tag{3.7.3} \frac { x - a } { b - a } = 4 m ( x ) + l ( x ) + r ( x )\] and $ r ( x ) \in [ 0 , 1 ) $, and $ m ( x ) $ and $ l ( x ) $ both integers with $ 0 \le l ( x ) < 4 $. We then define \[\tag{3.7.4} u ( t , x ) = \int _ { - \infty } ^ { + \infty } \frac 1 { \sqrt { 4 \pi k ( t - s ) } } \exp \left ( - \frac { ( y - x ) ^ 2 } { 4 k ( t - s ) } \right ) f ( y ) \, d y ,\] where the $ f $ in the integrand is this extended $ f $. To prevent subsequent equations from getting very messy we will write this in terms of the fundamental solution: \[\tag{3.7.5} u ( t , x ) = \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) f ( y ) \, d y ,\] The argument that this is a solution to the initial value problem with the given boundary conditions, and is the only solution, is essentially the same as for the wave equation.

It's also possible to write the solution in terms of the original, unextended, $ f $ as follows. We split first split the integral into pieces: \[\tag{3.7.6} u ( t , x ) = \sum _ { l = 0 } ^ 3 \sum _ { m = - \infty } ^ { + \infty } \int _ { a + ( 4 m + l ) ( b - a ) } ^ { b + ( 4 m + l ) ( b - a ) } K ( t - s , x - y ) f ( y ) \, d y ,\] or, after a linear change of variable, \[\tag{3.7.7} u ( t , x ) = \sum _ { l = 0 } ^ 3 \sum _ { m = - \infty } ^ { + \infty } \int _ { a } ^ { b } K ( t - s , x - y - ( 4 m + 1 ) ( b - a ) ) f ( y + ( 4 m + l ) ( b - a ) ) \, d y .\] Now \[\tag{3.7.8} f ( y + ( 4 m + l ) ( b - a ) ) = \begin{cases} f ( y ) & \mbox { if ${ l = 0 }$,} \\ f ( a + b - y ) & \mbox { if ${ l = 1 }$,} \\ - f ( y ) & \mbox { if ${ l = 2 }$,} \\ - f ( a + b - y ) & \mbox { if ${ l = 3 }$.} \end{cases}\] We make a further change of variable in the odd cases, replacing $ y $ by $ a + b - y $, obtaining \[\tag{3.7.9} \begin{split} u ( t , x ) & = \sum _ { m = - \infty } ^ { + \infty } K ( t - s , x - y - 4 m ( b - a ) ) f ( y ) \, d y \\ & \quad{} + \sum _ { m = - \infty } ^ { + \infty } K ( t - s , x + y - 2 a - ( 4 m + 2 ) ( b - a ) - a - b ) f ( y ) \, d y \\ & \quad {} - \sum _ { m = - \infty } ^ { + \infty } K ( t - s , x - y - ( 4 m + 2 ) ( b - a ) ) f ( y ) \, d y \\ & \quad {} - \sum _ { m = - \infty } ^ { + \infty } K ( t - s , x + y - 2 a - ( 4 m + 4 ) ( b - a ) - a - b ) f ( y ) \, d y . \end{split}\] Here the $ f $'s in the integrands all refer to the original, unextended, $ f $.

Section 3.8 Conservation and Monotonicity

The diffusion equation has a conservation law which applies when either there is no boundary or the boundary conditions are all Neumann conditions. First we consider the case without boundary. Suppose that \[\tag{3.8.1} \int _ { - \infty } ^ { + \infty } | f ( y ) | \, d y = 0 .\] Then \[\tag{3.8.2} \begin{split} \int _ { - \infty } ^ { + \infty } u ( t , x ) \, d x & = \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) f ( y ) \, d y \, d x \\ & = \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) f ( y ) \, d x \, d y \\ & = \int _ { - \infty } ^ { + \infty } f ( y ) \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) \, d x \, d y \\ & = \int _ { - \infty } ^ { + \infty } f ( y ) \, d y . \end{split}\] The interchange of the two integrals is justified by Theorem 3.5.D. So the quantity \[\tag{3.8.3} \int _ { - \infty } ^ { + \infty } u ( t , x ) \, d x\] is independent of $ t $.

Next we consider the case of a finite interval $ [ a , b ] $ with Neumann conditions at both endpoints. We apply Green's theorem with \[\tag{3.8.4} p = - u , \quad q = - k \frac { \partial u } { \partial x } .\] For the region we take the rectangle $ R = [ t _ 1 , t _ 2 ] \times [ a , b ] $, where $ t _ 2 > t _ 1 > s $. Green's theorem gives \[\tag{3.8.5} \sum _ { j = 1 } ^ 4 \int _ { C _ j } ( p ( t , x ) \, d x + q ( t , x ) \, d t ) = \int _ R \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } \, d A .\] The curves $ C _ 1 $, $ C _ 2 $, $ C _ 3 $ and $ C _ 4 $ will be the line segments from $ ( t _ 1 , a ) $ to $ ( t _ 1 , b ) $, $ ( t _ 1 , b ) $ to $ ( t _ 2 , b ) $, $ ( t _ 2 , b ) $ to $ ( t _ 2 , a ) $, and $ ( t _ 2 , a ) $ to $ ( t _ 1 , a ) $, respectively. The integrals along $ C _ 2 $ and $ C _ 4 $ are zero because of the Neumann condition. The right hand side will be zero for solutions of the diffusion equation. What we are left with is \[\tag{3.8.6} \int _ a ^ b u ( t _ 2 , x ) \, d x - \int _ a ^ b u ( t _ 1 , x ) \, d x = 0 .\] So \[\tag{3.8.7} \int _ a ^ b u ( t , x ) \, d x\] is independent of $ t $. This argument doesn't work for $ t = s $ because Green's theorem requires $ p $ and $ q $ to be continuously differentiable in all of $ R $ but we can take the limit as $ t _ 1 $ tends to $ s $ from above since the hypotheses of Theorem 3.5.B are satisfied.

In the original application of the diffusion equation to heat conduction Dirichlet boundary conditions correspond to conducting boundaries and Neumann boundary conditions correspond to insulating boundary conditions. The integral of $ u $ corresponds to the total energy. The theorem we've just proved is then conservation of energy for a thermally isolated system. We can't expect the theorem to apply with Dirichlet boundary conditions because energy can enter or leave the system through the conducting boundary.

Another important physical quantity in the original physical application is entropy, given by the integral \[\tag{3.8.8} - \int u ( t , x ) \log u ( t , x ) \, d x .\] Of course this only makes sense if $ u $ is positive, which it always is in the study of heat conduction. We've already seen that if the initial data are positive then the solution will remain positive forever. Unlike energy, we don't expect entropy to be conserved. The second law of thermodynamics says that it should be increasing, or at least non-decreasing. Again, we expect this only for isolated systems, so it should hold either when there is no boundary or when all boundary conditions are of Neumann type.

This time we'll treat the case of a finite interval first. Taking $ R $ as before we set \[\tag{3.8.9} p = u \log u , \quad q = k \frac { \partial u } { \partial x } \log u .\] Then \[\tag{3.8.10} \frac { \partial q } { \partial x } - \frac { \partial p } { \partial t } = \frac { k u } { \left ( \partial u / \partial x \right ) ^ 2 } - \left ( \frac { \partial u } { \partial t } - k \frac { \partial ^ 2 u } { \partial x ^ 2 } \right ) \log u .\] The second term on the right hand side is zero for solutions of the diffusion equation. As with our proof of energy conservation, the integrals over $ C _ 2 $ and $ C _ 4 $ vanish because of the Neumann boundary condition are we are left with \[\tag{3.8.11} \int _ a ^ b u ( t _ 2 , x ) \, d x - \int _ a ^ b u ( t _ 1 , x ) \, d x = \int _ R \frac { k u } { \left ( \partial u / \partial x \right ) ^ 2 } \, d A .\] Since $ u $ is positive the integrand on the right hand side is positive everywhere and so the integral is positive. It follows that \[\tag{3.8.12} \int _ a ^ b u ( t _ 2 , x ) \, d x > \int _ a ^ b u ( t _ 1 , x ) \, d x .\] As before, the use of Green's theorem presupposes $ t _ 2 > t _ 1 > s $ but we can use continuity to get the same result for $ t _ 2 > t _ 1 \ge s $.

The argument for the case without boundaries is more subtle. Let \[\tag{3.8.13} \varphi ( z ) = z \log z\] and \[\tag{3.8.14} w ( t , x , y ) = \varphi ( f ( y ) ) - \varphi ( u ( t , x ) ) - \varphi ' ( u ( t , x ) ) ( f ( y ) - u ( t , x ) ) .\] Multiplying by $ K ( t - s , x - y ) $ and integrating with respect to both $ x $ and $ y $ we get \[\tag{3.8.15} \begin{split} & \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) w ( t , x , y ) \, d x \, d y \\ & \quad{} = \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) \varphi ( f ( y ) ) \, d x \, d y \\ & \qquad {} - \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) \varphi ( u ( t , x ) ) \, d y \, d x \\ & \qquad {} - \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) \varphi ' ( u ( t , x ) ) f ( y ) \, d y \, d x \\ & \qquad {} + \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) \varphi ' ( u ( t , x ) ) u ( t , x ) ) \, d y \, d x . \end{split}\] Here we've used Theorem 3.5.D to change the order of integration in some, but not all cases. Performing the inner integration in each of the integrals on the right we have \[\tag{3.8.16} \begin{split} \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) w ( t , x , y ) \, d x \, d y & = \int _ { - \infty } ^ { + \infty } \varphi ( f ( y ) ) \, d y \\ & \quad {} - \int _ { - \infty } ^ { + \infty } \varphi ( u ( t , x ) ) \, d x \\ & \quad {} - \int _ { - \infty } ^ { + \infty } \varphi ' ( u ( t , x ) ) u ( t , x ) \, d x \\ & \quad {} + \int _ { - \infty } ^ { + \infty } \varphi ' ( u ( t , x ) ) u ( t , x ) ) \, d x . \end{split}\] The last two cancel so we are left with \[\tag{3.8.17} \begin{split} \int _ { - \infty } ^ { + \infty } \int _ { - \infty } ^ { + \infty } K ( t - s , x - y ) w ( t , x , y ) \, d x \, d y & = \int _ { - \infty } ^ { + \infty } \varphi ( f ( y ) ) \, d y \\ & \quad {} - \int _ { - \infty } ^ { + \infty } \varphi ( u ( t , x ) ) \, d x . \end{split}\] The fundamental solution is positive everywhere and $ w ( t , x , y ) $ is non-negative everywhere as a result of the convexity of $ \varphi $ so the integrand on the left hand side is non-negative and therefore so is its integral. It follows that \[\tag{3.8.18} \int _ { - \infty } ^ { + \infty } \varphi ( f ( y ) ) \, d y \le \int _ { - \infty } ^ { + \infty } \varphi ( u ( t , x ) ) \, d x\] or, equivalently, \[\tag{3.8.19} \int _ { - \infty } ^ { + \infty } \varphi ( u ( s , y ) ) \, d y \le \int _ { - \infty } ^ { + \infty } \varphi ( u ( t , x ) ) \, d x .\] In other words, the entropy at later times is always greater than or equal to the initial entropy. Of course if we have $ t _ 2 > t _ 1 \ge s $ then we can just view the solution restricted to $ [ t _ 1 , \infty ) $ as the solution to an initial value problem with data prescribed at time $ t _ 1 $, so we find that the entropy at time $ t _ 2 $ is greater than or equal the entropy at time $ t _ 1 $, so entropy is non-decreasing.

We can sharpen the result above by noting that $ \varphi $ is strictly convex, so if $ w ( t , x , y ) $ is positive except when $ u ( t , x ) = f ( y ) $ and both $ w $ and $ K $ are continuous so if $ K w $ is positive anywhere then it's positive on an open set and so the integral is positive. It follows that entropy is strictly decreasing unless $ u ( t , x ) = f ( y ) $ for all $ t $, $ x $ and $ y $, which happens only if $ u $ is constant.

The argument we used for entropy applies to prove monotonicity of other interesting integrals. If $ \varphi $ is convex and differentiable then \[\tag{3.8.20} \int \varphi ( u ( t , x ) ) \, d x\] is a decreasing function of $ t $, and strictly decreasing if $ \varphi $ is strictly convex and $ u $ is not constant. This applies, for example to \[\tag{3.8.21} \int u ( t , x ) ^ 2 \, d x\] or, more generally, to \[\tag{3.8.22} \int | u ( t , x ) | ^ p \, d x\] for $ p > 1 $. A slight variant of the argument applies when $ \varphi $ is convex but not necessarily differentiable, and so includes the case $ p = 1 $ above.

Section 3.9 Black-Scholes Equation

The Black-Scholes equation, (1.0.8), we mentioned in the introduction. As a reminder, it was \[\tag{3.9.1} \frac { \partial v } { \partial \tau } + \frac 1 2 \sigma ^ 2 s ^ 2 \frac { \partial ^ 2 v } { \partial s ^ 2 } + r s \frac { \partial v } { \partial s } - r v = 0 .\] It describes the evolution of the value of a derivative, although neither value nor derivative mean what they usual do in mathematics. Value means what you would expect in a financial context: the price at which an asset can be bought or sold. Derivative means an asset whose value depends on the value of some other asset. Usually this means an option on a stock, i.e. a contract giving one the right to buy or sell a stock at a given price on a given date. In the equation above $ v $ is the value of the derived asset, $ s $ is the price of the underlying asset, $ r $ is the rate of return on a risk free asset, $ \sigma $ is the volatility of the price of the underlying asset, and $ \tau $ is time.

The change of variable For now let's assume the derivative is an option on a stock which allows us to buy or sell it at a given price $ K $, usually called the strike price, at time $ T $, usually called the expiry date of the option. If make the changes of variable \[\tag{3.9.2} t = T - \tau , \quad u = v \exp ( r t ) , \quad x = \log ( s / K ) + \left ( r - \frac 1 2 \sigma ^ 2 \right ) t , \sigma = \sqrt { 2 k }\] in the Black-Scholes equation we get the diffusion equation. of $ t $ and $ s $. Note that $ t $ is the time remaining until expiry, so $ t = 0 $ corresponds to expiry and positive values of $ t $ correspond to times before the option expires, which are the times at which we'd like to compute its value. The value at expiry is a known function of $ s $ and therefore of $ s $. The specific function depends on whether our option is an option to sell, usually called a put, or an option to buy, usually called a call. So the problem of computing the value of the option at earlier times is an initial value problem for the diffusion equation, although it's really a final value problem for the Black-Scholes equation due to the time reversal in our change of variables.

While the Black-Scholes equation applies to the value of any option the example described above, an option which can be exercised only at expiry, is only one particular type of option, usually called a European option. The more common type of option, even in Europe, is what's called an American option, where the option can be exercised at any time. The effect of this is to convert our pure initial value problem into a boundary value problem, but this boundary value problem is of a very different type from the ones we've considered previously. The boundary conditions, in terms of the original variables, are \[\tag{3.9.3} v ( \tau , g ( \tau ) ) = \max ( 0 , v ( T , g ( \tau ) ) )\] and \[\tag{3.9.4} \frac { \partial v } { \partial x } ( \tau , g ( \tau ) ) = 1 .\] What is $ g ( \tau ) $? It is the price at which one should choose to exercise the option early at time $ \tau $. This is not a given function. Rather, finding this function is part of solving the problem. So, unlike the boundary value problems we've considered previously, the location of the boundary is not known in advance but rather has to be solved for. In some sense the lack of information about the location of the boundary is compensated for by the fact that we have two boundary conditions to be satisfied on the boundary rather than one.

Boundaries whose location is not known in advance are called free boundaries. They don't just arise in financial mathematics and indeed didn't first arise there. The classical example of a free boundary problem is fluid flow with a fluid which has a boundary, for example a bubble within the fluid region or the top surface of a water wave. The peculiarity of free boundary problems is that even when the equation is linear, as the Black-Scholes equation and the equations for irrotational incompressible fluid flow are, the methods needed to study them look much more like those of the theory of nonlinear differential equations.

Chapter 4 Burgers' Equation

We've already seen Burgers' equation \[\tag{4.0.1} \frac { \partial u } { \partial t } + u \frac { \partial u } { \partial x } = 0\] Burgers' equation is a vastly simplified model of the evolution of the free boundary for fluid flow without viscosity.

There is a general theory which applies to first order scalar differential equations, of which this is one, but here we'll just do everything by hand in this special case.

Section 4.1 Explicit Solution

Suppose $ u $ is a continuously differentiable solution to this equation in a neighbourhood of the point $ ( t _ 0 , x _ 0 ) $ and set \[\tag{4.1.1} p ( t ) = u ( t , x _ 0 + v t - v t _ 0 ) - v\] where $ v = u ( t _ 0 , x _ 0 ) $. Then $ p ( t _ 0 ) = 0 $ and the chain rule gives \[\tag{4.1.2} p ' ( t ) = \frac { \partial u } { \partial t } ( t , x _ 0 + v t - v t _ 0 ) + v \frac { \partial u } { \partial x } ( t , x _ 0 + v t - v t _ 0 )\] or, using the fact that $ u $ satisfies the differential equation, \[\tag{4.1.3} p ' ( t ) = - p ( t ) \frac { \partial u } { \partial x } ( t , x _ 0 + v t - v t _ 0 ) .\] Defining \[\tag{4.1.4} q ( t ) = p ( t ) \exp \left ( \int _ { t _ 0 } ^ t \frac { \partial u } { \partial x } ( t , x _ 0 + v s - v t _ 0 ) \, d s \right )\] we find $ q ( t _ 0 ) = 0 $ and using the product rule and the fundamental theorem of calculus we find that \[\tag{4.1.5} q ' ( t ) = 0\] so $ q $ is zero everywhere. It follows that $ p $ is also zero everywhere and \[\tag{4.1.6} u ( t , x _ 0 + v t - v t _ 0 ) = v .\] In other words, $ u $ is constant on the line $ x - v t = x _ 0 - v t _ 0 $. So to solve the initial value problem \[\tag{4.1.7} u ( t _ 0 , x ) = f ( x )\] it suffices to solve eliminate $ x _ 0 $ from the system of equations \[\tag{4.1.8} x - u t = x _ 0 - u t _ 0 , \quad u = f ( x _ 0 ) .\]

As a simple example consider linear initial conditions \[\tag{4.1.9} u ( t _ 0 , x ) = c x .\] Eliminating $ x _ 0 $ from \[\tag{4.1.10} x - u t = x _ 0 - u t _ 0 , \quad u = c x _ 0 .\] gives \[\tag{4.1.11} u ( t , x ) = \frac { c x } { 1 + c ( t - t _ 0 ) } .\] That this satisfies the differential equation and initial conditions is easy to check directly. The behaviour depends on the sign of $ c $. If $ c $ nonnegative then the solution exists for all nonnegative values of $ t $ while if $ c $ is negative the solution exists up until $ t = t _ 0 - 1 / c $ but there is no continuously differentiable solution afterwards. Unlike the wave or diffusion equations we can therefore not expect global solutions for Burgers' equation, even for very nice initial data.

Section 4.2 Shock Formation

If you know something about the existence and uniqueness theorems for ordinary differential equations this should not surprise you. The existence results for linear ordinary differential equations give global existence, while the ones for nonlinear ordinary differential equations only give existence in a finite time interval, whose length depends on the choice of initial data. Since the wave and diffusion equations are linear while Burgers' is nonlinear it isn't particularly unexpected that we get global existence for the first two and only existence in a finite interval for the last one. In fact the situation is worse than that though. Consider the initial conditions \[\tag{4.2.1} u ( 0 , x ) = \cos \left ( \pi x ^ 2 \right ) .\] The solution, for as long as it exists, should be equal to $ ( - 1 ) ^ k $ on the lines \[\tag{4.2.2} x = \sqrt { k } + ( - 1 ) ^ k t ,\] where $ k $ is a nonnegative integer. Considering the cases $ k = 2 j $ and $ k = 2 j + 1 $, where $ j $ is a positive integer, we see that at the point \[\tag{4.2.3} ( t , x ) = \left ( \frac { \sqrt { 2 j + 1 } - \sqrt { 2 j } } 2 , \frac { \sqrt { 2 j + 1 } + \sqrt { 2 j } } 2 \right )\] $ u ( t , x ) $ should be equal to both $ + 1 $ and $ - 1 $, so the solution cannot extend as far forward in time as \[\tag{4.2.4} t = \frac { \sqrt { 2 j + 1 } - \sqrt { 2 j } } 2 .\] Similarly, considering the cases $ k = 2 j - 1 $ and $ k = 2 j $ we see that at the point \[\tag{4.2.5} ( t , x ) = \left ( - \frac { \sqrt { 2 j } - \sqrt { 2 j - 1 } } 2 , \frac { \sqrt { 2 j - 1 } + \sqrt { 2 j } } 2 \right )\] $ u ( t , x ) $ should again be equal to both $ + 1 $ and $ - 1 $, so the solution cannot extend as far backward in time as \[\tag{4.2.6} t = - \frac { \sqrt { 2 j } - \sqrt { 2 j - 1 } } 2 .\] But these remarks apply to all integers $ j $, and both $ \sqrt { 2 j + 1 } - \sqrt { 2 j } $ and $ \sqrt { 2 j } - \sqrt { 2 j - 1 } $ tend to zero as $ j $ tends to infinity, so there is no time interval of positive length on which we have a continuously differentiable solution to this initial value problem, even though the initial data is bounded and infinitely differentiable!

There is much more to be said about Burgers' equation. It was originally introduce to model fluid flow and, in particular, shock formation. There is a natural way to extend solutions beyond the singularities we've seen above, although not as a continuously differentiable, or even continuous function. This is true also for the more complicated, but also more physically relevant, Euler equations. That is a topic for a more advanced text though.

Chapter 5 Laplace Equation

Section 5.1 Symmetries

The Laplace equation \[\tag{5.1.1} \frac { \partial ^ 2 u } { \partial x ^ 2 } + \frac { \partial ^ 2 u } { \partial y ^ 2 } = 0\] is linear and so has a scaling symmetry in the dependent variable. It has constant coefficients and so has translational symmetry in both variables. \[\tag{5.1.2} ( T _ { \xi , \eta } u ) ( x , y ) = u ( x - \xi , y - \eta ) .\] Somewhat more interestingly it has reflectional symmetry in either the $ x $ or $ y $ variable, a scaling symmetry in the variables $ x $ and $ y $ \[\tag{5.1.3} ( S _ \mu u ) ( x , y ) = u ( x / \mu , y / \mu )\] and a rotational symmetry \[\tag{5.1.4} ( R _ \theta u ) ( x , y ) = u ( x \cos \theta + y \sin \theta , - x \sin \theta + y \cos \theta )\] This is far from a complete list of symmetries though. We also have an inversion symmetry \[\tag{5.1.5} ( J u ) ( x , y ) = u \left ( \frac x { x ^ 2 + y ^ 2 } , \frac y { x ^ 2 + y ^ 2 } \right ) .\] This one is more complicated to check than the previous ones.

Instead of immediately taking derivatives, which gets rather messy, it's better at this point to ask more generally under what conditions on the functions $ p $ and $ q $ it is the case that \[\tag{5.1.6} \tilde u ( x , y ) = u ( p ( x , y ) , q ( x , y ) )\] is a symmetry of the Laplace equation, and then checking whether the particular $ p $ and $ q $ which appear in the definition of $ J $ satisfy those conditions. Taking one derivative using the chain rule, \[\tag{5.1.7} \frac { \partial \tilde u } { \partial x } ( x , y ) = \frac { \partial p } { \partial x } ( x , y ) \frac { \partial u } { \partial x } ( p ( x , y ) , q ( x , y ) ) + \frac { \partial q } { \partial x } ( x , y ) \frac { \partial u } { \partial y } ( p ( x , y ) , q ( x , y ) )\] Taking another derivative, \[\tag{5.1.8} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( x , y ) & = \left ( \frac { \partial p } { \partial x } ( x , y ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( p ( x , y ) , q ( x , y ) ) \\ & + 2 \frac { \partial p } { \partial x } ( x , y ) \frac { \partial q } { \partial x } ( x , y ) \frac { \partial ^ 2 u } { \partial x \partial y } ( p ( x , y ) , q ( x , y ) ) \\ & + \left ( \frac { \partial q } { \partial x } ( x , y ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial y ^ 2 } ( p ( x , y ) , q ( x , y ) ) \\ & + \frac { \partial p ^ 2 } { \partial x ^ 2 } ( x , y ) \frac { \partial u } { \partial x } ( p ( x , y ) , q ( x , y ) ) \\ & + \frac { \partial q ^ 2 } { \partial x ^ 2 } ( x , y ) \frac { \partial u } { \partial y } ( p ( x , y ) , q ( x , y ) ) . \end{split}\] Taking $ y $ derivatives gives \[\tag{5.1.9} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial y ^ 2 } ( x , y ) & = \left ( \frac { \partial p } { \partial y } ( x , y ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial x ^ 2 } ( p ( x , y ) , q ( x , y ) ) \\ & + 2 \frac { \partial p } { \partial y } ( x , y ) \frac { \partial q } { \partial y } ( x , y ) \frac { \partial ^ 2 u } { \partial x \partial y } ( p ( x , y ) , q ( x , y ) ) \\ & + \left ( \frac { \partial q } { \partial y } ( x , y ) \right ) ^ 2 \frac { \partial ^ 2 u } { \partial y ^ 2 } ( p ( x , y ) , q ( x , y ) ) \\ & + \frac { \partial p ^ 2 } { \partial y ^ 2 } ( x , y ) \frac { \partial u } { \partial x } ( p ( x , y ) , q ( x , y ) ) \\ & + \frac { \partial q ^ 2 } { \partial y ^ 2 } ( x , y ) \frac { \partial u } { \partial y } ( p ( x , y ) , q ( x , y ) ) . \end{split}\] Adding these equations gives \[\tag{5.1.10} \begin{split} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( x , y ) + \frac { \partial ^ 2 \tilde u } { \partial y ^ 2 } ( x , y ) & = \left [ \left ( \frac { \partial p } { \partial x } ( x , y ) \right ) ^ 2 + \left ( \frac { \partial p } { \partial y } ( x , y ) \right ) ^ 2 \right ] \frac { \partial ^ 2 u } { \partial x ^ 2 } ( p ( x , y ) , q ( x , y ) ) \\ & + 2 \left [ \frac { \partial p } { \partial x } ( x , y ) \frac { \partial q } { \partial x } ( x , y ) + \frac { \partial p } { \partial y } ( x , y ) \frac { \partial q } { \partial y } ( x , y ) \right ] \frac { \partial ^ 2 u } { \partial x \partial y } ( p ( x , y ) , q ( x , y ) ) \\ & + \left [ \left ( \frac { \partial q } { \partial x } ( x , y ) \right ) ^ 2 + \left ( \frac { \partial q } { \partial y } ( x , y ) \right ) ^ 2 \right ] \frac { \partial ^ 2 u } { \partial y ^ 2 } ( p ( x , y ) , q ( x , y ) ) \\ & + \left [ \frac { \partial p ^ 2 } { \partial x ^ 2 } ( x , y ) + \frac { \partial p ^ 2 } { \partial y ^ 2 } ( x , y ) \right ] \frac { \partial u } { \partial x } ( p ( x , y ) , q ( x , y ) ) \\ & + \left [ \frac { \partial q ^ 2 } { \partial x ^ 2 } ( x , y ) + \frac { \partial q ^ 2 } { \partial y ^ 2 } ( x , y ) \right ] \frac { \partial u } { \partial y } ( p ( x , y ) , q ( x , y ) ) . \end{split}\] This looks quite complicated but it's easy to check that if either the pair of equations \[\tag{5.1.11} \frac { \partial p } { \partial x } + \frac { \partial q } { \partial y } = 0 , \quad \frac { \partial p } { \partial y } - \frac { \partial q } { \partial x } = 0\] or the pair of equations \[\tag{5.1.12} \frac { \partial p } { \partial x } - \frac { \partial q } { \partial y } = 0 , \quad \frac { \partial p } { \partial y } + \frac { \partial q } { \partial x } = 0\] holds then the coefficients of the first derivatives and the mixed second partial derivative are all zero and the other coefficients are equal, so \[\tag{5.1.13} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( x , y ) + \frac { \partial ^ 2 \tilde u } { \partial y ^ 2 } ( x , y )\] is equal to \[\tag{5.1.14} \frac 1 2 \left [ \left ( \frac { \partial p } { \partial x } ( x , y ) \right ) ^ 2 + \left ( \frac { \partial p } { \partial y } ( x , y ) \right ) ^ 2 + \left ( \frac { \partial q } { \partial x } ( x , y ) \right ) ^ 2 + \left ( \frac { \partial q } { \partial y } ( x , y ) \right ) ^ 2 \right ]\] times \[\tag{5.1.15} \frac { \partial ^ 2 u } { \partial x ^ 2 } ( p ( x , y ) , q ( x , y ) ) + \frac { \partial ^ 2 u } { \partial y ^ 2 } ( p ( x , y ) , q ( x , y) ) .\] As long as the the derivative of the mapping defined by $ p $ and $ q $ is non-singular, which we need in order for this to be a symmetry, the sum of squares of partial derivatives is positive and so \[\tag{5.1.16} \frac { \partial ^ 2 \tilde u } { \partial x ^ 2 } ( x , y ) + \frac { \partial ^ 2 \tilde u } { \partial y ^ 2 } ( x , y ) = 0\] if and only if \[\tag{5.1.17} \frac { \partial ^ 2 u } { \partial x ^ 2 } ( p ( x , y ) , q ( x , y ) ) + \frac { \partial ^ 2 u } { \partial y ^ 2 } ( p ( x , y ) , q ( x , y)) = 0 .\] So the transformation is a symmetry provided that its derivative is non-singular and $ p $ and $ q $ satisfy one of the pairs of equations above. These equations are of independent interest. The first pair are the equations for two dimensional incompressible irrotational fluid flow while the second pair are the Cauchy-Riemann equations, which are important in complex analysis. In both cases the usual names for the variables I've been calling $ p $ and $ q $ are $ u $ and $ v $ but we already have a $ u $ so it was necessary to relabel them here.

It is in fact possible to show that the transformation \[\tag{5.1.18} \tilde u ( x , y ) = u ( p ( x , y ) , q ( x , y ) )\] is a symmetry if and only if the conditions above are satisfied. The non-singularity of the derivate is needed since symmetries must be invertible functions. To see that one of the pairs of first order order equations for $ p $ and $ q $ must hold we note that the coefficients of the pure second order equations must be equal and the coefficient of the mixed partial derivative must vanish, i.e. \[\tag{5.1.19} \left ( \frac { \partial p } { \partial x } \right ) ^ 2 + \left ( \frac { \partial p } { \partial y } \right ) ^ 2 - \left ( \frac { \partial q } { \partial x } \right ) ^ 2 - \left ( \frac { \partial q } { \partial y } \right ) ^ 2 = 0\] and \[\tag{5.1.20} 2 \left [ \frac { \partial p } { \partial x } \frac { \partial q } { \partial x } + \frac { \partial p } { \partial y } \frac { \partial q } { \partial y } \right ] = 0 .\] Squaring and adding we obtain, after a little bit of algebra, the equation \[\tag{5.1.21} \left [ \left ( \frac { \partial p } { \partial x } + \frac { \partial q } { \partial y } \right ) ^ 2 + \left ( \frac { \partial p } { \partial y } - \frac { \partial q } { \partial x } \right ) ^ 2 \right ] + \left [ \left ( \frac { \partial p } { \partial x } - \frac { \partial q } { \partial y } \right ) ^ 2 + \left ( \frac { \partial p } { \partial y } + \frac { \partial q } { \partial x } \right ) ^ 2 \right ] = 0 .\] For this to be zero we need one of the factors to be zero, which means that each of the two squared terms in that factor must be zero. This leads us to the two pairs of equations given earlier.

Solving either of the pairs of first order equations for $ p $ and $ q $ is no easier than solving the Laplace equation, but verifying that given $ p $ and $ q $ satisfy one or the other is just a matter of computing their first partial derivatives. For example, with \[\tag{5.1.22} p ( x , y ) = \frac { x } { x ^ 2 + y ^ 2 }\] and \[\tag{5.1.23} q ( x , y ) = \frac { y } { x ^ 2 + y ^ 2 } ,\] as in the definition of $ J $, we have \[\tag{5.1.24} \frac { \partial p } { \partial x } ( x , y ) = - \frac { x ^ 2 - y ^ 2 } { \left ( x ^ 2 + y ^ 2 \right ) ^ 2 } ,\] \[\tag{5.1.25} \frac { \partial p } { \partial y } ( x , y ) = - \frac { 2 y } { \left ( x ^ 2 + y ^ 2 \right ) ^ 2 } ,\] \[\tag{5.1.26} \frac { \partial q } { \partial x } ( x , y ) = - \frac { 2 y } { \left ( x ^ 2 + y ^ 2 \right ) ^ 2 } ,\] and \[\tag{5.1.27} \frac { \partial p } { \partial x } ( x , y ) = \frac { x ^ 2 - y ^ 2 } { \left ( x ^ 2 + y ^ 2 \right ) ^ 2 } ,\] and so the pair \[\tag{5.1.28} \frac { \partial p } { \partial x } + \frac { \partial q } { \partial y } = 0 , \quad \frac { \partial p } { \partial y } - \frac { \partial q } { \partial x } = 0\] is satisfied and $ J $ is therefore a symmetry of the Laplace equation.

We can find more symmetries by composing the ones we've just seen. For example, we can write any reflection through a line in the plane as a composition of a translation, a rotation, a reflection about the $ x $ or $ y $ axis, another rotation, and another translation. Since these are all symmetries it follows without further computation that all such reflections are symmetries of the Laplace equation.

For a more complicated example, set \[\tag{5.1.29} G = T _ { 0 , 1 } J T _ { 0 , - 1 / 2 } J S _ 2 .\] We then have that \[\tag{5.1.30} \begin{split} ( G u ) ( x , y ) & = ( T _ { 0 , 1 } J T _ { 0 , - 1 / 2 } J S _ 2 u ) ( x , y ) \\ & = ( J T _ { 0 , - 1 / 2 } J S _ 2 u ) ( x , y - 1 ) \\ & = ( T _ { 0 , - 1 / 2 } J S _ 2 u ) \left ( \frac x { x ^ 2 + ( y - 1 ) ^ 2 } , \frac { y - 1 } { x ^ 2 + ( y - 1 ) ^ 2 } \right ) \\ & = ( J S _ 2 u ) \left ( \frac x { x ^ 2 + ( y - 1 ) ^ 2 } , \frac 1 2 \frac { x ^ 2 + y ^ 2 - 1 } { x ^ 2 + ( y - 1 ) ^ 2 } \right ) \\ & = ( S _ 2 u ) \left ( 4 \frac x { x ^ 2 + ( y + 1 ) ^ 2 } , 2 \frac { x ^ 2 + y ^ 2 - 1 } { x ^ 2 + ( y + 1 ) ^ 2 } \right ) \\ & = u \left ( 2 \frac x { x ^ 2 + ( y + 1 ) ^ 2 } , \frac { x ^ 2 + y ^ 2 - 1 } { x ^ 2 + ( y + 1 ) ^ 2 } \right ) . \end{split}\] Since we already know that $ T _ { 0 , 1 } $, $ J $, $ T _ { 0 , - 1 / 2 } $, and $ S _ 2 $ are symmetries we can conclude that $ G $ is a symmetry without directly checking in the way we did for $ J $. What transformations can we write as a product of the ones we already have? This is a question we'll return to later.

Now that we have some symmetries we can ask which solutions are invariant under particular groups of symmetries. For most groups the answers are uninteresting. For example the only solutions invariant under translations are the constant solutions. In some cases we get more interesting examples though. A function is invariant under the rotations if and only if it is of the form \[\tag{5.1.31} u ( t , x ) = \varphi ( \sqrt { x ^ 2 + y ^ 2 } )\] for some function $ \varphi $. Substituting this into the Laplace equation we see that $ \varphi $ must satisfy the equation \[\tag{5.1.32} r \varphi '' ( r ) + \varphi ' ( r ) = 0 .\] Letting $ s = \varphi ( r ) $ we have, at least formally, $r \frac { d s } { d r } + s = 0$ so \[\tag{5.1.33} \frac { d s } s + \frac { d r } r = 0\] and so \[\tag{5.1.34} \log s + \log r = \log ( s r )\] is constant and therefore so is $ s r $. In other words, \[\tag{5.1.35} s = \frac { c _ 1 } r\] for some constant $ c _ 1 $. Then \[\tag{5.1.36} \varphi ' ( r ) = s = \frac { c _ 1 } r\] so, integrating again, \[\tag{5.1.37} \varphi ( r ) = c _ 1 \log r + c _ 2 .\] The preceding argument can be made rigorous but there's no real need to do so since we can check directly that this is a solution to the the ordinary differential equation for $ \varphi $ and the uniqueness of solutions to the initial value problem for ordinary differential equations then shows that these are the only solutions. So our rotationally invariant solutions are \[\tag{5.1.38} u ( x , y ) = c _ 1 \log ( \sqrt { x ^ 2 + y ^ 2 } ) + c _ 2 .\]

Once we have solution we can use our symmetries to get more solutions. For example, once we know that \[\tag{5.1.39} u ( x , y ) = \log ( x ^ 2 + y ^ 2 )\] is a solution we see that \[\tag{5.1.40} ( T _ { \xi , \eta } u ) ( x , y ) = \log ( x ^ 2 + y ^ 2 - 2 \xi x - 2 \eta y + \xi ^ 2 + \eta ^ 2 )\] is a solution. This wouldn't be terribly difficult to check directly, but it would be considerably more painful to check directly that \[\tag{5.1.41} ( G u ) ( x , y ) = \log \left ( \frac { x ^ 2 + ( y + 1 ) ^ 2 } { x ^ 2 + ( y - 1 ) ^ 2 } \right )\] is a solution.

Section 5.2 Mean Value Property

Suppose we have a solution $ u $ of the Laplace equation in the open disc of radius $ \rho $ about the origin. By rotational invariance then, as discussed above, \[\tag{5.2.1} ( R _ \theta u ) ( x , y ) = u ( x \cos \theta + y \sin \theta , - x \sin \theta + y \cos \theta )\] is also a solution in this disc. We can average this over all values of $ \theta $. \[\tag{5.2.2} \begin{split} \tilde u ( x , y ) & = \frac 1 { 2 \pi } \int _ { - \pi } ^ { \pi } R _ \theta ( x , y ) \, d \theta \\ & = \frac 1 { 2 \pi } \int _ { - \pi } ^ { \pi } u ( x \cos \theta + y \sin \theta , - x \sin \theta + y \cos \theta ) \, d \theta . \end{split}\] This average will also be a solution of the wave equation, as we can see by differentiating under the integral sign, and will be rotationally invariant. Taking $ ( x , y ) = ( 0 , 0 ) $ in the equation above we see that \[\tag{5.2.3} \tilde u ( 0 , 0 ) = u ( 0 , 0 ) .\] We've already described the rotationally invariant solutions of the Laplace equation and the only ones which are defined at the origin are the constant ones, so in fact \[\tag{5.2.4} \tilde u ( x , y ) = u ( 0 , 0 )\] throughout the disc.

For any $ r < \rho $ we have, by taking $ ( x , y ) = ( r , 0 ) $, \[\tag{5.2.5} u ( 0 , 0 ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ { \pi } u ( r \cos \theta , r \sin \theta ) \, d \theta .\] If we suppose further that $ u $ is continuous on the closed disc of radius $ \rho $ then the right hand side is a continuous function of $ r $ on the closed interval $ [ 0 , r ] $ by our theorem on continuity of integrals, and so we have, by taking limits as $ r $ approaches $ \rho $ from below, \[\tag{5.2.6} u ( 0 , 0 ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ { \pi } u ( r \cos \theta , r \sin \theta ) \, d \theta\] for $ r = \rho $ as well, and hence for all $ r \le \rho $.

The choice of the origin as the centre of our disc simplified the calculations above, but we know that spatial translations are symmetries of the Laplace equation so the following theorem then follows from this special case:

Theorem 5.2.A Suppose $ u $ is continuous in the closed disc of radius $ \rho $ about the point $ ( x , y ) $ and satisfies the Laplace equation in the open disc of radius $ \rho $. Then $ u ( x , y ) $ is equal to its average value over the circle of radius $ r $ about $ ( x , y ) $ for any $ r < \rho $. In other words \[\tag{5.2.7} u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ { \pi } u ( x + r \cos \theta , y + r \sin \theta ) \, d \theta .\]

Section 5.3 The Maximum Principle

Suppose $ \Omega $ is a bounded connected open subset of $ \mathbf R ^ 2 $. Then its closure $ \bar \Omega $ is also bounded. Suppose further that $ u $ is a continuous function on $ \bar \Omega $ which is twice continuously differentiable in $ \Omega $ and satisfies the Laplace equation there. Let \[\tag{5.3.1} V = \{ ( x , y ) \in \Omega : u ( x , y ) = \max _ { \bar \Omega } u \} .\] Note that $ \bar \Omega $ is a closed bounded subset of $ \mathbf R ^ 2 $, so the maximum refered to above exists. This $ V $ is closed, as a subset of $ \Omega $, because $ u $ is continuous.

$ V $ also an open subset of $ \Omega $ though. We can see this as follows. Suppose $ ( x , y ) \in V $. Then $ ( x , y ) \in \Omega $ and, since $ \Omega $ is open there is some positive $ \rho $ such that the disc of radius $ \rho $ about $ ( x , y ) $ is contained in $ \Omega $. For any $ r < \rho $ then $ u ( x , y ) $ is equal to the average value of $ u $ on the circle of radius $ r $ about $ ( x , y ) $. But $ u \in V $ so $ u ( x , y ) = \max _ { \bar \Omega } u $ and this circle is contained in $ \bar \Omega $ so $ u $ is less than or equal to $ u ( x , y ) $ everywhere on this circle. In other words, the value of $ u $ on the circle is everywhere less than or equal to its average value on the circle. Since $ u $ is continuous this can only happen if the value is equal to its average everywhere. The whole circle of radius $ r $ then also belongs to $ V $. This holds for all $ r < \rho $ so the open disc of radius $ \rho $ about $ ( x , y ) $ is a subset of $ V $. We've now established that for every point of $ V $ there is an open disc centred at $ ( x , y ) $ which is also in $ V $, and so $ V $ is open.

We've now seen that the connected set $ \Omega $ has a subset $ V $ which is, when considered as a subset of $ \Omega $, both open and closed. By the definition of connectedness this means that $ V $ is either empty or is all of $ \Omega $. Either possibility can occur. If $ V $ is empty then the maximum of $ u $ over $ \bar \Omega $ does not occur in $ \Omega $ and so must occur only on its boundary, $ \partial \Omega $. If $ V $ is all of $ \Omega $ then $ u $ is constant on $ \Omega $ and so, by continuity, also on $ \bar \Omega $. What we have just proved is the following theorem, known as the maximum principle for the Laplace equation, or sometimes as the strong maximum principle, to distinguish it from the weak maximum principle, which we will consider later.

Theorem 5.3.A Suppose $ \Omega $ is a bounded connected open subset of $ \mathbf R ^ 2 $. Suppose further that $ u $ is a continuous function on $ \bar \Omega $ which is twice continuously differentiable in $ \Omega $ and satisfies the Laplace equation there. Then either $ u $ is constant or $ u $ takes its maximum on the boundary $ \partial \Omega $ of $ \Omega $.

Stating the maximum principle as a maximum principle is a matter of convention. There is also a minimum principle, obtained by replacing the word maximum by minimum everywhere in the statement above. The proof is the same, except that in addition to replacing the word maximum by minimum everywhere we need to reverse the sign of all inequalities. Alternatively, since multiplication by non-zero constants is a symmetry of the equation we can just apply the maximum principle to $ - u $ to obtain the minimum principle.

Two consequences of the minimum principle are a non-negativity theorem and a positivity theorem. With hypotheses as in the statement of the maximum principle, if $ u $ is non-negative on $ \partial \Omega $ then it is non-negative in $ \Omega $ as well. If it is non-negative everywhere in $ \partial \Omega $ and is positive somewhere then it is positive everywhere in $ \Omega $.

The version of the maximum principle above implies a weaker result, that the maximum of $ u $ must occur somewhere on $ \partial \Omega $. This is called the weak maximum principle. The strong maximum principle, the one stated earlier, contains the additional information that unless $ u $ is constant the maximum occurs only on $ \partial \Omega $. The weak maximum principle can also be proved directly, without using the strong maximum principle, by a method similar to the one we used earlier for the diffusion equation.

Section 5.4 Uniqueness in Bounded Domains

Suppose as in the preceding section that $ \Omega $ is a bounded connected open subset of $ \mathbf R ^ 2 $. Let $ f $ be a continuous function on its boundary $ \partial \Omega $. The Dirichlet problem for the Laplace equation on $ \Omega $ with boundary data $ f $ is to find a continuous function $ u $ on $ \bar \Omega $ which is twice continuously differentiable in $ \Omega $ and satisfies the Laplace equation there, and such that the restriction of $ u $ to $ \partial \Omega $ is equal to $ f $.

While proving existence of solutions to the Dirichlet problem under the hypotheses above is in general not possible, and is fairly difficult for those $ \Omega $ for which it is possible, proving uniqueness is easy, or at least it is easily derived from theorems we've already proved. Indeed it follows almost immediately from the weak maximum principle and minimum principle. Suppose $ v $ and $ w $ solve the Dirichlet problem on the same region $ \Omega $ with the same boundary data $ f $. Let \[\tag{5.4.1} u = v - w . \] Then $ u $ satisfies the hypotheses of the maximum principle and minimum principle and so takes its maximum and minimum on $ \partial u $. But $ u $ is zero there, so its maximum and minimum are both zero and so $ u $ must be identically zero. In other words, $ u = v $.

The uniqueness property proved above can fail if we remove even a single point from $ \partial \Omega $. Consider, for example, the case where $ \Omega $ is the open unit disc with the origin removed. Then $ \bar \Omega $ is closed unit disc and $ \partial \Omega $ is the unit circle plus the origin. If we ignore the origin and take zero boundary data on the unit circle then in addition to the expected zero solution we have the solution $ \log ( x ^ 2 + y ^ 2 ) $, which we met earlier. This solution is of course not continuous or even defined on all of $ \bar \Omega $ but it is continuous everywhere else and satisfies all the other properties required by the Dirichlet problem.

The boundedness assumption in the uniqueness theorem is also needed. This is perhaps clearest if we consider $ \Omega = \mathbf R ^ 2 $. In this case $ \partial \Omega $ is empty and so the uniqueness theorem, if it held for this choice of $ \Omega $, would tell us that there is at most one solution to the Laplace equation in $ \mathbf R ^ 2 $, but there are in fact many solutions. Any linear polynomial, for example, is a solution. Even if we impose boundness as a condition, as we did for the diffusion and will again later for the Laplace equation in a half-space, we would still have the constants as solutions, although it turns out those are the only bounded solutions of the Laplace equation in $ \mathbf R ^ 2 $.

The connectedness assumption only really matters for the strong maximum principle. It's not needed for the weak maximum principle or the uniqueness theorem. On the other hand there's not much interesting to be said about disconnected regions. We can always treat them by considering each connected component separately. This is why we looked only at intervals for the wave and diffusion equations. Solving those equations in a disjoint union of intervals just requires solving it in each interval separately.

Section 5.5 Lorentz Transformations

Lorentz tranformations are symmetries of the wave equation in three space dimensions, so what are they doing in a chapter on the Laplace equation in two space dimensions? It turns out that they also give rise to symmetries of the two dimensional Laplace equation, but in a somewhat complicated way. We'll cover those symmetries in the next section and just discuss Lorentz tranformations in general in this section.

As is often done in Linear Algbra for this section we'll done vectors by boldface lower case letters and matrices by ordinary upper case letters when we don't need to refer to components individually. When we do want to refer to individual components we will write them with the usual matrix notation, with the corresponding lower case letter and subscripts. Vectors will be denoted either as column vectors or with the components separated by commas and delimited by parentheses. All vectors will be in $ \mathbf R ^ 4 $ and all matrices will be $ 4 \times 4 $. As examples, \[\tag{5.5.1} \mathbf w = \left [ \begin{matrix} w _ 1 \\ w _ 2 \\ w _ 3 \\ w _ 4 \end{matrix} \right ] = ( w _ 1 , w _ 2 , w _ 3 , w _ 4 )\] and \[\tag{5.5.2} A = \left [ \begin{matrix} a _ { 1 1 } & a _ { 1 2 } & a _ { 1 3 } & a _ { 1 4 } \\ a _ { 2 1 } & a _ { 2 2 } & a _ { 2 3 } & a _ { 2 4 } \\ a _ { 3 1 } & a _ { 3 2 } & a _ { 3 3 } & a _ { 3 4 } \\ a _ { 4 1 } & a _ { 4 2 } & a _ { 4 3 } & a _ { 4 4 } \end{matrix} \right ] .\] Technically this means that a function $ f ( \mathbf w ) $ should be written in terms of components as $ f ( ( w _ 1 , w _ 2 , w _ 3 , w _ 4 ) ) $ but following the usual convention we will drop the extra parentheses.

Let \[\tag{5.5.3} G = \left [ \begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & - 1 \end{matrix} \right ]\] A $ 4 \times 4 $ matrix $ A $ such that \[\tag{5.5.4} A ^ T G A = G \] is called a Lorentz matrix. Lorentz matrices act on $ \mathbf R ^ 4 $ by left multiplication \[\tag{5.5.5} M _ A ( \mathbf w ) = A \mathbf w\] We define a function $ g $ on $ \mathbf R ^ 4 $ by \[\tag{5.5.6} g ( \mathbf w ) = \mathbf w ^ T G \mathbf w .\] For any Lorentz transformation $ A $ we have \[\tag{5.5.7} \begin{split} ( g \circ M _ A ) ( \mathbf w ) & = g ( M _ A ( \mathbf w ) ) = g ( A \mathbf w ) \\ & = ( A \mathbf w ) ^ T G ( A \mathbf w ) = \mathbf w ^ T A ^ T G A \mathbf w = \mathbf w ^ T G \mathbf w = g ( \mathbf w ) \end{split}\] for all $ \mathbf w $ and so \[\tag{5.5.8} g \circ M _ A = g .\] It follows that $ M _ A $ maps the zero set of $ g $ to the zero set of $ g $.

Every Lorentz matrix is invertible. We could see this in either of two ways, both of which have their uses. On way is to take determinants in the equation $ A ^ T G A = G $ and use the usual properties of determinants to see that $ \det ( A ) ^ 2 = 1 $ and so \[\tag{5.5.9} \det ( A ) = \pm 1 .\] The other way is simply to check that \[\tag{5.5.10} G ^ { - 1 } A ^ T G = \left [ \begin{matrix} a _ { 1 1 } & - a _ { 2 1 } & - a _ { 3 1 } & - a _ { 4 1 } \\ - a _ { 1 2 } & a _ { 2 2 } & a _ { 3 2 } & a _ { 4 2 } \\ - a _ { 1 3 } & a _ { 2 3 } & a _ { 3 3 } & a _ { 4 3 } \\ - a _ { 1 4 } & a _ { 2 4 } & a _ { 3 4 } & a _ { 4 4 } \end{matrix} \right ]\] is the inverse of $ A $, either directly or via \[\tag{5.5.11} ( G ^ { - 1 } A ^ T G ) A = G ^ { - 1 } ( A ^ T G A ) = G ^ { - 1 } G = I .\] The inverse on $ G $ is redundant, but makes the calculation clearer. This inverse is also a Lorentz matrix, since \[\tag{5.5.12} A ^ { - T } G A ^ { - 1 } = A ^ { - T } ( A ^ T G A ) A ^ { - 1 } = G .\] Also, the product of Lorentz matrices is a Lorentz matrix, since \[\tag{5.5.13} ( A B ) ^ T G ( A B ) = B ^ T A ^ T G A B = B ^ T G B = G\] if $ A $ and $ B $ are Lorentz matrices. The Lorentz matrices therefore form a group.

Section 5.6 More Symmetries

We can make the group of Lorentz matrices act on the plane in a way which is far from obvious and it turns out that this action is by symmetries of the Laplace equation.

We first of all define a function $ j $ from $ \mathbf R ^ 2 $ to $ \mathbf R ^ 4 $ by \[\tag{5.6.1} j ( x , y ) = ( 2 x , 2 y , 1 - x ^ 2 - y ^ 2 , 1 + x ^ 2 + y ^ 2 )\] and a function $ p $ from $ \mathbf R ^ 4 $ to $ \mathbf R ^ 2 $ by \[\tag{5.6.2} p ( w _ 1 , w _ 2 , w _ 3 , w _ 4 ) = \left ( \frac { w _ 1 } { w _ 3 + w _ 4 } , \frac { w _ 2 } { w _ 3 + w _ 4 } \right ) .\] Strictly speaking this isn't a function from $ \mathbf R ^ 4 $ to $ \mathbf R ^ 2 $ but rather a function from $ \mathbf R ^ 4 $ minus a hyperplane to $ \mathbf R ^ 2 $, since it's undefined where $ w _ 3 + w _ 4 = 0 $. It's common when dealing with rational functions though to gloss over the fact that they're only defined where their denominators are non-zero and we'll mostly follow that practice here.

The functions $ j $ and $ p $ are not inverses, but we do have \[\tag{5.6.3} p ( j ( x , y ) ) = \left ( \frac { 2 x } { 1 - x ^ 2 - y ^ 2 + 1 + x ^ 2 + y ^ 2 } , \frac { 2 y } { 1 - x ^ 2 - y ^ 2 + 1 + x ^ 2 + y ^ 2 } \right ) = ( x , y ) ,\] so $ p $ is a left inverse to $ j $ and $ j $ is a right inverse to $ p $. For the composition in the other order we have the more complicated relation \[\tag{5.6.4} \begin{split} & j ( p ( w _ 1 , w _ 2 , w _ 3 , w _ 4 ) ) = j \left ( \frac { w _ 1 } { w _ 3 + w _ 4 } , \frac { w _ 2 } { w _ 3 + w _ 4 } \right ) \\ & \qquad {} = \left ( \frac { 2 w _ 1 } { w _ 3 + w _ 4 } , \frac { 2 w _ 1 } { w _ 3 + w _ 4 } , \frac { ( w _ 3 + w _ 4 ) ^ 2 - w _ 1 ^ 2 - w _ 2 ^ 2 } { ( w _ 3 + w _ 4 ) ^ 2 } , \frac { ( w _ 3 + w _ 4 ) ^ 2 + w _ 1 ^ 2 + w _ 2 ^ 2 } { ( w _ 3 + w _ 4 ) ^ 2 } \right ) \\ & \qquad {} = \frac 2 { w _ 3 + w _ 4 } ( w _ 1 , w _ 2 , w _ 3 , w _ 4 ) + \frac { w _ 1 ^ 2 + w _ 2 ^ 2 + w _ 3 ^ 2 - w _ 4 ^ 2 } { ( w _ 3 + w _ 4 ) ^ 2 } ( 0 , 0 , - 1 , 1 ) . \end{split}\] Note the image of the function $ j $ belongs to the set of $ ( w _ 1 , w _ 2 , w _ 3 , w _ 4 ) $ such that \[\tag{5.6.5} w _ 1 ^ 2 + w _ 2 ^ 2 + w _ 3 ^ 2 - w _ 4 ^ 2 = 0 \] and if we restrict $ p $ to this set then we get the simpler equation \[\tag{5.6.6} j ( p ( w _ 1 , w _ 2 , w _ 3 , w _ 4 ) ) = \frac 2 { w _ 3 + w _ 4 } ( w _ 1 , w _ 2 , w _ 3 , w _ 4 ) ,\] so $ j \circ p $ isn't the identity but it is multiplication by a scalar valued function.

Now we're finally able to describe the action of Lorentz matrices on the plane. For any Lorentz matrix $ A $ we define a function $ F _ A $ from $ \mathbf R ^ 2 $ to $ \mathbf R ^ 2 $ by \[\tag{5.6.7} F _ A = p \circ M _ A \circ j .\] Then \[\tag{5.6.8} F _ A \circ F _ B = p \circ M _ A \circ j \circ p \circ M _ B \circ j .\] Now $ j \circ p $ is, as we've just seen, multiplication by a scalar. It follows that for each $ ( x , y ) \in \mathbf R ^ 2 $ \[\tag{5.6.9} ( j \circ p \circ M _ B \circ j ) ( x , y ) = ( j \circ p ) ( ( M _ B \circ j ) ( x , y ) )\] is a scalar multiple of \[\tag{5.6.10} ( M _ B \circ j ) ( x , y ) .\] This scalar multiple will generally be different for different choices of $ ( x , y ) $ but if we look at the definition of $ p $ we see that its value is unchanged if we multiply its its argument by a scalar, so \[\tag{5.6.11} p ( ( j \circ p \circ M _ B \circ j ) ( x , y ) ) = p ( ( M _ B \circ j ) ( x , y ) )\] or, more simply, \[\tag{5.6.12} p \circ M _ A \circ j \circ p \circ M _ B \circ j = p \circ M _ A \circ M _ B \circ j\] or just \[\tag{5.6.13} F _ A \circ F _ B = p \circ M _ A \circ M _ B \circ j .\] It's important here that $ B $ was assumed to be a Lorentz matrix. The image of $ j $ is contained in the hypersurface $ w _ 1 ^ 2 + w _ 2 ^ 2 + w _ 3 ^ 2 - w _ 4 ^ 2 = 0 $ and $ M _ B $ maps this hypersurface to itself. We used this when said that $ j \circ p $ is multiplication by a scalar, since this only true on that hypersurface. Now \[\tag{5.6.14} M _ A \circ M _ B = M _ { A B }\] so \[\tag{5.6.15} F _ A \circ F _ B = p \circ M _ A \circ M _ B \circ j = F _ { A B } .\] In other words, multiplication of Lorentz matrices corresponds to composition of the corresponding functions on the plane. In other words we have an action of the group of Lorentz matrices on the plane.

As mentioned earlier, I've been ignoring the fact that rational functions are defined only where their denominators are non-zero. In the case of the functions $ F _ A $ we've just defined the denominators vanish either nowhere or at a single point, depending on $ A $.

Let's consider some examples. In each case checking that the matrices are Lorentz matrices is just a matter of multiplying some matrices and so we will skip that step. The identity matrix corresponds to the identity function on the plane since $ M _ I $ is the identity function on $ \mathbf R ^ 4 $ and so \[\tag{5.6.16} F _ I = p \circ M _ I \circ j = p \circ j\] and we've already seen that $ p \circ j $ is the identity function on $ \mathbf R ^ 2 $. But $ F _ { - I } $ is also the identity function on on $ \mathbf R ^ 2 $, since $ p $, as we already noted, is unaffected by multiplication of its argument by a scalar. $ I $ and $ - I $ are the only Lorentz matrices whose action on the plane is trivial though. We can't multiply $ I $ by any scalar other than $ 1 $ or $ - 1 $ because the result would not be a Lorentz matrix.

We can easily check that the matrices \[\tag{5.6.17} \left [ \begin{matrix} - 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix} \right ] , \quad \left [ \begin{matrix} 1 & 0 & 0 & 0 \\ 0 & - 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix} \right ]\] give reflections about the $ y $ and $ x $ axes in the plane, respectively. Somewhat less obvious is that \[\tag{5.6.18} \left [ \begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & - 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix} \right ]\] corresponds to the inversion symmetry we called $ J $ earlier. Indeed, if the matrix above is called $ A $ then \[\tag{5.6.19} \begin{split} F _ A ( x , y ) & = p ( M _ A ( j ( x , y ) ) ) \\ & = p ( M _ A ( 2 x , 2 y , 1 - x ^ 2 - y ^ 2 , 1 + x ^ 2 + y ^ 2 ) \\ & = p ( 2 x , 2 y , x ^ 2 + y ^ 2 - 1 , 1 + x ^ 2 + y ^ 2 ) \\ & = \left ( \frac { x } { x ^ 2 + y ^ 2 } , \frac { y } { x ^ 2 + y ^ 2 } \right ) . \end{split}\] The matrix \[\tag{5.6.20} \left [ \begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & - 1 \end{matrix} \right ]\] corresponds to the function taking $ ( x , y ) $ to \[\tag{5.6.21} \left ( \frac { - x } { x ^ 2 + y ^ 2 } , \frac { y } { - x ^ 2 + y ^ 2 } \right ) .\] We can see this directly, or we can note that this matrix is the product of the previous three and $ - I $ and use the fact that matrix multiplication corresponds to composition of functions.

A somewhat more complicated calculation shows that \[\tag{5.6.22} \left [ \begin{matrix} 1 & 0 & \xi & \xi \\ 0 & 1 & \eta & \eta \\ - \xi & - \eta & 1 - ( \xi ^ 2 + \eta ^ 2 ) / 2 & - ( \xi ^ 2 + \eta ^ 2 ) / 2 \\ \xi & \eta & ( \xi ^ 2 + \eta ^ 2 ) / 2 & 1 + ( \xi ^ 2 + \eta ^ 2 ) / 2 \end{matrix} \right ]\] sends $ ( x , y ) $ to $ ( x + \xi , y + \eta ) $. In other words it corresponds to a translation of the plane. Indeed, calling the matrix above $ A $ we have \[\tag{5.6.23} \begin{split} F _ A ( x , y ) & = p ( M _ A ( j ( x , y ) ) ) \\ & = p ( M _ A ( 2 x , 2 y , 1 - x ^ 2 - y ^ 2 , 1 + x ^ 2 + y ^ 2 ) \\ & = p ( j ( x + \xi , y + \eta ) ) = ( x + \xi , y + \eta ) . \end{split}\] By contrast it's somewhat easier to see that the matrix \[\tag{5.6.24} \left [ \begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & ( \mu ^ { - 1 } + \mu ) / 2 & ( \mu ^ { - 1 } - \mu ) / 2 \\ 0 & 0 & ( \mu ^ { - 1 } - \mu ) / 2 & ( \mu ^ { - 1 } + \mu ) / 2 \\ \end{matrix} \right ]\] corresponds to the scaling which takes $ ( x , y ) $ to $ ( \mu x , \mu y ) $ and that the matrix \[\tag{5.6.25} \left [ \begin{matrix} \cos \theta & - \sin \theta & 0 & 0 \\ \sin \theta & \cos \theta & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix} \right ]\] corresponds to the rotation taking $ ( x , y ) $ to $ ( x \cos \theta - y \sin \theta , x \sin \theta + y \cos \theta ) $. As a final example, note that \[\tag{5.6.26} \left [ \begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 0 & - 1 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{matrix} \right ]\] corresponds to the transformation which takes $ ( x , y ) $ to \[\tag{5.6.27} \left ( 2 \frac x { x ^ 2 + ( y + 1 ) ^ 2 } , \frac { x ^ 2 + y ^ 2 - 1 } { x ^ 2 + ( y + 1 ) ^ 2 } \right ) .\]

All of the example above have the property that \[\tag{5.6.28} \tilde u = u \circ F _ A ^ { - 1 }\] is a symmetry of the Laplace equation. In fact this is true for all Lorentz matrices $ A $. There are multiple ways to prove this. One is by brute force, using the method we used earlier to show that $ J $ was a symmetry. A more civilised approach is to show the every Lorentz matrix can be written as a product of the various types already considered. Both options are somewhat messy and so the calculations won't be given here.

Section 5.7 Dirichlet Problem for a Disc

Suppose $ 0 < \rho < 1 $ and apply the construction above to the matrix \[\tag{5.7.1} A = ( 1 - \rho ^ 2 ) ^ { - 1 } \left [ \begin{matrix} 1 + \rho ^ 2 & 0 & 0 & - 2 \rho \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ - 2 \rho & 0 & 0 & 1 + \rho ^ 2 \end{matrix} \right ] ,\] whose inverse is \[\tag{5.7.2} A ^ { - 1 } = ( 1 - \rho ^ 2 ) ^ { - 1 } \left [ \begin{matrix} 1 + \rho ^ 2 & 0 & 0 & 2 \rho \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 2 \rho & 0 & 0 & 1 + \rho ^ 2 \end{matrix} \right ] ,\] we get the symmetry \[\tag{5.7.3} \tilde u ( x , y ) = u ( F _ A ^ { - 1 } ( x , y ) ) = u \left ( p ( x , y ) , q ( x , y ) \right )\] where \[\tag{5.7.4} p ( x , y ) = \frac { \rho ( x ^ 2 + y ^ 2 ) + ( 1 + \rho ^ 2 ) x + \rho } { ( \rho x + 1 ) ^ 2 + \rho ^ 2 y ^ 2 }\] and \[\tag{5.7.5} q ( x , y ) = \frac { ( 1 - \rho ^ 2 ) y } { ( \rho x + 1 ) ^ 2 + \rho ^ 2 y ^ 2 } .\] The fact that this is a symmetry of the Laplace equation follows from the results of the preceding section, but it can also be proved by checking the equations \[\tag{5.7.6} \frac { \partial p } { \partial x } - \frac { \partial q } { \partial y } = 0 , \quad \frac { \partial p } { \partial y } + \frac { \partial q } { \partial x } = 0 .\]

Note that \[\tag{5.7.7} 1 - p ( x , y ) ^ 2 - q ( x , y ) ^ 2 = \frac 1 2 \frac { 1 - \rho ^ 2 } { ( \rho x + 1 ) ^ 2 + \rho ^ 2 y ^ 2 } ( 1 - x ^ 2 - y ^ 2 ) .\] The factor in front of $ 1 - x ^ 2 - y ^ 2 $ is positive so so $1 - p ( x , y ) ^ 2 - q ( x , y ) ^ 2 \ge 0$ if and only if \[\tag{5.7.8} 1 - x ^ 2 - y ^ 2 \ge 0\] and the same holds if we make both inequalities strict. In other words $ F _ A $ maps the closed unit disc to the closed unit disc and the open unit disc to the open unit disc. This is the first time we've used our assumption that $ 0 < \rho < 1 $.

Applying the symmetry we've just found to $ ( x , y ) = ( 0 , 0 ) $ we get \[\tag{5.7.9} \tilde u ( 0 , 0 ) = u ( \rho , 0 ) .\] If $ u $ is continuous on the closed unit disc and satisfies the Laplace equation in the open unit disc. Then the same is true of $ \tilde u $, so by the mean value property \[\tag{5.7.10} \tilde u ( 0 , 0 ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ { \pi } \tilde u ( \cos \theta , \sin \theta ) \, d \theta .\] Therefore \[\tag{5.7.11} \begin{split} u ( \rho , 0 ) & = \frac 1 { 2 \pi } \int _ { - \pi } ^ { \pi } u ( p ( \cos \theta , \sin \theta ) , q ( \cos \theta , \sin \theta ) ) \, d \theta \\ & = \frac 1 { 2 \pi } \int _ { - \pi } ^ { \pi } u \left ( \frac { \cos \theta + 2 \rho + \rho ^ 2 \cos \theta } { 1 + 2 \rho \cos \theta + \rho ^ 2 } , \frac { \sin \theta - \rho ^ 2 \sin \theta } { 1 + 2 \rho \cos \theta + \rho ^ 2 } \right ) \, d \theta . \end{split}\] Now the Laplace equation is invariant under rotations so the equation above continues to hold if we rotate the argment of $ u $ on both sides of the equation through an angle of $ \varphi $. \[\tag{5.7.12} u ( \rho \cos \varphi , \rho \sin \varphi ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ { \pi } u ( a ( \theta ) , b ( \theta ) ) \, d \theta\] where \[\tag{5.7.13} a ( \theta ) = \frac { ( \cos \theta + 2 \rho + \rho ^ 2 \cos \theta ) \cos \varphi - ( \sin \theta - \rho ^ 2 \sin \theta ) \sin \varphi } { 1 + 2 \rho \cos \theta + \rho ^ 2 }\] and \[\tag{5.7.14} b ( \theta ) = \frac { ( \cos \theta + 2 \rho + \rho ^ 2 \cos \theta ) \sin \varphi + ( \sin \theta - \rho ^ 2 \sin \theta ) \cos \varphi } { 1 + 2 \rho \cos \theta + \rho ^ 2 } .\]

Every point in the open unit disc, other than the origin, is of the form \[\tag{5.7.15} ( x , y ) = ( \rho \cos \varphi , \rho \sin \varphi )\] for some $ \rho $ and $ \varphi $ so \[\tag{5.7.16} u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi u ( w ( \theta ) , z ( \theta ) ) \, d \theta ,\] for all $ ( x , y ) $ in the punctured unit disc, where \[\tag{5.7.17} w ( \theta ) = \frac { ( \cos \theta + 2 \rho + \rho ^ 2 \cos \theta ) x - ( \sin \theta - \rho ^ 2 \sin \theta ) y } { \rho + 2 \rho ^ 2 \cos \theta + \rho ^ 3 }\] and \[\tag{5.7.18} z ( \theta ) = \frac { ( \cos \theta + 2 \rho + \rho ^ 2 \cos \theta ) y + ( \sin \theta - \rho ^ 2 \sin \theta ) x } { \rho + 2 \rho ^ 2 \cos \theta + \rho ^ 3 } .\] Strictly speaking the equation for $ u $ above has been proved only with the additional assumption that $ ( x , y ) \neq ( 0 , 0 ) $ but it holds for $ ( x , y ) = ( 0 , 0 ) $ as well by the mean value principle. The point $ ( w ( \theta ) , z ( \theta ) ) $ lies on the unit circle and so is of the form $ ( \cos \psi , \sin \psi ) $ for some angle $ \psi $. It is usually more convenient to write the equation for $ u ( x , y ) $ above in terms of this angle $ \psi $ than in terms of $ \theta $. This change of variable converts $ u ( w ( \theta ) , z ( \theta ) ) $ into just $ u ( \cos \psi , \sin \psi ) $ but we have to convert $ d \theta $ as well. From the equations \[\tag{5.7.19} w ( \theta ) = \cos \psi , \quad z ( \theta ) = \sin \psi\] we get \[\tag{5.7.20} w ' ( \theta ) = - \sin \psi \frac { d \psi } { d \theta } , \quad z ' ( \theta ) = \cos \psi \frac { d \psi } { d \theta }\] and hence \[\tag{5.7.21} \frac { d \psi } { d \theta } = w ( \theta ) z ' ( \theta ) - z ( \theta ) w ' ( \theta ) = \frac { ( x - w ( \theta ) ) ^ 2 + ( y - z ( \theta ) ) ^ 2 } { 1 - x ^ 2 - y ^ 2 }\] so \[\tag{5.7.22} u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi \frac { 1 - x ^ 2 - y ^ 2 } { ( x - \cos \psi ) ^ 2 + ( y - \sin \psi ) ^ 2 } u ( \cos \psi , \sin \psi ) \, d \psi .\] This is called the Poisson Formula. It can also be written as \[\tag{5.7.23} u ( x , y ) = \frac 1 { 2 \pi } \int _ { w ^ 2 + z ^ 2 = 1 } \frac { 1 - x ^ 2 - y ^ 2 } { ( x - w ) ^ 2 + ( y - z ) ^ 2 } u ( w , z ) \, d s ,\] where $ d s $ indicates a line integral.

What have we just proved about solutions to the Dirichlet problem in the unit disc? We assumed that $ u $ was a solution to the Dirichlet problem and derived the Poisson formula so this is the only solution. In other words we've proved uniqueness, which we already had, but now with an explicit formula. We haven't yet proved existence.

To prove existence we need to do two things. We need to show that \[\tag{5.7.24} u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi \frac { 1 - x ^ 2 - y ^ 2 } { ( x - \cos \psi ) ^ 2 + ( y - \sin \psi ) ^ 2 } f ( \cos \psi , \sin \psi ) \, d \psi\] satisfies the Laplace equation in the open unit disc and that it can be extended to a continuous function on the close unit disc in such a way that $u ( \cos \psi , \sin \psi ) = f ( \cos \psi , \sin \psi ) .$

The first of these is a bit tedious, but doesn't present any real problems. Let \[\tag{5.7.25} K ( x , y , \psi) \frac { 1 - x ^ 2 - y ^ 2 } { ( x - \cos \psi ) ^ 2 + ( y - \sin \psi ) ^ 2 }\] so that \[\tag{5.7.26} u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi K ( x , y , \psi ) f ( \cos \psi , \sin \psi ) \, d \psi .\] Then \[\tag{5.7.27} \frac { \partial K } { \partial x } ( x , y , \psi) ^ 2 + \frac { \partial K } { \partial y } ( x , y , \psi) ^ 2 = \frac 4 { \left [ ( x - \cos \psi ) ^ 2 + ( y - \sin \psi ) ^ 2 \right ] ^ 2 }\] so \[\tag{5.7.28} \left \| \nabla K ( x , y , \psi ) \right \| \le \frac 2 { ( x - \cos \psi ) ^ 2 + ( y - \sin \psi ) ^ 2 } \le \frac 2 { ( 1 - r ) ^ 2 }\] for $ ( x , y ) $ in the disc of radius $ r < 1 $. The absolute values of the partial derivatives are less than or equal to the norm of the gradient so \[\tag{5.7.29} \left | \frac { \partial K } { \partial x } ( x , y , \psi ) \right | \le \frac 2 { ( 1 - r ) ^ 2 }\] and similarly for the $ y $ derivative. Then \[\tag{5.7.30} \left | \frac { \partial K } { \partial x } ( x , y , \psi ) f ( \cos \psi , \sin \psi ) \right | \le \frac 2 { ( 1 - r ) ^ 2 } \max | f |\] and similarly for the $ y $ derivative. The right hand side is integrable so $ u $ is continuously differentiable in the disc of radius $ r $ and its derivative is what we would get by formally differentiating under the integral, \[\tag{5.7.31} \frac { \partial u } { \partial x } ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi \frac { \partial K } { \partial x } ( x , y , \psi ) f ( \cos \psi , \sin \psi ) \, d \psi\] and similarly for the $ y $ derivative. We've seen that the function $ u $ is continuously differentiable in the disc of radius $ r $ about the origin for each $ r < 1 $, so it's continuously differentiable in the open unit disc. We also get the bound \[\tag{5.7.32} \left \| \nabla u ( x , y ) \right \| \le \frac 2 { ( 1 - \sqrt { x ^ 2 + y ^ 2 } ) ^ 2 } \max | f | ,\] which will be useful later.

A similar calculation shows that we can take a further derivative by formally differentiating under the integral sign, so \[\tag{5.7.33} \frac { \partial ^ 2 u } { \partial x ^ 2 } ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi \frac { \partial ^ 2 K } { \partial x ^ 2 } ( x , y , \psi ) f ( \cos \psi , \sin \psi ) \, d \psi\] and similarly for the second $ y $ derivative. But \[\tag{5.7.34} \frac { \partial ^ 2 K } { \partial x ^ 2 } ( x , y , \psi ) + \frac { \partial ^ 2 K } { \partial y ^ 2 } ( x , y , \psi ) = 0\] so \[\tag{5.7.35} \frac { \partial ^ 2 u } { \partial x ^ 2 } ( x , y ) + \frac { \partial ^ 2 u } { \partial y ^ 2 } ( x , y ) = 0 .\] In other words $ u $ satisfies the Laplace equation.

The second part, about boundary values, is more of a problem. There is something rather fishy about the Poisson formula. If we substitute in values of $ ( x , y ) $ on the unit circle then the numerator in the fraction \[\tag{5.7.36} \frac { 1 - x ^ 2 - y ^ 2 } { ( x - \cos \psi ) ^ 2 + ( y - \sin \psi ) ^ 2 }\] will vanish and so we will get $ u ( x , y ) = 0 $, which is not what we want. The $ u $ obtained in this way is not continuous though. Instead we should return to the earlier equations \[\tag{5.7.37} u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi u ( w ( \theta ) , z ( \theta ) ) \, d \theta ,\] \[\tag{5.7.38} w ( \theta ) = \frac { ( \cos \theta + 2 \rho + \rho ^ 2 \cos \theta ) x - ( \sin \theta - \rho ^ 2 \sin \theta ) y } { \rho + 2 \rho ^ 2 \cos \theta + \rho ^ 3 }\] and \[\tag{5.7.39} z ( \theta ) = \frac { ( \cos \theta + 2 \rho + \rho ^ 2 \cos \theta ) y + ( \sin \theta - \rho ^ 2 \sin \theta ) x } { \rho + 2 \rho ^ 2 \cos \theta + \rho ^ 3 } ,\] which are equivalent to the Poisson formula when \[\tag{5.7.40} \rho = \sqrt { x ^ 2 + y ^ 2 } < 1 .\] The integrand though is a continuous function for $ \rho \le 1 $ and $ - \pi < \theta < \pi $ and so by our theorem on continuity of integrals, $ u $ is continuous on the closed unit disc. Note that $ - \pi < \theta < \pi $ is an open interval, not a closed interval, so continuity of the integrand is not enough by itself. We need in addition that the absolute value of the integrand is uniformly bounded by an integrable function, but in this case such bound is provided by the constant function $ \max | f | $. For $ ( x , y ) $ on the unit circle we have $ \rho = 1 $ and so \[\tag{5.7.41} w ( \theta) = x , \quad z ( \theta ) = y\] for all $ - \pi < \theta < \pi $ and so \[\tag{5.7.42} u ( x , y ) = f ( x , y )\] In other words $ u $, defined by the $ \theta $ integral rather than the $ \psi $ integral, is a continuous function which satisfies the Dirichlet boundary condition and satisfies the Laplace equation in the open unit disc.

What we have just proved is the following existence and uniqueness theorem for the Dirichlet problem in the unit disc.

Theorem 5.7.A For any continuous function $ f $ on the unit circle there is one and only one continuous function $ u $ on the closed unit disc which is equal to $ f $ on the unit circle and satisfies the Laplace equation in the open unit disc. This $ u $ is given by the Poisson formula in the open unit disc.

The formulation of the existence and uniqueness theorem above, with a classical solution in an open set, continuous data specified on the boundary of this set, and continuity on the closed set containing both, is similar to the one we had for the initial value problem for the diffusion equation. A more natural formulation of the Dirichlet problem for the unit disc would be to find a classical solution to the Laplace equation on the closed unit disc with a given function as its restriction to the unit circle. This boundary function would need to be twice continuously differentiable for us to be able to find a function on the closed unit disc whose restriction it is, even before we take the differential equation into account. So the question is, is there for each twice continuously differentiable $ f $ a unique $ u $ which is a classical solution in the closed unit disc such that its restriction to the unit circle is equal to $ f $? The answer turns out to be no. Uniqueness is okay, but existence can fail. It is however true that if $ f $ is infinitely differentiable then there is a unique infinitely differentiable $ u $. We won't prove this here however.

So far we've been working only with the unit disc, the disc of radius 1 centred at the origin. That was purely to keep the calculations manageable though. We could have done much the same analysis for a disc of any positive radius centred at any point, but we would have needed to carry three extra parameters around for the radius and centre. A better approach, if we need to consider a general disc, is to use the translation and scaling symmetries to reduce the problem to one on the unit disc, and then use the results we've already obtained for the unit disc.

Suppose then that we are given a continuous function $ f $ on the circle of radius $ \mu $ about the point $ ( w , z ) $ and are looking for a continuous function $ u $ on the closed disc of the same radius and same point which is equal to $ f $ on the circle and satisfies the Laplace equation in the corresponding open unit disc. Let \[\tag{5.7.43} \tilde f ( x , y ) = f \left ( \frac { x - w } \mu , \frac { y - z } \mu \right ) .\] If the problem above has a solution then \[\tag{5.7.44} \tilde u ( x , y ) = ( T _ { w , z } S _ \mu u ) ( x , y ) = u \left ( \frac { x - w } \mu , \frac { y - z } \mu \right )\] will be a solution to the Dirichlet problem for the unit disc with data $ \tilde f $ on the unit circle, and so must be \[\tag{5.7.45} \tilde u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi \frac { 1 - x ^ 2 - y ^ 2 } { ( x - \cos \psi ) ^ 2 + ( y - \sin \psi ) ^ 2 } \tilde f ( \cos \psi , \sin \psi ) \, d \psi .\] In terms of $ u $ and $ f $ this means \[\tag{5.7.46} u ( w + \mu x , z + \mu y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi \frac { ( 1 - x ^ 2 - y ^ 2 ) f ( w + \mu \cos \psi , z + \mu \sin \psi ) } { ( x - \cos \psi ) ^ 2 + ( y - \sin \psi ) ^ 2 } \, d \psi\] for $ ( x , y ) $ in the open unit disc or, equivalently, \[\tag{5.7.47} u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi \frac { \left [ \mu ^ 2 - ( x - w ) ^ 2 - ( y - z ) ^ 2 \right ] f ( w + \mu \cos \psi , z + \mu \sin \psi ) } { ( x - w - \mu \cos \psi ) ^ 2 + ( y - z - \mu \sin \psi ) ^ 2 } \, d \psi\] for $ ( x , y ) $ in the open unit disc of radius $ \mu $ about $ ( w , z ) $.

Section 5.8 Regularity

The argument we used above to get the two derivatives of $ u $ that we needed in order to check that $ u $ satisfies the Laplace equation works for any number of derivatives, so in fact $ u $ is infinitely differentiable. As explained above, we can use symmetries to take this result from the unit disc to any closed disc of positive radius such that $ u $ is continuous on that disc and satisfies the Laplace equation in the corresponding open unit disc. We find in this way that $ u $ is infinitely differentiable in that open disc.

Suppose now that $ u $ is satisfies the Laplace equation in an open subset $ \Omega $ of $ \mathbf R ^ 2 $. We make no assumption that it extends continuously to any larger set, like $ \bar \Omega $. $ \Omega $ is an open set so if $ ( x , y ) \in \Omega $ then there is an $ r > 0 $ such that the open disc of radius $ r $ about $ ( x , y ) $ is a subset of $ \Omega $. Because the closed unit disc of radius $ r $ is not necessarily a subset of $ \Omega $ and we haven't made any assumption that $ u $ extends continuously beyond $ \Omega $ we can't apply the result of the preceding paragraph to the disc of radius $ r $ about $ ( x , y ) $, but we can apply it to the disc of radius $ r / 2 $, so $ u $ is infinitely differentiable in the the disc of radius $ r / 2 $ about $ ( x , y ) $. So $ u $ is infinitely differentiable in some neighbourhood of every point of $ \Omega $ and hence is infinitely differentiable in $ \Omega $. In this way we find the following theorem.

An interesting case is $ \Omega = \mathbf R ^ 2 $. The theorem above shows that a function which satisfies the Laplace equation in the whole plane is infinitely differentiable everywhere. If the function happens to be bounded then we can say much more. The bounds on the gradient we obtained in the course of the proof of existence of solutions to the Dirichlet problem in the disc, together with the scaling symmetry, show that \[\tag{5.8.1} \left \| \nabla u ( x , y ) \right \| \le \frac 2 r \sup | u |\] where the supremum is taken over the circle of radius $ r $ about the point $ ( x , y ) $. But the supremum over a subset is always less than or equal to the supremum over the whole set so the inequality above also holds if we take the supremum to be over all of $ \mathbf R ^ 2 $. It holds for all positive $ r $ though so \[\tag{5.8.2} \left \| \nabla u ( x , y ) \right \| \le 0\] and hence $ \nabla u = 0 $ for all $ ( x , y ) $ and so $ u $ is constant. In other words, the only bounded solutions of the Laplace equation in the plane are the constant solutions.

Section 5.9 Dirichlet Problem for a Half-plane

We want to solve the Dirichlet problem for a half-plane. For simplicity we'll consider only the upper half-plane, i.e. \[\tag{5.9.1} \{ ( x , y ) \in \mathbf R ^ 2 \colon y > 0 \}\] or \[\tag{5.9.2} \{ ( x , y ) \in \mathbf R ^ 2 \colon y \ge 0 \} ,\] depending on whether we're considering the open half-plane or the closed half-plane. There's no real loss of generality in considering the upper half-plane since any other half-plane can be mapped to it by a translation and rotation. The Dirichlet problem will consist of finding a function $ u $ which is continuous and bounded on the closed upper half-plane and is a twice continuously differentiable solution of the Laplace equation in the open half-plane, such that its restriction to the boundary of the half-plane, i.e. the $ x $ axis, is equal to some given bounded continuous function $ f $.

We had two different representation formula for solutions of the Laplace equation in the unit disc in terms of the values on the unit circle, \[\tag{5.9.3} u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi u ( w ( \theta ) , z ( \theta ) ) \, d \theta ,\] for all $ ( x , y ) $ in the punctured unit disc, where \[\tag{5.9.4} w ( \theta ) = \frac { ( \cos \theta + 2 \rho + \rho ^ 2 \cos \theta ) x - ( \sin \theta - \rho ^ 2 \sin \theta ) y } { \rho + 2 \rho ^ 2 \cos \theta + \rho ^ 3 } ,\] \[\tag{5.9.5} z ( \theta ) = \frac { ( \cos \theta + 2 \rho + \rho ^ 2 \cos \theta ) y + ( \sin \theta - \rho ^ 2 \sin \theta ) x } { \rho + 2 \rho ^ 2 \cos \theta + \rho ^ 3 } ,\] and \[\tag{5.9.6} \rho = \sqrt { x ^ 2 + y ^ 2 } ,\] and \[\tag{5.9.7} u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi \frac { 1 - x ^ 2 - y ^ 2 } { ( x - \cos \psi ) ^ 2 + ( y - \sin \psi ) ^ 2 } u ( \cos \psi , \sin \psi ) \, d \psi .\] We can convert either of these into a representation for other discs by using the translation and scaling symmetry. We've already done this for the Poisson formula, i.e. the representation in terms of the integral over $ \psi $, and we now do this for the $ \theta $ integral: \[\tag{5.9.8} u ( x , y ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi u ( \xi + \mu w ( \theta ) , \eta + \mu z ( \theta ) ) \, d \theta\] for functions which are continuous in the closed disc of radius $ \mu $ about $ ( \xi , \eta ) $ and satisfy the Laplace equation in the corresponding open disc. Here $ w $ and $ z $ are given by the same equations as above but with \[\tag{5.9.9} \rho = \frac { \sqrt { ( x - \xi ) ^ 2 + ( y - \eta ) ^ 2 } } \mu .\]

Suppose $ u $ is continuous in the closed upper half plane and satisfies the Laplace equation in the open half plane. Then for all $ s \in ( 0 , 1 ) $ it is continuous in the disc of radius \[\tag{5.9.10} \mu = \frac { 1 - s ^ 2 } { 2 s }\] about the point \[\tag{5.9.11} ( \xi , \eta ) = \left ( 0 , \frac { 1 + s ^ 2 } { 2 s } \right ) .\] We can therefore apply the representation above to \[\tag{5.9.12} ( x , y ) = ( 0 , 1 ) .\] In this way we get the equation \[\tag{5.9.13} u ( 0 , 1 ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi u \left ( \frac { \sin \theta - q ^ 2 \sin \theta } { 1 + \cos \theta + q ^ 2 - q ^ 2 \cos \theta } , \frac { 2 q } { 1 + \cos \theta + q ^ 2 - q ^ 2 \cos \theta } \right ) d \theta\] or, in terms of $ \tan ( \theta / 2 ) $, the somewhat simpler equation \[\tag{5.9.14} u ( 0 , 1 ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi u \left ( \frac { ( 1 - q ^ 2 ) \tan ( \theta / 2 ) } { 1 + q ^ 2 \tan ^ 2 ( \theta / 2 ) } , \frac { q ( 1 + \tan ^ 2 ( \theta / 2 ) ) } { 1 + q ^ 2 \tan ^ 2 ( \theta / 2 ) } \right ) d \theta .\] Now the point \[\tag{5.9.15} \left ( \frac { ( 1 - q ^ 2 ) \tan ( \theta / 2 ) } { 1 + q ^ 2 \tan ^ 2 ( \theta / 2 ) } , \frac { q ( 1 + \tan ^ 2 ( \theta / 2 ) ) } { 1 + q ^ 2 \tan ^ 2 ( \theta / 2 ) } \right )\] at which $ u $ is evaluated depends continuously on $ q $ and $ \theta $ in the product of intervals $ [ 0 , 1 ] \times ( - \pi , \pi ) $ and so, since $ u $ is continuous, the integrand is continuous in the same product of intervals. If it were the product of closed intervals $ [ 0 , 1 ] \times [ - \pi , \pi ] $ we could simply take the limit as $ q $ tends to $ 0 $ from above, obtaining \[\tag{5.9.16} u ( 0 , 1 ) = \frac 1 { 2 \pi } \int _ { - \pi } ^ \pi u ( \tan ( \theta / 2 ) , 0 ) \, d \theta .\] Because one of the intervals is open we need an additional hypothesis to justify the interchange of the limit and integral, namely that the absolute value of the integrand is bounded uniformly in $ q $ by a function which is integrable in $ \theta $. This will be the case if we assume that $ u $ is bounded in the upper half-plane, as we will from now on. Without some such assumption the equation above need not hold, as the example $ u ( x , y ) = y $ shows.

A simple change of variable in the equation above gives \[\tag{5.9.17} u ( 0 , 1 ) = \frac 1 \pi \int _ { - \infty } ^ { \infty } \frac { u ( t , 0 ) } { 1 + t ^ 2 } d t .\] Now this equation gives the value of $ u $ only at the single point $ ( 0 , 1 ) $ but we can use the translation and scaling symmetries of the Laplace equation to convert it into an equation for the value at any point $ ( x , y ) $ in the upper half plane. Let \[\tag{5.9.18} \tilde u = S _ { 1 / y } T _ { - x , 0 } u\] Then \[\tag{5.9.19} u = T _ { x , 0 } S _ y \tilde u\] so \[\tag{5.9.20} u ( x , y ) = ( T _ { x , 0 } S _ y \tilde u ) ( x , y ) = ( S _ y u ) ( 0 , y ) = \tilde u ( 0 , 1 ) .\] But $ T _ { x , 0 } $ and $ S _ y $ are symmetries of the Laplace equation which map the upper half plane to itself so we can apply the equation above to $ \tilde u $: \[\tag{5.9.21} \tilde u ( 0 , 1 ) = \frac 1 \pi \int _ { - \infty } ^ { \infty } \frac { \tilde u ( t , 0 ) } { 1 + t ^ 2 } d t .\] Now \[\tag{5.9.22} \tilde u ( t , 0 ) = ( S _ { 1 / y } T _ { - x , 0 } u ) ( t , 0 ) = ( T _ { - x , 0 } u ) ( t y , 0 ) = u ( x + t y , 0 )\] so \[\tag{5.9.23} u ( x , y ) = \frac 1 \pi \int _ { - \infty } ^ { \infty } \frac { u ( x + y t , 0 ) } { 1 + t ^ 2 } d t .\] Making the change of variable \[\tag{5.9.24} z = x + y t\] we get \[\tag{5.9.25} u ( x , y ) = \frac 1 \pi \int _ { - \infty } ^ { \infty } \frac { y u ( z , 0 ) } { ( z - x ) + y ^ 2 } d z .\] This is known as the Poisson formula for the upper half-plane.

The argument above shows that any bounded solution of the Dirichlet problem for upper half-plane with boundary data $ f $ is given by \[\tag{5.9.26} u ( x , y ) = \frac 1 \pi \int _ { - \infty } ^ { \infty } \frac { y f ( z ) } { ( z - x ) + y ^ 2 } d z\] for $ y > 0 $. Strictly speaking we should write $ f ( z , 0 ) $ but $ f $ is defined only on the $ x $ axis so we don't normally write the second coordinate.

We're now in a similar situation to the one we were in a few pages ago for the Dirichlet problem for the unit disc. We have a uniqueness theorem, saying that if there is a solution then it's given by a particular expression in the interior of our region, but we don't yet know that it is a solution, so we don't have existence. To verify that this $ u $ satisfies the Laplace equation we need to differentiate under the integral sign. This is justified by our earlier theorem on differentiating under the integral sign if $ f $ is continuous and bounded, which are obviously necessary conditions for the existence of a bounded solution $ u $. Note that we only need to do this for $ y > 0 $, which is the region where the Poisson formula is valid.

We also need to check the boundary conditions and here we are in for an unpleasant surprise, or at least it's a surprise if we've forgotten what happened at the same stage for the Dirichlet problem in the disc. If we substitute $ y = 0 $ in the Poisson formula we get $ u ( x , 0 ) = 0 $ since the integrand is zero except at a single point. On the other hand, if we use the equation \[\tag{5.9.27} u ( x , y ) = \frac 1 \pi \int _ { - \infty } ^ { \infty } \frac { f ( x + y t , 0 ) } { 1 + t ^ 2 } d t\] instead then we get $ u ( x , 0 ) = f ( x ) $, as expected. This equation is equivalent to the Poisson formula for $ y > 0 $, by the same change of variable $ z = x + y t $ we saw earlier. This change of variable is, of course, not valid when $ y = 0 $, which is why we can get different results by applying the different equations in that case. Our theorem on continuity of integrals implies that the $ u $ defined for $ y \ge 0 $ by the $ t $ integral is continuous, since the absolute value of the integrand is bounded uniformly in $ x $ and $ y $ by \[\tag{5.9.28} \frac { \sup | f | } { 1 + t ^ 2 } ,\] which is an integrable function of $ t $. The theorem does not apply to the $ z $ integral, i.e. the Poisson formula, however. So although it's often stated that the Poisson formula solves the Dirichlet problem this isn't quite correct. It's the integral formula with the integral over $ t $ which solves the Dirichlet problem and this solution agrees with the Poisson formula in the open half-plane, but not on the $ x $ axis, which is where the boundary data is defined. The correct statement of the existence and uniqueness theorem is therefore as follows.

Theorem 5.9.A For any bounded continuous function $ f $ on the $ x $ axis there is one and only one bounded continuous function $ u $ on the closed upper half plane which is equal to $ f $ on the $ x $ axis and satisfies the Laplace equation in the open upper half plane. This $ u $ is given by the Poisson formula in the open upper half plane.

We've considered the Dirichlet problem for the upper half-plane. We can of course derive similar results for any half-plane, using the translation and rotation symmetries of the Laplace equation, just as we converted results about the unit disc to arbitrary discs by using the translation and scaling symmetries earlier. There we gave the details, since we needed discs other than the unit disc in the argument for the upper half-plane but we won't give the details for half-planes since the calculations are straightforward and we don't need the resulting equations for anything later.

Section 5.10 Harmonic Conjugate

Everything we have done so far in this chapter can be generalised to the Laplace equation in higher dimensions. In almost all cases the generalisation is a simple matter of adding more independent variables the the equations. There are a few places where the generalisation is a bit less straightforward. The non-constant radial solution in higher dimensions involves powers of the radial coordinate rather than logarithms, for example, and the action of the the higher dimensional Lorentz group is slightly more complicated, since we need to multiply the function $ u $ by a certain factor in addition to transforming its arguments. Also, averaging over rotations in dimensions greater than two is harder to describe explicitly. Even though in these cases the generalisation to higher dimensions requires more work, it does exist. What we'll discuss in this brief section though is specific to two dimensions and does not have a good analogue in higher dimensions.

In multivariable calculus one learns that a continuously differentiable vector field $ \mathbf w = ( w _ 1 , \ldots , w _ n ) $ on a simply connected domain is the gradient of a twice continously differentiable function $ v $, i.e. $ \mathbf w = \mathbf \nabla v $, if and only if it satisfies the equations \[\tag{5.10.1} \frac { \partial w _ j } { \partial w _ k } = \frac { \partial w _ k } { \partial w _ j }\] for all $ 1 \le j < k \le n $. The fact that this is necessary, even without the simple connectedness assumption, follows from equality of mixed second partial derivatives. In particular, in two dimensions we have that \[\tag{5.10.2} w _ 1 = \frac { \partial v } { \partial x } , \quad w _ 2 = \frac { \partial v } { \partial y }\] if and only if \[\tag{5.10.3} \frac { \partial w _ 1 } { \partial y } = \frac { \partial w _ 2 } { \partial x } .\] Here, as usual, we've written $ x $ for $ x _ 1 $ and $ y $ for $ x _ 2 $. Now if $ u $ is a solution of the Laplace equation in a simply connected region and $ w _ 1 $ and $ w _ 2 $ are defined by \[\tag{5.10.4} w _ 1 = - \frac { \partial u } { \partial y } , \quad w _ 2 = \frac { \partial u } { \partial x }\] then \[\tag{5.10.5} \frac { \partial w _ 1 } { \partial y } = \frac { \partial w _ 2 } { \partial x }\] so there must be a twice continuously differentiable $ v $ such that \[\tag{5.10.6} w _ 1 = \frac { \partial v } { \partial x } , \quad w _ 2 = \frac { \partial v } { \partial y } ,\] i.e. such that \[\tag{5.10.7} \frac { \partial u } { \partial x } - \frac { \partial v } { \partial y } = 0 , \quad \frac { \partial u } { \partial y } + \frac { \partial v } { \partial x } = 0 .\] These are the Cauchy-Riemann equations, which we met once before as a sufficient condition for a transformation of the plane to give rise to a symmetry of the Laplace equation. Taking derivatives we get \[\tag{5.10.8} \frac { \partial ^ 2 u } { \partial y \partial x } - \frac { \partial ^ 2 v } { \partial y ^ 2 } = 0 , \quad \frac { \partial ^ 2 u } { \partial x \partial y } + \frac { \partial ^ 2 v } { \partial x ^ 2 } = 0\] so subtracting the first equation from the second and using the equality of mixed partial derivatives we see that $ v $ also satisfies the Laplace equation \[\tag{5.10.9} \frac { \partial ^ 2 v } { \partial x ^ 2 } + \frac { \partial ^ 2 v } { \partial y ^ 2 } = 0 .\] This $ v $ is usually called the harmonic conjugate of $ u $, although it would make more sense to call it a harmonic conjugate rather than the harmonic conjugate, since adding any constant to $ v $ would also give a function satisfying the same conditions. This is similar to the situation with indefinite integrals, where people usually say the indefinite integral even though it would be more correct to say an indefinite integral.

If $ u $ is a solution to the Dirichlet problem in the upper half plane then we can, under some additional assumptions on the boundary value $ f $, write down a simple formula for $ v $ in terms of $ f $: \[\tag{5.10.10} v ( x , y ) = \frac 1 \pi \int _ { - \pi } ^ \pi \frac { ( x - z ) f ( z ) } { ( x - z ) ^ 2 + y ^ 2 } \, d z .\] Unlike the integral in the Poisson formula, some condition on $ f $ is needed to make this integrable. We could, for example, require $ f $ itself to be integrable. This would suffice since the factor we're multiplying it by is bounded as a function of $ z $ for each $ x $ and $ y $. In the $ t $ variable this integral becomes \[\tag{5.10.11} v ( x , y ) = \frac 1 \pi \int _ { - \pi } ^ \pi \frac { t f ( x + y t ) } { 1 + t ^ 2 } \, d t .\] In this case we can't exchange the limit and integral. It may nonetheless happen that the limit \[\tag{5.10.12} g ( x ) = \lim _ { y \to 0 ^ + } v ( x , y )\] exists. If so then $ g $ is called the Hilbert transform of $ f $. The Hilbert transform plays a role in the more detailed regularity theory of the Laplace equation. If we want a bounded solution to the Dirichlet problem which is twice continuously differentiable for $ y \ge 0 $, not just for $ y > 0 $ as in the theorem proved earlier, then it's necessary for $ f $ to be a bounded twice continuously differentiable function but this turns out not to be a sufficient condition. A necessary and sufficient condition is that $ f $ and its Hilbert transform are bounded twice continuously differentiable functions.