Course:MATH200/SecondDerivative

From UBC Wiki

Hessians and the Second Derivative Test

Statement of the Theorem

Let z=f(x,y) be a function of two real variables with continuous second derivatives in an open disk centered at a point P=(a,b). The Hessian determinant of f at P is the number

 

Recall that P is a critical point of f if ∇f(P)=0.

Theorem (Second Derivative Test). Supppose that z=f(x,y) is a function of two real variables with continuous second derivatives in an open disk centered at a point P=(a,b) and P is a critical point of f. Let D denote the Hessian determinant. Then

  1. if D and fxx(P)>0, f(P) is a local minimum of f;
  2. if D>0 and fxx(P)<0, f(P) is a local maximum of f;
  3. if D<0, f(P) is neither a local maximum or a local minimum of f.

Proof of the Theorem

A special case

Lemma. Let A,B and C be real numbers and set

 

Note that the Hessian determinant is constant and is given by D=(2A)(2C)-(2B)2=4(AC-B2). Suppose A and D are non-zero. Then we have

  

Proof. Multiply it out. The way we guess this is by completing the square using the first term on the right-hand-side to write the cross-term 2Bxy of the left-hand-side.

Corollary. Suppose f(x,y)=Ax2 + 2Bxy+Cy2. Then the Theorem holds for f at the critical point P=(0,0). In fact, if A,D>0, then f(0,0) is an absolute minimum of f on R2 and, D>0 and A<0, f(0,0) is an absolute maximum of f.

Proof. It is easy to see that P=(0,0) is a critical point: take the gradient. And, obviously, f(0,0)=0. Suppose A, D>0. Then the coefficients A and (D/A) on the right hand side of the the expression for f in the Lemma are both positive. It follows that f(x,y)≥ 0 for all (x,y). Similarly, if D>0 but A<0, both of the coefficients on the right-hand-side are negative so f(x,y)≤ 0 for all (x,y).

Suppose that D<0 and A≠ 0. Pick a number r>0. Then f(r,0)=Ar2 while f(-Br/A,1)=Dr2/(4A). Since these have opposite signs f(0,0) is neither a maximum nor a minimum.

The last case to consider is when D<0 but A=0. If C≠ 0, then we can just switch the roles of x and y and apply the reasoning of the previous paragraph. So we can assume C=0. Then f(x,y)=Bxy with B≠ 0. For this note that f(r,r)=r2>0 while f(r,-r)=-r2<0. Thus f(0,0)=0 is neither a local maximum nor a local minimum.

The General Case

The idea for the general case is essentially to approximate a general function f(x,y) by a quadratic function as in the special case. In fact, using Taylor's theorem with remainder we could prove the general case using this sort of approximation. However, I prefer a direct proof along the lines of the way the theorem is proved for the D>0 case in the text.

Proof of the Second Derivative Test. To simplify the notation, I assume that P=(0,0) and f(P)=0 and leave the reduction to this case to the reader as an exercise. For Q=(x,y), set H(x,y)=fxx(Q)fyy(Q)-[fxy(Q)]2 and D=H(P).

Suppose D>0 and that fxx(P)>0. Since f has continuous second-order differentives, we can find an open disk B of radius r>0 centered at P such that H(Q)>0 on B and f_{xx}(Q)>0 Q in B. Now, for Q=(a,b) in B define a function g by

 

Using the chain rule, we can see that g'(0)=afx(P)+bfy(P)=0 and

 

Since H(Q), fxx(Q)>0 on B, the lemma implies that g(t)≥ 0 for t∈ (0,1). This implies that f(Q)=g(1)≥ g(0)=f(P)=0. So f(P) is a local minimum.

To handle the case D>0 and fxx(P)<0, apply the result of the last paragraph to the function -f(x,y). This shows that -f(0,0) is a local minimum of -f(x,y). Thus f(0,0) is a local maximum of f(x,y).

Finally, suppose D<0. Assume that fxx(P)>0. (The general case is similar.) Then set h(t)=f(t,0). We have h(0)=fxx(P)>0. So f(P)=h(0) is a local minimum of the function h(t). On the other hand, if we set

 

It follows from a small calculation using the chain rule that

 

So f(P)=g(0) is a local maximum of the function g(t). This implies that f(P) is neither a local maximum nor a local minimum of f.