1 Critical points and variation of the gradient

Suppose that \(f(0,0)=0\) and \(\nabla f(0,0)=\begin{pmatrix}0\\0\end{pmatrix}\). We want to review and understand the second derivative test.

1.1

We argued that the sign of

\begin{equation*} \nabla f(\delta_x,\delta_y) \cdot \begin{pmatrix}\delta_x\\\delta_y\end{pmatrix} \end{equation*}

for all vectors \(\begin{pmatrix}\delta_x\\\delta_y\end{pmatrix}\) close to \(\begin{pmatrix}0\\ 0\end{pmatrix}\) was an efficient way of determinining whether \(f\) has a local max or a local min at \(\begin{pmatrix}0\\0\end{pmatrix}\).

Review why this is the case.

1.2

Let us PROVE this fact. Fix a radius \(r\gt 0\). Suppose that

  • \(f\) is differentiable on \(D_r(0,0)\)
  • \( \nabla f(x,y) \cdot \begin{pmatrix}x\\y\end{pmatrix} \geq 0\) for all \((x,y)\in D_r(0,0)\).

We will PROVE that \(f(x,y)\geq f(0,0)\) for all \((x,y)\in D_r(0,0)\).

Fix \((x,y)\) as above and find a path \(\vec{p}(t)\) that goes from \((0,0)\) to \((x,y)\) with constant velocity as time passes from \(0\) to \(1\). Write the expression for

\begin{equation*} \vec{p}(t) = \begin{pmatrix}p_x(t)\\p_y(t)\end{pmatrix} \end{equation*}

1.3

Call \(F(t)=f(\vec{p}(t))\). Convince yourself that to prove our statement it is sufficient to show that \(F(1)\geq F(0)\).

1.4

Express \(F'(t)\) for any \(t\in[0,1]\) in terms of \(\nabla f(\vec{p}(t))\) and of \(\dot{\vec{p}}(t)\). Also compute \(\dot{\vec{p}}(t)\).

1.5

Using the fundamental theorem of calculus write

\begin{equation*} F(1)=F(0)+\int_0^1 F'(t)dt=F(0)= F(0)+\int_0^1 \nabla f(\vec{p}(t)) \cdot \dot{\vec{p}(t)}dt \end{equation*}

Show that under our assumptions the integrand is non-negative and deduce the statement of our theorem.

2 The Hessian

2.1

Review from calculus the second derivative test: if \(F'(t)=0\) and \(F''(t)\gt 0\) then \(F\) has a local min at \(t\).

Why is this true?

2.2

We now want to show that the second derivative test in \(2d\) corresponds to checking the second derivative test in \(1d\) over all directions starting from the critical point.

The setup: \(f(x,y)\) is twice differentiable, \(f(0,0)=0\). Fix \(\vec{u}=\begin{pmatrix}u_x,u_y\end{pmatrix}\) a unit vector. Let

\begin{equation*} t\to t\vec{u} \end{equation*}

be the parameterization of the line in direction \(\vec{u}\) through \((0,0)\).

2.3

Express in terms of \(t\), \(\vec{u}\), and \(f\) the function \(F(t)\) that is the restriction of \(f(x,y)\) to the line mentioned above. Show that \((0,0)\) is critical for \(f\) if and only if \(F'(0)=0\) for however the direction vector \(\vec{u}\) was chosen.

2.4

Compute \(F'(t)\) in terms of \(\vec{u}\), \(t\), and \(\nabla f \).

Show that

\begin{equation*} F''(t)= \partial_x^2 f(tu_x,tu_y) u_x^2 + 2\partial_x\partial_yf(tu_x,tu_y) u_xu_y+\partial_y^2 f(tu_x,tu_y) u_y^2 \end{equation*}

and in particular show that

\begin{equation*} F''(0)= \partial_x^2 f(0,0) u_x^2 + 2\partial_x\partial_yf(0,0) u_xu_y+\partial_y^2 f(0,0) u_y^2. \end{equation*}

See the above recognize the entries as being the coefficients of the Hessian of \(f\) at \((0,0)\)

2.5

We now wish to regard the above as an expression that depends on \(u_x\) and \(u_y\) as variables:

\begin{equation*} G(u_x,u_y)= \partial_x^2 f(0,0) u_x^2 + 2\partial_x\partial_yf(0,0) u_xu_y+\partial_y^2 f(0,0) u_y^2. \end{equation*}

Why does \(G\) attain max and min over the set of all direction vectors:

\begin{equation*} \vec{u}\in\R^2\qquad \|\vec{u}\|=1? \end{equation*}

2.6

Parameterize the set \(\vec{u}\in\R^2\qquad \|\vec{u}\|=1\) using polar coordinates. Show that expressing \(H(\theta)=G(cos\theta,\sin\theta)\) as a function of the angle \(\theta\) we have

\begin{equation*} H(\theta)=\cos(\theta)^2 \Big(\partial_x^2 f(0,0) + 2\partial_x\partial_yf(0,0) \tan(\theta)+\partial_y^2 f(0,0) \tan(\theta)^2\Big). \end{equation*}

2.7

See

\begin{equation*} \Big(\partial_x^2 f(0,0) + 2\partial_x\partial_yf(0,0) \tan(\theta)+\partial_y^2 f(0,0) \tan(\theta)^2\Big). \end{equation*}

as a quadratic polynomial in \(\tan(\theta)\). Show that it does not change sign if \(\det(D^2f)\gt 0\) i.e. \(D^2f\) is definite while it DOES change sign if \(\det(D^2f)\lt 0\).

2.8

Conclude that you have proved the second derivative test.