# Abstract Nonsense

## The Inverse Function Theorem (Proof)

Point of Post: This is a continuation of this post.

$\text{ }$

Inverse Function Theorem

Let $f:U\to\mathbb{R}^m$, where $U\subseteq\mathbb{R}^n$ is open, we say that $a\in U$ is a regular point of $f$ if $f$ is totally differentiable at $a$ and $\text{rk}\left(D_f(a)\right)=m$. If $f$ is differentiable at $a$ and $\text{rk}\left(D_f(a)\right) we say that $a$ is a critical point. By a basic theorem of linear algebra we know that if $m=n$ we have that a point $a\in U$, for which $f$ is differentiable, is regular if $D_f(a)$ is invertible (i.e. $\det\text{Jac}_f(a)\ne 0$) and critical otherwise (i.e. if $\det\text{Jac}_f(a)=0$). Thus, with this terminology in mind we state the inverse function theorem as follows:

$\text{ }$

Theorem (Inverse Function Theorem): Let $f:U\to\mathbb{R}^n$, where $U\subseteq\mathbb{R}^n$ is open, be $C^1(U)$. Let then $a$ be a regular point for $f$, then there exists a neighborhood $V\subseteq U$ of $a$ such that $f(V)\subseteq f(U)$ is open, $f_{\mid V}:V\to f(V)$ is a bijection, and the inverse $\left(f_{\mid V}\right)^{-1}:f(V)\to V$ is  $C^1(V)$ with $D_{f^{-1}}(f(x))=D_f(x)^{-1}$.

Proof: Since $a$ is a regular point we have that $D_f(a)$ is invertible, call it’s inverse $M$. With a little fiddling one can see that we can choose $\delta>0$ and $r>0$ such that $\overline{B_\delta(a)}\subseteq U$ and

$\text{ }$

$\displaystyle \left\|D_f(a)-D_f(x)\right\|_\text{op}\leqslant\frac{1}{2m\|M\|_\text{op}}\qquad r\leqslant\frac{\delta}{2\|M\|_\text{op}}$

$\text{ }$

both hold. Now, for each $y\in B_r(f(a))$ we define $A_y:\overline{B_\delta(a)}\to\mathbb{R}^n:x\mapsto x+M(y-f(x))$. Note then that for each $x\in \overline{B_\delta(a)}$  one has that $A_y$ could be equally well described as $\mathbf{1}+M\circ f+\text{const}$ (where the constant in $M(y)$ and $\mathbf{1}$ is the identity function) and so, using the chain rule,

$\text{ }$

$D_{A_y}(x)=\mathbf{1}+D_M(f(x))\circ D_f(x)=\mathbf{1}+M\circ D_f(x)$

$\text{ }$

where we used the fact that $\mathbf{1}$ and $M$ are linear and the total derivative of a linear map is itself. Thus, we see that

$\text{ }$

\begin{aligned}\left\|D_{A_y}(x)\right\|_\text{op}&=\left\|\mathbf{1}-M\circ D_f(x)\right\|_\text{op}\\ &=\left\|M\left(D_f(a)-D_f(x)\right)\right\|_\text{op}\\ &\leqslant \|M\|_\text{op}\left\|D_f(a)-D_f(x)\right\|_\text{op}\\ &\leqslant\frac{1}{2n}\end{aligned}

$\text{ }$

We can easily deduce then by the multivariable mean value theorem that

$\text{ }$

$\displaystyle \left\|A_y(x_1)-A_y(x_2)\right\|=\left\|D_{A_y}(\xi)(x-y)\right\|\leqslant \|D_{A_y}(\xi)\|_{\text{op}}|x_1-x_2|\leqslant \frac{1}{2n}|x_1-x_2|$

$\text{ }$

Next note that for each $x\in \overline{B_\delta(a)}$ one has that

$\text{ }$

\displaystyle \begin{aligned}\left\|A_y(x)-a\right\| &=\|A_y(x)-A_y(a)\|+\|A_y(a)-a\|\\ &\leqslant \frac{1}{2}\|x-a\|+\|M(y-f(x)\|\\ &\leqslant \frac{\delta}{2}+\|M\|r\\ &\leqslant \delta\end{aligned}

$\text{ }$

Thus, $A_y$ is a contraction mapping and so by the Banach fixed point theorem there is a unique solution to $A_y(x)=x$ in $\overline{B_\delta(x)}$. But, note that writing it out one finds that $A_y(x)=x$ if and only if $f(x)=y$. Thus, there is a unique solution to $y=f(x)$ for $x\in\overline{B_\delta(a)}$. So, define the map $g:B_r(f(a))\to \overline{B_\delta(a)}$ by sending $y$ to the unique $x\in \overline{B_\delta(a)}$ such that $A_y(x)=x$. It is clear that $g$ is an embedding (i.e. that it is a homemorphism onto its image). What we claim though is that $g(B_r(f(a))$ is open in $\mathbb{R}^n$. To see this let $g(y)\in g(B_r(f(a))$ be arbitrary. Since $f$ is continuous we know there exists a neighborhood $B_\varepsilon(g(y))$ such that $f(B_\varepsilon(g(y))\subseteq B_r(f(a))$ but it’s clear to see that this implies $B_\varepsilon(g(y))\subseteq g(B_r(f(a))$. Thus, since evidently $f(a)\in B_r(f(a))$ we have that if $V=g(B_rf(a))$ and $f(V)=B_r(f(a))$ then $V,f(V)$ are both open and $f:V\to f(V)$ is a bijection with inverse equal to $g$.

$\text{ }$

What we lastly claim is that $g$ is differentiable and satisfies $D_g(f(x))=D_f(x)^{-1}$. So, let $f(x)$ be any point of $f(V)$. Let then $w$ be some vector in $\mathbb{R}^n$ with $\varepsilon$ small enough magnitude such that $y+\varepsilon w\in f(V)$. We note then that

$\text{ }$

$\displaystyle \left\|A_{f(x)}(x+g(f(x)+\varepsilon w)-g(f(x)))-A_y(x)\right\|\leqslant \frac{1}{2}\left\|g(f(x)+\varepsilon w)-g(f(x))\right\|$

$\text{ }$

from this it easily follows that

$\text{ }$

$\displaystyle \|g(f(x)+\varepsilon w)-g(f(x))\|\leqslant 2\varepsilon \|M\|_{\text{op}}\|w\|$

$\text{ }$

That said, since $f$ is differentiable we know that $f(x+v)-f(x)=D_f(x)(v)+h(v)$ where $\displaystyle \lim_{v\to\bold{0}}\frac{h(v)}{\|v\|}=\bold{0}$. Thus, putting it all together we see that

$\text{ }$

\displaystyle \begin{aligned}\lim_{\varepsilon\to0}\frac{g(f(x)+\varepsilon w)-g(f(x)}{\varepsilon} &=\lim_{\varepsilon}D_f(x)^{-1}\left(\frac{\varepsilon w-h(g(f(x)+\varepsilon w)-g(f(x))}{\varepsilon}\right)\\ &=D_f(x)^{-1}\end{aligned}

$\text{ }$

But, this left side precisely says that $D_g(f(x))$ exists and $D_g(f(x))=D_f(x)^{-1}$. $\blacksquare$

$\text{ }$

Corollary: Let $f$ be as in the statement of the inverse function theorem, but that $f\in C^k(U)$ for some $k\in\mathbb{N}\cup\{\infty\}$ then the guaranteed $f^{-1}$ is also $C^k$.

Proof: This follows immediately from the fact that $D_{f^{-1}}(f(x))=D_f(x)^{-1}$. Indeed, since $f\in C^k(U)$ it’s easy to see that the entries of $D_f(x)$ for any $x$ are $C^{k-1}$, but then the entries of $\displaystyle D_{f^{-1}}f(x)=D_f(x)^{-1}=\frac{1}{\det(D_f(x))}\text{adj}(D_f(x))$ are $C^{k-1}$ (as can be checked since they are quotients of polynomials of $C^{k-1}$ functions and so we can go back quickly to conclude that $f^{-1}$ is $C^k$. $\blacksquare$

$\text{ }$

$\text{ }$

References:

1.  Spivak, Michael. Calculus on Manifolds; a Modern Approach to Classical Theorems of Advanced Calculus. New York: W.A. Benjamin, 1965. Print.

2. Apostol, Tom M. Mathematical Analysis. Reading, MA: Addison-Wesley Pub., 1974. Print.

September 8, 2011 -

1. […] a surface. We now show the other claim about level sets of curves. For the setup of this theorem we recall that we call, for a function where is open, the point a regular value if for every with one […]

Pingback by Surfaces (Pt. II) « Abstract Nonsense | October 9, 2011 | Reply

2. […] and so by construction we have that . The rest is just a literal statement of the inverse function theorem. […]

Pingback by Surfaces (Pt. III) « Abstract Nonsense | October 9, 2011 | Reply

3. […] since we know that is invertible. By the inverse function theorem we know there exists a neighborhood of and a neighborhood of such that is a diffeomorphism. […]

Pingback by Instructive Non-Examples « Abstract Nonsense | October 14, 2011 | Reply

4. […] a holomorphic one. This is where the necessary groan should be emitted by anyone familiar with the proof of the smooth IFT (or at least the equivalent, inverse function theorem). Now, before you decide to […]

Pingback by Loci of Holomorphic Functions and the Inverse Function Theorem (Pt. I) « Abstract Nonsense | October 3, 2012 | Reply