Abstract Nonsense

Crushing one theorem at a time

The Inverse Function Theorem (Proof)

Point of Post: This is a continuation of this post.

\text{ }

Inverse Function Theorem

Let f:U\to\mathbb{R}^m, where U\subseteq\mathbb{R}^n is open, we say that a\in U is a regular point of f if f is totally differentiable at a and \text{rk}\left(D_f(a)\right)=m. If f is differentiable at a and \text{rk}\left(D_f(a)\right)<m we say that a is a critical point. By a basic theorem of linear algebra we know that if m=n we have that a point a\in U, for which f is differentiable, is regular if D_f(a) is invertible (i.e. \det\text{Jac}_f(a)\ne 0) and critical otherwise (i.e. if \det\text{Jac}_f(a)=0). Thus, with this terminology in mind we state the inverse function theorem as follows:

\text{ }

Theorem (Inverse Function Theorem): Let f:U\to\mathbb{R}^n, where U\subseteq\mathbb{R}^n is open, be C^1(U). Let then a be a regular point for f, then there exists a neighborhood V\subseteq U of a such that f(V)\subseteq f(U) is open, f_{\mid V}:V\to f(V) is a bijection, and the inverse \left(f_{\mid V}\right)^{-1}:f(V)\to V is  C^1(V) with D_{f^{-1}}(f(x))=D_f(x)^{-1}.

Proof: Since a is a regular point we have that D_f(a) is invertible, call it’s inverse M. With a little fiddling one can see that we can choose \delta>0 and r>0 such that \overline{B_\delta(a)}\subseteq U and

\text{ }

\displaystyle \left\|D_f(a)-D_f(x)\right\|_\text{op}\leqslant\frac{1}{2m\|M\|_\text{op}}\qquad r\leqslant\frac{\delta}{2\|M\|_\text{op}}

\text{ }

both hold. Now, for each y\in B_r(f(a)) we define A_y:\overline{B_\delta(a)}\to\mathbb{R}^n:x\mapsto x+M(y-f(x)). Note then that for each x\in \overline{B_\delta(a)}  one has that A_y could be equally well described as \mathbf{1}+M\circ f+\text{const} (where the constant in M(y) and \mathbf{1} is the identity function) and so, using the chain rule,

\text{ }

D_{A_y}(x)=\mathbf{1}+D_M(f(x))\circ D_f(x)=\mathbf{1}+M\circ D_f(x)

\text{ }


where we used the fact that \mathbf{1} and M are linear and the total derivative of a linear map is itself. Thus, we see that

\text{ }

\begin{aligned}\left\|D_{A_y}(x)\right\|_\text{op}&=\left\|\mathbf{1}-M\circ D_f(x)\right\|_\text{op}\\ &=\left\|M\left(D_f(a)-D_f(x)\right)\right\|_\text{op}\\ &\leqslant \|M\|_\text{op}\left\|D_f(a)-D_f(x)\right\|_\text{op}\\ &\leqslant\frac{1}{2n}\end{aligned}

\text{ }

We can easily deduce then by the multivariable mean value theorem that

\text{ }

\displaystyle \left\|A_y(x_1)-A_y(x_2)\right\|=\left\|D_{A_y}(\xi)(x-y)\right\|\leqslant \|D_{A_y}(\xi)\|_{\text{op}}|x_1-x_2|\leqslant \frac{1}{2n}|x_1-x_2|

\text{ }

Next note that for each x\in \overline{B_\delta(a)} one has that

\text{ }

\displaystyle \begin{aligned}\left\|A_y(x)-a\right\| &=\|A_y(x)-A_y(a)\|+\|A_y(a)-a\|\\ &\leqslant \frac{1}{2}\|x-a\|+\|M(y-f(x)\|\\ &\leqslant \frac{\delta}{2}+\|M\|r\\ &\leqslant \delta\end{aligned}

\text{ }

Thus, A_y is a contraction mapping and so by the Banach fixed point theorem there is a unique solution to A_y(x)=x in \overline{B_\delta(x)}. But, note that writing it out one finds that A_y(x)=x if and only if f(x)=y. Thus, there is a unique solution to y=f(x) for x\in\overline{B_\delta(a)}. So, define the map g:B_r(f(a))\to \overline{B_\delta(a)} by sending y to the unique x\in \overline{B_\delta(a)} such that A_y(x)=x. It is clear that g is an embedding (i.e. that it is a homemorphism onto its image). What we claim though is that g(B_r(f(a)) is open in \mathbb{R}^n. To see this let g(y)\in g(B_r(f(a)) be arbitrary. Since f is continuous we know there exists a neighborhood B_\varepsilon(g(y)) such that f(B_\varepsilon(g(y))\subseteq B_r(f(a)) but it’s clear to see that this implies B_\varepsilon(g(y))\subseteq g(B_r(f(a)). Thus, since evidently f(a)\in B_r(f(a)) we have that if V=g(B_rf(a)) and f(V)=B_r(f(a)) then V,f(V) are both open and f:V\to f(V) is a bijection with inverse equal to g.

\text{ }

What we lastly claim is that g is differentiable and satisfies D_g(f(x))=D_f(x)^{-1}. So, let f(x) be any point of f(V). Let then w be some vector in \mathbb{R}^n with \varepsilon small enough magnitude such that y+\varepsilon w\in f(V). We note then that

\text{ }

\displaystyle \left\|A_{f(x)}(x+g(f(x)+\varepsilon w)-g(f(x)))-A_y(x)\right\|\leqslant \frac{1}{2}\left\|g(f(x)+\varepsilon w)-g(f(x))\right\|

\text{ }

from this it easily follows that

\text{ }

\displaystyle \|g(f(x)+\varepsilon w)-g(f(x))\|\leqslant 2\varepsilon \|M\|_{\text{op}}\|w\|

\text{ }

That said, since f is differentiable we know that f(x+v)-f(x)=D_f(x)(v)+h(v) where \displaystyle \lim_{v\to\bold{0}}\frac{h(v)}{\|v\|}=\bold{0}. Thus, putting it all together we see that

\text{ }

\displaystyle \begin{aligned}\lim_{\varepsilon\to0}\frac{g(f(x)+\varepsilon w)-g(f(x)}{\varepsilon} &=\lim_{\varepsilon}D_f(x)^{-1}\left(\frac{\varepsilon w-h(g(f(x)+\varepsilon w)-g(f(x))}{\varepsilon}\right)\\ &=D_f(x)^{-1}\end{aligned}

\text{ }

But, this left side precisely says that D_g(f(x)) exists and D_g(f(x))=D_f(x)^{-1}. \blacksquare

\text{ }

Corollary: Let f be as in the statement of the inverse function theorem, but that f\in C^k(U) for some k\in\mathbb{N}\cup\{\infty\} then the guaranteed f^{-1} is also C^k.

Proof: This follows immediately from the fact that D_{f^{-1}}(f(x))=D_f(x)^{-1}. Indeed, since f\in C^k(U) it’s easy to see that the entries of D_f(x) for any x are C^{k-1}, but then the entries of \displaystyle D_{f^{-1}}f(x)=D_f(x)^{-1}=\frac{1}{\det(D_f(x))}\text{adj}(D_f(x)) are C^{k-1} (as can be checked since they are quotients of polynomials of C^{k-1} functions and so we can go back quickly to conclude that f^{-1} is C^k. \blacksquare

\text{ }

\text{ }


1.  Spivak, Michael. Calculus on Manifolds; a Modern Approach to Classical Theorems of Advanced Calculus. New York: W.A. Benjamin, 1965. Print.

2. Apostol, Tom M. Mathematical Analysis. Reading, MA: Addison-Wesley Pub., 1974. Print.


September 8, 2011 - Posted by | Analysis | , , , ,


  1. […] a surface. We now show the other claim about level sets of curves. For the setup of this theorem we recall that we call, for a function where is open, the point a regular value if for every with one […]

    Pingback by Surfaces (Pt. II) « Abstract Nonsense | October 9, 2011 | Reply

  2. […] and so by construction we have that . The rest is just a literal statement of the inverse function theorem. […]

    Pingback by Surfaces (Pt. III) « Abstract Nonsense | October 9, 2011 | Reply

  3. […] since we know that is invertible. By the inverse function theorem we know there exists a neighborhood of and a neighborhood of such that is a diffeomorphism. […]

    Pingback by Instructive Non-Examples « Abstract Nonsense | October 14, 2011 | Reply

  4. […] a holomorphic one. This is where the necessary groan should be emitted by anyone familiar with the proof of the smooth IFT (or at least the equivalent, inverse function theorem). Now, before you decide to […]

    Pingback by Loci of Holomorphic Functions and the Inverse Function Theorem (Pt. I) « Abstract Nonsense | October 3, 2012 | Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: