Abstract Nonsense

Crushing one theorem at a time

Curves and the Implicit Function Theorem

Point of Post: In this post we discuss the notion of smooth curves in \mathbb{R}^n and the implicit function theorem.

\text{ }


To begin our discussion of geometry it seems prudent to discuss perhaps the simplest of all smooth geometric objects–curves. Everyone has an intuitive notion of a curve (at least in two or three space). Namely, a curve can be thought of as a length of string that is twisted this way and that, in a smooth manner. But, of course in mathematics one must always back up intuitive notions with concrete, sold definitions. That said, in our case one quickly realizes that there is not one immediate definition of curve. Indeed, there are two canonical ways of defining a curve which, from our point of view, are ordered in terms of ‘importance’ (i.e. we prefer one notion over the other). To see the difference between these two notions consider probably the simplest (closed) curve one could imagine in \mathbb{R}^2–the unit circle \mathbb{S}^1. Ask any kid off the street how one defines the unit circle and you are most likely to get the immediate answer “Oh! It’s just the set of points (x,y)\in\mathbb{R}^2 such that x^2+y^2=1” (or, perhaps the set of all z\in\mathbb{C} with |z| =1). Or, the parabola is another perfectly good curve which could be described as the set (x,y) such that y=x^2. Thus, one should start to wonder if perhaps the correct notion of a curve is the ‘locus’ of a single or multiple functions in Euclidean space. To be more concrete, for functionsf_1,\cdots,f_n:\mathbb{R}^n\to\mathbb{R} define \mathbb{V}(f_1,\cdots,f_n) to be the set f_1^{-1}(\{0\})\cap\cdots\cap f_n^{-1}(\{0\}) (so that \mathbb{S}^1=\mathbb{V}(x^2+y^2-1)). Perhaps then a good definition of a ‘curve’ is a set of the form \mathbb{V}(f_1,\cdots,f_n) for some sufficiently well-behaved functions f_1,\cdots,f_n. That said, there is another natural notion of curve which is equally naturally as the definition as the locus of a set of ‘nice’ functions. Namely, a curve can be thought of as a ‘path’, or the trace of  a moving particle, or more importantly the function defining the path. To be precise, a curve could also be defined as a sufficiently nice mapping \gamma:I\to\mathbb{R}^n for some (possibly infinite) non-empty interval I\subseteq\mathbb{R}. There is a large connotational difference between curves thought of as the locus of a set of functions and as a ‘path’. In particular, a ‘path’ has notions of how quick one traverses the path, whether they turn around, etc. whereas the loci of functions is just a set. That said, there seems to be a pretty obvious ‘connection’, namely taking a ‘path’ \gamma:I\to\mathbb{R}^n and loci. Namely, it seems intuitively obvious that at least the locus \mathbb{V}(f_1,\cdots,f_n) of a set of functions and the image \gamma(I) of some ‘path’ \gamma are the same ‘objects’ (i.e. just sets). That said, a little thought shows that they are definitively not in one-to-one correspondence. For example, consider the hyperbola \mathbb{V}(x^2-y^2-1). This is a perfectly nice ‘curve’, that said there evidently does not exist a sufficiently nice (e.g. continuous) ‘path’ \gamma:I\to\mathbb{R}^2 with \gamma(I)=\mathbb{V}(x^2-y^2-1) since the right hand side is not connected and the left hand side necessarily is. That said, there is hope to find a ‘path’ that has image equal to part of the hyperbola. In particular, if one restricts \mathbb{V}(x^2-y^2-1) to points with positive y-coordinates then the path \gamma:\mathbb{R}\to\mathbb{R}^2 having \gamma:t\mapsto (\cosh(t),\sinh(t)) is a perfectly (C^\infty) nice ‘path’ with \gamma(I) equal to the aforementioned branch of the hyperbola. Thus, one wonders if perhaps there is some condition on a curve (or a point of a curve) that gurantees that the curve is locally equivalent to the image of a ‘path’.  In fact, there is a theorem to this effect, but it is perhaps more of a sophisticated answer than one would expect–in particular being a stronger version of the inverse function theorem. Roughly the theorem states that if one has a level curve, and if the ‘derivative’ of the defining functions is non-zero at some point then in some neighborhood of that point the level curve is the graph of a function! Not to point out the obvious, but math is full of simple questions with startling complicated answers–this is perhaps one of the most profound of these examples, providing an integral link between algebra of functions and their geometry.

\text{ }


Let I\subseteq\mathbb{R} be an open interval. We call a map \gamma:I\to\mathbb{R}^n a paramaterized curve or just curve for short. A curve \gamma:I\to\mathbb{R}^n is called C^k if it is n-coordinate functions \gamma_1,\cdots,\gamma_n are C^k in the usual sense. From this point on we assume that all curves are C^\infty, or also known as smooth.

\text{ }

We define for a curve \gamma:I\to\mathbb{R}^n the tangent vector curve \gamma':I\to\mathbb{R}^n to be given by \gamma':t\mapsto (\gamma_1'(t),\cdots,\gamma_n'(t))–we call the image of a point of the tangent vector curve a tangent vector. How should one interpret the tangent vector? Intuitively, it’s a vector that matches the curve well near the point it’s defined. In more big boy language, recall that the total derivative D_\gamma(p)+f(p)\in\text{Aff}\left(\mathbb{R},\mathbb{R}^n\right) (i.e. an affine transformation \mathbb{R}\to\mathbb{R}^n) is the affine transformation which best approximates \gamma in a neighborhood p\in I. That said, since D_\gamma(p) is linear we have for any x\in\mathbb{R} that D_\gamma(p)(x)=xD_\gamma(p)(1). Thus, the affine transformation that best approximates \gamma near p is A(x)=x D_\gamma(p)(1)+f(p). Take a quick guess what D_\gamma(p)(1) is. You’re right, it’s \gamma'(p)!  In particular, we have that D_\gamma(p) is invertible if and only if \gamma'(p)\ne 0.

\text{ }

Remark: We extend the notion of tangent vectors to higher derivatives in the obvious way (i.e. \gamma^{(n)}(t)=(\gamma_1^{(n)}(t),\cdots,\gamma^{(n)}(t))).

\text{ }

Some of our obvious notions of what the tangent vector curve should behave like are true. For example, something whose tangent vector function is constant should morally be a line. That said, sure enough one checks that this is true since \gamma'(t) being constant for all t\in I implies that \gamma_j'(t) is constant for all t\in I from where the conclusion follows (for the sake of emphasis, recall that this required that I be connected).

\text{ }

Now, we come to the fact mentioned in the motivation for this post. Namely, there is another natural definition of a ‘curve’. Namely, we define a smooth level curve to be the locus \mathbb{V}(f_1,\cdots,f_n) of smooth functions f_j:\mathbb{R}^n\to\mathbb{R}.  What we’d hope is that every level curve could be paramaterized (in the sense that there is a paramaterized curve whose image is the level curve) so that the theory would literally be the same. Unfortunately, as was pointed out with \mathbb{V}(x^2-y^2-1), this isn’t possible. That said, it’s clear for example tha the hyperbola can be ‘locally’ paramaterized (e.g. the ‘upper branch’ can be paramterized by t\mapsto(t,\cosh(t))). This is ‘typical’ in a sense. Namely, we shall see that under relatively mild conditions one can locally paramaterize a level curve with a particularly nice form, namely something of the form x\mapsto (x,f(x)) where x and f(x) might both be vectors in some Euclidean space. Said differently, under relatively mild conditions we shall see that at most points on most level curves the curve locally looks like the graph of a function. Another way one could interpret this is that the equations for which the level curve is defined by are ‘locally solvable’.

\text{ }

Theorem (Implicit Function Theorem): Let U\subseteq\mathbb{R}^n\times\mathbb{R}^m be open and f\in C^1(U,\mathbb{R}^m) (recalling the definition of C^k) thought of as f(x;y) where x\in\mathbb{R}^n and y\in\mathbb{R}^m. Let (x_0,y_0) in U. We can partition the Jacobian matrix \text{Jac}_f(x_0,y_0) into two blocks, namely let D_X denote the m\times n matrix whose (i,j)^{\text{th}} entry is \displaystyle \frac{\partial f_i}{\partial x_j}(x_0,y_0) for i=1,\cdots,m and j=1,\cdots,n and the matrix n\times n matrix D_Y whose (i,j)^{\text{th}} entry is \displaystyle \frac{\partial f_i}{\partial y_j}(x_0,y_0) where i,j\in[n] (note we used x_j,y_j just to emphasize we are thinking of them as living in ‘different spaces’).  If D_Y is invertible then there exists a neighborhood V of x_0 and a function g\in C^1(V,\mathbb{R}^m) such that f(x,g(x))=f(x_0,y_0) for all x\in V. Moreover, D_g(x)=-D_Y^{-1}(f(x,g(x))D_X(f(x,g(x)).

Proof: We define the auxiliary function F\in C^1(U,\mathbb{R}^n\times\mathbb{R}^m) given by F(x,y)=(x,f(x,y)). One then finds that

\text{ }

\displaystyle \text{Jac}_F(x_0,y_0)=\left(\begin{array}{c|c} I_m & 0\\ \hline D_Y & D_X\end{array}\right)

\text{ }

and so evidently \text{Jac}_F(x_0,y_0) is also invertible. Thus, we may apply the inverse function theorem to F we have that there exists neighborhoods A of x_0 and B of y_0 and a map G\in C^1(A\times B,\mathbb{R}^{m}\times\mathbb{R}^n) such that F(G(x,y))=(x,y). Letting G_2(x,y) denote the last m-coordinate functions of G one can easily verify that the g we are after is G_2(x,0). The derivative formula then follows from basic algebraic manipulation. \blacksquare

\text{ }

\text{ }


1.  Spivak, Michael. Calculus on Manifolds; a Modern Approach to Classical Theorems of Advanced Calculus. New York: W.A. Benjamin, 1965. Print.

2. Apostol, Tom M. Mathematical Analysis. Reading, MA: Addison-Wesley Pub., 1974. Print.

3.  Carmo, Manfredo Perdigão Do. Differential Geometry of Curves and Surfaces. Upper Saddle River, NJ: Prentice-Hall, 1976. Print.


September 15, 2011 - Posted by | Analysis, Differential Geometry | , , , , ,


  1. Small typo, I believe: When you define curve, the domain should be $\mathbb{R}$ not $\mathbb{R}^n$

    Comment by Chris | September 19, 2011 | Reply

    • Dear Chris,

      Of course! Thank you very much for pointing that out!


      Comment by Alex Youcis | September 19, 2011 | Reply

    • Oh, and also, for future reference the tags for LaTeX in wordpress are (dollar sign)(space)(LaTeX)(dollar sign) i.e. $ latex $ with no initial extra space after $

      Comment by Alex Youcis | September 19, 2011 | Reply

  2. […] have discussed curves, some of the fundamental objects (at least motivationally) in the geometry we will discuss, and so […]

    Pingback by Arc Length (Pt. I) « Abstract Nonsense | September 22, 2011 | Reply

  3. […] Let be arbitrary. Since is a regular point for we know from the implicit function theorem that since we may find  neighborhood of and open sets , with a smooth function with (the […]

    Pingback by Surfaces (Pt. II) « Abstract Nonsense | October 9, 2011 | Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: