In Lambda calculus, the only objects we can work with are $~$\lambda$~$ expressions, which represent functions. So how can we talk about, and perform computations on, more familiar objects like numbers? We'll have to come up with a way to encode numbers, and operations on them, as $~$\lambda$~$ expressions.
Natural numbers
Let's figure out how to express the natural numbers $~$0,1,2,\dots$~$ in $~$\lambda$~$ calculus. First of all, they shouldn't have any free variables; we don't want numbers to depend on some undefined $~$x$~$. So, like any $~$\lambda$~$ expression with no free variables, they need to start with a $~$\lambda$~$. That is, they need to be functions.
Say they only take in one input, so they're $~$\lambda x.M$~$, where $~$M$~$ is a $~$\lambda$~$ expression (that varies with the number encoded) containing only the variable $~$x$~$. What could $~$M$~$ look like? The obvious things to try look like $~$x\ (x\ (x\ x))$~$ and $~$((x\ x)\ x)\ x$~$. To encode $~$0$~$, we would use the simplest version of this: just $~$x$~$. So the examples listed earlier would encode for $~$3$~$.
While it's possible to use one of those definitions for numbers, they don't work that well. What type of object is $~$x$~$ supposed to be? Should it be a function, or the input to a function? (Yes, everything in $~$\lambda$~$ calculus can be both, but it can still help to think of those two types differently.) Instead, we'll have each natural number be a function on two inputs, of the form $~$\lambda f.\lambda x.M$~$, where we think of $~$f$~$ as a "function" and $~$x$~$ as an "object."
So what should $~$0$~$ be? How about it does nothing to its inputs, and just returns $~$x$~$, so $~$0=\lambda f.\lambda x.x$~$. What about $~$1$~$? We have a function and an object, so the natural thing to do is apply the one to the other: $~$1=\lambda f.\lambda x.f\ x$~$. At this point it should be clear how we're going to continue:
- $~$2=\lambda f.\lambda x.f\ (f\ x)$~$
- $~$3=\lambda f.\lambda x.f\ (f\ (f\ x))$~$
- $~$4=\lambda f.\lambda x.f\ (f\ (f\ (f\ x)))$~$
And so on, so $~$n$~$ is the function that, on inputs $~$f$~$ and $~$x$~$, applies $~$f$~$ to $~$x$~$ $~$n$~$ times. Informally, we could write $~$n=\lambda f.\lambda x.f^n(x)$~$.
Operations on numbers
We've defined natural numbers, but so far they aren't doing anything. We expect numbers to be able to be added and multiplied, so let's figure out how to express those in $~$\lambda$~$ calculus. We'll start easy: with the successor function $~$S(n)=n+1$~$.
[successor]
On to defining $~$S$~$. It's a function that takes in a number, so let's start with $~$\lambda n$~$. What should it return? Another number, of course, which starts $~$\lambda f.\lambda x$~$. Now we should use $~$n$~$, which applies $~$f$~$ to $~$x$~$, $~$n$~$ times. But we want $~$n+1$~$, so we want to apply $~$f$~$ to $~$x$~$, $~$n+1$~$ times. If we apply $~$f$~$ $~$n$~$ times, and then once more, we can achieve this:
$$~$S=\lambda n.\lambda f.\lambda x.f\ (n\ f\ x).$~$$
Here $~$(n\ f\ x)$~$ gives us $~$f^n(x)$~$, so $~$f\ (n\ f\ x)$~$ is $~$f(f^n(x))=f^{n+1}(x)$~$. We could do these applications in the other order, first taking $~$f\ x$~$ and then applying $~$f$~$ to that $~$n$~$ times, so
$$~$S^\prime=\lambda n.\lambda y.\lambda x.n\ f\ (f\ x).$~$$
While $~$S$~$ and $~$S^\prime$~$ are equivalent when $~$n$~$ is a natural number, they aren't equal; if we plug in $~$\lambda a.\lambda b.a$~$ for $~$n$~$, we have \begin{align} S\ (\lambda a.\lambda b.a) &= \lambda f.\lambda x.f\ ((\lambda a.\lambda b.a)\ f\ x) \newline &= \lambda f.\lambda x.f\ f \end{align} but \begin{align} S^\prime\ (\lambda a.\lambda b.a) &= \lambda f.\lambda x.(\lambda a.\lambda b.a)\ f\ (f\ x)) \newline &= \lambda f.\lambda x.f \end{align} which is a different function.
To check that $~$S$~$ works as we want, let's verify that $~$S\ 3=4$~$. For these worked examples, it's probably much more informative if you work through them yourself; they're provided here so you can check that you did it right.
%%hidden(Worked example: successor): \begin{align} S\ 3 &= (\lambda n.\lambda f.\lambda x.f\ (n\ f\ x))\ (\lambda f.\lambda x.f\ (f\ (f\ x))) \newline &= (\lambda f.\lambda x.f\ ((\lambda f.\lambda x.f\ (f\ (f\ x)))\ f\ x)) \newline &= (\lambda f.\lambda x.f\ (f\ (f\ (f\ x))) \newline &= 4 \end{align} %%
[addition]
Now that we know how to add $~$1$~$, let's figure out how to add arbitrary numbers. Suppose we have two numbers $~$m$~$ and $~$n$~$; remember $~$m$~$ means "on inputs $~$f$~$ and $~$x$~$, apply $~$f$~$ to $~$x$~$ $~$m$~$ times," and similarly for $~$n$~$. We want $~$m+n$~$ to apply $~$f$~$ to $~$x$~$ $~$m+n$~$ times, so how can we accomplish that? If we apply $~$f$~$ to $~$x$~$ $~$n$~$ times, and then $~$m$~$ more times, it should have the desired effect. Writing this as a $~$\lambda$~$ term,
$$~$+=\lambda m.\lambda n.\lambda f.\lambda x.m\ f\ (n\ f\ x)$~$$
Again $~$n\ f\ x$~$ is $~$f$~$ of $~$x$~$ $~$n$~$ times, and the $~$m\ f$~$ in front applies $~$f$~$ to the result $~$m$~$ more times. We could of course swap $~$m$~$ and $~$n$~$, since addition is commutative.
Let's verify that $~$2+3=5$~$. We have to write $~$2+3$~$ as $~$+\ 2\ 3$~$ because we haven't defined infixes in $~$\lambda$~$ calculus (though we could define $~$m+n$~$ to mean $~$+\ m\ n$~$.
%%hidden(Worked example: addition): \begin{align} +\ 2\ 3 &= (\lambda m.\lambda n.\lambda f.\lambda x.m\ f\ (n\ f\ x))\ (\lambda f.\lambda x.f\ (f\ x))\ (\lambda f.\lambda x.f\ (f\ (f\ x))) \newline &= \lambda f.\lambda x.(\lambda f.\lambda x.f\ (f\ x))\ f\ ((\lambda f.\lambda x.f\ (f\ (f\ x)))\ f\ x) \newline &= \lambda f.\lambda x.(\lambda f.\lambda x.f\ (f\ x))\ f\ (f\ (f\ (f\ x))) \newline &= \lambda f.\lambda x.f\ (f\ (f\ (f\ (f\ x)))) \newline &= 5 \end{align} %%
[multiplication]
Multiplication might seem a bit harder. Now, given two numbers $~$m$~$ and $~$n$~$, we want to apply some function $~$f$~$ to $~$x$~$ $~$m\times n$~$ times. If we take a function that applies $~$f$~$ $~$n$~$ times, and then apply that function $~$m$~$ times, we should get something that applies $~$f$~$ $~$n$~$ times, $~$n$~$ times, which is the same as applying $~$f$~$ $~$m\times n$~$ times. That is, $~$(f^n)^m(x)=f^{m\times n}(x)$~$. The intermediate function that applies $~$f$~$ $~$n$~$ times is $~$\lambda x.n\ f\ x$~$, or using [eta_convesion $~$\eta$~$-conversion], simply $~$n\ f$~$. We just need to apply $~$n\ f$~$, $~$m$~$ times, so we can define
$$~$\times=\lambda m.\lambda n.\lambda f.\lambda x.m\ (n\ f) x$~$$
or, using $~$\eta$~$-conversion again,
$$~$\times=\lambda m.\lambda n.\lambda f.m\ (n\ f).$~$$
Again multiplication is commutative, so we could swap $~$m$~$ and $~$n$~$ to get an equally good definition.
We should have $~$\times\ 2\ 3=6$~$; let's check this.
%%hidden(Worked example: multiplication): \begin{align} \times\ 2\ 3 &= (\lambda m.\lambda n.\lambda f.m\ (n\ f))\ (\lambda f.\lambda x.f\ (f\ x))\ (\lambda f.\lambda x.f\ (f\ (f\ x))) \newline &= \lambda f.(\lambda f.\lambda x.f\ (f\ x))\ ((\lambda f.\lambda x.f\ (f\ (f\ x)))\ f) \newline &= \lambda f.(\lambda f.\lambda x.f\ (f\ x))\ (\lambda x.f\ (f\ (f\ x))) \newline &= \lambda f.\lambda x.(\lambda x.f\ (f\ (f\ x)))\ ((\lambda x.f\ (f\ (f\ x)))\ x) \newline &= \lambda f.\lambda x.(\lambda x.f\ (f\ (f\ x)))\ (f\ (f\ (f\ x))) \newline &= \lambda f.\lambda x.f\ (f\ (f\ (f\ (f\ (f\ x)))))) \newline &= 6 \end{align} %%