Infinite sets

With the axioms above we have no way to prove the existence of an infinite set. The various operations we have, union, intersection, relative complement, power set and Cartesian product, all have the property that when applied to finite sets they produce finite sets.

We have a number of things though which, if they are sets, must be infinite. One is the natural numbers. If the natural numbers are a set then the set pairs \({ ( x , y ) }\) with \({ x ≤ y }\) is a partially ordered set with no greatest element. We’ve already seen that any partial order on a finite set has a greatest element, so the natural numbers can’t be a finite set.

You might object that I’ve used the set of natural numbers in examples. Examples are meant to guide your intuition though so I sometimes presuppose things we don’t yet know to be true. I’ve been careful to confine the natural numbers to examples though and not to use the existence of such a set in proving theorems.

Another set which, if is exits, must be infinite is that set of lists of items in a given non-empty set \({ A }\). Any set which contains all such lists must be infinite, which we can see as follows.

Suppose \({ A }\) is a non-empty set and \({ B }\) is a set which contains all lists whose items are members of \({ A }\). We can write down a Boolean expression which identifies which elements of \({ B }\) are actually lists all of whose items are members of \({ A }\) so we can use Selection to conclude that there is a set \({ C }\) whose members are precisely such lists. \({ A }\) is non-empty, so there is an \({ x ∈ A }\). Let \({ F }\) be the set of pairs of lists \({ ( D , E ) }\) where \({ E }\) is the list obtained by appending \({ x }\) onto \({ D }\). Then \({ F }\) is an injective function from \({ C }\) to itself. It is not surjective because the empty list \({ ∅ }\) is not in its range. But we’ve already shown that every injective function from a finite set to itself is surjective, so \({ C }\) cannot be finite. Subsets of finite sets are finite so \({ B }\) can’t be finite either.

There are a number of ways to get infinite sets but all of them involve introducing some new axiom. We could just introduce an axiom saying that there is an infinite set. This turns out not to be sufficient to establish the existence of all the infinite sets that we want. The usual procedure is to introduce an axiom establishing the existence of one particular infinite set which is in some sense large enough.

Natural numbers

There are multiple ways to implement natural numbers within set theory.

The most common way is via the von Neumann ordinals. Consider the operation \({ ' }\) on sets defined by \[ A ' = A ⋃ \{ A \} . \] Then \[ ∅ ' = \{ ∅ \} , \quad ∅ '' = \{ ∅ , \{ ∅ \} \} , \quad ∅ ''' = \{ ∅ , \{ ∅ \} , \{ ∅ , \{ ∅ \} \} \} , ... \] In general the set represented by \({ ∅ }\) followed by \({ n }\) apostrophes has \({ n }\) members, each of which is one of the sets represented by \({ ∅ }\) followed by \({ m }\) apostrophes where \({ m }\) is a natural number less than \({ n }\). We will call the sets of this form finite ordinals.

It is possible to define operations \({ + }\) and \({ · }\) on sets in such a way that whenever \({ B }\) and \({ C }\) are finite ordinals we have the properties.

\({ \{ ∀ B . [ ¬ ( B ' = ∅ ) ] \} }\)
\({ \{ ∀ B . [ ( B + ∅ ) = B ] \} }\)
\({ ( ∀ B . \{ ∀ C . [ ( B + C ' ) = ( B + C ) ' ] \} ) }\)
\({ \{ ∀ B . [ ( B · ∅ ) = ∅ ] \} }\)
\({ [ ∀ B . ( ∀ C . \{ ( B · C ' ) = [ ( B · C ) + B ] \} ) ] }\)

The first of these doesn’t reference the operations \({ + }\) and \({ · }\) at all and is easily proved. \({ B }\) is a member of \({ B ' }\) and \({ ∅ }\) has no members so \({ B }\) and \({ ∅ }\) cannot be equal. To prove the other four properties one needs the definitions of \({ + }\) and \({ · }\), which are rather complicated. We won’t do this. Instead we note that the properties listed above correspond exactly to the five axioms of Peano arithmetic. Peano arithmetic had rules of inference as well as axioms but the rules of inference also carry over. The most complicated of those rules of inference, the rule of induction, follows from the induction property of finite sets discussed earlier. In this way it is possible to build a model of Peano arithmetic entirely within simple set theory.

An alternative method to construct natural numbers is via lists. Choose some \({ x }\). The choice doesn’t matter but a convenient one is \({ x = ∅ }\), since we have an axiom which says it exists. Then a natural number is just a list all of whose items are \({ x }\). The intuition is that the length of the list is a natural number and there is one such list for each natural number so we can use the lists as representatives for the number. From this point of view it’s clear how we should define 0, incrementation and addition. 0 is just the empty list \({ () }\). \({ A ' }\) is the result of appending an \({ x }\) to \({ A }\). \({ A + B }\) is the result of concatenating \({ A }\) and \({ B }\). Multiplication is somewhat trickier to define but this can be done in such a way that the axioms of arithmetic are all satisfied.

The standard point of view in mathematics and logic is not that the finite ordinals have the same behaviour as the natural numbers but rather that they are the natural numbers. At least this is the point of view most mathematicians and logicians claim to have. It’s debatable how many really believe this. One property of the definitions above is that \({ ∅ ' ∈ ∅ ''' }\) or, in the usual notation for natural numbers \({ 1 ∈ 3 }\). With the interpretation described above this is simply a theorem about the natural numbers but it would be hard to find a mathematician who would be comfortable describing \({ 1 ∈ 3 }\) as a true statement, or even a meaningful one.

The situation here is similar to the one we encountered earlier with ordered pairs. A computer science perspective is more useful here than a mathematical or logical one. The natural numbers have an interface. Some operations, like \({ + }\) and \({ · }\), are defined on them, as are some relations, like \({ = }\) and \({ ≤ }\). These operations and relations are guaranteed to have certain properties, described by the Peano axioms.

Operators outside the public interface of the natural numbers, like \({ ⋂ }\) and \({ ⋃ }\) may behave differently in different implementations, as may relations like \({ ∈ }\) and \({ ⊆ }\). In practice once the implementation has been set up and the required properties have been proved one only ever uses the public interface. One doesn’t, and shouldn’t, write down statements like \({ 1 ∈ 3 }\).

The fact that we can implement Peano arithmetic within set theory has an important consequence. Peano arithmetic is either inconsistent or incomplete. As a result set theory must also be either inconsistent or incomplete.

The set of natural numbers

It’s important to understand what we have and haven’t done above. We’ve found objects which behave like the natural numbers. We haven’t shown that they form a set. In fact we can’t show this with the axioms we have because these sets, if they exist, are infinite and our axioms are not sufficient to show the existence of any infinite set. We need a new axiom in order to have not just natural numbers but a set of natural numbers, which is what we need if we’re to extend arithmetic beyond Peano arithmetic.

The simplest approach is just to assume these sets exist as an axiom.

For the von Neumann implementation of the natural numbers this means taking the following as an axiom:

Infinity: There is a set \({ A }\) such that \({ ∅ ∈ A }\) and and \({ B ⋃ \{ B \} ∈ A }\) whenever \({ B ∈ A }\). Formally, \[ [ ∃ A . [ [ ∅ ∈ A ] ∧ [ ∀ B ∈ A : B ⋃ \{ B \} ∈ A ] ] ] . \]

The set \({ B }\) is not necessarily the set \({ N }\) that we’re looking for. As with a number of previous axioms we’ve assumed the existence of a set which is large enough to contain everything we want, but might contain other things. To remove those we use the axiom of separation, selecting only those \({ B ∈ A }\) which are finite and satisfy the condition \[ [ ∀ C ∈ B : [ ∀ D ∈ C : [ [ D ∈ B ] ∧ [ ∀ E ∈ D : E ∈ C ] ] ] ] . \]

If we use the list implementation of the natural numbers then it’s more natural to use the following axiom

Infinity (alternative version): For every set \({ C }\) there is a set \({ D }\) whose members are the lists all of whose items are members of \({ C }\).

The formal version of this axiom is rather long since it needs to incorporate the definition of a list.

Although these axioms were designed for the same purpose they are not quite equivalent, in the sense that we can’t prove either from the other and our other axioms and rules of inference. They become equivalent though if you assume the following additional axiom schema.

Replacement: For any Boolean expression \({ P }\) in which \({ x }\), \({ y }\), and \({ A }\) appears freely but \({ B }\) does not the statement that for each \({ A }\) if \({ x ∈ A }\) implies that there is a unique \({ y }\) such that \({ P }\) then there is a set \({ B }\) such that \({ y ∈ B }\) if and only if there is an \({ x ∈ A }\) such that \({ P }\). Formally this is \[ \begin{split} & [ ∀ A . [ [ ∀ x ∈ A : [ [ ∃ y . P ] ∧ [ [ P ∧ Q ] ⊃ y = z ] ] ] \\ & \quad {} ⊃ [ ∃ B . [ ∀ y . [ [ [ y ∈ B ] ⊃ [ ∃ x ∈ A : P ] ] ∧ [ [ ∃ x ∈ A : P ] ⊃ [ y ∈ B ] ] ] ] ] ] ] . \end{split} \] Here \({ z }\) is a variable which does not occur in \({ P }\) and \({ Q }\) is the result of replacing all free occurrences of \({ y }\) in \({ P }\) with \({ z }\).

The Axiom of Replacement can be understood as follows. If we have sets \({ A }\) and \({ B }\) and a Boolean expression involving variables \({ x }\) and \({ y }\) then we can form a relation from \({ A }\) to \({ B }\) which is the set of pairs \({ ( x , y ) }\) where \({ x ∈ A }\) and \({ y ∈ B }\). There are some conditions, i.e. Boolean expressions, which need to be satisfied for this relation to be a surjective function, in which case \({ A }\) is its domain and \({ B }\) is its range. The Axiom of Replacement says that if \({ A }\) is a set and we have such a Boolean expression satisfying those conditions then there is indeed a set \({ B }\) such that the relation defined as above is a surjective function from \({ A }\) to \({ B }\).

The Axiom of Replacement was not part of Zermelo’s set theory. It was introduced later by Fraenkel. It is often convenient, but rarely necessary, to assume it in order to prove standard mathematical theorems. It forms part of what’s called Zermelo-Fraenkel set theory, which is the version of set theory most mathematicians use.

Since it’s rarely used I don’t want to spend too much time on it but I will sketch how you can prove the alternative version of the Axiom of Infinity from the first version and the Axiom of Replacement.

One nice property of both implementations of the natural numbers is that the natural number \({ n }\) is a set with \({ n }\) members. With either implementation we can therefore prove that for every finite set there is a bijective function from that set to a natural number, considered as a set. This is in fact fairly straightforward to prove by set induction.

Since we’re assuming the first version of the Axiom of Infinity we’ll use the implementation of the natural numbers as von Neumann ordinals, so that the set of natural numbers is known to exist. We can write down a Boolean expression with free variables \({ x }\) and \({ w }\) expressing the following:

\({ x }\) is a natural number, i.e. a von Neumann ordinal, and
\({ w }\) is a list all of whose items are members of \({ C }\),
there is a bijective function from \({ x }\) to \({ w }\).

Each of these statements individually is straightforward, if rather tedious, to express in our language and we just need to combine them with \({ ∧ }\)’s. Using this combined Boolean expression we can construct a second expression, with free variables \({ x }\) and \({ y }\) expressing the fact that \({ y }\) is a set and \({ w ∈ y }\) if and only if the first expression is satisfied. This second expression then has the interpretation that \({ y }\) is the set of all lists with \({ x }\) items, all of which are members of \({ C }\). We then apply the Axiom of Replacement with \({ A = N }\) and \({ P }\) being the Boolean expression we just constructed. This gives us a set \({ B }\) whose members are the \({ y }\)’s make the expression true for some \({ x ∈ N }\). In other words they are the sets of lists of each length. Every list is a finite set so by the theorem from the preceding paragraph it’s a member of one of these sets. So the set of all lists of items in \({ C }\) is \({ D = ⋃ B }\).

There’s a similar, slightly easier argument which shows that the alternative form of Axiom of Infinity, together with the Axiom of Replacement, implies the first form.