We’ll start with a subset of set theory which is almost sufficient for almost all of mathematics and computer science. As we did with elementary arithmetic we’ll borrow first order logic. We won’t borrow elementary arithmetic itself. In particular we will not assume that numbers exist. You may have noticed that the language I’ve introduced has no notation for them.
Our axioms are
Extensionality: Suppose \({ A }\) and \({ B }\) are sets and every member of \({ A }\) is a member of \({ B }\) and vice versa then \({ A = B }\).
Elementary sets: There is a set \({ ∅ }\) such that for all \({ x }\) we have \({ [ ¬ [ x ∈ ∅ ] ] }\). For all \({ x }\) we have a set \({ \{ x \} }\), of which \({ x }\) is a member and there are no other members. Similarly, for all \({ x }\) and \({ y }\) we have a set \({ \{ x , y \} }\) such that \({ x }\) and \({ y }\) are members and there are no other members.
Separation: For every variable \({ x }\), set \({ A }\) and Boolean expression \({ θ }\) the set \({ \{ x ∈ A : θ \} }\), whose members are those members of \({ A }\) for which \({ θ }\) is true, exists.
Power set: For any set \({ A }\) the power set \({ [ P A ] }\) exists. \({ [ B ∈ [ P A ] ] }\) if and only if every member of \({ B }\) is a member of \({ A }\).
Union: For every set \({ A }\) the set \({ [ ⋃ A ] }\) exists. This is the set of all members of members of \({ A }\).
These are informal statements of the axioms. Their formal equivalents will be given shortly, but they are hard to read if you don’t know what they’re trying to express. Before doing that, there are a few things to notice about these axioms.
The Axiom of Extensionality tells us that sets are characterised purely by their members. This is something which often causes confusion. We have many ways of describing sets, but the set is not its description. In terms of our language, there could be multiple set expressions which describe the same set. There could also be no set expression which describes a particular set.
The Axiom of Elementary Sets has some redundancy. The only parts we really need are are the existence of some set and the fact that for all \({ x }\) and \({ y }\) there is a set of which \({ x }\) and \({ y }\) are members. If \({ A }\) is such a set then \({ \{ w ∈ A : [ [ w = x ] ∨ [ w = y ] ] \} }\) is a set of which they are the only members. There can only be one such set, by the Axiom of Extensionality. This is the set we called \({ \{ x , y \} }\). We don’t really need to state the existence of \({ \{ x \} }\) separately. It’s the same as \({ \{ x , x \} }\). Also, if \({ A }\) is a set \({ \{ x \in A : [ ¬ [ x = x ] ] \} }\) is a set, by the Axiom of Separation, and has no members. By the Axiom of Extensionality there can only be one such set. This is the set we called \({ ∅ }\). So if there are any sets at all then there is an empty set. What about sets with more than two members? We can show that those exist using this axiom together with the Axiom of Union. \({ \{ x , y , z \} }\), for example, is \({ [ ⋃ \{ \{ x , y \} , \{ y , z \} \} ] }\).
The Axiom of Separation is not an axiom. Instead it’s what’s called an axiom schema, i.e. a common pattern for a family, indeed an infinite family, of axioms, one for each choice of variable, set and expression. It would have been better to make this into a rule of inference rather than an axiom but for historical reasons it is called an axiom. Note that the axiom doesn’t allow us to use an expression to construct a set of everything which makes that expression true, only to construct a set of those members of a given set which make the expression true. In other words, it carves out a subset from a set which is already known to exist. It can’t create sets from nothing. If you’ve been reading carefully you’ll have realised that there’s something wrong with my informal description of the Axiom of Separation. I referred to members for which a statement is true. Truth has no place in a formal system. The formal version of the axiom, or rather axiom schema, does not refer to the concept of truth.
The Axiom of Separation is useful for constructing particular subsets but it doesn’t assure us that the subsets of a given set form a set. For that we need the Power Set Axiom. I haven’t actually said what a subset is but you can probably guess. \({ A }\) is a subset of \({ B }\), written \({ [ A ⊆ B ] }\) if every member of \({ A }\) is a member of \({ B }\). As with various other axioms, instead of assuming the existence of the power set itself we could just assume the existence of some set such that all the subsets of \({ A }\) are members of it and then use the Axiom of Separation to remove any members which aren’t subsets of \({ A }\).
The Axiom of Union is used to create unions. That’s straightforward enough. As with most other axioms we could just assume the existence of some set such that every member of every member of \({ A }\) is a member of it, and then use the Axiom of Separation to remove any other members.
What may seem odd is the absence of any Axiom of Intersection. We don’t need one. We can define \({ [ ⋂ A ] }\) as \[ \{ x \in [ ⋃ A ] : [ ∀ B ∈ A : [ x ∈ B ] ] \} . \] In other words, \({ x }\) belongs to \({ [ ⋂ A ] }\) if and only if it is a member of some member of \({ A }\) which is also a member of all members of \({ A }\). If \({ A }\) is a non-empty set then the condition that it’s a member of all members of \({ A }\) implies that it’s a member of some member of \({ A }\). We couldn’t just have defined it as \[ \{ x . [ ∀ B ∈ A : [ x ∈ B ] ] \} \] though. The Axiom of Separation requires us to restrict \({ x }\) to a set. If \({ A }\) is not a non-empty set then the definition above would give \({ [ [ ⋂ A ] = ∅ ] }\). This would have some unfortunate consequences. For example, it would not be true that \({ [ A ⊆ B ] }\) implies \({ [ [ ⋂ B ] ⊆ [ ⋂ A ] ] }\). For this reason it’s better just not to define \({ [ ⋂ A ] }\) unless \({ A }\) is a non-empty subset. This okay because our version of first order logic allows expressions which don’t refer to anything.
The axioms above mix assumptions about the existence of sets with notations for them. Strictly speaking the axioms are just the part which assert the existence of a set. One important point to understand is that having a notation for something doesn’t mean it exists. Mathematics is full of notations for things which don’t exist, like \({ 1 / 0 }\). The axioms of set theory are arranged in such a way that everything we have a notation for will in fact exist, but it exists as a consequence of the axioms, not just because we happen to have included it in our language. We have a number of notations for which there are no axioms. The intersection symbol, which we just considered, is such a notation. Others are the notations for set differences, lists and Cartesian products. We’ll need to give definitions for those, just as we did for the intersection.
Here are the formal versions of the axioms.
Extensionality: \[ \begin{split} & [ ∀ A . [ ∀ B . [ [ ∃ x . [ x ∈ A ] ] \cr & \quad ⊃ [ [ ∀ y . [ [ [ y ∈ A ] ⊃ [ y ∈ B ] ] ∧ [ [ y ∈ B ] ⊃ [ y ∈ A ] ] ] ] ⊃ [ A = B ] ] ] ] ] \end{split} \]
Elementary Sets: \[ [ ∀ x . [ ¬ [ x ∈ ∅ ] ] ] \] and \[ [ ∀ x . [ ∀ y . [ ∃ A . [ [ x ∈ A ] ∧ [ y ∈ A ] ] ] ] ] \]
Separation: \[ [ ∃ B . [ ∀ x . [ [ [ x ∈ B ] ⊃ [ [ x ∈ A ] ∧ θ ] ] ∧ [ [ [ x ∈ A ] ∧ θ ] ⊃ [ x ∈ B ] ] ] ] ] \] Here \({ x }\) can be replaced by any variable, \({ A }\) by any set expression and \({ θ }\) by any Boolean expression in which \({ B }\) has no free occurrences.
Power Set: \[ [ ∀ A . [ ∃ B . [ ∀ C . [ [ ∀ x . [ [ x ∈ C ] ⊃ [ x ∈ A ] ] ] ⊃ [ C ∈ B ] ] ] ] ] \]
Union: \[ [ ∀ A . [ ∃ B . [ ∀ C ∈ A : [ ∀ x ∈ C : [ x ∈ B ] ] ] ] ] \]
These aren’t quite the axioms as they appeared initially. Instead I have incorporated some of the observations from the discussion section to shorten the axioms. For example, the formal version of the Axiom of Elementary Sets just says that \({ ∅ }\) is the empty set and that for all \({ x }\) and \({ y }\) there is a set with both \({ x }\) and \({ y }\) as members, not that there is a set with only those members, and doesn’t say anything about sets with only one member. The axiom above is therefore also known as Axiom of Pairing.
There is no set of all sets. This follows directly from the axioms. Suppose there were a set \({ A }\) such that every set is a member of \({ A }\). By the Axiom of Separation then we can form the set \[ B = \{ C ∈ A : [ ¬ [ C ∈ C ] ] \} . \] In other words \({ C }\) is the set of all sets which are not members of themselves. Is \({ B }\) a member of \({ B }\)? If not then \({ B }\) is a set which is not a member of itself, but then by the definition of \({ B }\) it is a member of \({ B }\). Similarly, if \({ B }\) is a member of \({ B }\) then it doesn’t satisfy the definition of \({ B }\) and so isn’t a member of \({ B }\). So the assumption that there is a set of all sets leads to a contradiction.
\({ A }\) is a set if and only if it’s the empty set or has at least one member. In other words, sets are characterised by the Boolean expression \[ [ [ A = ∅ ] ∨ [ ∃ x . [ x ∈ A ] ] ] . \] In the Axiom of Separation we weren’t allowed to define sets with just a Boolean expression; we needed a Boolean expression and some other set. In other words, expressions were used to define subsets of a given set, not to define sets directly. Now we can see why. If we could just use expressions to define sets then we could define the set of all sets as \[ \{ A : [ [ A = ∅ ] ∨ [ ∃ x . [ x ∈ A ] ] ] \} . \] But we’ve just seen that this set can’t exist, so an axiom which allowed us to define it would necessarily be unsound.
More generally, there is no such thing as the complement of a set. The complement of the empty set would be the set of all sets, and we’ve already seen that that doesn’t exist. The same holds for any set though. Suppose the complement of \({ A }\) existed, i.e. that there was a set \({ C }\) such that every member of \({ A }\) is not a member of \({ C }\), and vice versa. By the Axiom of Pairing there is then a set \({ B }\) with both \({ A }\) and \({ C }\) as members. By the Axiom of Union there’s then a set \({ D }\) such that every member of every member of \({ B }\) is a member of \({ D }\), and in particular every member of \({ A }\) or \({ C }\) is a member of \({ D }\). Everything is either in \({ A }\) or \({ C }\) though and therefore in \({ D }\). I’ve been deliberately rather vague about whether our language is meant to include objects which are not sets, a point we’ll need to return to later, but in either case we can use the Axiom of Separation to define \[ E = \{ x ∈ D : [ [ x = ∅ ] ∧ [ ∃ y . [ y ∈ x ] ] ] \} . \] This \({ E }\) is the set of those members of \({ D }\) which are sets, and so is the set of all sets, which we’ve already seen doesn’t exist. So there can be no such set \({ B }\).
Relative complements are meaningful though. \({ [ A ∖ B ] }\) is easily defined as \[ [ A ∖ B ] = \{ x ∈ A : [ ¬ [ x ∈ B ] ] \} . \] In some contexts we’re only concerned with subsets of one given set. We might, for example, be discussing subsets of the natural numbers, and only subsets of the natural numbers. In such a case it’s common to drop the word “relative” and just say “complement”. This is just shorthand though and the set described in this way is still a relative complement.
In the discussion of first order logic I described a class of interpretations where the variables were to be understood as members of a set and mentioned that these were not the only interpretations. We can now see why. We’re applying first order logic to set theory, and the variables are allowed to range over all sets, but there is no set of all sets, so this cannot be an interpretation of the type described earlier.
The non-existence of the set of all sets has some other awkward consequences. We’ve already met one of them. It’s what prevented us from defining \({ [ ⋂ ∅ ] }\) to be the union of all sets, which might otherwise have seemed like a way to extend the unary intersection operator to all sets while maintaining the property that if \({ [ A ⊆ B ] }\) then \({ [ [ ⋂ B ] ⊆ [ ⋂ A ] ] }\).
We can derive a number of set theory identities from zeroeth order logic identities. The basis for this is the following facts.
\({ [ A ⊆ B ] }\) if and only if \({ [ [ x ∈ A ] ⊃ [ x ∈ B ] ] }\). Indeed this is just the definition of the \({ ⊆ }\) relation.
If \({ [ A ⊆ B ] }\) and \({ [ B ⊆ A ] }\) then \({ [ A = B ] }\). This is a consequence of Extensionality.
\({ [ [ x ∈ [ A ∩ B ] ] }\) if and only if \({ [ [ x ∈ A ] ∧ [ x ∈ B ] ] }\). This is more or less the definition of the \({ ∩ }\) operator.
\({ [ [ x ∈ [ A ∪ B ] ] }\) if and only if \({ [ [ x ∈ A ] ∨ [ x ∈ B ] ] }\). This is more or less the definition of the \({ ∪ }\) operator.
\({ [ [ x ∈ [ A ∖ B ] ] }\) if and only if \({ [ ¬ [ [ x ∈ A ] ⊃ [ x ∈ B ] ] ] }\). This is more or less the definition of the \({ ∖ }\) operator.
So the three set operators \({ ∩ }\), \({ ∪ }\) and \({ ∖ }\) are expressible in terms of the four Boolean operators \({ ∧ }\), \({ ∨ }\), \({ ¬ }\), and \({ ⊃ }\). \({ ∩ }\) corresponds to \({ ∧ }\) and \({ ∪ }\) corresponds to \({ ∨ }\), which is fairly easily to remember. \({ ∖ }\) corresponds to a particular combination of \({ ¬ }\) and \({ ⊃ }\), but no set operator corresponds to \({ ¬ }\) or \({ ⊃ }\) individually. In some sense the complement operator, if there were one, would correspond to \({ ¬ }\).
As an example, consider the associativity of the union operation, i.e. the identity \({ [ [ [ A ∪ B ] ∪ C ] = [ A ∪ [ B ∪ C ] ] ] }\). \({ [ [ p ∨ [ q ∨ r ] ] ⊃ [ [ p ∨ q ] ∨ r ] ] }\) is a tautology in zeroeth order logic. Substituting \({ [ x ∈ A ] }\) for \({ p }\), \({ [ x ∈ B ] }\) for \({ q }\), and \({ [ x ∈ C ] }\) for \({ r }\) gives \[ [ [ [ x ∈ A ] ∨ [ [ x ∈ B ] ∨ [ x ∈ C ] ] ] ⊃ [ [ [ x ∈ A ] ∨ [ x ∈ B ] ] ∨ [ x ∈ C ] ] ] . \] We can replace \({ [ [ x ∈ B ] ∨ [ x ∈ C ] ] }\) with \({ [ x ∈ [ B ∪ C ] ] }\) and \({ [ [ x ∈ A ] ∨ [ x ∈ B ] ] }\) with \({ [ x ∈ [ A ∪ B ] ] }\), so \[ [ [ [ x ∈ A ] ∨ [ x ∈ [ B ∪ C ] ] ] ⊃ [ [ [ x ∈ [ A ∪ B ] ] ] ∨ [ x ∈ C ] ] ] . \] Then we can replace \({ [ [ x ∈ A ] ∨ [ x ∈ [ B ∪ C ] ] ] }\) by \({ [ x ∈ [ A ∪ [ B ∪ C ] ] ] }\) and \({ [ [ [ x ∈ [ A ∪ B ] ] ] ∨ [ x ∈ C ] ] }\) by \({ [ x ∈ [ [ A ∪ B ] ∪ C ] ] }\), so \[ [ [ x ∈ [ A ∪ [ B ∪ C ] ] ] ⊃ [ x ∈ [ [ A ∪ B ] ∪ C ] ] ] \] and hence \[ [ [ A ∪ [ B ∪ C ] ] ⊆ [ [ A ∪ B ] ∪ C ] ] . \] Similarly, \({ [ [ [ p ∨ q ] ∨ r ] ⊃ [ p ∨ [ q ∨ r ] ] ] }\) is a tautology so \[ [ [ [ A ∪ B ] ∪ C ] ⊆ [ A ∪ [ B ∪ C ] ] ] . \] Combining that with the inclusion already obtained gives \[ [ [ [ A ∪ B ] ∪ C ] = [ A ∪ [ B ∪ C ] ] ] . \] The following facts about sets can similarly be proved using tautologies borrowed from zeroeth order logic.
\({ [ [ A ∩ B ] ⊆ A ] }\)
\({ [ [ A ∩ B ] ⊆ B ] }\)
\({ [ A ⊆ [ A ∪ B ] ] }\)
\({ [ B ⊆ [ A ∪ B ] ] }\)
\({ [ [ A ∖ B ] ⊆ A ] }\)
\({ [ [ A ∩ A ] = A ] }\)
\({ [ [ A ∪ A ] = A ] }\)
\({ [ [ A ∩ B ] = [ B ∩ A ] ] }\)
\({ [ [ A ∪ B ] = [ B ∪ A ] ] }\)
\({ [ [ [ A ∩ B ] ∩ C ] = [ A ∩ [ B ∩ C ] ] ] }\)
\({ [ [ [ A ∪ B ] ∪ C ] = [ A ∪ [ B ∪ C ] ] ] }\)
\({ [ [ [ A ∩ [ B ∪ C ] ] = [ [ A ∪ C ] ∩ [ B ∪ C ] ] ] }\)
\({ [ [ [ A ∪ [ B ∩ C ] ] = [ [ A ∩ C ] ∪ [ B ∩ C ] ] ] }\)
\({ [ [ A ∩ [ A ∪ B ] ] = A ] }\)
\({ [ [ A ∪ [ A ∩ B ] ] = A ] }\)
\({ [ [ A ∖ [ A ∖ B ] ] = [ A ∩ B ] ] }\)
\({ [ [ C ∖ [ A ∩ B ] ] = [ [ C ∖ B ] ∪ [ C ∖ A ] ] ] }\)
\({ [ [ C ∖ [ A ∪ B ] ] = [ [ C ∖ B ] ∩ [ C ∖ A ] ] ] }\)
\({ [ [ A ∖ [ B ∖ C ] ] = [ [ A ∩ C ] ∪ [ B ∖ C ] }\)
\({ [ [ [ A ∖ B ] ∖ C ] = [ A ∖ [ B ∪ C ] ] ] }\)
\({ [ [ [ A ∖ B ] ∩ C ] = [ A ∩ [ C ∖ B ] ] ] }\)