Lists

The main goals of this section is make precise what we mean by a list, and to define various other useful objects in terms of them.

Chains and pairs

A chain is a set \({ A }\) such that for all \({ B ∈ A }\) and \({ C ∈ A }\) we have \({ B ⊆ C }\) or \({ C ⊆ B }\). Any subset of a chain is also a chain. Every non-empty finite chain has a least member, i.e. a member which is a subset of every other member, and a greatest member, i.e. a member such that every other member is a subset of it. This can be proved by induction on sets. The least member is \({ ⋂ A }\) and the greatest member is \({ ⋃ A }\).

As an example, for any \({ x }\) and \({ y }\) the set \({ \{ \{ x \} , \{ x , y \} \} }\) is a chain, with \({ \{ x \} }\) as its least member and \({ \{ x , y \} }\) its greatest member. This is true even in the case \({ x = y }\), although then \({ \{ x \} }\) and \({ \{ x , y \} }\) are the same set. As another example, for each natural number \({ m }\) we can consider the set of all natural numbers \({ n }\) such that \({ m ≤ n }\). The set of such sets, one for each \({ m }\), form a chain. This chain has a greatest member, the set of all natural numbers, but no least element. That doesn’t contradict the result above because this chain is not finite. This example, of course, assumes that the set of natural numbers exists.

Every subset of a chain is a chain and every subset of a finite set is a finite set so every subset of a finite chain is a finite chain.

Suppose \({ A }\) is a finite chain of non-empty sets and \({ D ∈ A }\). Then \[ E = \{ B ∈ A : [ [ B ⊆ D ] ∧ [ ¬ [ B = D ] ] ] \} , \] i.e. the set of elements of \({ A }\) which are proper subsets of \({ D }\), is a subset of \({ A }\) and so is a finite chain. It follows from what we showed earlier that if \({ E }\) is non-empty then it has a greatest member, which we’ll call \({ C }\), and \({ C = ⋃ E }\). Now \({ C ∈ E }\) so \({ C }\) is a proper subset of \({ D }\). Therefore \({ D ∖ C }\), which is the same as \[ D ∖ ⋃ \{ B ∈ A : [ [ B ⊆ D ] ∧ [ ¬ [ B = D ] ] ] \} , \] is non-empty. If \({ E }\) is empty then the set above is just \({ D }\), which is a member of \({ A }\) and hence is also non-empty. So in either case the set above is non-empty. So for each \({ D ∈ A }\) the set above has at least one member. We say that \({ A }\) is a Kuratowski chain if each of these sets also has at most one member. In other words, a set \({ A }\) is a Kuratowski chain if and only if it satisfies the following conditions.

\({ A }\) is finite, i.e., \[ \begin{split} & [ ∀ B ∈ [ P P A ∖ ∅ ] : [ [ ∃ C ∈ B : [ ∀ D ∈ B : [ [ C ⊆ D ] ⊃ [ C = D ] ] ] ] \\ & \qquad ∧ [ ∃ D ∈ B : [ ∀ C ∈ B : [ [ C ⊆ D ] ⊃ [ C = D ] ] ] ] ] ] \end{split} \]
\({ A }\) is a chain, i.e. \[ [ ∀ B ∈ A : [ ∀ C ∈ A : [ [ B ⊆ C ] ∨ [ C ⊆ B ] ] ] ] \]
Every member of \({ A }\) is a non-empty set, i.e. \[ [ ∀ B ∈ A : [ ∃ x . [ x ∈ B ] ] ] \]
Each of the sets discussed previously has at most one element, i.e. \[ \begin{split} & [ ∀ C . [ C = D ∖ ⋃ \{ B ∈ A : [ [ B ⊆ D ] ∧ [ ¬ [ B = D ] ] ] \} ] \\ & \qquad ⊃ [ [ [ x ∈ C ] ∧ [ y ∈ C ] ] ⊃ [ x = y ] ] ] . \end{split} \]

The set \({ \{ \{ x \} , \{ x , y \} \} }\) considered earlier is an example of a Kuratowski chain. It is called the Kuratowski pair with initial element \({ x }\) and final element \({ y }\).

More generally, if \({ A }\) is a non-empty Kuratowski chain then we call the unique member of \({ ⋂ A }\) the initial element of \({ A }\) and the unique element of \[ [ ⋃ A ] ∖ ⋃ \{ B ∈ A : [ [ B ⊆ A ] ∧ [ ¬ [ B = A ] ] ] \} \] the final element of \({ A }\). The fact that these sets have exactly one element is built into the definition of Kuratowski chains. The initial and final element could be the same, as happens, for example, with the chain \({ \{ x \} }\). This is the only way that can happen though.

Suppose \({ A }\) is a non-empty Kuratowski chain with at most two elements. Let \({ x }\) be the initial element and let \({ y }\) be the final element. Let \({ C }\) be the least member of \({ A }\). By the definition of a chain the set \[ C ∖ ⋃ \{ B ∈ A : [ [ B ⊆ C ] ∧ [ ¬ [ B = C ] ] ] \} \] has only one member. \({ C }\) is a least member of \({ A }\) so the set \[ \{ B ∈ A : [ [ B ⊆ C ] ∧ [ ¬ [ B = C ] ] ] \} \] is empty and therefore \[ C ∖ ⋃ \{ B ∈ A : [ [ B ⊆ C ] ∧ [ ¬ [ B = C ] ] ] \} = C \] and \({ C }\) has only one member. Since \({ x }\) is a member of all members of \({ A }\) and \({ C }\) is a member of \({ A }\) it follows that \({ C = \{ x \} }\). So \[ \{ x \} ∈ A . \] \({ y }\) is the final element of \({ A }\) so \[ [ ⋃ A ] ∖ ⋃ \{ B ∈ A : [ [ B ⊆ A ] ∧ [ ¬ [ B = A ] ] ] \} = \{ y \} . \] Now \[ \begin{split} ⋃ A & = [ [ ⋃ \{ B ∈ A : [ [ B ⊆ A ] ∧ [ ¬ [ B = A ] ] ] \} ] \\ & \quad {} ⋃ [ [ ⋃ A ] ∖ ⋃ \{ B ∈ A : [ [ B ⊆ A ] ∧ [ ¬ [ B = A ] ] ] \} ] \end{split} \] so \[ ⋃ A = [ ⋃ \{ B ∈ A : [ [ B ⊆ A ] ∧ [ ¬ [ B = A ] ] ] \} ] ⋃ \{ y \} . \] \({ A }\) has at most two members so \({ [ ⋃ \{ B ∈ A : [ [ B ⊆ A ] ∧ [ ¬ [ B = A ] ] ] \} ] }\) is either empty or has \({ C }\) as its only member. In the first case \({ x = y }\) so \({ ⋃ A = \{ y \} }\) is the same as \[ ⋃ A = \{ x , y \} . \] In the second case \({ ⋃ A = C ⋃ \{ y \} }\) is the same as \[ ⋃ A = \{ x , y \} . \] So we get \({ ⋃ A = \{ x , y \} }\) in either case. For any Kuratowski chain \({ ⋃ A ∈ A }\) so \[ \{ x , y \} ∈ A . \] We already had \({ \{ x \} ∈ A }\) so \[ \{ \{ x \} , \{ x , y \} \} ⊆ A . \] \({ A }\) has at most two members so \[ \{ \{ x \} , \{ x , y \} \} = A . \] Therefore all non-empty Kuratowski chains with at most two members are Kuratowski pairs. We’ve already seen the converse, that all Kuratowski pairs are non-empty Kuratowski chains with at most two members.

Ordered pairs

If two Kuratowski chains are equal then they must have the same initial element and same final element. The applies in particular to pairs. What we’ve just shown implies that if two pairs have the same initial and final elements then they are equal.

Note that it’s not meaningful to talk about the initial or final member of the two element set \({ \{ x , y \} }\) since \({ \{ x , y \} = \{ y , x \} }\) but \({ \{ \{ x \} , \{ x , y \} \} }\) is not equal to \({ \{ \{ y \} , \{ y , x \} \} }\) unless \({ x = y }\). Because we can uniquely identify an initial and final element Kuratowski pairs are often called ordered pairs.

It’s tempting to try to define ordered triples analogously, so that \({ \{ \{ x \} , \{ x , y \} , \{ x , y , z \} \} }\) would be the ordered triple with \({ x }\) as its initial element, \({ y }\) as its middle element and \({ z }\) as its final element. Unfortunately this doesn’t work. Consider the ordered triple with \({ v }\) as both its initial and middle elements and \({ w }\) as its final element. According to the proposed definition above this would be \({ \{ \{ v \} , \{ v , v \} , \{ v , v , w \} \} }\), which is the same as \({ \{ \{ v \} , \{ v , w \} \} }\). Now consider the ordered triple whose initial element is \({ v }\) and whose middle and final elements are \({ w }\). According to the proposed definition above this would be \({ \{ \{ v \} , \{ v , w \} , \{ v , w , w \} \} }\), which is also the same as \({ \{ \{ v \} , \{ v , w \} \} }\). These triples have distinct middle elements though and should therefore be distinct. So the Kuratowski construction works for pairs but the analogous construction for triples does not work.

Soon we will have chains whose elements are Kuratowski pairs, each of which has initial and final elements. To avoid confusion, or at least reduce it, from now on I’ll refer to the initial element of a pair as the left component and the final element as the right component.

Lists

We can still build ordered triples, and lists more generally, using Kuratowski chains. We just have to go about it in a more complicated way.

We’ll say that \({ A }\) is a list if it satisfies the following conditions:

\({ A }\) is a finite Kuratowski chain
for each \({ B ∈ A }\) the unique member of the set \[ B ∖ ⋃ \{ C ∈ B : [ [ C ⊆ B ] ∧ [ ¬ [ B = C ] ] ] \} . \] is an ordered pair, the right component of which is \[ \{ C ∈ B : [ [ C ⊆ B ] ∧ [ ¬ [ B = C ] ] ] \} . \]

The items in the list are the left components of the pairs \[ B ∖ ⋃ \{ C ∈ B : [ [ C ⊆ B ] ∧ [ ¬ [ B = C ] ] ] \} . \] These form a set.

The empty set \({ ∅ }\) is a list, called the empty list, with no items. The first item of a non-empty list is the left component of the initial element of the list. The last item of a non-empty list is the left component of the final element of the list. The definition of lists implies that if \({ A }\) is a list and \({ B }\) is its final element then the right component of \({ B }\) is a list with one element fewer than \({ A }\). We’ll call this the truncation of \({ A }\). For any list \({ A }\) and any \({ x }\) there is a unique list whose last item is \({ x }\) and whose truncation is \({ A }\), namely \[ B = A ⋃ \{ \{ \{ x \} , \{ x , A \} \} \} . \] We say that \({ B }\) is the result of appending the item \({ x }\) to the list \({ A }\). We can build up lists by starting with the empty set and successively appending items. The result of appending the items \({ x }\), \({ y }\), and \({ z }\), in that order, to the empty set will be denoted \({ ( x , y , z ) }\), and similarly for any other list. The empty list is written as \({ ( ) }\).

The precise details aren’t terribly important but I’ll write out what the list \({ ( x , y , z ) }\) is as a set, using this technique of successively appending items. \[ ( ) = ∅ , \] \[ ( x ) = \{ \{ \{ x \} , \{ x , ∅ \} \} \} , \] \[ ( x , y ) = \{ \{ \{ x \} , \{ x , ∅ \} \} , \{ \{ y \} , \{ y , \{ \{ \{ x \} , \{ x , ∅ \} \} \} \} \} \} \] \[ \begin{split} ( x , y , z ) & = \{ \{ \{ x \} , \{ x , ∅ \} \} , \{ \{ y \} , \{ y , \{ \{ \{ x \} , \{ x , ∅ \} \} \} \} \} , \\ & \qquad \{ \{ z \} , \{ z , \{ \{ \{ x \} , \{ x , ∅ \} \} , \{ \{ y \} , \{ y , \{ \{ \{ x \} , \{ x , ∅ \} \} \} \} \} \} \} \} \} \end{split} \]

This looks rather complicated but all we really need to know is the following.

There is a Boolean expression in our language which tells us whether something is a list.
There is a Boolean expression in our language which tells us whether a list is empty.
There is a way within our language to get the last item in a non-empty list.
There is a way within our language to get a list with all but the last item.
There is a way to take a list and append one more item at the end.
There is a way to get the set whose members are the items of a list.

These are all the operations we need to perform on lists. More complicated operations, like reversing the order of the items on a list or concatenating two lists, can be defined from these basic operations. In fact LISP, whose main data structure is lists, does exactly this, with the inconsequential difference that it builds lists starting from the empty list by adding new items at the start of the list rather than the end.

Interfaces

A fundamental organising principle in programming is to define interfaces, the fundamental operations whose semantics the user of an object can rely upon, and to hide as much as possible the implementation of these interfaces, so that one implementation can be replaced by another without breaking anything, assuming users aren’t relying on anything about the objects beyond the fact that they implement one or more of these interfaces.

As an example, we now have two different ways we could implement ordered pairs, either as Kuratowski pairs or as lists with two elements. The first option might seem more efficient, since \[ \{ \{ x \} , \{ x , y \} \} \] is simpler than \[ \{ \{ \{ x \} , \{ x , ∅ \} \} , \{ \{ y \} , \{ y , \{ \{ \{ x \} , \{ x , ∅ \} \} \} \} \} \} . \] This is irrelevant though if we only operate on lists using the primitive operations considered earlier, as we should. This is one respect in which mathematics differs from programming. Efficiency of implementation matters to users in programming. There is a notion of efficiency in mathematics but it involves the ease with which we can prove that the objects constructed implement the given interface. Implementations are generally called definitions in mathematics.

In fact there are a number of other implementations of ordered pairs besides the two given above but we won’t consider any of those.

Often one interface can be implemented in terms of another. If you look at the list of operations on lists defined above you may notice that by combining the third and fourth operations and removing the last one standard interface for a stack:

We can check whether an object is a stack.
We can check whether a stack is empty.
We can pop the top item off a non-empty stack, leaving a stack with the remaining items.
We can push an item onto the top of the stack, leaving a stack with that item as its top.

Which implementation of ordered pairs should we choose? Perhaps the most useful answer is that it shouldn’t matter. If we ever do anything which works for one implementation but not the other then we are doing the mathematical equivalent of accessing objects outside of their public interfaces, which in programming is either forbidden or strongly discouraged, depending on the language. Still, we do need some implementation so we have to choose one. Efficiency, in the mathematical sense rather than the programming one, is not really an issue for us. We need lists for a number of other purposes, including the definition of a language, so we can’t avoid defining lists and establishing their main properties, even if we ultimately decide not to define ordered pairs in terms of them. We need Kuratowski pairs for the implementation of lists above, so we also needs those definitions and properties. So the same amount of work is required no matter which option we choose. I’ll use the list definition primarily because we need a notation for ordered pairs and a notation for lists and if ordered pairs are lists then we can use the same notation for both, so \({ ( x , y ) }\) is the two-element list whose first and last items are \({ x }\) and \({ y }\) respectively and also the ordered pair whose first and second coordinates are \({ x }\) and \({ y }\), since those are the same thing. Mathematicians traditionally choose the other interpretation, as Kuratowski pairs, because they need ordered pairs but don’t really need lists. In that situation Kuratowski pairs are a more efficient implementation. As mentioned before though, it doesn’t really matter.

Cartesian products

The set of ordered pairs \({ ( x , y ) }\) with \({ x ∈ A }\) and \({ y ∈ B }\) is called the Cartesian product of \({ A }\) and \({ B }\), written \({ A × B }\). In the common special case where \({ A = B }\) we often write \({ A ^ 2 }\) rather than \({ A × A }\). In a similar way we define \({ A ^ 3 }\) to be the list of ordered triples, i.e. lists of the form \(( x , y , z )\), where each of \({ x }\), \({ y }\) and \({ z }\) is an element of \({ A }\).

\({ A × B }\) is indeed a set, as are \({ A ^ 2 }\) and \({ A ^ 3 }\). This is less obvious than it might seem, but still true.

If \({ A }\) and \({ B }\) are finite sets then \({ A × B }\) is finite. This can be proved fairly easily by induction on sets. In particular, if \({ A }\) is finite then so are \({ A ^ 2 }\) and \({ A ^ 3 }\).