A language for zeroeth order logic

The propositional calculus, often called zeroeth order logic, governs the use of logical operators like “and”, “or” and “if … then”. It does not concern itself with quantifiers, like “for all” or “there exists”, which belong to first order logic. It does not concern itself with the meaning of the statements combined with those connectives. It should be noted though that it can only be expected to behave as expected when those statements are either definitely true or false. It does not cope well with statements like “this statement is false.”

Our language for zeroeth order logic will consist of variables and logical operators. The variables are Boolean variables and the logical operators are Boolean operators, but in this chapter we have no other type of variable or operator so I’ll drop the word Boolean until this next chapter. We’ll use lower case letters, starting with \({ p }\) for variables and single symbols for logical operators. In particular we’ll use \({ ∧ }\) for “and”, \({ ∨ }\) for “or”, and \({ ¬ }\) for “not”. The grammar will also allow the use of \({ ⊃ }\) for “implies”, \({ ⊼ }\) for “nand”, \({ ⊻ }\) for “nor”, \({ ≡ }\) for “if and only if”, \({ ≢ }\) for “xor”, and \({ ⊂ }\) for “if”, but we’ll only ever use the first two of those, will use the second only briefly. We’ll use an infix notation but we’ll use a fully parenthesised version rather than relying on precedence and associativity rules. To make it easier to spot matching parentheses we’ll use not just \({ ( }\) and \({ ) }\) but also \({ [ }\) and \({ ] }\) and \({ \{ }\) and \({ \} }\). These are to be regarded as fully equivalent though. Anywhere they appear in the following discussion any of the above pairs may be replaced with any other. We’ll use the lower case Latin letters \({ p }\), \({ q }\), \({ r }\), \({ s }\) and \({ u }\) for variables. We skip \({ t }\) here to avoid confusion with the constant symbol \({ t }\), meaning “true”, which isn’t part of our language, but which we well use, along with \({ f }\) for “false”, in talking about the language.

The weird symbols for logical operators are unfamiliar at first sight but forcing all symbols to be single characters shortens formulae, makes a lexical analyser unnecessary and allows us to dispense with whitespace as a way of separating symbols.

For theoretical purposes it’s convenient to allow an infinite number of variables so we’ll also allow adding arbitrarily many exclamation points to these letters to create new variables, like \({ p }\), \({ p ! }\), \({ p ! ! }\), etc. We’ll never actually encounter an example where we run out of Latin letters though, so this will remain just a theoretical possibility. Some treatments of zeroeth order logic also have constants symbols for the values “true” and “false” but these won’t be part of our language. The letters \({ t }\) and \({ f }\) will appear in various places though, like truth tables.

Our grammar is then

statement  : expression
expression : variable
           | "(" expression binop expression ")" | "(" "¬" expression ")"
           | "[" expression binop expression "]" | "[" "¬" expression "]"
           | "{" expression binop expression "}" | "{" "¬" expression "}"
variable   : letter | variable "!"
letter     : "p" | "q" | "r" | "s" | "u"
binop      : "∧" | "∨" | "⊃" | "⊼" | "⊻" | "≡" | "≢" | "⊂"

The spaces separate symbols in the specification. They aren’t part of the language. Quoted strings are terminal symbols. The quotes are not part of the symbol.

binop is short for binary operator and includes the operators ∧, ∨, ⊃, ⊼, ⊻, ≡, ≢, ⊂.

This language is unambiguous. Why? Any expression longer than a single variable is bracketed by parentheses of some type. The rules which expand to such expressions all have a single Boolean operator joining one or two expressions. There may be more operators in those expressions, but they are contained within their own set of parentheses. We can identify which rule (alternate) was used by looking at the type of parentheses and the one operator which is not further parenthesised. For example \({ [ ( ¬ p ) ∧ ( q ∨ r ) ] }\) must have been generated by

[ expression ∧ expression ]

This allows top down parsing: start with the whole statement, figure out which rule generated it, figure out which rule generated the expression(s) within it, repeat until we reach the level of variables.

We could have taken expression as the start symbol and dispensed with statement entirely. It will be convenient to have the distinction when we talk about rules of inference though. Every statement is an expression but not every expression is a statement. When we discuss the rule of substitution, for example, it’s important that when we substitute an expression for a variable in a statement that we replace all occurrences of that variable in the the statement, not just all in some particular expression occurring in the statement.

As an example of the language, consider the statement \[ \{ [ p ∨ ( q ∨ r ) ] ⊃ [ ( p ∨ q ) ∨ r ] \} . \] This has the abstract syntax tree given in the diagram.

Syntax tree for { { [ p ∨ ( q ∨ r ) ] ⊃ [ ( p ∨ q ) ∨ r ] } } — Syntax tree for \({ { [ p ∨ ( q ∨ r ) ] ⊃ [ ( p ∨ q ) ∨ r ] } }\)

There are certain symbols which are not part of our language but which we will use for talking about the language. \({ t }\) and \({ f }\) have been mentioned previously. We’ll use the upper case Latin letters \({ P }\), \({ Q }\), \({ R }\), \({ S }\) and \({ U }\) to stand for arbitrary expressions. This is particularly useful in stating rules of inference. One commonly used rule of inference, for example, says that from statements of the form \({ P }\) and \({ ( P ⊃ Q ) }\) we can deduce the statement \({ Q }\). Here any expression can be substituted for \({ P }\) and \({ Q }\). \({ P }\) and \({ Q }\) themselves, though, do not belong to the language. It’s understood, as discussed above, that different types of brackets are interchangeable so an instance of the rule above would be that from \({ ( r ∧ s ) }\) and \({ [ ( r ∧ s ) ⊃ ( r ∨ s ) ] }\) we can deduce \({ ( r ∨ s ) }\). This saves us from needing to repeat each rule three times, once for each set of brackets.