Lecture 1: A Brief Introduction to Logic

$ \newcommand{\set}[1]{\{#1\}} \newcommand{\comprehension}[2]{\{#1\,\vert\,#2\}} \newcommand{\size}[1]{\vert#1\vert} \newcommand{\true}{\top} \newcommand{\false}{\bot} \newcommand{\limplies}{\rightarrow} \newcommand{\divides}{\mathbin{\vert}} \newcommand{\mult}{\mathbin{\cdot}} \newcommand{\xor}{\oplus} \newcommand{\union}{\cup} \newcommand{\intersect}{\cap} \newcommand{\complement}[1]{\overline#1} \newcommand{\ixUnion}{\bigcup} \newcommand{\ixIntersect}{\bigcap} \newcommand{\defeq}{\stackrel{\text{def}}{=}} $

Administrivia

Most of the administrative information relevant to this course can be found on either the home or policy pages, which are also listed in the navigation block on the right.

Preliminaries

There's a common pattern in gateway theorem-proof courses, of which this is one, in which the textbook begins with a chapter or two of background material on logic and set theory, which are then summarized or sometimes even skipped as the introductory lecture hastens on to the “real” material of the course, as if the magical incantations contained in those introductory chapters are so powerful as to imbue the rest with precision and rigor, without even the need to speak them. As someone trained in formal logic, I find this painful, but it's hard to escape the imperatives that drive the common pattern. There is much “real” material to cover, and dealing with foundational issues comprehensively takes time we don't have.

So let me focus on what you actually need, making no claims of completeness.

Mathematical Structure and the Giving of Names

In mathematics, we study mathematical structures. These are ”real” abstractions that motivate mathematical research, and to which our mathematical work is intended to apply. For example, in this course, we'll be concerned with

$\mathfrak{N}$, whose universe is $\mathbb{N} = \set{0,1,\ldots}$, the set of natural numbers, with the successor, addition, and multiplication functions, and the less-than relation,
$\mathfrak{Z}$, whose universe is $\mathbb{Z} = \set{\ldots,-1,0,1,\ldots}$, the set of integers, with the addition, subtraction, and multiplication functions, and the less-than relation,
$\mathfrak{Q}$, whose universe is $\mathbb{Q}$, the set of rational numbers, with the addition, subtraction, and multiplication functions,
$\mathfrak{R}$, whose universe is $\mathbb{R}$, the set of real numbers, with the addition, subtraction, and multiplication functions,

among others. Note here the spiffy Fraktur font, which mathematicians reserve for pretentious circumstances. You have to love that Fraktur Z, though. When we need to speak of the universe of a structure $\mathfrak{A}$ abstractly, we'll denote it by $U^{\mathfrak{A}}$. Thus, $U^{\mathfrak{N}} = \mathbb{N}$. In colloquial mathematics, it is routine to name mathematical structures via their universes, e.g., to use $\mathbb{N}$ in contexts where full rigor calls for $\mathfrak{N}$. If this caused confusion, people would stop doing it, but it doesn't, so you won't see much (if any) Fraktur after this lecture.

An initial concern of logic is to provide a means to make precise statements about mathematical structures, in short, a formal language for Mathematics. We begin by introducing names for the various functions and relations of a mathematical structure, together with names for some (but not necessarily all) of its elements. These are called the function, relation, and constant symbols of the language of that structure. For example, $$\mathscr{L}_{\text{PA}} = \set{S,+,\times,0,\le}$$ is the language of Peano Arithmetic, i.e., the natural numbers, with symbols $S$ for the successor function, $+$ for addition, $\times$ for multiplication, $0$ for zero, and $\lt$ for the less than relation. In a more formal setting, we would require that type information (expressed in terms of the structure's universe) be a constituent of the definition of a language, but such minutiae are best left to a course in mathematical logic or type-theory.

Note that we are making a distinction here between a symbol, e.g., $+$, and the function it represents within a particular structure, e.g., ordinary addition on the natural numbers. Part of the power of language comes from the way such distinctions disappear in the mind of the fluent, and so these distinctions are rarely observed or recognized, and that's not an altogether bad thing. But for the moment, we'll need to maintain such distinctions. The association between a symbol and a meaning is called an interpretation, and we'll write (at least for a few minutes) $+^{\mathfrak{N}}$ to represent the addition function of the natural numbers, as distinct from the symbol $+$ we use to represent it in our language.

We can then use the function and constant symbols, together with a distinct class of symbols called variables to build expressions. In this class, we'll deal with expressions somewhat informally, using the usual algebraic notations for expressions, interpreted using the precedence rules that you've been using since fourth grade. Thus, for example, $(x + 2) \times y$ is an expression built out of the variables $x$ and $y$, in an expansion of $\mathscr{L}_{\text{PA}}$ in which constants have been introduced for all of the natural numbers (or at least, for $2$).

We'd like to think of expressions as a means to augment our ability to name things, but variables add an essential complication to the idea of “expressions as names.” After all, to give a meaning to $(x + 2) \times y$, we need to know the meanings of $x$ and $y$. To that end, we'll define a valuation to be a function from the set of variables to the universe of a given structure. Valuations in logic are analogous to an environment in a programming language. Given a valuation, we can give a meaning to expressions, e.g., assume that $\nu$ is a $\mathfrak{N}$-valuation, i.e., a function from variables to natural numbers, such that $\nu(x) = 2^{\mathfrak{N}}$, and $\nu(y) = 3^{\mathfrak{N}}$. We can then determine the value of the expression $(x+2)\times y$ in the structure $\mathfrak{N}$ with valuation $\nu$, as follows

\begin{align*} ((x+2)\times y)^{\mathfrak{N},\nu} &= (x^{\mathfrak{N},\nu} +^{\mathfrak{N}} 2^{\mathfrak{N}}) \times^{\mathfrak{N}} y^{\mathfrak{N},\nu} \\ &= (\nu(x) +^{\mathfrak{N}} 2^{\mathfrak{N}}) \times^{\mathfrak{N}} \nu(y) \\ &= (2^{\mathfrak{N}} +^{\mathfrak{N}} 2^{\mathfrak{N}}) \times^{\mathfrak{N}} 3^{\mathfrak{N}} \\ &= 4^{\mathfrak{N}} \times^{\mathfrak{N}} 3^{\mathfrak{N}} \\ &= 12^{\mathfrak{N}} \end{align*}

Whew! It's easy to see why people quickly dispense with the distinction between names and values in practice.

Propositions

Propositional formulae are mathematical assertions. We'd like to say that assertions are either true or false, but again, variables introduce an essential complication, because the truth value of an expression that involves variables will usually depend on the value assigned to those variables, i.e., we need to account for valuations in giving meaning to formulae in much the same way that we did with expressions. For the sake of simplicity, we'll define the meaning of various constructs when we define the constructs themselves.

We will use ordinary lower-case Roman letters to denote variables, e.g., $x$, $y$, and $z$, sometimes with subscripts, e.g., $x_1$, $y_2$, $z_{17}$. We will use lower-case Greek letters, often $\varphi$, $\psi$, to denote propositions. A proposition can either be true ($\true$) or false ($\false$) for a given structure $\mathfrak{A}$ and valuation $\nu$. We will write $\mathfrak{A},\nu \models \varphi$ if $\varphi$ is true for the structure $\mathfrak{A}$ and valuation $\nu$, and $\mathfrak{A},\nu \not\models \varphi$ if $\varphi$ is false for that structure and valuation.

Let $\mathfrak{A}$ be a mathematical structure, with associated language $\mathscr{L} = \mathscr{L}_{\mathfrak{A}}$.

Atomic Propositions

There are two kinds of atomic formulae: equality and relational propositions.

If $e_1$ and $e_2$ are expressions in $\mathcal{L}$, then $e_1 = e_2$ is an equality proposition, and $\mathfrak{A},\nu \models e_1 = e_2$ if and only if $e_1^{\mathfrak{A},\nu} = e_2^{\mathfrak{A},\nu}$, i.e., the interpretations of the two expressions are the same. The $=$ in the preceding sentence is being used to two quite distinct ways: in the first two occurrences, we are referring to the equality symbol, a part of the syntax of mathematical formulae, whereas, in the third and final occurrence, we are referring to the usual equality relation of mathematics. Equality is such a central concept in mathematics that we don't want to have multiple notations for it! The distinction is a bit clearer in the next case, of relational statements. As an example, $\mathfrak{N},\nu \models 2 \times x = 4$ if $\nu(x) = 2$, and $\mathfrak{N},\nu \not\models 2 \times x = 4$ otherwise. From this example, we can see that much of the algebra we've learned is focused on recovering a valuation from a set of asserted propositions. Note that the equality symbol is not specifically enumerated as a relation symbol in $\mathcal{L}$. Equality is such an important notion that we assume we always have it.

If $R$ is an $n$-ary relation symbol of $\mathcal{L}$, and $e_1,\ldots,e_n$ are expressions of $\mathcal{L}$, then $R(e_1,\ldots,e_n)$ is a relational proposition, and $\mathfrak{A},\nu \models R(e_1,\ldots,e_n)$ if and only if $(e_1^{\mathfrak{A},\nu},\ldots,e_n^{\mathfrak{A},\nu}) \in R^{\mathfrak{A}}$. For example, $\mathfrak{N},\nu \models 2 \lt x\times x + 3$ for all valuations $\nu$. Note here, too, how cumbersome the use of $\times$ feels in this context, which is why mathematicians typically use either $\cdot$, as in $x \cdot x$, or more simply still, juxtaposition, as in $3x$ to express products. We will too.

Boolean Operators

We use the standard operations of boolean algebra to combine propositions.

If $\varphi$ is $\mathcal{L}$-formula, then $\lnot \varphi$ is also a $\mathcal{L}$-formula, the negation of $\varphi$, and we define $\mathfrak{A},\nu \models \lnot \varphi$ if and only if $\mathfrak{A},\nu \not\models \varphi$.

If $\varphi$ and $\psi$ are $\mathcal{L}$-formula, then $\varphi \land \psi$ is also a $\mathcal{L}$-formula, the conjunction of $\varphi$ and $\psi$, and $\mathfrak{A},\nu \models \varphi \land \psi$ if and only if both $\mathfrak{A},\nu \models \varphi$ and $\mathfrak{A},\nu \models \psi$.

If $\varphi$ and $\psi$ are $\mathcal{L}$-formula, then $\varphi \lor \psi$ is also a $\mathcal{L}$-formula, the disjunction of $\varphi$ and $\psi$, and $\mathfrak{A},\nu \models \varphi \lor \psi$ if and only if either $\mathfrak{A},\nu \models \varphi$ or $\mathfrak{A},\nu \models \psi$, or both. When mathematicians use the word “or,” they do so in the inclusive sense, i.e., to say “$a$ or $b$” means that either $a$ holds, or $b$ holds, or they both do. Thus, when I said “or both” a couple of sentences ago, I was being redundant. English often assigns “or” an exclusive meaning, i.e., to say “$a$ or $b$” means either $a$ holds, or $b$ holds, but not both, cf., “Either I'll pass this class or I'll have to major in Marketing,” does not admit in everyday usage the possibility that I'll be doing both. In any event, let this be an incentive for passing this class.

If $\varphi$ and $\psi$ are $\mathcal{L}$-formula, then $\varphi \limplies \psi$ is also a $\mathcal{L}$-formula, an implication, whose hypothesis is $\varphi$, and conclusion is $\psi$, and $\mathfrak{A},\nu \models \varphi \limplies \psi$ if and only if, whenever $\mathfrak{A},\nu \models \varphi$, then $\mathfrak{A},\nu \models \psi$ also. There are two distinct complications in the foregoing. With the earlier binary boolean operators, we didn't bother to introduce distinct names for their first and second arguments (they have them, but like augend and addend, the names for the first and second arguments to an addition, knowing them is not a path to enlightenment). But we do with $\limplies$, because $\limplies$ is not commutative, so $\varphi \limplies \psi$ means something quite distinct from $\psi \limplies \varphi$, whereas conjunction and disjunction are commutative, so that $\varphi \land \psi$ means the same thing as $\psi \land \varphi$, even though they're technically distinct as propositions. Unsurprisingly, $\limplies$ isn't associative either, whereas both $\land$ and $\lor$ are. This means that we either have to parenthesize expressions that involve multiple $\limplies$, or agree on an associativity convention. We will do the latter, so $\alpha \limplies \beta \limplies \gamma$ is just an abbreviation for $\alpha \limplies (\beta \limplies \gamma)$. This convention is probably only familiar to programmers who work with languages that support a Hindley-Milner based type system, like Haskell, ML, or Swift. The rest of you will just have to learn it now. A second complication comes from the defined meaning of $\limplies$, which is subject to some interpretation. In classical logic, the interpretation we'll use, $\mathfrak{A},\nu \models \varphi \limplies \psi$ if and only if $\mathfrak{A},\nu \not\models \varphi$ or $\mathfrak{A},\nu \models \psi$, or (redundantly) both.

Quantifiers

If $\varphi$ is an $\mathcal{L}$-formula, and $x$ is a variable, then $\forall x,\, \varphi$ is also a universally quantified $\mathcal{L}$-formula. Defining $\mathfrak{A},\nu \models \forall x,\, \varphi$ requires a digression. Given a valuation $\nu$, a variable $x$, and an $a \in U^{\mathfrak{A}}$, we define

\begin{equation*} \nu[x \mapsto a](z) = \begin{cases} \nu(z), &\text{if $z$ and $x$ are distinct variables}\\ a, &\text{if $z$ and $x$ are the same variable} \end{cases} \end{equation*}

If the foregoing seems a bit odd (“Isn't it obvious that $x$ and $z$ are different variables, so why do we need an ‘otherwise’ case?”), let me clarify by noting that $x$ and $z$ aren't actually variables of $\mathcal{L}$, they're variables of the meta-language (the language about language) that we use to describe $\mathcal{L}$. To be a logician, you have to have a sixth-sense as to when these distinctions become relevant. Fortunately, we won't have much of a need to do so after these preliminaries. Anyway, with this in hand, we define $\mathfrak{A},\nu \models \forall x,\,\varphi$ if and only if for all $a \in U^{\mathfrak{A}}$, $\mathfrak{A}, \nu[x \mapsto a] \models \varphi$.

In the same manner, if $\varphi$ is an $\mathcal{L}$-formula, and $x$ is a variable, then $\exists x,\, \varphi$ is also an existentially quantified $\mathcal{L}$-formula, and we define $\mathfrak{A},\nu \models \exists x,\,\varphi$ if and only if there exists an $a \in U^{\mathfrak{A}}$, $\mathfrak{A}, \nu[x \mapsto a] \models \varphi$.

We'll often use a notational convention that allows us to abbreviate consecutive quantifiers of the same sort by eliding all but the first occurrence of a quantifier in a block, e.g., rather than $$\forall a \, \forall b \, \forall c \, \exists x \, \exists y, \, \varphi(a,b,c,x,y),$$ we'll write $$\forall a \, b \, c \, \exists x \, y, \, \varphi(a,b,c,x,y).$$

Why do we care?

We've spent more than a little time introducing a particular foundations for mathematics, the first-order predicate calculus. Why?

A Language for Expressing Mathematical Assertions

A first, and very significant answer, is that we've introduced a precise and succinct language for expressing mathematical concepts. For example, rather than writing, “For all $n$, there exists a $p$ greater than or equal to $n$, such that $p$ is prime.“ we'll write “$\forall n \, \exists p,\, p \geq n \land \text{prime}(p)$,” or even, by a mild abuse of notation, “$\forall n \, \exists p \geq n,\text{prime}(p)$.”

*Exercise 1.1

We define $a$ divides $b$ (in notation $a \divides b$) via
\begin{equation*} a \divides b \defeq \exists d,\, a \mult d = b \end{equation*}
In such a case, we introduce the relation symbol $\divides$ to our language $\cal{L}$, and add the logical assertion
\begin{equation*} \forall a \, b, \, a \divides b \iff \exists d,\, a \mult d = b \end{equation*}
to our set of axioms.

Express this assertion in English, without the use of mathematical or logical symbols other than the names of the variables. Note that the scope of a quantifier extends as far to the right as possible.
A number $p$ is prime if and only if if $p$ is not equal to one, and for all $n$, if $n$ divides $p$, then $n$ equals $1$ or $p$. Write a definition of the predicate $\text{prime}$ in the language of the first-order predicate calculus. Your answer should be of the form,
\begin{equation*} \text{prime}(n) \defeq \varphi(n) \end{equation*}
as above, for some predicate $\varphi$. It is perfectly acceptable to write $x \not= y$ instead of $\lnot\, (x = y)$, and you may use the defined notation $a \divides b$ in $\varphi$.

Exercise 1.1 illustrates that these formal languages can be expanded via definitions.

A Language for Proof

The first order predicate calculus breathes life into Gottfried Leibniz's vision of a “calculus ratiocinator,” a means of converting logical reasoning into calculation.

A key observation here is that we've defined the various meanings of expressions and formulae in terms of the meanings of their constituents. This justifies the notion of “referential transparency,” i.e., that expressions that have the same meaning can be freely substituted for one another, and moreover this same basic principle applies to logically equivalent formulae. We'll introduce (and often name) various equivalences as we need them, but let's start with a simple few. A couple of simple equivalences are given by de Morgan's Laws, which state that for all propositional formulae $\alpha$ and $\beta$,

\begin{align*} \lnot (\alpha \land \beta) &\iff \lnot \alpha \lor \lnot \beta\\ \lnot (\alpha \lor \beta) &\iff \lnot \alpha \land \lnot \beta\\ \end{align*}

These laws are easily checked via an exhaustive analysis (considering all of the possible ways that $\alpha$ and $\beta$ can be assigned $\true$ or $\false$). But we can take this further, based on the analogy that a universal quantification is just a giant “and” indexed over all of the elements of a universe, and an existential quantifier is just a giant “or” in much the same way. thus, if $\varphi$ is a propositional formula,

\begin{align*} (\lnot \forall x,\, \varphi) &\iff (\exists x,\, \lnot \varphi)\\ (\lnot \exists x,\, \varphi) &\iff (\forall x,\, \lnot \varphi)\\ \end{align*}

We can justify these equivalences via ordinary, informal mathematical reasoning, as follows. Consider just the first of these two laws. For the forward direction, the statement $\lnot \forall x,\, \varphi$ means that the formula $\varphi(x)$ isn't true for all values of $x$, i.e., that there's a counterexample $c\in U$ such that $\lnot \varphi(c)$, and thus $\exists x,\, \lnot \varphi$. For the backwards definition, the existence of a counterexample denies the universal.

Another truth-table calculation yields double negation elimination, which says that for all propositional formula $\alpha$

\begin{align*} \alpha &\iff \lnot\lnot\alpha \end{align*}

We can use double-negation elimination, together with de Morgan's laws, to derive quasi-definitions of $\land$ in terms of $\lor$ and $\lnot$, and of $\lor$ in terms of $\land$ and $\lnot$, as follows:

Theorem 1.2

\begin{align} \alpha \land \beta &\iff \lnot (\lnot \alpha \lor \lnot \beta) \\ \alpha \lor \beta &\iff \lnot (\lnot \alpha \land \lnot \beta) \\ \end{align}

Proof. For (1)

\begin{align} \alpha \land \beta &\iff \lnot \lnot (\alpha \land \beta) \\ &\iff \lnot (\lnot \alpha \lor \lnot \beta) \\ \end{align}

where (3) follows from double negation elimination, and (4) from the first of de Morgan's laws, and for (2)

\begin{align*} \alpha \lor \beta &\iff \lnot \lnot (\alpha \lor \beta) \\ &\iff \lnot (\lnot \alpha \land \lnot \beta) \\ \end{align*}

in much the same way.

$\Box$ Theorem 1.2.

*Exercise 1.3 In much the same way as above, derive quasi-definitions for

$\forall$ in terms of $\exists$ and $\lnot$, and
$\exists$ in terms of $\forall$ and $\lnot$.

Exercise 1.4 Read Chapter 1 of Rosen's Discrete Math and its Applications. Rosen does a traditional introduction to logic, beginning with the propositional calculus (no quantifiers, just the boolean operations) and truth-tables, and continuing with the first order predicate calculus, quantifier manipulations, and proofs. Work the following exercises from the text:

Chapter 1, Section 2: Problems 25, 27.
Chapter 1, Section 4: Problems 35, 39, 41.
Chapter 1, Section 5: Problems 39, 49.
Chapter 1, Section 7: Problems 11, 13, 41.
Chapter 1, Supplementary Exercises: 13, 15.

There are computer programs, known generally as “proof assistants,” which facilitate the development of rigorously checked, formal proofs. Learning such a system goes beyond the scope of this class, but there's something both comforting and surprising in knowing that the “calculus ratiocinator” isn't just a dream. Given a suitable formalism (like the 1st order predicate calculus), proofs can be reduced to a kind of formal calculation, and subject to algorithmic verification.

CMSC-27100

Discrete Math

Winter Quarter, 2019