Lecture 7
\( \newcommand{\set}[1]{\{#1\}} \newcommand{\comprehension}[2]{\{#1\,\vert\,#2\}} \newcommand{\size}[1]{\vert#1\vert} \newcommand{\true}{\top} \newcommand{\false}{\bot} \newcommand{\limplies}{\rightarrow} \newcommand{\divides}{\mathbin{\vert}} \newcommand{\mult}{\mathbin{\cdot}} \newcommand{\xor}{\oplus} \newcommand{\union}{\cup} \newcommand{\intersect}{\cap} \newcommand{\complement}[1]{\overline#1} \newcommand{\powerset}{\mathcal{P}} \newcommand{\ixUnion}{\bigcup} \newcommand{\ixIntersect}{\bigcap} \newcommand{\Div}{\mathrm{div}} \newcommand{\gcd}{\mathrm{gcd}} \newcommand{\divmod}{\mathop{\mathbf{divmod}}} \newcommand{\div}{\mathop{\mathbf{div}}} \newcommand{\mod}{\mathop{\mathbf{mod}}} \)Induction, II
Let's begin by once again recalling the template for a proof by induction:
- Statement: Begin with a precise statement of the formula to be proven.
- The Basis Case: State the number $k$ where you're starting your induction, and the formula $\varphi(k)$ to be proven, and then prove it.
- The Induction Case: State the inductive hypothesis $\varphi(n)$, and the formula $\varphi(n+1)$ to be proven. Prove the result, clearly indicating when the inductive hypothesis (often abbreviated IH) is used.
A practical problem
Let's assume you want a formula for $$\sum_{i=1}^n i^2,$$ i.e., the sum of the first $n$ squares. What could you do? Induction allows us to verify an answer, but it's usually not much help in finding it.
Look it up...
Surely people have looked at this problem before, and surely, the answer is written down somewhere. This reduces solving the problem to looking up the answer. When I was a kid, they'd tell us to look in the “CRC Standard Mathematical Tables and Formulae.” This is still a thing, and Section 1.2.11 is ”Sums of Powers of Integers.“ Cool. These days, googling "what is the sum of the first n squares" yields $$n (n + 1) (2n + 1) \over 6$$ and you don't even have to trudge over to the library to find the book. Sweet.
But how would we know we got it right? Google's been known to cough up wrong answers, after all. From time to time we're given evidence that the “wisdom of the crowd” isn't all it's sometimes cracked up to be, but that's all Google can reasonably be expected to yield. Even the venerable “CRC Standard Mathematics Tables and Formulae” is currently in it's 32nd edition, suggesting that the first 31 editions (at least!) might have fallen short of perfection.
Proof by induction is certainly an excellent answer to the question. But way back in the day when I was an undergraduate math major, my department taught a math class for students majoring in elementary education. They were taught to “prove” such formula by trying it out at 7 distinct data points. Oh, how we laughed at the lack of sophistication! But think about it: If we somehow knew that the sum could be expressed as a $3$rd order polynomial (which seems quite reasonable, given the examples we've seen), it would be enough (by the fundamental theorem of algebra) to verify that the polynomial we've found is correct on four distinct data points. We can easily do this, computing $p(1) = 1$, $p(2) = 5$, $p(3) = 14$, and $p(4) = 30$. The joke isn't that the elementary-school teachers in training were asked to do too little, it's that they were asked to do too much!
Variation of Parameters
The name, “variation of parameters” is drawn from the study of ordinary differential equations, and refers to a technique where you guess the form of the answer (up to a few constants), plug the resulting form into the equation to be solved, and then derive solve equations for those parameters. But this is a useful idea in general: if you know the general form of an answer, you can often use that knowledge to find a complete solution.
In this case, we might guess that the sum of the first $n$ values of an order $k$ polynomial is an order $k+1$ polynomial, as it is in the constant and linear cases, and as it is in the case of integration, a process clearly related to summation.
So we guess $$\sum_{i=1}^n = an^3 + bn^2 + cn + d.$$ We have four parameters, and so need four (independent) equations. We'll get them by considering $n\in\set{1,2,3,4}$.
\begin{align*} \sum_{i=1}^1 i^2 &= \phantom{0}1 = a \mult 1^3 + b \mult 1^2 + c \mult 1 + d\\ \sum_{i=1}^2 i^2 &= \phantom{0}5 = a \mult 2^3 + b \mult 2^2 + c \mult 2 + d\\ \sum_{i=1}^3 i^2 &= 14 = a \mult 3^3 + b \mult 3^2 + c \mult 3 + d\\ \sum_{i=1}^4 i^2 &= 30 = a \mult 4^3 + b \mult 4^2 + c \mult 4 + d\\ \end{align*}or
\begin{align*} 1 &= \phantom{00}a + \phantom{0}b + \phantom{0}c + d\\ 5 &= \phantom{0}8a + \phantom{0}b + 2c + d\\ 14 &= 27a + \phantom{0}9b + 3c+ d\\ 30 &= 64a + 16b + 4c+ d\\ \end{align*}Solving this system, e.g., by Gaussian elimination (Gauss did get around) results in
\begin{align*} a &= 1/3\\ b &= 1/2\\ c &= 1/6\\ d &= 0 \end{align*}Note that if you think carefully about the $n=0$ case, you'll see that $d$ is always zero (and therefore, $n$ is always a factor of the resulting polynomial). With a little bit of algebra, this can be massaged into the same form as the other two answers.
This technique is particularly useful if you happen to have a solver for systems of linear equations lying around, as is often the case if you're working with mathematical software.
A somewhat more sophisticated approach, and a closer parallel to the technique of variation of parameters from ordinary differential equations, is to notice that if the function $f(n) = \sum_{i=1}^n i^2$ has the form
$$f(n) = an^3 + bn^2 + cn,$$(recalling that $d$ must equal $0$) what we want (in the nomenclature of difference equations) is $$\nabla f(n) = f(n) - f(n-1) = n^2,$$ i.e., the difference between the sum of the first $n$ and first $n-1$ squares is $n^2$.
We can then drop our particular form for $f$ into this equation, along with the desired result, yielding
\begin{align*} 1 n^2 & = an^3 + bn^2 + cn - (a(n-1)^3 + b(n-1)^2 + c(n-1))\\ & = an^3 + bn^2 + cn - (an^3 - 3an^2 + 3an - a + bn^2 - 2bn + b + cn - c)\\ & = an^3 + bn^2 + cn - an^3 + 3an^2 - 3an + a - bn^2 + 2bn - b - cn + c\\ & = (a-a) n^3 + (b + 3a - b)n^2 + (c -3a + 2b -c)n + (a - b + c)\\ & = 3a n^2 + (2b-3a) n + (a - b + c)\\ \end{align*}We can now equate like coefficients, resulting in the system of equations
\begin{align*} 3a = 1\\ 2b - 3a = 0\\ a - b + c = 0 \end{align*}Although it took a bit more work to get here, we can more easily solve this system, working from top to bottom, and using substitutions along the way, obtaining the same values for $a$, $b$, and $c$ as before, and combining this with our note that $d$ is always $0$.
The Method of Differences
The introduction of $\nabla$ is suggestive. Intuitively, there's an obvious analogy between integration and summation, and the inverse operations of each are differentiation and differencing respectively. As a practical matter, the theory of integration and differentiation requires more mathematical sophistication, but involves simpler forms and so is somewhat easier to compute with. But this is a discrete math class, and so we'll have to deal with the additional combinatorial complexity. The analogy can be developed through the following simple theorem:
Theorem 7.1 For all $n \geq 1$, and all $f : \mathbb{N} \to \mathbb{N}$,
\begin{equation*} \sum_{i=1}^n \nabla f (i) = f(n) - f(0) \end{equation*}Intuitively, this is obvious:
\begin{align*} \sum_{i=1}^n \nabla f (i) &= (f(n) - f(n-1)) + (f(n-1) - f(n-2)) + \ldots f(1) - f(0)\\ &= f(n) + (-f(n-1) + f(n-1)) + (-f(n-2) + f(n-2)) + \ldots + (-f(1) + f(1)) - f(0)\\ &= f(n) + 0 + 0 + \ldots + 0 - f(0)\\ &= f(n) - f(0) \end{align*}This is a “telescoping series” in which all but the first and last terms cancel out, leaving a simple form.
*Exercise 7.2 Give an inductive proof of Theorem 7.1.
Note that this can be further simplified by considering the $n=0$ case. In this case, the summation is empty, and so is $0$. Thus, we can take $f(0) = 0$, giving rise to a simpler form. But we may want to use the method of differences to analyze things that aren't summations, so we'll retain the more general form of Theorem 7.1.
An interesting observation is that we can use this theorem to convert difference calculations into summation formula, e.g., we can compute
\begin{align*} \nabla x^3 &= x^3 - (x-1)^3\\ &= x^3 - (x^3 - 3x^2 + 3x -1)\\ &= 3x^2 - 3x + 1 \end{align*}And immediately conclude
\begin{equation*} \sum_{i=1}^n (3i^2 - 3i + 1) = n^3 \end{equation*}We can build up a number of such formula, and use them to solve for $\sum_{i=1}^n i^2$, much as we did last time in developing a formula for $\sum_{i=1}^n i$.
Exercise 7.3 Derive a formula for $\sum_{i=1}^n i^2$ from the formula $\sum_{i=1}^n (3i^2 - 3i + 1) = n^3$ and known formulae for $\sum_{i=1}^n i$ and $\sum_{i=1}^n 1$. [Note: this is very similar to Exercise 6.8, but the starting point has a different form, reflecting a different analysis.]
Theorem 7.1 seems like it should be the foundation of an elegant theory, but we're frustrated by the fact that the forms that come out of our differencing calculations are messy, whereas derivatives of the same forms in calculus classes result in comparatively neat forms, and clean inverse calculations. But let's not give up quite so quickly. We know $\sum_{i=1}^n 1 = n$, and that $\sum_{i=1}^n i = n (n+1) /2$. There's a sense in which these are almost the simple forms of calculus, we've just seen an $n+1$ appear in a place we expected to see an $n$. And of course, the pattern seems to break with $\sum_{i=1}^n i^2 = n (n + 1) (2n + 1) / 6$, which has some hopeful structure (note the factors of $n$ and $n+1$, and the “high order” co-efficient of $2/6 = 1/3$, which is what we'd expect from calculus). But it's hard to a pattern that the $(2n+1)$ term fits into.
But maybe we're considering the wrong pattern. If the sum of first powers resulted in $n(n+1)/2$ where $n^2/2$ might be expected, perhaps there's something to be gained in considering $\sum_{i=1}^n i (i+1)$. We compute as follows:
\begin{align*} \sum_{i=1}^n i (i+1) &= \sum_{i=1}^n (i^2 + i)\\ &= \sum_{i=1}^n i^2 + \sum_{i=1}^n i \\ &= {n (n+1) (2n+1)\over 6} + {n(n+1)\over 2}\\ &= n (n+1) ({2n+1\over 6} + {1 \over 2})\\ &= n (n+1) ({2n+1+3 \over 6})\\ &= n (n+1) {(2n+4)\over 6}\\ &= {n (n+1) (n+2) \over 3} \end{align*}There's a pattern!
Definition 7.4 The $k$-th factorial power of $n$ is $n^{\overline k} = n(n+1)\cdots(n+k-1)$.
We can now compute
\begin{align*} \nabla n^{\overline k} &= n^{\overline k} - (n-1)^{\overline k}\\ &= [n (n+1) \cdots (n+k-1)] - [(n-1)((n-1)+1) \cdots ((n-1) +k - 1)]\\ &= [n (n+1) \cdots (n+k-1)] - [(n-1)n(n+1) \cdots (n+k-2)]\\ &= n (n+1) \cdots (n+k-2) [(n+k-1) - (n-1)]\\ &= k n (n+1) \cdots (n+k-2)\\ &= k n^{\overline{k-1}} \end{align*}We can then sum both sides,
\begin{align*} \sum_{i=1}^n k\cdot i^{\overline {k-1}} &=\sum_{i=1}^n \nabla i^{\overline k}\\ &=n^{\overline k}\tag{By Theorem 7.1.} \end{align*}i.e.,
\begin{equation*} \sum_{i=1}^n i^{\overline {k-1}} = {1\over k} \cdot n^{\overline k} \end{equation*}or, by replacing $k$ by $k+1$,
Theorem 7.5 \begin{equation*} \sum_{i=1}^n i^{\overline k} = {1\over k+1} \cdot n^{\overline {k+1}} \end{equation*}
*Exercise 7.6 Formalize proof of Theorem 7.5, by giving a proof by induction that $\nabla n^{\overline k} = k n^{\overline{k-1}}$ for all natural numbers $k \geq 1$. You may find it helpful to first prove the following analog to the product rule for differentiation: $$\nabla f(n) \cdot g(n) = f(n) \nabla g(n) + g(n-1) \nabla f(n).$$
This suggests a very elegant approach to solving the problem of summing a polynomial $p(x)$:
- Express $p(x)$ in terms of factorial powers,
- Sum the resulting expression using using the integration-like Theorem 7.5, then
- Convert the result (a sum of factorial powers) back into ordinary polynomial form.
The first time we actually try to carry out this approach, we have a sobering realization: we've eliminated the problem of solving a system of linear equations in step (2), in favor of solving two systems of linear equations, one each in steps (1) and (3). This is the kind of simplification that we need less of!
But this has been far from a fruitless digression. The general idea we explored here is often very fruitful: there are plenty of problems that are difficult to solve when expressed in one form, but easy to solve when expressed in a slightly different form. This can make the transforming of problems and solutions very attractive, cf., logarithms and multiplication. But there's more here than an “it almost worked.” We note here that the $\nabla$ of a $k$-th order polynomial must be a $k-1^{\text{st}}$ order polynomial, and moreover that every $k-1^{\text{st}}$ order polynomial arises in this way. We can conclude that the sum of a $k-1^{\text{st}}$ order polynomial must be a $k$-th order polynomial, justifying the form we guessed in using variation of parameters.
An Impractical Problem: Matches
Thus far, we've used proof by induction to deal with the problem of finding closed forms for certain summations. It's been useful, but induction is useful in a lot of other settings. Let's consider a simple game, Matches. Matches is a typical two-player game, whose state consists of two piles of matches. The game starts with an equal number of matches in each pile. Play proceeds by alternating moves. At each move, the player removes a non-zero number of matches from one of the piles. The first player who can't move (i.e., there are no matches left to take), loses.
Let's call the instance of Matches that begins with two piles of size $n$ the “$n$-instance of Matches.” The question we want to consider is, “Which player has a winning strategy for the $n$-instance of Matches?” Oddly enough, the answer doesn't depend on $n$.
Theorem 7.7 For all $n$, the second player has a winning strategy for the $n$-instance of Matches.
Proof The proof is by strong induction on $n$. Recall, in a proof by strong induction of a formula $\forall n,\,\varphi(n)$, we have to proof $\varphi(n)$ from the assumption $\forall k \lt n, \varphi(k)$.
To that end, suppose $n$ is given, and we know that for all $k \lt n$, the second player has a winning strategy for $k$-instance of Matches. Although this is a proof by strong induction, we'll consider two different cases for $n$:
Case 1: $n = 0$. The $0$-, or trivial-, instance of Matches begins the game with a losing configuration for the first player. He has no moves. The second player just wins!
Case 2: $n > 0$. The first player picks one pile, and removes some matches from it, resulting in two piles: one with $n$ matches, and one with $k\lt n$ matches. The second player moves to re-establish the symmetry, taking $n-k \gt 0$ matches from the pile with $n$ matches, resulting in two piles of $k$ matches each, and the first player to move. But this is just configuration at the beginning of the $k$-instance of Matches, and we know by induction hypothesis that the second player has a winning strategy for that game.
$\Box$ Theorem 7.7
Exercise 7.8 The game of Matches is a simplified version of the game of Nim, which can involve multiple (i.e., more than 2) distinct piles of matches. There's a nice Wikipedia: Nim article, which includes a discussion of the winning strategy in this more general case. Read the article, understand the strategy, and convince yourself that the proof above simply adapts the general strategy to the two-pile special case.