Lecture 4: Modular Arithmetic

\( \newcommand{\set}[1]{\{#1\}} \newcommand{\comprehension}[2]{\{#1\,\vert\,#2\}} \newcommand{\size}[1]{\vert#1\vert} \newcommand{\true}{\top} \newcommand{\false}{\bot} \newcommand{\limplies}{\rightarrow} \newcommand{\divides}{\mathbin{\vert}} \newcommand{\mult}{\mathbin{\cdot}} \newcommand{\xor}{\oplus} \newcommand{\union}{\cup} \newcommand{\intersect}{\cap} \newcommand{\complement}[1]{\overline#1} \newcommand{\powerset}{\mathcal{P}} \newcommand{\ixUnion}{\bigcup} \newcommand{\ixIntersect}{\bigcap} \newcommand{\Div}{\mathrm{Div}} \newcommand{\gcd}{\mathrm{gcd}} \newcommand{\divmod}{\mathop{\mathbf{divmod}}} \newcommand{\div}{\mathop{\mathbf{div}}} \newcommand{\mod}{\mathop{\mathbf{mod}}} \)

As last time, our default domain of discourse today is $\mathbb{Z}$. Remainders are often a fruitful path in addressing number-theoretic problems. To that end,

Definition 4.1 If $a,b \in \mathbb{Z}$ and $m>0$, then $a$ is congruent to $b$ mod $m$ (in notation $a \equiv_m b$), if and only if $m \divides a-b$, i.e., if and only if $a \mod m = b \mod m$.

Theorem 4.2 For $m>0$, the binary relation $\equiv_m$ is an equivalence relation.

Proof Recall that a binary relation is an equivalence relation if it is reflexive, symmetric, and transitive.

For reflexivity, $a \equiv_m a$ as $m \divides a-a$ for all $a$. For symmetry, if $a \equiv_m b$, then $m \divides a - b$, whence $m \divides b - a$, and so $b \equiv_m a$. Finally, for transitivity, if $a \equiv_m b$ and $b \equiv_m c$, then $m \divides a-b$ and $m \divides b-c$, so $m \divides (a-b) + (b-c)$, i.e., $m \divides a-c$.

$\Box$ Theorem 4.2

Exercise 4.3 Although Theorem 4.2 specifically does not consider this case $\equiv_0$ is also an equivalence relation. What equivalence relation is it?

Of course, once we've established that a particular binary relation is an equivalence relation, the next thing we'll do is consider its equivalence classes, in this case

Definition 4.4 For all $m>0$, and $a \in \mathbb{Z}$, define $[a]_m = \comprehension{b}{a \equiv_m b}$.

An extraordinarily nice fact is that the addition and multiplication structure of the integers naturally “lift” to these equivalence classes:

Theorem 4.5 Let $m>0$. We can define operations $+$ and $\mult$ on the equivalence classes of $m$ as follows:

Proof We have to show that the functions we've claimed to define are, in fact, well-defined. To that end, suppose $a_1 \equiv_m a_2$ and $b_1 \equiv_m b_2$. We have to show $a_1 + b_1 \equiv_m a_2 + b_2$, and $a_1 \mult b_1 \equiv_m a_2 \mult b_2$. We will do the $+$ case in these notes, and leave the $\mult$ case as an exercise.

By assumption $m \divides a_1-a_2$, and $m \divides b_1-b_2$. Therefore $m \divides (a_1-a_2) + (b_1-b_2)$, which after reorganizing the right-hand side gives $m \divides (a_1+b_1) - (a_2+b_2)$, i.e., $a_1+b_1 \equiv_m a_2+b_2$, as required.

*Exercise 4.6 Prove the $\mult$ case.

$\Box$ Theorem 4.5

Because we've defined these functions in terms of the corresponding functions on representatives of congruence classes, certain properties of the $+$ and $\mult$ operations on the natural numbers carry over directly to the corresponding operations on congruence classes, e.g.,

Theorem 4.7 For all $a$, $b$, $c$, and for all $m>0$, $[a]_m + ([b]_m + [c]_m) = ([a]_m + [b]_m) + [c]_m$, i.e., modular addition is associative.

Proof

\begin{align*} [a]_m + ([b]_m + [c]_m) &= [a]_n + [b+c]_m\\ &= [a + (b+c)]_m \\ &= [(a+b)+c]_m \\ &= [a+b]_m + [c]_m\\ & = ([a]_m + [b]_m) + [c]_m \\ \end{align*}

$\Box$ Theorem 4.7

Exercise 4.8 Let $m>0$, and consider $+$ on the equivalence classes mod $m$. Prove

Before pressing further on this, I'd like to step back for just a moment. There is what in math-speak is known as a “canonical construction” going on here, in particular the “quotient construction.” We start with a structure $\mathfrak{A}$, and then describe an equivalence relation $\equiv$ on $U^{\mathfrak{A}}$. We then observe how certain functions defined on $A$ are well-defined on equivalence classes, and then produce a new structure $\mathfrak{A}/\equiv$, whose universe is consists of the equivalence classes of $\equiv$, i.e., $U^{\mathfrak{A}/\equiv} = \comprehension{[a]}{a \in A}$, together with these equivalence-class respecting functions. Note that the “quotient” of the quotient construction doesn't refer to the operation of division on the integers or natural numbers, but rather the partitioning of $U^{\mathfrak{A}}$ into equivalence classes, and then abstractly viewing those classes as being the “points” of an induced structre.

Definition 4.9 The integers mod $m$ is the structure $\mathbb{Z}_m = \mathbb{Z}/\equiv_m$, together with the binary operations of $+$ and $\mult$ on that set as defined above. (Technically, we should be writing $\mathfrak{Z}_m = \mathfrak{Z}/\equiv_m$, but no one does.)

Sometimes functions or operations are lost (as well as sometimes gained) in passing from a structure to one of its quotients. For example, there's no sensible ordering $\lt$ that can be imposed on $\mathbb{Z}_m$, and so $\lt$ is not a part of the language of modular arithmetic. We'll learn about what we've gained soon.

A recurring issue with quotient structures revolves around the naming of equivalence classes, which are tedious things to have to drag around explicitly, however convenient they may be in proofs like that of Theorem 4.7. In one particularly common sort of quotient construction, the equivalence relation on $U^{\mathfrak{A}}$ comes from a function $f:U^{\mathfrak{A}} \to B$ for some set $B$, where $a_1 \equiv a_2 \iff f(a_1) = f(a_2)$. In this case, we can name the elements of the quotient structure by the element of $B$ they map to. Another approach is to have a function $f:U^{\mathfrak{A}} \to U^{\mathfrak{A}}$ such that $f$ both respects and is well-defined on equivalence classes, i.e., $\forall a,\, f(a) \in [a]$, and $\forall a b, a \equiv b \limplies f(a) = f(b)$. Such a function $f$ can be said to “select a canonical representative” of each equivalence class, and we can use $f(a)$ as a surrogate name for $[a]$. In the case of the quotient structure on $\mathbb{Z}$ given by equivalence classes based on congruence mod $m$, a natural choice of canonical representative is the least non-negative element $r$ of the equivalence class, i.e., $a \mod m$. This is easy and natural, but comes with the risk of ambiguity: does $r$ refer to $r \in \mathbb{Z}$, or does it refer to $[r]_m \in \mathbb{Z}_m = \mathbb{Z}/\equiv_m$? And if the latter, for which $m$? To avoid confusion, we'll remind ourselves by annotating equations in $\mathbb{Z}_m$ with their modulus, e.g., $2+2=1 \pmod 3$.

This is all a quite different approach than what Rosen takes, and I dare say from what you probably learned in elementary school, but I believe you guys will benefit from the extra layer of mathematical sophistication here, as you'll encounter more quotient structures in the future. One interesting difference in our approaches is that Rosen introduces new notation $+_m$ and $\mult_m$ for addition and multiplication in $\mathbb{Z}$, but we account for this somewhat differently, by defining the interpretation of the $+$ and $\mult$ symbols in the structure $\mathbb{Z}_m$ to be the functions Rosen refers to as $+_m$ and $\mult_m$ respectively.

Exercise 4.10 Read Rosen's presentation of this material in Chapter 4.1.

Definition 4.11 Two integers $a$ and $b$ are relatively prime (or co-prime) if $\gcd(a,b) = 1$. Note that if $p$ is a prime, and $p$ does not divide $a$, then $a$ and $p$ are relatively prime.

Theorem 4.12 If $m>0$, and $a$ are relatively prime, then $a$ has a multiplicative inverse mod $m$, i.e., $\exists b, b \mult a = 1 \pmod m$.

Proof Let $m$ and $a$ be as hypothesized. By Theorem 3.16, there exist integers $n$ and $b$ such that $n \mult m + b \mult a = 1$. Computing residue classes mod m,

\begin{align*} [1]_m &= [n \mult m + b \mult a]_m\\ &= [n]_m \mult [m]_m + [b]_m \mult [a]_m\\ &= [n]_m \mult [0]_m + [b]_m \mult [a]_m\\ &= [b]_m \mult [a]_m \end{align*}

as required.

$\Box$ Theorem 4.12.

Note that not only do multiplicative inverses exists, but that the extended version of Euclid's algorithm from last time allows us to compute them. Naturally, these multiplicative inverses are unique:

Theorem 4.13 If $p$ is prime, and $a \not= 0 \pmod{p}$, then the multiplicative inverse of $a$ is unique.

Proof There is a “usual” proof of such things, greatly simplified in the case by the fact that multiplication in $\mathbb{Z}_m$ is commutative and associative. Suppose that $b$ and $c$ are both multiplicative inverses of $a$. Then

\begin{align*} b &= b \mult 1 &\pmod{p} \\ &= b \mult (c \mult a) &\pmod{p} \\ &= (b \mult c) \mult a &\pmod{p} \\ &= (c \mult b) \mult a &\pmod{p} \\ &= c \mult (b \mult a) &\pmod{p} \\ &= c \mult 1 &\pmod{p} \\ &= c &\pmod{p} \end{align*}

$\Box$ Theorem 4.13

Corollary 4.14 If $p$ is prime, and $a \not= 0 \pmod{p}$, then $a$ has a multiplicative inverse mod $p$.

Definition 4.15 We will use $a^{-1}$ to indicate the multiplicative inverse of $a$, when it can be show to exist.

*Exercise 4.16 Solve the following system of linear equations:

\begin{align*} 3 \mult x + 5 \mult y &= 14&\pmod{17}\\ 7 \mult x + 3 \mult y &= 6 &\pmod{17} \end{align*}

Note that Exercise 4.16 rests on an interesting observation: for primes $p$, $\mathbb{Z}$ is a lot like $\mathbb{R}$, in that both are fields, and so similar computational approaches work. But for non-prime moduli, like $18$, it is quite different. Consider the equation $2 \mult x = 8 \pmod{18}$. This has two solutions, $x = 4 \pmod{18}$ and $x=13\pmod{18}$. The problem here is that $\gcd(2,18) = 2 \not= 1$, and so $2$ doesn't have a multiplicative inverse.

Corollary 4.17 If $p$ is prime, and $a$ and $b$ are integers such that $p \divides a \mult b$, then $p \divides a$ or $p \divides b$.

Proof. Let $p$, $a$, and $b$ be as hypothesized. Without loss of generality, $p$ does not divide $a$. Therefore $\gcd(a,p) = 1$, and so there exist integers $x$ and $y$ such that $x \mult a + y \mult p = 1$. Multiply both sides of this equation on the right by $b$, to get $b = x \mult a \mult b + y \mult p \mult b$. Note that $p$ divides the right hand side of this equality, and therefore must divide the left, as required.

$\Box$ Corollary 4.17

Exercise 4.18 Show the following:

Theorem 4.19 (The Chinese Remainder Theorem) Suppose $m_1, \ldots , m_k$ are $k$ pair-wise relatively prime moduli, and that $c_1,\ldots,c_k$ are integers. Then there exists a solution to the system of equations $x = c_i \pmod{m_i}$, unique up to multiples of $n = m_1 \mult m_2 \mult \ldots \mult m_k$.

Proof. First we'll do existence.

For each $i \in \set{1,\ldots,k}$, we define $n_i = n / m_i$, i.e., $n_i$ is the product of the $m_j$'s for $j \not= i$. Note that $m_i$ must be relatively prime to $n_i$, via the first part of Exercise 4.18 and a simple inductive argument. We first show that we can solve the special case of Theorem 4.19 where $c_i = 1$, and $c_j = 0$ for $j \not= i$. To that end, let $a = n_i^{-1} \pmod{m_i}$, which must exist because $n_i$ and $m_i$ are relatively prime. Consider $x_i = a \cdot n_i$. We can easily see that $x$ is a solution to the special case, because $x_i = a \cdot n_i = n_i^{-1} \cdot n_i = 1 \pmod{m_i}$, and $m_j | n_i$ for $i \not= j$, so $m_j | a \cdot n_i = x_i$, i.e., $x_i = 0 \mod{m_j}$ as required.

We can now solve the general case by taking $x = c_1 \cdot x_1 + \cdots c_k \cdot x_k$, and verify the solution by calculating mod $m_i$:

\begin{align*} x &= c_1 \cdot x_1 + \cdots c_k \cdot x_k &\pmod{m_i} \\ &= c_1 \cdot 0 + \cdots + c_{i-1} \cdot 0 + c_i \cdot 1 + c_{i+1} \cdot 0 + \cdots + c_k \cdot 0 &\pmod{m_i} \\ &= c_i &\pmod{m_i} \end{align*}

as required.

For uniqueness, suppose we have two solutions $x_1$ and $x_2$. We easily see that

\begin{align*} x_1 - x_2 &= 0 \pmod{m_i} \end{align*}

for every $i$.

It immediately follows that $m_i$ must divide $x_1 - x_2$, for all $i$, and because the $m_i$ are pairwise relatively prime, their product $n = m_1 \mult m_2 \mult \cdots \mult m_k$ must as well, hence $x_1 = x_2 \pmod n$.

Exercise 4.20 There's a gap at this point, which shows that you should be thinking critically while you're listening to me. The claim at the end of the proof above relies on the following: if $m_1$ and $m_2$ are relatively prime, and $m_1 \divides a$, and $m_2 \divides a$, then $m_1 \mult m_2 \divides a$, which seems plausible enough, but we haven't proven. Prove it. [Hint: Contemplate the proof of Corollary 4.17.]

$\Box$ Theorem 4.19

The Chinese Remainder Theorem is called this because it was first proven in the 3rd century(!) by the Chinese mathematician Sunzi.

Example 4.21 Let's wrap up by solving an example problem from Rosen.

We're given to solve

\begin{align*} x &= 2 \pmod{3}\\ x &= 3 \pmod{5}\\ x &= 2 \pmod{7}\\ \end{align*}

We first compute $n_1 = 5 \mult 7 = 35$, $n_2 = 3 \mult 7 = 21$, and $n_3 = 3 \mult 5 = 15$.

To compute $x_1$, we start by finding $35^{-1} \pmod 3$, reasoning as follows:

\begin{align*} 35^{-1} & = 2^{-1} &\pmod{3}\\ & = 2 &\pmod{3} \end{align*}

Where the last step follows from the observation that $2 \mult 2 = 4 = 1 \pmod{3}$. [In principle, we could use the extended Euclid's algorithm, but it's hardly necessary here.]

Thus, $x_1 = 2 \mult n_1 = 2 \mult 35 = 70$.

Reasoning similarly, to compute $x_2$, we have to find $n_2^{-1} \pmod{m_2}$, i.e., $21^{-1} = 1^{-1} = 1 \pmod 5$, whence $x_2 = 1 \mult 21 = 21$; and to compute $x_3$, we have to compute $15^{-1} = 1^{-1} = 1 \pmod 7$, so $x_3 = 1 \mult 15 = 15$.

We can now assemble our final answer: $x = c_1 \mult x_1 + c_2 \mult x_2 + c_3 \mult x_3$. Reasoning $\pmod{105}$, we have \begin{align*} x &= 2 \mult 70 + 3 \mult 21 + 2 \mult 15 &\pmod{105} \\ &= 140 + 63 + 30 &\pmod{105} \\ &= 233 &\pmod{105} \\ &= 23 &\pmod{105} \end{align*}

Which we can easily verify.

$\Box$ Example 4.21

*Exercise 4.22 Use the algorithmic approach above to solve

\begin{align*} x &= 1 \pmod{2}\\ x &= 2 \pmod{3}\\ x &= 3 \pmod{5}\\ \end{align*}

There's a nice algorithmic consequence to the Chinese Remainder Theorem. Let's suppose we want to perform an arithmetic calculation based on the usual operations of addition, subtraction, and multiplication over very large integers. A problem here is that the cost of so-called “infinite precision integer arithmetic“ scales (non-linearly!) with the amount of memory necessary to represent these integers. An alternative approach would be to have lots of little computers, each tasked with performing (in parallel, no less) these calculations modulo different primes. We could then use our one big computer to re-assemble the solution to the original calculation from the solutions to all of these modular calculations.