Monday Math: The Sum of the Reciprocals of the Primes

Time for the big finale! We now have all the pieces in place to establish the divergence of the sum of the reciprocals of the primes.

Recall that we have the Euler product expansion of the harmonic series:


\[
\sum_{n=1}^\infty \frac{1}{n}= \prod_p \left( \frac{1}{1-\frac{1}{p}} \right)
\]

 

We noted that this formula was really just a consequence of the fundamental theorem of arithmetic.

This is definite progress, since we now have a product indexed over the primes. To convert that to a sum over the primes we simply take the natural logarithm of both sides:


\[
\ln \sum_{n=1}^\infty \frac{1}{n}= \ln \prod_p \left(1-\frac{1}{p} \right)^{-1}= \sum_p -\ln \left( 1-\frac{1}{p} \right)
\]

 

We have made use of two basic properties of logarithms. The first is that they turn products into sums. The second is that exponents magically float down from their lofty perch, and become constants in front of our expression.

Here, as always, I must apologize for being so cavalier in my manipulations of infinite series. To do this right I should be talking about limits and convergence and error terms. However, I really don't think this brief discussion will suffer from the lack of rigor.

The next big idea involves the Taylor series for the natural logarithm, which we derived last week. The form of the series most useful for our present purposes is


\[
-\ln(1-x)= \sum_{m=1}^\infty \frac{x^m}{m}=x+\frac{x^2}{2}+\frac{x^3}{3}+\frac{x^4}{4}+\dots
\]

 

This formula is only valid if x is between minus one and one, but that is acceptable, since in our case we have


\[
x=\frac{1}{p} \]

 

Substituting our Taylor series into our previous expression gives us:


\[
\ln \sum_{n=1}^\infty \frac{1}{n}= \sum_p \sum_{m=1}^\infty \frac{1}{mp^{m}}
\]

 

Double sums always look a bit formidable, but they are not so bad if you think about them in an organized way. The important thing to realize is that when m=1, we are left with precisely the sum of the reciprocals of the primes. If we separate out that term we have


\[
\ln \sum_{n=1}^\infty \frac{1}{n}=\sum_p \frac{1}{p} + \sum_p \sum_{m=2}^\infty \frac{1}{mp^m}
\]

 

Time to take stock. On the left we have the harmonic series. Since we know it diverges, its logarithm diverges as well. On the right we have the sum of the reciprocal of the primes. We are trying to show that series diverges.

Then we have the remaining double sum. If we can show that double sum converges then we are done. If the left hand side diverges, and the right hand side adds our series to something that is known to converge, then it must be the case that our series diverges.

Well, I am happy to report that it is not terribly difficult to show that the double sum converges. The techniques involved are elementary, but sufficiently tedious that I will not present them here. The ever useful Wikipedia provides more of the details, if you are interested.

So we've done it! I still find this a remarkable result, since it really seemed hopeless at the start. I mean, really, how can we make any statement about a sum indexed over the primes?

 

It gets better. You see, nowhere in this proof did we assume there were infinitely many prime numbers. Nonetheless, we were able to establish that the sum of the reciprocals of the primes diverges. If the number of primes was finite then, plainly, the sum of their reciprocals would converge. That means the argument we have given here can be viewed as a proof of the infinitude of the primes. (Of course, it is a far more complex proof than the one we provided a while back.)

In that previous post we mentioned Dirichlet's theorem on primes in arithmetic progressions. The theorem asserts that if ak+b represents the arithmetic progression with first term b and common difference a, and if we assume a and b are relatively prime, then the progression contains infinitely many primes. For example, the progression 4k+1 looks like this:


\[
1, \ 5, \ 9, \ 13, \ 17, \ 21, \ 25, \ 29, \ 33, \ 37, \dots
\]

 

You can see that some of the terms are prime and some are not. Dirichlet's theorem assures us that the sequence contains infinitely many primes. Incredibly, he proved this by generalizing the approach we used here. That is, he showed there are infinitely many primes in this (or effectively any other) progression by showing that the sum of the reciprocals of these primes diverges.

It was a bravura performance, not least because it required techniques from complex analysis. Dirichlet thereby inaugurated the study of analytic number theory. It is great stuff, but that is definitely a different post. Come to think of it, it is a different series of posts!

More like this

I remember learning early in my first analysis course that if an infinite sequence diverges then any infinite subsequence also divergence. Since the sequence of reciprocals of primes is just an infinite subsequence of the reciprocals of natural numbers (both converge to zero), would it be possible to derive the result for the corresponding infinite sums from that fact?

I haven't read all the posts in this series, so apologies if you've covered this obvious-seeming point already.

I think you have this backward. It is true that if a sequence converges then all of its infinite subsequences converge to the same limit. But it is possible for a divergent sequence to have a convergent subsequence. For example, the sequence 1, 2, 1, 2, 1, 2, ... does not converge to anything, but the subsequence consisting of all the ones does converge.

When we talk about the convergence of a series, we are really talking about the convergence of the sequence of partial sums. The sequence of partial sums of the harmonic series just gets bigger and bigger without bound, and so it diverges. But once we start throwing out terms, it is possible that the result will be a convergent series. For example, if the only fractions we keep are the ones with perfect squares on the bottom, then the series converges.

What is surprising about the result with the primes is that we have thrown out so many terms that you might think the series now converges, but in reality we still have enough left to make it diverge.

You can avoid having to deal with the Taylor series for log(1-x), checking the convergence of that doubly infinite series, etc., by putting a simple bound on how much log 1-x differs from x.

For instance, if - log 1-1/p <= 2/p then we're good. That's true if log 1-x >= -2x when 0 <= x <= 1/2, which is true if 1-x >= exp(-2x) when 0 <= x <= 1/2, i.e. if exp(-y) <= 1-y/2 when 0 <= y <= 1. That's not quite true, unfortunately, but it is true (and almost instant to prove by using exp' = exp) when 0 <= y <= log 2, which is true for p >= 3.

Damn. I *promise* I replaced all my angle brackets with HTML entities and previewed and everything; dunno what actually happened. Anyone who cares can probably reconstruct the argument I actually had in mind from the ~20% of it that survived :-).

(Less-than and greater-than signs, that is, but it's because they also function as angle brackets that they got eaten.)

Irrelevant note: attempting to post a correction or clarification like this triggers some automatic I-think-you-might-be-a-spammer checker. That seems a bit extreme.