The Basel problem was proposed by Pietro Mengoli in 1650. And named after Euler’s hometown, Basel, after he solved it in 1734. The Basel problem ask to find “A precise summation of the reciprocals of the squares of the natural numbers”. It is well known in the mathematical community that \[ \sum_{k=1}^\infty \frac{1}{k^2} = \frac{\pi^2}{6}. \] But how did we come to this conclusion? What does a rigourous proof for this look like? why is \(\pi\) in this summation? This is a brief survey of some ways to solve the Basel problem and will attempt to answer these questions.
To begin this survey of how to solve the Basel problems lets first take a look at how Euler first solved it back in 1734.
Euler’s approach involved manipulating the Taylor series of \(sin(x)\) and creating another powers series of \(sin(x)\) using its roots. Then by manipulating them enough one begins to resemble the Basel problem so we can equate part of the series to the Basel problem.
First recall that \[sin(x) = x - \frac{x^3}{3!} + \frac{x^5}{5!}- ...\] \[\begin{equation} \frac{sin(x)}{x} = 1 - \frac{x^2}{3!} + \frac{x^4}{5!}- ... \tag{1.1} \end{equation}\]
The next step in the proof is to write out a power series for \(\frac{sin(x)}{x}\) using its roots. Recall that we can write out polynomial with their roots. for example we can equate \(x^2 + 2x+1 = (x+1)(x+1)\). We will do this for \(\frac{sin(x)}{x}\) as follows. \[sin(x) = (x-0)(x-\pi)(x+\pi)(x-2\pi)...\] Note that at the time it was not concrete that we could form an infinite power series by its root. In fact it is not always possible to do so. More on how to create a power series with its roots can be found from Weierstrass factorization theorem, which was proved in 1872. Euler anticipated Weierstrass results by over 125 years!
Next we want to pull out an \(x\) from each factor. \[sin(x) = x(1-\frac{0}{x})(1-\frac{\pi}{x})(1+\frac{\pi}{x})(1-\frac{2\pi}{x})...\] Then notice that we can flip the \(x\) and \(\pi\)’s while maintaining the same roots. \[sin(x) = x(1-\frac{x}{\pi})(1+\frac{x}{\pi})(1-\frac{x}{2\pi})...\] This shows that \[\frac{sin(x)}{x} = 1(1-\frac{x}{\pi}) (1+\frac{x}{\pi}) (1-\frac{x}{2\pi})...\] And by multiplying similar roots with each other we obtain that \[\frac{sin(x)}{x} = (1-\frac{x^2}{\pi^2}) (1-\frac{x^2}{4\pi^2}) (1-\frac{x^2}{9\pi^2})...\] Then by multiplying out and collecting the \(x^2\) terms we see that \[\frac{sin(x)}{x} = (1 - \frac{x^2}{\pi^2} - \frac{x^2}{4\pi^2} + \frac{x^4}{4\pi^4}) (1-\frac{x^2}{9\pi^2})...\] So \[-(\frac{1}{\pi^2}+\frac{1}{4\pi^2}+ \frac{1}{9\pi^2}+...)= -\frac{1}{\pi^2} \sum_{n=1}^{\infty} \frac{1}{n^2} \]
Recall the \(x^2\) coefficient from (1.1). Since we know that series have unique representations we know that the \(x^2\) coefficient found above equals the \(x^2\) coefficient of (1.1). Thus \[-\frac{1}{3!} = -\frac{1}{\pi^2} \sum_{n=1}^{\infty} \frac{1}{n^2}\] And \[\sum_{n=1}^{\infty} \frac{1}{n^2} = \frac{\pi^2}{6}\] as many of us know. From the time Euler solved this problem it brought him immediate fame in the field. However, he did have his doubter. particular debate was on forming the power series of \(sin(x)\) using its roots. Later in Euler’s career he would revisit this problem.
As noted by Euler, the Bernoulli numbers, \(B_n\), can be defined as the coefficients in the series expansion of the function \(\frac{x}{e^x-1}\). \[\frac{x}{e^x-1} = \sum_{n=0}^\infty B_n \frac{x^n}{n!}, 0<|x|<2\pi\]
To give some more intuition on these Bernoulli number we have some listed below.
\[\begin{equation} \{B_n\} = \{1,-\frac{1}{2}, \frac{1}{6}, 0, -\frac{1}{30} 0, \frac{1}{42}, 0,-\frac{1}{30}, 0, \frac{5}{66},0,... \} \tag{2.1} \end{equation}\]
Consider
\[\begin{equation} \begin{aligned} h(x) &= \frac{x}{e^x-1} + \frac{x}{2} \\ &= \sum_{n=0}^\infty B_{2n} \frac{x^{2n}}{(2n)!} \end{aligned} \tag{2.2} \end{equation}\]
Where we use the fact that \(B_n\) is eventually zero at the odd terms. Note we could also look at \(B_n\) another way. Namely,
\[\begin{equation} \begin{aligned} h(x) &= \frac{x}{e^x-1} + \frac{x}{2} \\ &= \frac{x}{2}*\frac{e^\frac{x}{2} + e^{-\frac{x}{2}}}{ e^\frac{x}{2} - e^{-\frac{x}{2}}} \end{aligned} \tag{2.3} \end{equation}\]
From combining the two fraction and then multiplying by “\(1\)”.Then from complex analysis it is understood that
\[\begin{equation} cot(x) = i \frac{e^{ix} + e^{-ix}}{e^{ix}-e^{-ix}} \tag{2.4} \end{equation}\]
Note from (2.3) and (2.4) we obtain the equality that \(xcot(x) = h(2ix)\). And thus we have made a connection from \(cot(x)\) to \(B_n\)!
Using this equality and our equation from (2.2) we find that
\[ xcot(x) = \sum_{n=0}^\infty \frac{2^{2n}B_{2n}}{(2n)!}x^{2n} \]
And by re-indexing we can see that
\[\begin{equation} cot(x) = \frac{1}{x} + \sum_{n=1}^\infty \frac{2^{2n}B_{2n}}{(2n)!}x^{2n-1} \tag{2.5} \end{equation}\]
Note there is another well known expansion for cotangent that goes along the lines of
\[ \pi cot(\pi x) = \frac{1}{x} + \sum_{n=1}^\infty \frac{2x}{x^2-n^2}, 0<|x|<\frac{\pi}{2} \]
Note that we can pull out a \((-n^2)\) in the denominator and replace \(\frac{1}{1-\frac{x^2}{n^2}}\) by \(\frac{1}{1-r}\). We should also see that for the x that we are interested in we will have \(|r|<1\). By doing this we obtain
\[\begin{equation} \begin{aligned} \pi cot(\pi x) &= \frac{1}{x} - \sum_{n=1}^\infty \frac{2x}{n^2 (1-\frac{x^2}{n^2})} \\ &= \frac{1}{x} - \sum_{n=1}^\infty \frac{2x}{n^2} (\sum_{k=1}^\infty (\frac{x^2}{n^2})^{k-1}) \\ &= \frac{1}{x} - \sum_{k=1}^\infty (\sum_{n=1}^\infty \frac{2}{n^{2k}})x^{2k-1} \end{aligned} \tag{2.6} \end{equation}\]
Then from (2.5) we can see that
\[\begin{equation} \pi cot(\pi x) = \frac{1}{x} + \sum_{n=1}^\infty (-1)^n \frac{2^{2n}\pi^{2n}B_{2n}}{(2n)!} x^{2n-1} \tag{2.7} \end{equation}\]
Finally note that we can use the fact about a series representation being unique from (2.7) and (2.5) to form the equality
\[\begin{equation} \zeta(2n) = \sum_{k=1}^\infty \frac{1}{k^{2n}} = (-1)^{n+1} \frac{2^{2n-1}B_{2n}}{(2n)!} \tag{2.8} \end{equation}\]
Recall some of the Bernoulli numbers mentioned above at (2.1). From those and our new equality (2.8) we can see that
\[ \begin{align} \zeta(2) &= \sum_{k=1}^\infty \frac{1}{k^2} = 1 + \frac{1}{2^2} + \frac{1}{3^2} + ... = \frac{\pi^2}{6} \\ \zeta(4) &= \sum_{k=1}^\infty \frac{1}{k^4} = 1 + \frac{1}{2^4} + \frac{1}{3^4} + ... = \frac{\pi^4}{90} \\ \zeta(6) &= \sum_{k=1}^\infty \frac{1}{k^6} = 1 + \frac{1}{2^6} + \frac{1}{3^6} + ... = \frac{\pi^6}{945} \\ \end{align} \]
For this subsection we will explore Lewin’s argument for the Basel problem. It is worth noting that while he is credited for this proof, there is still debate on whether Euler knew about this proof. This is because this proof relies heavenly on some of the identities Euler discovered. For a more in depth talk about whether Euler knew about this proof further reading can be done at [1].
First Lets consider the function \[g(z) = \sum_{k=1}^\infty \frac{z^k}{k^2}\] Then \[g'(z) = \sum_{k=1}^\infty \frac{z^{k-1}}{k} = 1+\frac{z}{2} + \frac{z^2}{3} + ...\]
And \[ \frac{d}{dz}(zg'(z)) = \frac{d}{dz}(\sum_{k=1}^\infty \frac{z^{k}}{k}) = 1+z+z^2+... = \frac{1}{1-z}\]
So \[zg'(x) = \int_0^z \frac{1}{1-z} = -ln(1-z)+0\]
Finally we can see that
\[g'(x) = \left\{ \begin{array}{rcl} 1 & \mbox{for} & z=0 \\ \frac{-ln(1-z)}{z} & \mbox{for} & z \neq 0 \end{array}\right.\]
Note the relationship between \(g(z)\) and the Polylogarithm function denoted as \(Li_n(z)\), such that \[Li_n(z) = \sum_{n=1}^\infty \frac{z^k}{k^n}\]
In particular we are curious about when \(n=2\) or \(Li_2(z)\), which is often called the dilogarithm function. Note that \(Li_2(z) = g(z)\). From now on we will express \(g\) as \(Li_2\). Now we will use a property found by Euler with the Dilogarithm function that goes like \[\begin{equation} Li_2(-\frac{1}{z}) + Li_2{-z} + \frac{1}{2}(ln(z))^2 = C \tag{3.1} \end{equation}\]
Where \(C\) is a constant. This Identity will only be useful if we know what \(C\) is so lets set \(z=1\) and find \(C\).
Note that our interval of convergence of \(Li_2\) does not include 1 or -1. We will continue with this proof and show in the next section that this was okay to do.
\[ C = 2Li_2(-1) = 2(-1+\frac{1}{4}-\frac{1}{9}+...) \]
Note it is a common trick to sum up the even terms in the sum above and then the odd terms to find out a sum. This by letting \(E\) denote the even terms above. We find that \[E = \sum_{k=1}^\infty \frac{1}{(2k)^2} = \frac{1}{4} Li_2(z)\] Then we can find the odd terms denoted as \(D\) by taking the compement of the evens as shown below \[D = Li_2(z) -E = \frac{3}{4} Li_2(z)\] Thus \[C = 2Li_2(-1) = 2(E-D) = -Li_2(z)\]
Then by using the identity above (3.1) we can see by setting \(z=-1\) that
\[\begin{equation} Li_2(1) +Li_2(1) +\frac{1}{2}(ln(z))^2 = -Li_2(z) \tag{3.2} \end{equation}\] \[\rightarrow Li_2(1) = -\frac{1}{6}(ln(-1))^2\]
Next we simply have to recall that \[e^{i \pi} = -1\] \[\rightarrow i \pi = ln(-1)\] and thus \[Li_2(1) = \frac{\pi^2}{6},\] As desired.
Using the intuition as an intro.
Let \(z \in \Omega\) such that \(\Omega = \mathbb{C}\backslash[1,\infty)\). That is, \(z\) will have the form \(z=re^{i\theta}\) where \(r>0\) and \(-\pi<\theta\leq\pi\).
Note that \(ln(z) = ln(r)+i\theta\) and is analytic (the Taylor series about \(x_0\) converges to the function in some neighborhood for every \(x_0\) in its domain) in \(\mathbb{C}\backslash(-\infty,0]\). Also notice that \(ln(1-z)\) is analytic in \(\Omega\).
For this proof we are interested in evaluating when \[Li_2(1) = \sum_{k=1}^\infty \frac{1^k}{k^2}\] And this can be obtain by integrating \(\frac{-ln(1-z)}{z}\). Note that we have a removable discontinuity at the origin. So we define \[ F(z) = \left\{ \begin{array}{rcl} 1 & \mbox{for} & z=0 \\ \frac{-ln(1-z)}{z} & \mbox{for} & z \neq 0 \end{array}\right. \] And \(F\) is analytic in \(\Omega\). This can be seen because it will be analytic away from the origin by the product of analytic functions. However, for \(|z|<1\) we can see that \[-ln(1-z) = z + \frac{z^2}{2} + \frac{z^3}{3} +... .\] Which shows that
\[\begin{equation} -\frac{ln(1-z)}{z} = 1 + \frac{z}{2} + \frac{z^2}{3} +... \tag{3.3} \end{equation}\]
for \(z \neq 0\). The function \(F\) agrees in a neighborhood of zero with the series on the right, and so is analytic there.
Now let us properly define \(Li_2(z)\).
Definition. For \(Z \in \Omega\), \[ Li_2(z) = \int_0^z F(w)dw = \int_0^z \frac{-ln(1-w)}{w} dw. \] Which gives us an integral that is a complex path integral, and since it is analytic in \(\Omega\), the integral is path independent and \(Li_2\) is analytic in \(\Omega\). Also note the relationship \(Li'_2(z) = F(z)\). Therefore, by term-by-term integration of (3.3) we see that \[\begin{equation} Li_2(z) = z + \frac{z^2}{4} + \frac{z^3}{9} +... \tag{3.4} \end{equation}\] for \(|z|<1\). Note that our domain \(\Omega\) does not contain the value we are interested in, \(1\). To begin to overcome this obstacle lets first look at the radial limit to show that \(Li_2(z)\) and \(\sum \frac{z^k}{k^2}\) agree not only in the interior of the unit circle, but also on the unit circle T, except at \(z=1\).
To begin let us prove the following lemma.
Lemma 1 (Radial Limits). Suppose \(z \in T\). Then \[ \lim\limits_{t \rightarrow 1^-} \sum \frac{(tz)^k}{k^2} = \sum \frac{z^k}{k^2} \] For any \(\epsilon >0\), such that there exist a \(\delta \in (0,1)\) independent of \(z\) so that \(\sum \frac{(tz)^k}{k^2}\) is within \(\epsilon\) of \(\sum \frac{z^k}{k^2}\) whenever \(\delta<t<1\).
proof. Let \(t \in (0,1)\), then for both series mentioned in the lemma above converge absolutely. There absolute difference can be expressed as \[ \Bigg| \sum \frac{(1-t^k)z^k}{k^2} \Bigg| \leq \sum \frac{1-t^k}{k^2} \] Then by splitting the sum on the right into two parts, corresponding to \(k \leq N\) and \(k>N\). The first part is a polynomial in t and converges to 0 as t approaches 1. It can be made small by choosing \(t> \delta\) for an appropriate \(\delta <1\). The second part is bounded above by a tail of \(\sum 1/k^2\) irrespective of the value of t. Therefore, by choosing first N and then \(\delta\), we can make the inequality above to be arbitrary small, establishing the limit asserted in the lemma. And because \(\delta\) can be chosen without regard to the value of z, the second part of the lemma is verified.
The radial limit result implies the following lemma.
Lemma 2. For all \(z \neq 1\) in the closed unit disk, \[Li_2(z) = \sum \frac{z^k}{k^2}\] proof. We already know that the equation holds for \(|z|<1\). So consider a fixed \(x \in T\), such that \(z \neq 1\). By continuity of \(Li_2\) in \(\Omega\) we have \[ \begin{align} Li_2(z) &= \lim\limits_{t \rightarrow 1^-} Li_2(tz) \\ &= \lim\limits_{t \rightarrow 1^-} \sum \frac{(tz)^k}{k^2} \\ &= \sum \frac{z^k}{k^2} \end{align} \] That is what we wish to show.
Next we want to extent the domain of \(Li_2(z)\) to include \(z=1\).
Lemma 3. Define \(Li_2(z) = \sum_{k=1}^{\infty} 1/k^2\). Then the restriction of \(Li_2(z)\) to \(T\) is continuous.
Proof. Because we already know that \(Li_2(z)\) is continuous and analytic in \(\Omega\), we then need to show continuity at \(z = 1\). To this end, let \(\epsilon > 0\). We will show that on the unit circle \(Li_2(z)\) varies by no more than \(\epsilon\) for \(z\) near \(1\). As illustrated in the figure below, we will go radially from \(z\) to \(t^*z\), then along the inner arc from \(t^*\) to \(t^*\), and lastly from \(t^*\) to \(1\).
Applying Lemma 1, choose \(\delta_1\) so that for any \(z \in T\), \(\sum(tz)^k/k^2\) is within \(\epsilon /3\) of \(\sum z^k/k^2\) when \(\delta_1 < t < 1\). In other words, \(Li_2(z)\) varies by less than \(\epsilon /3\) along any radial line between the unit circle and the concentric circle of radius \(\delta_1\).
Fix \(t^*\) in the interval \((\delta_1, 1)\). The Dilogarithm function is continuous on the circle \(T^* = t^*T = \{t^* z |z \in T \}\) and, in particular, is continuous at \(t^* = t^* * 1\). This implies that for some \(\delta_2\), at any \(z^* \in T^*\) within \(\delta_2\) of \(t^*\), \(|Li_2(z^*) - Li_2(t^*)| < \epsilon /3\).
Next we want to show that all points \(z\) of the unit circle are within \(\delta_2\) of \(1\), \(|Li_2(z) - Li2(1)| < \epsilon\). Then, we have
\[ |Li_2(z) -Li_2(1)| \leq |Li(z) - Li_2(t^*z)| + |Li_2(t^*z)-Li_2(t^*)| + |Li_2(t^*) - Li_2(1)| \]
On the right, the first and third terms measure radial variation of \(Li_2\) between the unit circle and \(T^*\). These terms are each less than \(\epsilon /3\). The middle term measures variation along the circle \(T^*\) over a distance less than \(\delta_2\), so it is also less than \(\epsilon /3\). Therefore we have shown that \(|Li_2(z) - Li2(1)| < \epsilon\), and that proves that the restriction of \(Li_2(z)\) to the unit circle is continuous at \(z = 1\).
Thus we have shown that we can evaluate \(Li_2(1)\). Let us turn to Euler’s identity (3.1). We can justify the identity by differentiation only where \(Li_2(-1/z)\), \(Li_2(-z)\), and \(ln(z)\) are all analytic. That requires \(-1/z\) and \(-z\) both to be in \(\Omega\), and \(z\) to be in \(C \ (-\infty, 0] = \Lambda\). Note, all three conditions hold for \(z \in \Lambda\), so the left-hand side of the identity is analytic. As we did in the intuition section, we can infer that it is constant by verifying that its derivative vanishes. Moreover, since we know that neither \(-z\) nor \(-1/z\) is zero, we can differentiate the Dilogarithm function using the fact that \(Li_2'(z) = -ln(1-z)/z\). This shows that the identity is true in \(\Lambda\).
In particular, the identity holds when \(z = 1\), from what we have seen in (3.2). Now, that \(Li_2\) is defined as an integral, and we have our earlier results to see that \(\sum (-1)^k/k^2\) converges to \(Li_2(-1)\). Then, we see that \(C = -\sum 1/k^2\). Then, by definition, that is \(-Li_2(1)\).
To complete the proof we would like to take \(z = -1\) in (3.2). But note that identity (3.1) has not been established for \(z = -1 \in \Lambda\). However, the identity does hold at every other point of \(T\). By Lemma 3, the two \(Li_2\) terms are continuous for \(z \in T\). Then we can see that \(ln(z)\) jumps from \(-\pi i\) to \(\pi i\) where \(T\) passes through \(-1\), \((ln z)^2\) is continuous there, with value \(-\pi^2\). Thus, the left side of (3.2) is continuous on all of \(T\) . But we already saw it to be \(-Li_2(1)\) everywhere except at \(z = -1\), so we can now conclude that the identity must hold at \(z = -1\) as well. So we can apply (3.2) when \(z = -1\), with the additional understanding that \([ln(-1)]^2 = - \pi^2\). As argued earlier, which leads to \(Li_2(1) = \pi^2 /6\). Thus we conclude Lewin’s argument.
All these approaches are fine ways to see how to sum \(\sum 1/k^2\). However, there is still something that should bother the reader after coming this far. Where exactly does \(\pi\) come into play. From when we were first introduced to \(\pi\) we used it to describe circles. So where is the circle in the Basel problem? Thanks to the Swedish mathematician Johan Wastlund we have a proof that describes the relationship between \(\sum 1/k^2\) and circles. Though I will only give an intuition on why it works here for lack of time, I encourage you to read his paper at [3].
To get the intuition of this proof we need to understand two concepts.
First, is the Inverse Square Law. In this example we will be using the idea of light. And the inverse square law tells us how bright an object should be from a certain distance. In this case we would say the brightness of an object is \(1/d^2\), where d is the distance from us to the object. Note as \(d \rightarrow 0\) we see that the apparent brightness of the object approaches \(\infty\).
Second we should be aware of the inverse Pythagorean theorem shown below.To start of this example consider a circle with circumference of 2. This implies we would have a diameter of \(2/ \pi\). And if we imagine ourselves being \(2/ \pi\) units away from this object emitting light, we find that the apparent brightness is \(1/d^2 = \pi^2/4\). Next we want to make a new circle with twice the circumference (4) where both the circles are intersecting at the origin. Then we will make a horizontal line through the origin and use the inverse Pythagorean theorem to split the light into two components. A diagram is provided below to help visualize this process where we are at the bottom of the circle. Note the dots are our objects emmitting light.
Note becuase of how we are moving the light objects to the bigger circle, the light is still has an apparent brightness of \(\pi^2/4\) from the inverse pythagoean theorem. It is also worth noting that the distance along the arc of the circle, between our two light sources, are two units apart by the inscribed angle theorem. Then, by a similar process, we will now add a circle with double the circumference of our biggest circle. That is, our new circle will have a circumference of 8. Next we will do a similar process to moving the lights in such a way that we maintain the apparent brightness. Note we are able to keep duplicating the light source without adding any apparent brightness thanks to the inverse Pythagorean theorem mentioned above. A diagram is provided below to visualize the process.
Now note that our apparent brightness has stayed the same as we increase our circle so it is still \(\pi^2 /4\). Now Let us visualize the circle growing such that the circumference approaches infinity.
Notice from the diagram above that the line seems to become more “flat”. In a way, the line can represent the real number line. That is, we can think of the real number line as an unbounded growing circle! Okay to use this to solve the Basel problem recall that as we increase our circle and move our light sources to the new circle, we maintain the arc distance of 2 units from each light source. So as we let the circle grow, the light sources maintain there distance of 2 along the arc of the circle. Thus if we take the bottome of the circle to be the origin we find that the apparent brigtness becomes the equation \[ ... \frac{1}{(-5)^2} + \frac{1}{(-3)^2} + \frac{1}{(-1)^2} + \frac{1}{1^2} + \frac{1}{3^2} + \frac{1}{5^2} + ... = \frac{\pi^2}{4} \]
Then if we take only the inverse of the positive squares we find that the sum is \(\pi^2/8\). Note that \[ \begin{align} O &= \sum_{k=1}^\infty \frac{1}{(2k)^2} \\ &= \frac{1}{4} \sum_{k=1}^\infty \frac{1}{k^2}\\ \rightarrow D &= \frac{3}{4} \sum_{k=1}^\infty \frac{1}{k^2} \end{align} \]
Where O is the sum of the even inverse squares and D is the sum of the odd inverse squares. And since we know the value of D we can see that
\[\begin{align} \frac{\pi^2}{8} &= D \\ &= \frac{3}{4} \sum_{k=1}^\infty \frac{1}{k^2} \\ \rightarrow \sum_{k=1}^\infty \frac{1}{k^2} &= \frac{\pi^2}{6} \end{align}\]
As desired.
As we conclude our survey of some solutions to the Basel problem we would like to highlight some of the great things that came about studying this question.
First we obtained the zeta function which is used in the study of the distribution of primes. This function is very rich and there is always something new to learn about it. It is also part of the Riemann hypothesis which has a $1 million prize for anyone who can prove that the zeta function only has its zeros at the negative even integers and complex numbers with real part 1/2.
Next from further expanding the zeta function we obtained the Polylogarithm function. Which has been used in defining distributions in quantum statistics. Then, it has been further expanded to the Lerch zeta function, which has its uses in Physics.
Finally, for those of us who enjoy geometry. We now have a new perspective of looking the the Real number line. Namely, that we can view it as an unbounded growing circle.