Note: This is an updated version of the SoME 3 entry. See the original here.
There are many functions on a scientific or graphing calculator that we are introduced to as high school students that, we are told, just work. You select the function, put in the value that you need to calculate, hit "=" or "ENTER", and SHABAM! You have the correct answer to some arbitrary number of digits that you are ensured are all 100% accurate.
Pictured: The natural log of 2 to (an arbitrary) 32 digits on the built-in Microsoft calculator.
This is wonderful! Necessary, even, for if our calculators and computers calculated logarithms inaccurately, as well as exponentials, trig functions, and square roots, to name but a few, a lot of scientific and engineering work would be broken and end in catastrophe. But how do we know that the value on the calculator is, in fact, accurate? How did the calculator crunch the input number and give us the output? I remember this question being asked in high school, and the answer being no more than a handwave. To be fair to Algebra II teachers (which is where I was first introduced to logarithms, if memory serves), the answer is beyond the scope of algebra, and lies in calculus. But it is an answer worth knowing, and it's an answer that is very knowable, even if you don't have a background in calculus. Calculus is the method for deriving the formula used in calculators and computers, but the formula itself is a pretty simple polynomial.
Most of us were introduced to polynomial equations in algebra. For a quick refresher, a
polynomial is an expression involving at least one variable (usually
Line:
Parabola:
Quartic:
Note: The links lead to Desmos graphs where you can change the parameters to see how it changes the graph.
While these equations of polynomials contain a finite number of terms, we can have polynomials
with an infinite number of terms. These are called series, and one of the simplest types of
series is called a geometric series. A geometric series contains an initial term, usually
denoted by the variable
In the first series, the initial value is
The general formula for a geometric series is
If we set this formula equal to
By pulling an
Despite the fact that
Move
combine like-terms:
and solve for
We can now set both
Note that this assumes the series converges. This formula does not hold if the series diverges.
Regarding the natural logarithm, the geometric series we are interested in is the one where
This series converges when
Pictured: The 5-term approximation of the geometric series from above (black line) compared to the function $\frac{1}{1+x}$ (green line). Note how the series approximation is only accurate for $|x| < 1$.
"This is all very interesting, but what does this have to do with computing logarithms?"
It turns out that the function
While the understanding of the calculus involved is beyond the scope of this article,
I made a Desmos graph where you can play
with values of
Pictured: The integral of $\frac{1}{1+x}$ from 0 to $e - 1$ (shaded in blue) and a square of Area = 1 (shaded in green). Both shaded areas are equal in size.
Taking integrals is an inverse problem and for some functions is very difficult, if not impossible, to get the exact solution.
However, for polynomials, it's actually very easy. For a given polynomial term, increase the degree
by one, divide the term by the new degree, and take the difference of the function evaluated at the two
integrand values (the values at the top and bottom of the
We'll start by setting the natural logarithm equal to the integral of our geometric series:
The integral of a sum is the sum of the integrals of each term, so we can take this one term at a time. Starting with the first term,
$$
\begin{align*}
\int_{0}^{x} 1 dt & = \int_{0}^{x} 1t^0 dt \
& = \frac{t^{0+1}}{0+1} \Big|{t=0}^{t=x} \
& = \frac{t^1}{1} \Big|{t=0}^{t=x} \
& = t \Big|_{t=0}^{t=x} \
& = x - 0 = x
\end{align*}
$$
Let me explain the symbols above. We start with taking the integral of 1, a constant, in terms of
$$ \begin{align*} \int_{0}^{x} -t dt & = \int_{0}^{x} -t^1 dt \ & = -\frac{t^{1+1}}{1+1} \Big|0^x = -\frac{t^2}{2} \Big|{t=0}^{t=x} \ & = -\frac{x^2}{2} - \biggl(-\frac{0^2}{2} \biggr) \ & = -\frac{x^2}{2} + 0 \ & = -\frac{x^2}{2} \end{align*} $$
Our second term is
$$ \begin{align*} \int_{0}^{x} t^2 dt & = \frac{t^{2+1}}{2+1} \Big|{t=0}^{t=x} \ & = \frac{t^3}{3} \Big|{t=0}^{t=x} \ & = \frac{x^3}{3} - \frac{0^3}{3} \ & = \frac{x^3}{3} \end{align*} $$
With this, the pattern starts to come into focus. We have a series where each term is a polynomial divided by its degree, where all of the odd degree polynomial terms are positive and all the even terms are negative. Written as a series,
In fact, if we were to integrate the general term of the geometric series, this is exactly what we get:
$$ \begin{align*} \int_0^x \frac{1}{1+t} dt & = \int_0^x \sum_{n=0}^{\infty} (-t)^n dt \ & = \sum_{n=0}^\infty \biggl( \int_0^x (-t)^n dt \biggr) \ & = \sum_{n=0}^\infty \biggl( \int_0^x (-1)^n t^n dt \biggr) \ & = \Biggl( \sum_{n=0}^\infty (-1)^n \frac{t^{n+1}}{n+1} \Biggr) \Bigg|{t=0}^{t=x} \ & = \sum{n=0}^{\infty} (-1)^n \frac{x^{n+1}}{n+1} \end{align*} $$
So, we did it! We have a polynomial series that calculates the natural logarithm! Granted, we need to shift our
input by 1, as this series is
Except we're limited. Very limited. Remember, the geometric series only applies to the function
Well, we tried. Pack up, go home, see you next time.
Actually, as it turns out, this would be the end if we were dealing with almost any other function. But we are dealing with logarithms, and logarithms are special. There are two properties that we can use to reduce our inputs so that we can calculate the logarithm of any value we want!
The first property is that multiplication in the input is equivalent to addition of the outputs.
This means that if the input is the product of two factors, the logarithm of that product is the same
as the logarithms of each factor taken individually and added together. For example, if we wanted to
find the natural logarithm of the value 6, we can compute either
The general form of this property is
This also applies to division in the inputs, but instead, you subtract the outputs:
The second property is that the logarithm of a value raised to some power is that power times the logarithm of the (unraised) value.
This means that exponents in the input of the logarithm can hop out and
become a coefficient in the output. For example, the number 8 can also be written as
The general form of this property is
Each of these properties allows us to reduce the argument of the logarithm in different ways. Using the
first property, if we have the natural logarithm of a known value (2 and 10 are common choices), we can
reduce the argument by powers of that constant until we get a small enough input. Let's use 15 as an
example. If we have the natural logarithm of 2 calculated, we can keep dividing 15 by 2 until it is close
to 1 while keeping track of the number of times we divided by 2. After dividing by 2 four times,
we reduce 15 to 0.9375, so
We can see that as terms are added to the series, the approximation gets more and more precise as
each term gets smaller and smaller in magnitude. In this particular example, the next term is
less than one tenth the absolute value of the previous term. This means that the approximation gains
about one digit of accuracy for each term we add to the series. To see this, here is a list of the
first eight approximations for
So now we have a formula for finding the natural logarithm of any number to arbitrary precision based on the number of terms we use in the series. We did it! I even have a Python script that implements this formula to return the natural logarithm of any value within 64-bit floating-point precision using 48 terms.
Now, this formula is perfectly good to use. After all, it's the formula used in Python's decimal module to find the natural logarithm for base-10 number representation. But what if there were a better formula? One that converged much faster to find the natural logarithm with fewer terms?
How can an infinite series converge faster? Well, it turns out that for inputs less than 1, series that have terms that increase by two degrees will converge much faster than series that only increase by one. For example, if you look at the series for sine and cosine, they increase in degree by 2 for each term and converge very quickly (in fact, if you look at the C implementations of the sine kernel and cosine kernel, they only use six polynomial terms to get double floating-point precision, a much quicker convergence than our 48 terms!). So how do we get this series to skip a degree for each term? Surely, we can't just get rid of all the even-degree terms or all of the odd-degree terms and still get an accurate answer, right?
We can if we do it in a clever way. Let's look at the series again:
This is the series for
The difference between this series and our original series is that all the odd terms have become negative while the even terms remain negative. This is perfect! If you don't understand why, let me put each series in expanded form with all the terms lined up:
It should be apparent from this form that if we add the two series together, the odd terms will cancel out, and if we subtract the second series from the first series, the even terms will cancel out. So, should we add the series together, or subtract them? This must be determined by seeing how the logarithm function input changes when we add or subtract the two functions together. If we add the two functions together, we get
And if we subtract the two functions, we get
using the first property of logarithms.3 Let's remind ourselves that
For the subtracted logarithm, set
The addition of the logarithms requires taking a square root—a complicated function in and of itself—to convert our main input into the series input, while the subtraction of the logarithms only requires the basic operations of addition, subtraction, and division.4 Subtracting the logarithms is the clear winner. Now we can take our two series and combine them in this manner:
Distributing the negative sign on the second series, we get
and then combining them gives us
Substituting our initial value,
$$ \begin{align} \ln(u) & = \sum_{n=0}^{\infty} \frac{2}{2n+1} {\biggl( \frac{u-1}{u+1} \biggr)}^{2n+1} \ & = 2{\biggl( \frac{u-1}{u+1} \biggr)} + \frac{2}{3} {\biggl( \frac{u-1}{u+1} \biggr)}^3
- \frac{2}{5} {\biggl( \frac{u-1}{u+1} \biggr)}^5 + \frac{2}{7} {\biggl( \frac{u-1}{u+1} \biggr)}^7 + \ldots \end{align} $$
We now have a series that increases by two degrees for each new term added! But how much quicker does
this series converge compared to our old series? I have made a
Desmos graph where you can compare the two series.
The first series for
Pictured: A comparison of the 3-term approximations for the series $\sum_{n=0}^{\infty} (-1)^n \frac{(x-1)^{n+1}}{n+1}$ (red) and $\sum_{n=0}^{\infty} \frac{2}{2n+1} {\biggl( \frac{x-1}{x+1} \biggr)}^{2n+1}$ (green) to $\ln(x)$ (dotted black). Even after only three terms, the latter series is a much better approximation than the former.
But regardless of whichever series is used under the hood, you now understand how your computer and your calculator compute and give you an accurate value to the natural logarithm function. And with the change of base formula, you can calculate the logarithm of any base.
As I was learning how to script in Python, I began to get curious about how things worked under the hood of this programming language. Since Python is open source, I was able to look at its source code to see how different mathematical functions, particularly in its math and cmath modules, worked and how their code was structured. I learned about different techniques of coding mathematical formulae that avoided weird edge cases that would either create an inaccurate result or throw an error despite a calculable value existing. However, when it came to the basic complicated math functions, like the logarithm, trigonometric, and exponential functions, I found that the Python source code didn't have them written up, because Python used the functions from the C library. This led me to a new adventure where I looked into the original C code, which, while more difficult to read, is surprisingly readable once you have learned Python. The trig functions worked more or less how I expected, but I was surprised that the natural logarithm used a different series than the one I had learned about in my calculus class. (So does the exponential function, incidentally. After all, the Taylor series for exp(x) only increases by one degree each term, and as we've learned here, if you can bump that up to two degrees per term, you have better optimization. If you are curious, go and seek that one out yourself.) Once I worked out how this series was derived, I gained an appreciation for the cleverness of mathematicians and programmers who sussed out this series to calculate values for this very useful function. I have always been interested in how things like this work; when I was younger, I was the curious student who got frustrated when something logical and grounded in structured patterns was given to us like a magic wand where one push of a button gave you the answer you sought. I saw logarithm tables that had painstaking calculations for every common logarithm between 1.000 and 10.000 in 0.001 increments. Where did these values come from, and how do we know they work for our calculations of radioactive decay, chemical reactions, and population growth? Thanks to this exercise, now I know, and if you made it this far, so do you! Thank you for joining me on this fun little math adventure. I hope you found this exercise as interesting and enlightening as I have.
Did you enjoy this article? Want to help me make more? Consider making a donation.
Footnotes
-
This is true for geometric series, but not all infinite series. The harmonic series is a famous example of a series where the absolute value decreases with each successive term, but the series diverges to positive infinity. ↩
-
While exclamation points are used in mathematics to denote factorials, factorials do not appear in this blog post. Exclamation points are just exclamation points. ↩
-
The function $\ln \left( \frac{1+x}{1-x} \right)$ happens to be the inverse hyperbolic tangent function $ \big( \text{artanh}(x)$ or $\tanh^{-1}(x) \big)$ multiplied by 2. ↩
-
This also conveniently maps all of the positive reals to the domain $x \in (-1, 1)$, giving us an infinite radius of convergence, as we see in the graph. ↩