Properties of the logarithm

[summary:

$\log_b(x \cdot y) = \log_b(x) + \log_b(y)$ for any $b$ , this is the defining characteristic of logarithms.
$\log_b(1) = 0,$ because $\log_b(1) = \log_b(1 \cdot 1) = \log_b(1) + \log_b(1).$
$\log_b\left(\frac{1}{x}\right) = -\log_b(x),$ because $\log_b(1) = \log_b\left(x \cdot \frac{1}{x}\right) = \log_b(x) + \log_b\left(\frac{1}{x}\right) = 0.$
$\log_b\left(\frac{x}{y}\right) = \log_b(x) - \log_b(y),$ which follows immediately from the above.
$\log_b\left(x^n\right) = n \cdot \log_b(x),$ because $x^n$ = $\underbrace{x \cdot x \cdot \ldots x}_{n\text{ times}}$ .
$\log_b\left(\sqrt[n]{x}\right) = \frac{\log_b(x)}{n},$ because $\log_b(x) = \log_b\left((\sqrt[n]{x})^n\right) = n \cdot \log_b(\sqrt[n]{x}).$
For every $f$ that satisfies $f(x \cdot y) = f(x) + f(y)$ for all $x, y \in \mathbb R^+,$ either $f$ sends every input to 0 or there exists some $b$ such that $f(b) = 1,$ in which case we call $f$ $\log_b.$ Thus, $\log_b(b) = 1.$
$\log_b(b^n) = n,$ because $\log_b(x^n) = n \log_b(x)$ and $\log_b(b) = 1.$ ]

With a solid interpretation of logarithms under our belt, we are now in a position to look at the basic properties of the logarithm and understand what they are saying. The defining characteristic of logarithm functions is that they are [-real_valued] functions $f$ such that

Property 1: Multiplying inputs adds outputs. $f(x \cdot y) = f(x) + f(y) \tag{1}$ for all $x, y \in$ [positive_reals $\mathbb R^+$ ]. This says that whenever the input grows (or shrinks) by a factor of $y$ , the output goes up (or down) by only a fixed amount, which depends on $y$ . In fact, equation (1) alone tells us quite a bit about the behavior of $f,$ and from it, we can _almost_ guarantee that $f$ is a logarithm function. First, let's see how far we can get using equation (1) all by itself:

Property 2: 1 is mapped to 0. $f(1) = 0. \tag{2}$

This says that the amount the output changes if the input grows by a factor of 1 is zero — i.e., the output does not change if the input changes by a factor of 1. This is obvious, as "the input changed by a factor of 1" means "the input did not change."

Exercise: Prove (2) from (1). %%hidden(Proof): $f(x) = f(x \cdot 1) = f(x) + f(1),\text{ so }f(1) = 0.$ %%

Property 3: Reciprocating the input negates the output. $f(x) = -f\left(\frac{1}{x}\right). \tag{3}$

This says that the way that growing the input by a factor of $x$ changes the output is exactly the opposite from the way that shrinking the input by a factor of $x$ changes the output. In terms of the "communication cost" interpretation, if doubling (or tripling, or $n$ -times-ing) the possibility space increases costs by $c$ , then halving (or thirding, or $n$ -parts-ing) the space decreases costs by $c.$

Exercise: Prove (3) from (2) and (1). %%hidden(Proof): $x \cdot \frac{1}{x} = 1,\text{ so }f(1) = f\left(x \cdot \frac{1}{x}\right) = f(x) + f\left(\frac{1}{x}\right).$

$f(1)=0,\text{ so }f(x)\text{ and }f\left(\frac{1}{x}\right)\text{ must be opposites.}$ %%

Property 4: Dividing inputs subtracts outputs.

$f\left(\frac{x}{y}\right) = f(x) - f(y). \tag{4}$

This follows immediately from (1) and (3).

Exercise: Give an interpretation of (4). %%hidden(Interpretation): There are at least two good interpretations:

$f\left(x \cdot \frac{1}{y}\right) = f(x) - f(y),$ i.e., shrinking the input by a factor of $y$ is the opposite of growing the input by a factor of $y.$
$f\left(z \cdot \frac{x}{y}\right) = f(z) + f(x) - f(y),$ i.e., growing the input by a factor of $\frac{x}{y}$ affects the output just like growing the input by $x$ and then shrinking it by $y.$

Try translating these into the communication cost interpretation if it is not clear why they're true. %%

Property 5: Exponentiating the input multiplies the output.

$f\left(x^n\right) = n \cdot f(x). \tag{5}$

This says that multiplying the input by $x$ , $n$ times incurs $n$ identical changes to the output. In terms of the communication cost metaphor, this is saying that you can emulate an $x^n$ digit using $n$ different $x$ -digits.

Exercise: Prove (5). %%hidden(Proof): This is easy to prove when $n \in \mathbb N:$ $f\left(x^n\right) = f(\underbrace{x \cdot x \cdot \ldots x}_{n\text{ times}}) = \underbrace{f(x) + f(x) + \ldots f(x)}_{n\text{ times}} = n \cdot f(x).$

For $n \in \mathbb Q,$ this is a bit more difficult; we leave it as an exercise to the reader. Hint: Use the proof of (6) below, for $n \in \mathbb N,$ to bootstrap up to the case where $n \in \mathbb Q.$

For $n \in \mathbb R,$ this is actually not provable from (1) alone; we need an additional assumption (such as [-continuity]) on $f$ . %%

Property 5 is actually false, in full generality — it's possible to create a function $f$ that obeys (1), and obeys (5) for $n \in \mathbb Q,$ but which exhibits pathological behavior on irrational numbers. For more on this, see [ pathological near-logarithms].

This is the first place that property (1) fails us: 5 is true for $n \in \mathbb Q,$ but if we want to guarantee that it's true for $n \in \mathbb R,$ we need $f$ to be [continuity continuous], i.e. we need to ensure that if $f$ follows 5 on the rationals it's not allowed to do anything insane on irrational numbers only.

Property 6: Rooting the input divides the output.

$f(\sqrt[n]{x}) = \frac{f(x)}{n}. \tag{6}$

This says that, to change the output one $n$ th as much as you would if you multiplied the input by $x$ , multiply the input by the $n$ th root of $x$ . See Fractional digits for a physical interpretation of this fact.

Exercise: Prove (6). %%hidden(Proof): $(\sqrt[n]{x})^n = x,\text{ so }f\left((\sqrt[n]{x})^n\right)\text{ has to equal }f(x).$

$f\left((\sqrt[n]{x})^n\right) = n \cdot f(\sqrt[n]{x}),\text{ so }f(\sqrt[n]{x}) = \frac{f(x)}{n}.$ %%

As with (5), (6) is always true if $n \in \mathbb Q,$ but not necessarily always true if $n \in \mathbb R.$ To prove (6) in full generality, we additionally require that $f$ be continuous.

Property 7: The function is either trivial, or sends some input to 1.

$\text{Either $f$ sends all inputs to $0$, or there exists a $b \neq 1$ such that $f(b)=1.$}\tag{7}$

This says that either $f$ is very boring (and does nothing regardless of its inputs), or there is some particular factor $b$ such that when the input changes by a factor of $b$ , the output changes by exactly $1$ . In the communication cost interpretation, this says that if you're measuring communication costs, you've got to pick some unit (such as $b$ -digits) with which to measure.

Exercise: Prove (7). %%%hidden(Proof): Suppose $f$ does not send all inputs to $0$ , and let $x$ be an input that $f$ sends to some $y \neq 0.$ Then $f(\sqrt[y]{x}) = \frac{f(x)}{y} = 1.$ %%note: You may be wondering, "what if $y$ is negative, or a fraction?" If so, see [ Strange roots]. Short version: $\sqrt[-3/4]{x}$ is perfectly well-defined.%%

$b$ is $\sqrt[y]{x}.$ We know that $b \neq 1$ because $f(b) = 1$ whereas, by (2), $f(1) = 0$ . %%%

Property 8: If the function is continuous, it is either trivial or a logarithm.

$\text{If $f(b)=1$ then } f(b^x) = x. \tag{8}$

This property follows immediately from (5). Thus, (8) is always true if $x$ is a rational, and if $f$ is continuous then it's also true when $x$ is irrational.

Property (8) states that if $f$ is non-trivial, then it inverts exponentials with base $b.$ In other words, $f$ counts the number of $b$ -factors in $x$ . In other words, $f$ counts how many times you need to multiply $1$ by $b$ to get $x$ . In other words, $f = \log_b$ !

Many texts take (8) to be the defining characteristic of the logarithm. As we just demonstrated, one can also define logarithms by (1) as [continuity continuous] [trivial_mathematics non-trivial] functions whose outputs grow by a constant (that depends on $y$ ) whenever their inputs grow by a factor of $y$ . All other properties of the logarithm follow from that.

If you want to remove the "continuous" qualifier, you're still fine as long as you stick to rational inputs. If you want to remove the "non-trivial" qualifier, you can interpret the function $z$ that sends everything to zero as [4c8 $\log_\infty$ ]. Allowing $\log_\infty$ and restricting ourselves to rational inputs, every function $f$ that satisfies equation (1) is isomorphic to a logarithm function.

In other words, if you find a function whose output changes by a constant (that depends on $y$ ) whenever its input grows by a factor of $y$ , there is basically only one way it can behave. Furthermore, that function only has one degree of freedom — the choice of $b$ such that $f(b)=1.$ As we will see next, even that degree of freedom is rather paltry: All logarithm functions behave in essentially the same way. As such, if we find any $f$ such that $f(x \cdot y) = f(x) + f(y)$ (or any physical process well-modeled by such an $f$ ), then we immediately know quite a bit about how $f$ behaves.

Comments

Eric Rogstad

Exercise: Given an interpretation to $4$\.

I'm having trouble parsing interpretation #1 -- which part is supposed to map onto the right hand side of equation (4)?

This says that the way that growing the input by a factor of $x$ changes the input is exactly the opposite from the way that shrinking the input by a factor of $x$ changes the input\. In terms of the "communication cost" interpretation, if doubling $or tripling, or $n$ \-times\-ing$ the possibility space increases costs by $c$ , then halving $or thirding, or $n$ \-parts\-ing$ the space decreases costs by $c.$

output?

This says that the way that growing the input by a factor of $x$ changes the output is exactly the opposite from the way that shrinking the input by a factor of $x$ changes the output\. In terms of the "communication cost" interpretation, if doubling $or tripling, or $n$ \-times\-ing$ the possibility space increases costs by $c$ , then halving $or thirding, or $n$ \-parts\-ing$ the space decreases costs by $c.$

May need to build the intuition that knowing how f(x) behaves tells me how f(c*x) is different from f(c).

(You're using the language of "growing the input," but I just see a static input called x.)

Kaya Fallenstein

(8) doesn't follow from (5). The assumption in (5) was than $n$ ranged over naturals, not reals. In fact, (1) only implies (8) if you also require the function to be continuous.

(1) essentially says $f$ is a homomorphism from $(\mathbb{R}^{>0},\cdot)$ to $(\mathbb{R},+)$ . To generate a function satisfying (1) but not (8), we need only compose $log$ (choose a base) with an automorphism in the additive group and show that the composition is not a multiple of a logarithm. We can get such an automorphism by considering $\mathbb{R}$ as an infinite dimensional vector space over the rationals and, for example, swapping two dimensions.

Nate Soares

(5) was intended to assume that $n \in \mathbb R^{\ge 1},$ or possibly $\in \mathbb R^{\ge 0}$ if you want an easy way to prove (6). In that case, how does (8) not follow from (5)? (If $f(x^y)=yf(x)$ in general, then $f(b^n)=nf(b)$ and $f(b)=1 \implies f(b^n)=n,$ unless I'm missing something.)

The proof of (5) only goes through for $n\in\mathbb{N}$ .

You can prove a version of (8) from (5), namely, $f(b)=1\Rightarrow f(b^q)=q$ for $q\in\mathbb{Q}$ , but this doesn't pin down $f$ completely, unless you include a continuity condition.

tl;dr: I did some reading on related topics, and it turns out that (1) may be sufficient to define logarithms if we take as an axiom that every set is Lebesgue measurable (which is incompatible with the axiom of choice). Otherwise, we need to add an additional condition to (1).

(1) states that $f(x\cdot y)=f(x)+f(y)$ . Given a function $g$ satisfying this condition, we can generate an additional function satisfying this condition by composing $g$ with a function $h$ , where $h(x+y)=h(x)+h(y)$ :

$h(g(x\cdot y))=h(g(x))+h(g(y))$

$h$ , as defined, is a solution to Cauchy's functional equation. The family of functions given by $h(x)=ch(x)$ for some constant $c$ is always a solution, giving the usual logarithm family. The existence of other solutions is independent of ZF. When they do exist they are always pathological and generate non-Lebesgue measurable sets (for more, see this stackexchange link).

We can prove the existence of such solutions in ZFC by noting that the solutions of the Cauchy functional equation are exactly the homomorphisms from the additive group of $\mathbb{R}$ to itself. The real numbers form an infinite dimensional vector space over the field $\mathbb{Q}$ . Linear transformations from the vector space to itself translate into homomorphisms from the group to itself. Since the axiom of choice implies that any vector space has a basis, we can, for example, find a non-trivial linear transformation by swapping two basis vectors. This in turn induces a homomorphism from the group to itself. (The Wikipedia page gives the general form of a solution to this functional, which turn out to be all the linear transformations on the vector space.)

(I'm not saying that this article should discuss axiomatizations of set theory, but it doesn't seem good to make statements that are only true if you assume, e.g., an unusual alternative to the axiom of choice.)

Wikipedia proves that the pathological solutions must all be dense in $\mathbb{R}$ , so to exclude them, we can adopt any number of conditions. Wikipedia points at " $f$ is continuous", " $f$ is monotonic on any interval", and " $f$ is bounded on any interval". Continuity seems to be most intuitive; once we have defined the value of the function on the rationals (which we can do with basically the arguments already on this page), the rest of its values are determined.

How are these changes? (starting at prop 5, through the end)