Gelfands’s question asks whether there is a positive integer *n* such that the first digits of *j**n* base 10 are all the same for *j* = 2, 3, 4, …, 9. (Thanks to @republicofmath for pointing out this problem.) This post will explore Gelfand’s question via probability.

The MathWorld article on Gelfand’s question says that the answer is no for values of *n* less than 100,000. That range seemed small to me. My intuition was that you’d need to try larger values of *n* to have a reasonable chance of finding a solution.

So how big a value of *n* should you explore? To answer that question, consider picking *n* at random. If the leading digits of *j**n* were randomly distributed, what would the chances be that *n* would satisfy Gelfand’s question? Leading digits are not random, but they do often have regular distributions.

Suppose the leading digits are uniformly distributed among 1 to 9, and that the leading digits for each base are independent. Then the probability of *j**n* beginning with 1 (or any other chosen digit) for all *j* from 2 to 9 would be (1/9)8, and the probability of all leading digits matching any digit would be 9 times larger. That would say the probability of all leading digits matching would be on the order of 2 × 107, so we should try on the order of 107 or 108 values of *n* to have a decent chance of finding one.

But there’s a problem with the above argument: leading digits are not uniformly distributed. Benford’s law says that leading digits are often distributed very unevenly. When a system follows Benford’s law, the probability of a leading digit being *d* is log10(1 + 1/*d*). An uneven distribution of leading digits raises the probability that all the digits will match. If the leading digits of *j**n* follow Benford’s law, and are independent, then the probability of all matching is 6.84 × 10-5, two orders of magnitude higher than assuming a uniform distribution.

Do the leading digits of *j**n* follow Benford’s law? Yes, at least approximately, based on testing values of *n* from 1 to 1,000.

Now suppose the probability of a randomly selected value of *n* satisfying Gelfand’s question is *p* = 6.84 × 10-5. If we tried 100,000 random values of *n*, the probability of having no successes would be about 0.00107. So my intuition was wrong. If the random heuristic given here were correct, we’d stand a good chance of seeing a solution if we tried values of *n* up to 100,000.

Where does the probabilistic heuristic go wrong? In the assumption that the distributions of the leading digits are independent.

Let *x**j* be the sequence *j**n* for *n* running from 1 to 1000. Some of the *x**j* are correlated and some are not.

Some of the correlations are easy to see. For example, the sequences for *x*2 and *x*4 have correlation 0.493. That’s because 4*n* = (2*n*)2. So if you know the first digit of 2*n*, you can often guess the first digit of 4*n*. In light of this, it’s not surprising that *x*2 is also correlated with *x*8, and *x*3 is correlated with *x*9.

You might speculate that *x**j* and *x**k* are uncorrelated if and only if *j* and *k* are relatively prime. Some sequences, such as *x*2 and *x*3 support that guess. But *x*4 and *x*5 are negatively correlated, *r* = -0.433, for reasons that are not obvious. I suspect that the negative correlations are the key to understanding the question.

***

**Update**: I have verified that the answer to Gelfand’s question is no for *n* up to 109.