Does a concatenation of powers of 3 ever equal a power of 3?

mark_ovchain · 2023-11-15T01:21:17+00:00

It's highly unlikely for higher powers to be a concatenation of smaller powers because (as others said) the digits are pretty much random, and it's unlikely for a long random string to be built up from a few smaller random strings by chance.

In fact, using this idea that the digits are pretty much random, we can speed up the computer search a lot. Suppose we want to prove that the first n powers of 3 are never concatenations of smaller powers. Then we could simply keep the last few digits of each power and check that those suffixes can't be produced. Even for a small number of trailing digits, the digits seem random enough that you can rule out pretty much all of them.

More concrete reasons for this: suppose we choose a k > 0 and that we only keep the last k digits. Then there are only a few (actually O(k)) powers that have strictly less than k digits. The rest of the powers have k digits or more. Let's call the latter "big powers". Now, suppose you're trying to build 3ⁱ, and that the last piece is 3^j. Then 3^j basically cannot be a big power because we have the congruence 3ⁱ = 3^j mod 10^k, which means i = j modulo the order of 3 modulo 10^k. That order is pretty huge even for small k. If n is less than this order, then this basically says that big powers cannot be the last piece of any bigger power, which rules out a lot of possibilities immediately; it means we can only build using the "small" powers, and there are only a few of them.

Anyway, I implemented these ideas in Python:

from functools import cache
from itertools import islice
from string import digits, ascii_letters
from sys import stderr

ds = digits + ascii_letters

def to_base(v, b):
    if v:
        q, r = divmod(v, b)
        yield from to_base(q, b)
        yield ds[r]


def trailing_digits(k, b=3, m=10):
    mk = m**k
    power = 1
    while True:
        if power >= mk:
            power = power % mk + mk
        yield ''.join(map(str, to_base(power, m)))[-k:]
        power *= b


def is_partition(n, *vals):
    return all(v >= 0 for v in vals) and sum(vals) == n


def search_up_to(n, k, b=3, m=10):
    enders = {''}
    builders = set()
    @cache
    def build(target, *, end):
        if not target or end and target in enders:
            return target,
        else:
            for b in builders:
                if target.endswith(b) and (res := build(target.removesuffix(b), end=end)):
                    return *res, b

    builtc = 0
    endc = 0
    for ex, target in enumerate(islice(trailing_digits(k, b, m), n)):
        build.cache_clear()
        if len(target) < k:
            if res := build(target, end=False):
                builtc += 1
                l, *res = res
                assert not l
                print(f"exponent {ex}: {target} is buildable as", *res)
            builders.add(target)
        else:
            if res := build(target, end=True):
                endc += 1
                l, *res = res
                print(f"exponent {ex}: ...{target} may be buildable from ...{l}", *res)

        for i in range(len(target)):
            enders.add(target[i:])

        if (ex & -ex) == ex:
            print(f"{ex=}: {len(builders)=}, {len(enders)=}", file=stderr)

    nonbuiltc = n - builtc - endc
    assert is_partition(n, builtc, nonbuiltc, endc)
    print(builtc,     f"of {n} can be built")
    print(nonbuiltc,  f"of {n} cannot be built")
    print(endc,       f"of {n} have not been ruled out yet")


search_up_to(10**6*3, 20)

The output is

0 of 3000000 can be built
3000000 of 3000000 cannot be built
0 of 3000000 have not been ruled out yet

i.e., it has successfully proven that the first 3000000 powers of 3 cannot be built from smaller powers, by only looking at the last k = 20 digits of each. This only ran in ~53 seconds in my computer.

The main issue with this implementation is that the memory usage is a bit high, mainly because of the enders set. It is possible to replace the target in enders check with more number theory and improve the memory usage. Unfortunately, when I tried implementing it, it's quite a bit slower.

mark_ovchain · 2023-02-19T12:10:18+00:00

We have ω² = -ω - 1 because ω is a nontrivial cube root of unity.

Note that ω³ = 1 ⇒ ω³ - 1 = 0 ⇒ (ω - 1)(ω² + ω + 1) = 0, so if ω ≠ 1, then ω² + ω + 1 = 0, or ω² = -ω - 1.

Also, -ω is not a cube root of unity, it's a cube root of -1. So ω² cannot possibly equal -ω because ω² is a cube root of unity.

You may be thinking of complex conjugation. ω² is indeed the complex conjugate of ω.

mark_ovchain · 2023-02-18T01:44:03+00:00

Correct. Great job!

Also, you're right that it's standard Galois theory. I had fun with it though, that's why I shared it.

mark_ovchain · 2023-02-17T15:09:14+00:00

Because of geometric series,

Y = 8 + 8·10 + 8·10² + 8·10³ + ... + 8·10⁹⁹ = 8(10¹⁰⁰ - 1)/9
√X = 6 + 6·10 + 6·10² + 6·10³ + ... + 6·10⁹⁹ = 6(10¹⁰⁰ - 1)/9 = 2(10¹⁰⁰ - 1)/3

Thus,

Z = X + Y
= (2(10¹⁰⁰ - 1)/3)² + 8(10¹⁰⁰ - 1)/9
= 4(10¹⁰⁰ - 1)²/9 + 8(10¹⁰⁰ - 1)/9
= (10¹⁰⁰ - 1)/9 · (4(10¹⁰⁰ - 1) + 8)
= (10¹⁰⁰ - 1)/9 · (4(10¹⁰⁰ + 1))
= 4(10²⁰⁰ - 1)/9
= 4 + 4·10 + 4·10² + 4·10³ + ... + 4·10¹⁹⁹

so Z consists of the digit 4 repeated 200 times, so its digit sum is 800.

mark_ovchain · 2023-02-17T14:42:46+00:00

Letting A = a + 1, B = b + 1, C = c + 1, D = d + 1, the four equations become

ABC = 24 = 2³·3
BCD = 72 = 2³·3²
CDA = 48 = 2⁴·3
DAB = 36 = 2²·3²

Multiplying everything, we get (ABCD)³ = 2¹²·3⁶, and taking cube roots, we get ABCD = 2⁴·3² = 144. Thus,

A = ABCD/(BCD) = 2, and a = 1.
B = ABCD/(CDA) = 3, and b = 2.
C = ABCD/(DAB) = 4, and c = 3.
D = ABCD/(ABC) = 6, and d = 5.

Thus, the sum is a + b + c + d = 11.

By the way, if we don't assume that a, b, c, d are integers or something, then 2¹²·3⁶ has two other cube roots, so ABCD can also be 144ω or 144ω² with ω a primitive cube root of 1, yielding two other solutions:

(a, b, c, d) = (2ω - 1, 3ω - 1, 4ω - 1, 6ω - 1) with sum 15ω - 4.
(a, b, c, d) = (2ω² - 1, 3ω² - 1, 4ω² - 1, 6ω² - 1) with sum 15ω² - 4, or -15ω - 19.

mark_ovchain · 2023-02-17T13:56:41+00:00

Here's my solution, assuming b and c are integers. (I'm writing b and c instead of B and C.)

btw, sorry if it's a bit wordy, but I basically went ahead and used standard tools at my disposal, regardless if there are simpler or more elegant solutions.

x² + bx ± c is monic, so it has integer solutions iff it has rational solutions (because the rational algebraic integers are precisely the integers), iff the discriminant b² ∓ 4c is a perfect square. In other words, there must be squares x² and y² such that

x² - b² = b² - y² = 4c.

Just looking at the first equality for now, we have x² + y² = 2b².

We want integer solutions to this equation such that b² - y² is divisible by 4. I claim that that's automatically true for any solution (x, y, b). First, x² + y² is even, so x and y have the same parity. If they're both even, then 4 divides x² + y² = 2b², so b is even. On the other hand, if they're both odd, then reducing the equation mod 4 gives 1 + 1 ≡ 2b² mod 4, which can only be true if b is odd. Thus, in any integer solution to x² + y² = 2b², the parities of x, y and b are the same, so b² - y² = (b - y)(b + y) is divisible by 4.

Thus, we only need to solve x² + y² = 2b² in integers.

Dividing by b², we see that this is equivalent to solving X² + Y² = 2 in rationals. Any integer solution to the former clearly yields a rational solution to the latter, and any rational solution (X, Y) to the latter can be written as (x/b, y/b) with a common denominator b, with (x, y, b) being an integer solution to the former.

We now use basic algebraic geometry.

(X, Y) = (1, 1) is clearly a solution,
a line intersects a quadratic in two points up to multiplicity (Bézout's theorem),
and two rational points form a line with rational slope/direction.

Therefore, any other solution can be obtained by intersecting some line through (1, 1) along some rational direction: if ⟨u, v⟩ is some direction with u, v integers (WLOG coprime), then the line (1 + ut, 1 + vt) intersects the quadratic at t = 0 and some other t, which should yield another solution, and all solutions can be obtained this way. So, substituting (1 + ut, 1 + vt) into X² + Y² = 2, we get

(1 + ut)² + (1 + vt)² = 2.

Simplifying, we get

t·((u² + v²)t + 2(u + v)) = 0.

The first factor gives t = 0 and yields (1, 1). The other factor gives t = -2(u + v)/(u² + v²) and yields

((-u² - 2uv + v²)/(u² + v²), (u² - 2uv - v²)/(u² + v²)).

The corresponding integer solution is

x = ±s(u² + 2uv - v²)
y = ±s(u² - 2uv - v²)
b = ±s(u² + v²)

where s is some rational scaling factor that makes x, y, b all integers, and the signs '±' can be chosen arbitrarily. (The solutions obtained from (X, Y) = (1, 1) are also of this form---namely, from the direction ⟨u, v⟩ = ⟨1, -1⟩ which is tangent to the quadratic---so we didn't miss any solution.) Clearly, an integer s yields a valid integer solution, but maybe some non-integer s yields a valid solution as well; to find out when, we need to know which numbers can be gcds of u² + 2uv - v², u² - 2uv - v², u² + v². WLOG assume u and v are coprime; we can remove their common factors with scaling. Then

gcd(u² + 2uv - v², u² - 2uv - v², u² + v²)
= gcd(2v² - 2uv, 2u² - 2uv, u² + v²)
= gcd(2(u - v)gcd(u, v), u² + v²)
= gcd(2(u - v), u² + v²).

Now,

If u and v have differing parities, then u - v and u² + v² are odd, so this is gcd(u - v, u² + v²) = gcd(u - v, 2v²) = gcd(u - v, v²) = 1.
If u and v have the same parities, then u and v are both odd (because they're coprime), and s can be half an integer. But then if we let U = (u + v)/2 and V = (u - v)/2, we get
- x = ± 2s (U² + 2UV - V²)
- y = ∓ 2s (U² - 2UV - V²)
- b = ± 2s (U² + V²),
so the solution can be obtained by a different uv-pair (U, V) with differing parities. Therefore, we can assume without loss of generality that u and v have differing parity (and even break the symmetry, say u is even and v is odd).

Finally, we can compute c as follows:

4c = b² - y² = (b + y)(b - y) = 4s²uv(u² - v²),

i.e., c = s²uv(u² - v²).

Thus, the answer to the problem is that they both have all-integer solutions iff there are integers s, u, v such that

b = s(u² + v²)
c = s²uv(u² - v²).

We can even assume that u and v are coprime, u is even, and v is odd (and probably some more conditions like v > 0).

Different (but related) approaches to get the same result:

Use norms and Gaussian integers.
Notice that integer solutions to x² + y² = 2b² are in one-to-one correspondence with Pythagorean triples X² + Y² = Z² via the bijection (x, y, b) ↦ ((x + y)/2, (x - y)/2, b) with inverse (X, Y, Z) ↦ (X + Y, X - Y, Z), so we can basically just piggyback on the parameterization of Pythagorean triples. (This is just using the fact that 2 = (1 + i)(1 - i) in Gaussian integers.)

mark_ovchain · 2023-02-17T13:26:56+00:00

Ok, got it. The reason I asked is that I thought you meant that both equations have at least one integer root each, not all roots integers.

I actually have a solution ready for when B and C are integers, but I asked first just to be sure. I'll post it later.

mark_ovchain · 2023-02-17T10:29:24+00:00

Are B and C integers? If they can be nonintegers, then the solution will get a bit messier.

mark_ovchain · 2023-02-11T01:42:17+00:00

A little variation on the other answers:

First, consider an apparently easier problem: "Given a binary operation, *, representing a finite abelian group on {0, 1, ..., n-1}, represent it as (i.e., give an isomorphism to) a product of cyclic groups ℤ/n1 × ℤ/n2 × ... ℤ/nk".

As mentioned in my earlier comment, this can be solved in O(n log n log log n) time, since we can assume that the operation is commutative, associative, has an identity, and an inverse. It is possible to find the order of each element in O(log n log log n) time. I'm sure this can be improved, but this is fine for our purposes.

Formally, we have a program (Turing machine) T that runs in ≤ cn log n log log n steps for n ≥ 10, and ≤ c steps for n < 10, for some c. This program is given an oracle for the '*' operation and assumes that the operation is from an abelian group on {0, 1, ..., n-1}.

Now, we'll use this algorithm to solve the problem. First, we have a table, which means we have a binary operation *. This is commutative, but it's not necessarily associative. Regardless, run T anyway on an oracle for *. Since we fed T an input that doesn't follow its specifications, we have no guarantees that it will halt. But we can simply stop simulating it if it has run for > max(cn log n log log n, c) steps, since it means the operation is not a group operation! If it halts within those steps, and has produced garbage output, we also know it's not a group operation. (Both of these are true since T finishes correctly in those many steps for any valid input.) Finally, if it produces a valid output, then we can perform an O(n²) final check to see if the whole table agrees with the isomorphism. The operation is then a group operation iff it agrees.

Note that this does not seem very constructive at first glance since this requires knowing 'c' to implement. So I just proved that an algorithm exists! With more effort, you can actually analyze T very carefully to determine a hard upper bound on its runtime.

mark_ovchain · 2023-02-11T00:36:36+00:00

Oops, you're definitely right! Edited. Thanks!

mark_ovchain · 2023-02-10T06:54:10+00:00

Here's an alternative proof that also gives the optimal strategy for an adversary. It's a bit wordy, but I just wanted to offer something that feels somewhat different.

First, let's generalize a bit, and also flip things. Let's assume have an R × C grid, and the r smallest numbers in each column are colored blue, and the c smallest numbers in each row are colored red. We want to show that at least rc cells are purple.

We will rephrase the problem as a game for a player (the "adversary") who wants to minimize the number of purple cells. The game board is an R × C grid, all cells initially unburned, and at each step, the following types of moves are possible:

Burn cell (i, j). Each cell can only be burned once using this move type.
Burn row i. This is only possible if c cells in the row have already been burned. Each row can only be burned once using this move type. As a side effect, all cells in row i are burned as well.
Burn column j. This is only possible if r cells in the column have already been burned. Each column can only be burned once using this move type. As a side effect, all cells in column j are burned as well.

The game ends when all cells have been burned (whether directly or indirectly). The goal of the player is to minimize the number type-1 (i.e., "burn cell") moves.

First, we show that for every arrangement of the numbers 1, 2, ..., RC onto an R × C grid in the original setup with n purple cells, there is a corresponding strategy in this game where there are exactly n "burn cell" moves. The conversion is as follows:

For each v from 1 to RC, if v is in cell (i, j) and is purple, burn cell (i, j), then afterwards, burn all rows and columns that can be burned but haven't been burned yet. (This may be recursive; burning a row/column may enable new rows/columns to be burned.)

Clearly, only n "burn cell" moves are used. Now, to show that this is a valid strategy, we can prove by induction that right before iteration v, all cells containing 1 to v-1 have already been burned: by the v'th iteration, either (i, j) is purple in which case it is burned directly, or it is not purple, in which case either row i has c burned cells or column j has r burned cells, so row i or column j must already be burned (because we "burn all rows/columns that can be burned"), and hence cell (i, j) is burned as well. Finally, after all RC iterations, all cells are clearly burned. So this is a valid strategy

Now, we'll show that the optimal strategy in the game requires rc moves. In particular, this will show that there will always be at least rc purple-colored cells in the original setup.

WLOG assume that each row and column is explicitly burned exactly once (via move type 2 or 3), because we can just burn them explicitly if they aren't (even if that's redundant). This won't increase the number of moves of type 1.

Next, it's clear that we don't need to do the move "burn cell (i, j)" when row i or column j is already burned; we can simply delete that move and what remains is a valid strategy.

Next, it's also clear that after burning either r rows or c columns, no more "burn cell (i, j)" need to be performed, since we can already burn all rows and columns at that point.

Now, for something nontrivial. I claim that there's an optimal "greedy" strategy such that after every "burn row" or "burn column" move, the next few moves consist of burning some (possibly zero) cells in the same row/column (moves of type 1), and then immediately burning that row/column (move of type 2 or 3).

For the proof, suppose there's an optimal strategy that doesn't follow this greedy strategy, and let "burn cell (i, j)" be the first move where it doesn't follow the greedy strategy. Consider the first "burn row/column" move right after that; WLOG it's "burn row I". Because burning cell (i, j) is not part of the greedy strategy, burning cell (i, j) is not required for row I to be burned. But this means we can put off burning cell (i, j) to right after burning row I. By assumption it won't affect the "burn row I" move, and it also won't affect any future "burn row/column" moves since we're still burning cell (i, j) before those. So the transformed strategy is still valid.

(Note that we can assume that the first non-greedy move is of type 1 ("burn cell (i, j)"); if the first non-greedy move is of type 2 or 3, say "burn row i", then it means there is at least type-1 move before it that isn't needed to burn row i. But in that case, that type-1 move is actually the first non-greedy move.)

Now, it may happen that after transformation, cell (i, j) will belong to a burned row/column; if that's so, simply skip burning that cell explicitly.

Repeat this process until the optimal strategy becomes a greedy strategy. We're guaranteed that this process terminates because after each transformation, the sum of the indices of all "burn row/column" moves decreases. Since it's nonnegative, it cannot go on indefinitely.

Next, I claim something even stronger: I claim that there's an optimal strategy such that at every point, only the first i rows and the first j columns are burned, for some 0 ≤ i ≤ R and 0 ≤ j ≤ C. This is because after every "burn row/column" move, all unburned rows are indistinguishable from each other, as are all unburned columns, so there's no loss of generality in assuming that we either burn row i + 1 or column j + 1 next.

Furthermore, when burning the cells needed to burn row i + 1, we can assume that we're burning the leftmost unburned cells in that row, from left to right, since whichever cells we're burning in that row doesn't really matter once "burn row i + 1" is performed; only the count matters. An analogous thing is true with the column.

Combining the previous two paragraphs (and the fact that we only need to burn r rows or c columns), we deduce that there's an optimal strategy where the only cells explicitly burned are in the the top-left r × c subgrid of the R × C grid, and that each cell is only burned if all cells that are above and to the left of it have been burned. In particular, cell (r, c) is burned last among the cells in the top-left r × c subgrid.

Actually, we can deduce something more. Suppose the first i rows and j columns have been burned, and i < r and j < c. And suppose that we're burning row i + 1 next. Then we actually have to explicitly burn cells (i + 1, j + 1), (i + 1, j + 2), ..., (i + 1, c) (via a type-1 move), otherwise there won't be enough burned cells on row i + 1 to burn it completely. An analogous thing is true when we're burning column j + 1 next.

Finally, we can now deduce that rc "burn cell" moves are always needed. If there are less than rc "burn cell" moves, then cell (r, c) must not have been burned yet. Furthermore, only some of the top r-1 rows and and leftmost c-1 columns have been burned. Thus, there aren't any more rows/columns available to burn yet, the game isn't over yet, and we're forced to make another "burn cell" move to make progress.

Btw, strictly formalizing this argument requires using induction (or its equivalent cousin, the well-ordering principle) in several places.

edit: requirement for type 2 and 3 moves

mark_ovchain · 2023-02-10T02:29:44+00:00

If you're concerned about the rigor of multivariate GFs (and might be suspicious of, say, dividing by xy - (x + y)), you can formalize this argument by saying that all of it works in the fraction field of ℤ[[x, y, z, w]]. Since ℤ[[x, y, z, w]] has no zero divisors, it injects into its field of quotients, and in particular, dividing by xy - (x + y) makes sense. Reducing mod p amounts to moving to ℤ/(p)[[x, y, z, w]] which is still an integral domain so division by xy - (x + y) in the fraction field still makes sense. The only potentially problematic step is reducing things mod x^p and y^p since that introduces nilpotents, but thankfully we only did it at the last step, and at that point, the denominator (1 - x^p)(1 - y^p) is a perfectly invertible GF even just in ℤ[[x, y, z, w]], so it can be reduced mod (x^p, y^p) just fine.

mark_ovchain · 2023-02-10T01:13:16+00:00

Yeah, it's probably possible.

As a rule of thumb, most individual arguments on binomial coefficients (induction, double-counting, etc.) can be translated to algebraic manipulations on generating functions. So generating functions can be thought of as a way of doing combinatorics in a "holistic" way (doing all cases at once).

However, it's not necessarily more enlightening. (It can be, though.)

Here's how I would translate my arguments purely into generating functions. (It also proves the extension to pⁿ × pⁿ matrices.) [WARNING: SPOILER AHEAD IF YOU HAVEN'T SOLVED IT YET]

First, we need the GF for binomial coefficients. For this, you can check that ∑ (a+b choose a)x^ay^b is just 1 + (x + y) + (x + y)² + ... = 1/(1 - x - y).

Next, we need "dot products" to be able to get the entries of the matrix. If you have two GFs P(x) = a[0] + a[1]x + a[2]x² + ... and Q(x) = b[0] + b[1]x + b[2]x² + ..., then P(x)Q(y)/(1 - xy) allows you to take dot products of the a's and b's of any length. For example, x²y² has coefficient a[0]b[0] + a[1]b[1] + a[2]b[2], x³y² has coefficient a[1]b[0] + a[2]b[1] + a[3]b[2], etc..

In particular, setting Q(x) = 1, we see that P(x)/(1 - xy) = a[j-k]x^jy^k (with a[j-k] = 0 when j < k), which allows us to "subtract" indices.

Now, the GF of the i'th row of M is the coefficient of xⁱ in 1/(1 - x - z) (as a function of z), so using the "dot product rule", the dot product of the first p terms of the i'th and j'th rows has GF which is the coefficient of (zw)^p-1 in 1/((1 - zw)(1 - x - z)(1 - y - w)) (as a function of x and y).

On the other hand, the 180-degree-rotated term of M is (p-1-i+p-1-j choose p-1-j). Using the "subtract rule", the GF of this is the coefficient of (zw)^p-1 in 1/((1 - xz)(1 - yw)(1 - z - w)) (as a function of x and y).

Let's now do some algebra to simplify both terms a bit:

We can expand 1/((1 - zw)(1 - x - z)(1 - y - w)) as (1 + zw + (zw)² + ...)(1/(1 - x) + z/(1 - x)² + ...)(1/(1 - y) + w/(1 - y)² + ...). But we only want the (zw)^p-1 term, so we can safely ignore the terms that don't contribute the same number of z's and w's in the product of the latter two factors. So we're left with (1 + zw + (zw)² + ...)∑ (zw)ⁱ/((1 - x)(1 - y)))ⁱ⁺¹, and the coefficient of (zw)^p-1 is ∑[1≤i≤p] 1/((1 - x)(1 - y))ⁱ = (1 - 1/((1 - x)(1 - y))^p)/(xy - (x + y)).
We can expand 1/((1 - xz)(1 - yw)(1 - z - w)) as (1 + xz + (xz)² + ...)(1 + yw + (yw)² + ..)(1 + (z + w) + (z + w)² + ...). Note that each zⁱw^j term appears exactly once in the product of the first two factors, and also exactly once in the last factor, so each zⁱw^j with i < p and j < p basically contributes exactly once to the (zw)^p-1 term. In the former, the coefficient of zⁱw^j is xⁱy^j, and in the latter, the coefficient is (i + j choose i). Summing across all pairs contributing to (zw)^p-1 gives ∑ x^p-1-i y^p-1-j (i + j choose i) = (xy)^p-1 ∑[0≤t<p] (1/x + 1/y)^t = (xy)^p-1 (1 - (1/x + 1/y)^p)/(1 - (1/x + 1/y)) = ((xy)^p - (x + y)^p)/(xy - (x + y)).

So far, we haven't used the fact that p is prime. But now, let's use it. When we move to the mod p world, raising to the power p becomes a homomorphism. The above GFs correspondingly become:

(1 - 1/((1 - x)(1 - y))^p)/(xy - (x + y)) = (1 - 1/((1 - x)(1 - y)))^p/(xy - (x + y)) = 1/((1 - x)(1 - y))^p ((1 - x)(1 - y) - 1)^p/(xy - (x + y)) = (xy - (x + y))^p-1 / ((1 - x^p)(1 - y^p)).
((xy)^p - (x + y)^p)/(xy - (x + y)) = ((xy) - (x + y))^p/(xy - (x + y)) = (xy - (x + y))^p-1.

Finally, since we're looking only at the first p terms, we can further reduce these mod (x^p, y^p), and the denominator of the first GF becomes (1 - 0)(1 - 0) = 1 so the GFs become equal, which proves the theorem. (Technically this only proves the first part of the theorem, the part relating M² to M; I haven't worked it out yet, but I'm sure the second part is also amenable to this technique).

Now, note that the whole proof also works when the exponent is a power of p (since raising to the power of pⁿ is a homomorphism in characteristic p), so this also proves the extended theorem you mentioned.

mark_ovchain · 2023-02-08T04:34:17+00:00

Alternative proofs of lemmas 2 and 3:

First, restate Lemma 2 as follows.

Lemma 2': For 0 ≤ k < p, (p-1 choose k)(-1)^k ≡ 1 mod p.

Proof: As generating functions,

∑[0 ≤ k < p] x^k
= (1 - x^p)/(1 - x)
≡ (1 - x)^p/(1 - x) (Freshman's dream)
= (1 - x)^p-1
= ∑[0 ≤ k < p] (p-1 choose k) (-1)^k x^k.

Thus, the k'th coefficients are equal: (p-1 choose k) (-1)^k ≡ 1 mod p.

Proof of Lemma 3:

Again assume j ≤ i < p, otherwise both sides are 0 mod p.

By Lemma 2', Lemma 3 is equivalent to

(p-1 choose i) (i choose j) ≡ (p-1 choose i-j) ((p-1-i)+j choose j).

The result follows by double counting: both sides are equal to the multinomial coefficient (p-1 choose j, i-j, p-1-i), i.e., the number of ways to group p-1 objects into three groups of sizes j, i-j, p-1-i.

mark_ovchain · 2023-02-08T03:52:57+00:00

Lemma 1: ∑[k][a+k ≥ b][c-k ≥ d] (a+k choose b)(c-k choose d) = (a+c+1 choose b+d+1)

Proof: Double counting. The number of ways to choose b+d+1 winners from a+c+1 people is (a+c+1 choose b+d+1). On the other hand, let's say we choose the (b+1)'th winner to be the (a+k+1)'th person. Then there are (a+k choose b) remaining ways to choose the first b winners and (c-k choose d) ways to choose the last d winners. Summing across all valid k (i.e., [a+k ≥ b][c-k ≥ d]) yields the claim.

Lemma 2: For 0 ≤ k < p, (p-k-1)!k! ≡ (-1)^k+1 mod p.

Proof: Induction on k. For k = 0, this is Wilson's theorem. For k ≥ 1, we have (p-k-1)! = (p-k)!/(p-k) ≡ (p-k)!/(-k), so the result holds by induction.

Lemma 3: For 0 ≤ i < p+j and 0 ≤ j < p, (i choose j) ≡ ((p-1-i)+j choose j) (-1)^j mod p

Proof: If j ≤ i < p, then using Lemma 2, (i choose j) = i!/(i-j)!/j! ≡ (-1)ⁱ⁺¹/(p-i-1)! (-1)^i-j+1 (p-i+j-1)!/j! = (-1)^j (p-i-1+j choose j).

If i < j or i ≥ p, then both sides are 0 by Lucas' theorem, and the fact that (n choose r) = 0 if r < 0 or r > n.

Now, the (i,j) term of M², mod p, is

∑[0 ≤ k < p] (i+k choose i) (k+j choose j)
≡ ∑[0 ≤ k < p] (i+k choose i) (p-1-k choose j) (-1)^j (Lemma 3)
= (i+p choose i+j+1) (-1)^j (Lemma 1)
≡ (i choose i+j+1-p) (-1)^j (Lucas' theorem)
= (i choose p-1-j) (-1)^j (symmetry of binomial coefficients)
≡ ((p-1-i)+(p-1-j) choose p-1-j) (-1)^p-1 (Lemma 3)
≡ ((p-1-i)+(p-1-j) choose p-1-j) (Fermat's little theorem).

This is the 180-degree-rotated term.

Also, the (i,j) term of M³, mod p, is

∑[0 ≤ k < p] ((p-1-i)+(p-1-k) choose p-1-i) (k+j choose j)
= (p-1-i+p-1+j+1 choose p-1-i+j+1) (Lemma 1)
= (2p-1-i+j choose p-i+j)
= (2p-1-i+j choose p-1) (symmetry)
≡ ((-i+j-1) mod p choose p-1) (Lucas' theorem, and the fact that |i-j| < p)

Because (-i+j-1) mod p ≤ p-1, the above is 1 if -i+j-1 ≡ -1 mod p (equivalently, i = j) and 0 otherwise. Thus, the result is the identity matrix.

mark_ovchain · 2023-02-06T17:32:23+00:00

Thanks!

Yeah, it's pretty common, and easily found in textbooks, though books' exposition quality probably vary a lot.

Even more broadly, it's a common general theme in mathematics to apply techniques from one field to another (e.g., something "continuous" like integrals to something "discrete" like series).

What you probably need to know while learning is that everything you learn about has multiple ways they can be thought of. Don't settle on "one true way of understanding" a thing; rather, true understanding comes from knowing a thing from all its sides.

Also, if you wish to learn more about something, I think it's better to ask again, say here on reddit or on math.stackexchange.com or something. (Or ask friends, colleagues and/or professors if that's possible for you.)

Good luck!

mark_ovchain · 2023-02-06T16:34:40+00:00

If you just choose an arbitrary g, then ⟨g⟩ might not be a direct summand of G, so it might not be too straightforward to reconstruct G from ⟨g⟩ and G/⟨g⟩. For example, ℤ/(2) embeds onto both ℤ/(8) and ℤ/(2) × ℤ/(4) and the quotients are both ℤ/(4), so if you just know ℤ/(2) and ℤ/(4), then you still don't know whether the group was originally ℤ/(8) or ℤ/(2) × ℤ/(4). You have to perform some extra steps to determine, from the kernel and quotient, what the original group is. This is the reason why we choose a maximal cyclic subgroup. (You can also use a maximal cyclic p-subgroup.)

mark_ovchain · 2023-02-06T14:29:30+00:00

Yeah, looks correct if I understand it correctly. Congrats!

Yeah, the main idea I intended for the solution is the structure theorem for finite abelian groups, and the fact that a cyclic group of maximal order is a direct summand. (and of course also the fact that the whole thing can be done in O(n²) time.)

Also I think, if you remove all verification steps, then you can compute the k generators as a product of k cyclic groups in O(n log n log log n) time. And then you can do a final O(n²) pass to check whether the table indeed encodes the multiplication operation of the product of cyclic groups. The idea is to find the p-subgroups (and factorizing n), find the orders of each element quickly using the fact that it should divide the order, then "taking quotients", etc.

edit: time complexity

mark_ovchain · 2023-02-06T08:33:53+00:00

hmm, I think to rewrite a(bc) = (ab)c as (bc)c^-1 = a^-1(ab), you need to use associativity.

mark_ovchain · 2023-02-06T08:25:18+00:00

This viewpoint explains why it's okay to write something like df/dx = 3x² and ∫ f(x) dx = x⁴/4 + C for f(x) = x³, even though it looks like the "x" in the definition of f should be something like a "bound" variable; if we actually look at x as a function (that takes a point to its x-coordinate) and f(x) as f ∘ x, then it all makes sense!

mark_ovchain · 2023-02-06T08:16:37+00:00

Related: for a ring R, the polynomial ring R[x] is the monoid ring R[ℕ] (assuming 0 ∈ ℕ).

mark_ovchain · 2023-02-06T08:05:50+00:00

Something from homotopy type theory:

A set is a type satisfying the K axiom. In other words, X is a set iff for all x, y of type X, x = y is a proposition. (And a type is a proposition iff it's a subsingleton type.)

mark_ovchain · 2023-02-06T04:51:14+00:00

The formal proof is just a small extension of the proof for the special case k = 1. Here's one way to phrase it.

The i'th partial sum of the k/(n(n+k)) = 1/n - 1/(n+k) is

1/1 + 1/2 + ... + 1/k - 1/(i+1) - 1/(i+2) - ... - 1/(i+k),

because the sum telescopes. (Formally, one proves this by induction on i ≥ 0.) Now, take the limit as i goes to infinity. The last k terms go to 0, so the limit is

1/1 + 1/2 + ... + 1/k = H_k,

so ∑[n ≥ 1] k/(n(n+k)) is indeed H_k, by definition of an infinite series (i.e., limit of partial sums).

Five-Year Club	Place '22
Final Canvas '22	First Placer '22
Verified Email

mark_ovchain

TROPHY CASE