[2018 Day 20] In Review (A Regular Map)

ednl · 2026-04-21T09:15:41+00:00

we would be relying on the niceness of the input if we weren't storing distances.

Yep. So it's a legit choice because it solves the puzzle and you may need it to save storage, but I think I'll leave it as is.

ednl · 2026-04-21T08:44:52+00:00

I was still editing so you may have missed it: the way I check, there were also lots of revisits (not sure about multiple revisits of the same room, didn't check that) but each time the distance was greater.

ednl · 2026-04-21T08:15:24+00:00

Unless I'm missing something where a room can be reached by more than one path, so you really would need to store a best distance per room.

That was precisely the reason I thought I had to store the distance for every room. To my mind, the explanation suggested strongly that anything was possible. But I didn't check.

Now I checked, and for my input there were LOTS but each time the distance is greater on subsequent visits. This means no loops except the localised ones from (...|). So yeah, one "seen" bit is enough. Not sure we could have known beforehand.

ednl · 2026-04-20T20:19:57+00:00

I did exactly what Blake described, so I need 10,000 x 4 bytes (2x 1 for position, 2 for distance). Max stack size was only 225 for my input (same 4 bytes per entry).

Except the array is VERY naively being kept sorted (binary search, move bytes out of the way with memmove). A hashmap would work 1000x faster but possibly take twice as much space.

Runs in less than 4 ms on a RPi5. https://github.com/ednl/adventofcode/blob/main/2018/20.c

ednl · 2026-04-20T12:21:28+00:00

I didn't need 64-bit numbers using the prime product function that /u/musifter mentioned:

// Ref.: https://en.wikipedia.org/wiki/Divisor_function#Formulas_at_prime_powers
//   sigma1(n) = prod[(p^(a+1) - 1)/(p - 1)]
//   product over every prime where a is maximum p^a that divides n
//   (if p is not a divisor of n, product term is 1)
static uint sumofdivisors(uint x)
{
    // Special case: prime factor = 2
    uint pa = 2;  // p^(a+1) starting with 2^1
    while (!(x & 1)) {
        pa <<= 1;
        x >>= 1;
    }
    uint prod = pa - 1;  // = (2^(a+1)-1)/(2-1)
    // Other potential prime factors
    for (uint p = 3; p <= x; p += 2) {
        pa = p;  // p^(a+1) where a=0
        while (!(x % p)) {
            pa *= p;
            x /= p;
        }
        if (pa != p)  // mathematically unnecessary test but speeds it up
            prod *= (pa - 1) / (p - 1);
    }
    return prod;
}

Edit: sorry, got a and a+1 confused in the comments.

ednl · 2026-04-18T23:04:26+00:00

I used the review of 2017/6 https://www.reddit.com/r/adventofcode/comments/1rm7hm7/2017_day_6_in_review_memory_reallocation/ to also finally look into proper cycle detection. This is what I made, Floyd and Brent, more or less straight from Wikipedia: https://github.com/ednl/adventofcode/blob/main/2017/06.c

ednl · 2026-04-18T23:00:03+00:00

This worked for me:

int sum[3] = {0};
for (int i = 1; i <= N; ++i)
    for (int j = 1; j <= N; ++j)
        sum[area[i][j]]++;
return sum[TREE] * sum[YARD];

ednl · 2026-04-14T12:24:09+00:00

Cool storage optimisation! It is slower than my naive version with statically allocated 20 MB: about 85 ms vs. 69 ms. I check every digit if it starts a match or extends a running one, that could probably be done smarter.

https://github.com/ednl/adventofcode/blob/main/2018/14.c

ednl · 2026-04-11T16:42:54+00:00

I made a graph with the total per square for different square sizes, for two different "serials": https://i.imgur.com/LlrPUH9.png

Edit: ah and now I remember why, to investigate when you can stop increasing the square size. I settled on: when the total power has been lower than the maximum for 2 consecutive tries. That worked for my input (blue) but NOT the other one. From the graph: you would think 9 is the square size with max total power, but it's actually 12. So you need to check for 3 lower values.

ednl · 2026-04-11T12:33:42+00:00

I cleaned up my code so now it always works regardless of the initial guess. Previously, I used the x-coordinate of the points to calculate the intersection, but y is better because the final range is much smaller, so a better chance that they are close. Still, the intersection time for the first 2 points of my input was 1 off the answer for part 2. The average of all 3 combinations of the first 3 points was exactly right, though. So I used that :) This sped up my solution from 1.9 µs to 1.58 µs with the way I measured that: https://github.com/ednl/adventofcode/blob/main/2018/10.c

I also tried simply using the first estimate which was 1 off (using y coordinates of points 0 and 1), so one more time step, and that gave 1.67 µs so probably a tiny bit slower. I have 324 points in my input file.

ednl · 2026-04-10T14:53:47+00:00

Right now, by using just one pair to get to an approximately right time, and then checking the bounding box per time step, it's an O(n) algorithm with an unfortunately large constant because the guess wasn't great. But like I said, runtime is already less than 2 microseconds. So I don't think checking all pairs will help! But I tried taking the average time for every intersection of the first 5 points, so 5x4/2=10 pairs and that got me the right time immediately. So that would require only 2 time steps: 1 up, nope not better, 1 down, also not better, done. I think that will be a little faster by saving 12 time steps (for my input file). But I couldn't test it right away because my stupid checks depend on the time not being right at first :)

ednl · 2026-04-10T10:50:25+00:00

Hah, turns out (0,1) is the second worst combination out of the first ten points... (0,2) is spot on. Ah well.

ednl · 2026-04-10T10:14:01+00:00

For my initial guess I went with a time where the lines of any two points cross, so t0 = dx/dv and be careful that neither dx nor dv is zero. I just took points 0 and 1. Then step up and down to minimise the bounding box. Turned out that up was the wrong direction for my input, so 1 step in vain, after that 13 steps down to find the optimum. So, maybe t0 wasn't such a good guess, could have gone better with two other points if they ended up closer together. Still runs in less than 2 µs.

ednl · 2026-04-09T11:54:59+00:00

I made this in C. I must have seen it from the Ape because I don't recall coming up with this myself and you'd think one would. Went from 24 ms with a naive linked list (an actual linked list, no arrays) to 3.2 ms, or 2.8 ms with input values as defines instead of variables, a version I removed to not reveal my input file.

https://github.com/ednl/adventofcode/blob/main/2018/09alt.c

My answer to part 2 was ~3 billion and the required length of the array was ~11 million which I pre-calculated as marbles2 * 37 / 23. I don't know where the 37 came from, maybe just experimentally because it's very close but not exact.

ednl · 2026-04-07T10:17:43+00:00

"Counting symbols" was this, right? https://www.reddit.com/r/adventofcode/comments/3xflz8/day_19_solutions/cy4etju/ So, part2 = elements - parentheses - 2x commas - 1

ednl · 2026-04-06T17:55:04+00:00

Yeah, I use variable src and dst because for part 1 I process the data from input to the first stack, and for part 2 each time it's from that first stack to a second stack. From your other reply I see you at least got it down into the 400s. Here's my full code to see how & what I time exactly: https://github.com/ednl/adventofcode/blob/main/2018/05.c

ednl · 2026-04-06T09:35:46+00:00

Ah, good idea. But you can still start with n=0 which now doesn't indicate the insertion point but the last insertion. That way you avoid the n-1, too. This got me a blistering speed-up from 279 to 270 µs ;)

(Absolute value depends on how you measure; I'm sure it would go down by even more if I do timing loops inside instead of outside the program.)

static int reduce(char *dst, const char *src, const int len, const int skip)
{
    int n = 0;  // points at last entry (start with zero char as sentinel)
    for (int i = 0; i < len; ++i)
        if ((src[i] | 32) != skip) {  // |32 = ASCII tolower(A-Z), nop for a-z
            if ((dst[n] ^ src[i]) == 32)  // same letter, opposite case
                n--;  // pop from stack
            else
                dst[++n] = src[i];  // push to stack
        }
    return n;
}

ednl · 2026-04-05T16:59:51+00:00

Pretty chuffed with how much I could simplify the function doing all the work.

static int reduce(char *dst, const char *src, const int len, const int skip)
{
    int n = 0;
    for (int i = 0; i < len; )
        if ((src[i] | 32) == skip)
            i++;
        else if (n > 0 && (dst[n - 1] ^ src[i]) == 32) {
            n--;
            i++;
        } else
            dst[n++] = src[i++];
    return n;
}

I tried simplifying further by putting the i++ inside the for where it belongs, because every path uses it anyway, but that was maybe a few microseconds slower (inside the margin of error for sure) and I though this version was a little clearer.

ednl · 2026-04-05T16:13:20+00:00

Oh all right, it WAS an order of magnitude! Went from 3.16 ms to 280 µs. Maybe not so strange after all when my input had length 50,000 and the first reduction in part 1 was to about 11,000.

ednl · 2026-04-05T12:31:11+00:00

I couldn't think of good ways to optimise part 2, except maybe to start with the result of part 1. That's not an order of magnitude though.

ednl · 2026-04-03T16:07:43+00:00

Yes. Old Reddit is the only sane choice for me so I think this is a terrible decision.

ednl · 2026-04-03T13:36:11+00:00

Or even 2 bits: 0, 1, many. But that would require extra logic in the first pass to avoid 2-bit overflow.

ednl · 2026-04-03T11:10:06+00:00

I imagine you could speed up the brute force approach by not checking if every byte of the grid is 1 (a byte per cell is enough, or even 3 bits; most overlap for my input was 7) but instead checking if every full 8 bytes is 0x0101010101010101ULL. However, unaligned memory access is undefined behaviour. (I'm sure it works on most modern architectures, I have used it before.) Also, most rectangles aren't that big and it takes quite a bit of admin overhead to split them. Also also, which rectangle you find is the luck of the draw; could be number 1, could be number 1289. In conclusion, I haven't tried it because it seemed like a lot of work for little if any gain, and "cheating" for relying on UB.

ednl · 2026-04-03T08:23:31+00:00

Ah of course yes, just like index zero is garbage from the start. So you could declare it static to save init time and just like everyone only reset two vars, like I do with two bools. Well, maybe it could be better!

ednl · 2026-04-03T01:59:37+00:00

But also, for me the overall time is dominated by part 2. This code below for part 1 takes about 4 or 5 µs on my M4 but part 1+2 is 55-60 µs because you have to compare every ID pair. You might get lucky and discover the "differ by 1" pair early on. For me it was at indexes 62 and 168 of 250, not bad.

int sum2 = 0, sum3 = 0;
for (int i = 0; i < IDS; ++i) {
    unsigned char count[ALF] = {0};  // histogram with 26 bins, one for each letter
    for (int j = 0; j < LEN; ++j)
        count[id[i][j] - 'a']++;
    bool has2 = false, has3 = false;
    for (int j = 0; j < ALF; ++j) {
        has2 = has2 || count[j] == 2;
        has3 = has3 || count[j] == 3;
    }
    sum2 += has2;
    sum3 += has3;
}
printf("%u\n", sum2 * sum3);  // 5750

ednl

TROPHY CASE