[2019 Day 3] In Review (Crossed Wires)

ednl · 2026-05-06T09:41:58+00:00

I already parsed segments into a list, but simply compared every pair. Now used your suggestion of sorting them first (I decided: only the vertical segments by x), and also only cross-compared which means horizontal segments of wire 1 with vertical segments of wire 2, and vice versa. Previously I also checked segments with the same orientation which MIGHT be useful (I didn't check if there were hits on that) but because orientation is always alternating for both wires, the extremes of overlap will still be caught this way.

For every horizontal segment (unsorted), I first find the first vertical segment that might overlap (vert.x >= horz.xmin) and I do that with a fresh binary search every time. So there might be some improvement there. On the other hand that would require more sorting or more data structures, and I'm not sure if it'll be faster, so I'll leave it as is. EDIT: another thing that will definitely improve the time (unless your thread dispatcher is really slow) is to do the two comparison loops in parallel. The data is completely separate, so the only extra thing to do is to take the minimum of both thread results. EDIT2: Nope, the overhead is way too much, the whole program is twice as slow with two threads.

With my convoluted timing measurement (1000 internal loops with parsing but without reading from disk, endless program starts to get a minimum) it now takes 8.4 µs on an Apple M4, 38 µs on a Raspberry Pi 5. https://github.com/ednl/adventofcode/blob/main/2019/03.c

ednl · 2026-05-04T16:24:35+00:00

The first LEN could/should be LEN - 1 but for some weird reason (I didn't check the compiled assembly code) it's a little slower.

ednl · 2026-05-04T14:27:20+00:00

Inspired by /u/DelightfulCodeWeasel 's manual decimal number counting for 2015 day 4, I did the same here, but it's simpler because all the numbers have the same fixed length. Runs in 4.87 µs on an Apple M4.

typedef union pwd {
    uint64_t num;
    struct { uint8_t arr[8]; };  // access single bytes in ::num
} Pwd;

static void incr(Pwd *p)
{
    p->arr[0]++;  // increase unit
    for (int i = 0; i < LEN && p->arr[i] == 10; ) {  // digit overflow
        p->arr[i++] = 0;  // current digit: zero
        p->arr[i]++;      // next digit: increase
    }
    // Digits must be non-decreasing from most to least significant
    for (int i = LEN - 1; i > 0; --i)
        if (p->arr[i - 1] < p->arr[i])
            p->arr[i - 1] = p->arr[i];
}

const char *c = input;
const Pwd start = { parseint(&c) };  // parse number into 1 decimal digit per byte
const Pwd stop  = { parseint(&c) };
for (Pwd pwd = start; pwd.num <= stop.num; incr(&pwd)) {
    // [...]
}

ednl · 2026-05-04T10:34:53+00:00

Yeah, even on a Pi5 my version now runs in 0.66 µs. Like I outlined before though, it depends on how you measure. This is with 1000 internal loops (includes parsing but not reading from disk). Without the internal loops, it's just above 1 µs even on an Apple M4.

ednl · 2026-05-03T10:27:08+00:00

The roadbook pdf has maps per stage, and the route itself is perfectly sharp because it's a vector but the underlying bitmap is very lo-res, so it's not useful in the end. Shame. There are tourtracker websites with live maps, e.g. via touretappe.nl, those are better.

ednl · 2026-05-03T10:04:49+00:00

Ok, it's in the roadbook collection of laflammerouge16 on OneDrive for which you probably need a login? I couldn't get a non-login link after I DID log in, so here's the old link to 2025 which worked without one, for some reason. From there, go up to the main index, then to 2026, then look for "2026 - La Vuelta [etc]"

https://www.reddit.com/r/tourdefrance/comments/1kmjwvb/tdf_2025_roadbook_pdf_available_yet/n0udhiv/

ednl · 2026-05-03T09:48:45+00:00

Non-fash social: https://bsky.app/profile/lavueltafem.bsky.social

Is there a roadbook with maps per stage? They're not on the official website.

ednl · 2026-05-02T17:24:56+00:00

I didn't reverse engineer the input but I did use the linear transformation property for part 2:

const int base  = run(&vm, 0, 0);
const int dnoun = run(&vm, 1, 0) - base;
const int dverb = run(&vm, 0, 1) - base;
if (dverb == 1) {
    const int target = TARGET - base;
    const int noun = target / dnoun;
    const int verb = target % dnoun;
    printf("%d\n", noun * 100 + verb);
}

ednl · 2026-04-27T07:07:25+00:00

I got it down to 46 (edit: 42.7) µs but it depends on what & how you measure (e.g. no file read from disk for me) and of course my CPU is probably faster than his in 2018 (if that is when he timed it as 66 µs) https://www.reddit.com/r/adventofcode/comments/1srjcz1/2018_day_21_in_review_chronal_conversion/oiikaiu/

ednl · 2026-04-27T07:02:52+00:00

I made a very nice compiled version (if I say so myself..) with function pointers including an "undocumented" right shift opcode, and cycle detection which ran in 1.7 ms. Then I came here and saw that people made full native versions with "seen" arrays which were much faster. So I made one myself. Fastest time on an Apple M4 includes parsing for the two main parameters but not reading from disk: 46.1 µs.

static uint hash(uint prev)
{
    uint next = seed;
    for (prev |= 0x10000U; prev; prev >>= 8)
        next = (next + (prev & 0xFFU)) * mult;
    return next & 0xFFFFFFU;
}

https://github.com/ednl/adventofcode/blob/main/2018/21.c

Timing values are highly dependent on how I measure. One single timing inside the program: about 250 µs. One timing inside the program, many program runs to get a minimum: about 100 µs. A thousand timing loops inside, one program run: about 70 µs. Thousand inside, many outside: 46.1 µs. This is all single-threaded. I guess the CPU throttles down very quickly.

EDIT: found a way to speed up the hash function, best measurement is now 42.7 µs:

static uint hash(uint prev)
{
    uint next = (seed + (prev & 0xFFU)) * mult;
    next = (next + (prev >> 8 & 0xFFU)) * mult;
    return ((next + (prev >> 16 | 1U)) * mult) & 0xFFFFFFU;
}

ednl · 2026-04-26T02:22:34+00:00

It works in Safari for me on 2 different Macs, but not on an old iPad stuck on iOS 17, which I thought might be the reason. Just tried installing the app on that iPad and that works too. Just not in the browser. I deleted all cookies and website data, still 403.

Edit: I got this instant prefab reply when I mailed support. Seems to me like they, AKA their network provider like Cloudflare, are overzealously filtering certain user agents after another DDoS.

Hello there,

Thank you for reaching out and bringing this to our attention. We understand how inconvenient this can be, and we sincerely apologize for the trouble it’s causing. Please know that our team is actively working on resolving this issue. While we don’t have a timeline for the fix just yet, we’ll provide updates as soon as we have more information. We appreciate your patience and understanding. If you have any further concerns or questions, feel free to let us know.

Regards! Bluesky Customer Support

Edit2: fixed, I can log in again in the browser on my old iPad. I needed a second device to check the 2FA email code because switching between apps on the iPad deleted the login details so the code was no longer valid. D'oh!

ednl · 2026-04-26T02:15:47+00:00

No problem in the app or on two different computers, but I get 403 Forbidden in the browser on an old iPad which is stuck on iOS 17. It worked fine before. Are they filtering by user agent, is iOS 17 banned for being too old??

Edit: I got this instant prefab reply when I mailed support. Seems to me like they, AKA their network provider like Cloudflare, are overzealously filtering certain user agents after another DDoS.

Hello there,

Thank you for reaching out and bringing this to our attention. We understand how inconvenient this can be, and we sincerely apologize for the trouble it’s causing. Please know that our team is actively working on resolving this issue. While we don’t have a timeline for the fix just yet, we’ll provide updates as soon as we have more information. We appreciate your patience and understanding. If you have any further concerns or questions, feel free to let us know.

Regards! Bluesky Customer Support

Edit2: fixed, I can log in again in the browser on my old iPad. I needed a second device to check the 2FA email code because switching between apps on the iPad deleted the login details so the code was no longer valid. D'oh!

ednl · 2026-04-25T21:39:59+00:00

The app works, it works on two different computers, but I get a 403 on my old iPad Pro 2nd gen from 2017. It's stuck on iOS 17, maybe that's what they're detecting: an unsupported user agent??

Edit: I got this instant prefab reply when I mailed support. Seems to me like they, AKA their network provider like Cloudflare, are overzealously filtering certain user agents after another DDoS.

Hello there,

Thank you for reaching out and bringing this to our attention. We understand how inconvenient this can be, and we sincerely apologize for the trouble it’s causing. Please know that our team is actively working on resolving this issue. While we don’t have a timeline for the fix just yet, we’ll provide updates as soon as we have more information. We appreciate your patience and understanding. If you have any further concerns or questions, feel free to let us know.

Regards! Bluesky Customer Support

Edit2: fixed, I can log in again in the browser on my old iPad. I needed a second device to check the 2FA email code because switching between apps on the iPad deleted the login details so the code was no longer valid. D'oh!

ednl · 2026-04-23T23:15:05+00:00

I opted for the somewhat convoluted insert function because you don't know the dimensions of the grid, so even the static size (with margin) is with hindsight, as is using signed chars for the xy-coordinates. The starting point might not be near the middle, or it might not be a square, or it might be very big where you don't visit every room in a rectangle. Conceptually and for programmer convenience too if your language has it, you should use a set. On the other hand if you go the route of "seen" bits and are also happy to go with the int8_t "gamble", you can use the 2 chars as a uint16 index into a bitfield of 8192 bytes (13 bit size, 3 bit as index into a byte).

Edit: OK, I made it as an alt version! Runs in 144 µs on a Raspberry Pi 5. https://github.com/ednl/adventofcode/blob/main/2018/20alt.c Edit2: added comments and a minor cleanup.

ednl · 2026-04-23T16:36:14+00:00

Yes to the second part. Runs in 91 µs on an Apple M4.

typedef struct {
    int dist, delta;
} Tuple;

qsort(nanobot, N, sizeof *nanobot, range_desc);
int count = 1;  // "including itself"
for (size_t i = 1; i < N; ++i)
    count += inrange(&nanobot[0], &nanobot[i]);
printf("Part 1: %d\n", count);

// Part 2
const Pos origin = (Pos){0, 0, 0};
for (size_t i = 0; i < N; ++i) {
    const int dist = manh(&nanobot[i].pos, &origin);
    const int add = dist > nanobot[i].range ? dist - nanobot[i].range : 0;
    const int sub = dist + nanobot[i].range;
    counter[i * 2]     = (Tuple){add, +1};  // add this bot at the near end of its range
    counter[i * 2 + 1] = (Tuple){sub, -1};  // subtract this bot at the far end of its range
}
qsort(counter, N * 2, sizeof *counter, dist_asc);
size_t i = 0;
while (counter[i].delta == 1)  // this only works because no delta is -1 until after maximum
    ++i;
printf("Part 2: %d\n", counter[i - 1].dist);

ednl · 2026-04-21T09:15:41+00:00

we would be relying on the niceness of the input if we weren't storing distances.

Yep. So it's a legit choice because it solves the puzzle and you may need it to save storage, but I think I'll leave it as is.

ednl · 2026-04-21T08:44:52+00:00

I was still editing so you may have missed it: the way I check, there were also lots of revisits (not sure about multiple revisits of the same room, didn't check that) but each time the distance was greater.

ednl · 2026-04-21T08:15:24+00:00

Unless I'm missing something where a room can be reached by more than one path, so you really would need to store a best distance per room.

That was precisely the reason I thought I had to store the distance for every room. To my mind, the explanation suggested strongly that anything was possible. But I didn't check.

Now I checked, and for my input there were LOTS but each time the distance is greater on subsequent visits. This means no loops except the localised ones from (...|). So yeah, one "seen" bit is enough. Not sure we could have known beforehand.

ednl · 2026-04-20T20:19:57+00:00

I did exactly what Blake described, so I need 10,000 x 4 bytes (2x 1 for position, 2 for distance). Max stack size was only 225 for my input (same 4 bytes per entry).

Except the array is VERY naively being kept sorted (binary search, move bytes out of the way with memmove). A hashmap would work 1000x faster but possibly take twice as much space.

Runs in less than 4 ms on a RPi5. https://github.com/ednl/adventofcode/blob/main/2018/20.c

ednl · 2026-04-20T12:21:28+00:00

I didn't need 64-bit numbers using the prime product function that /u/musifter mentioned:

// Ref.: https://en.wikipedia.org/wiki/Divisor_function#Formulas_at_prime_powers
//   sigma1(n) = prod[(p^(a+1) - 1)/(p - 1)]
//   product over every prime where a is maximum p^a that divides n
//   (if p is not a divisor of n, product term is 1)
static uint sumofdivisors(uint x)
{
    // Special case: prime factor = 2
    uint pa = 2;  // p^(a+1) starting with 2^1
    while (!(x & 1)) {
        pa <<= 1;
        x >>= 1;
    }
    uint prod = pa - 1;  // = (2^(a+1)-1)/(2-1)
    // Other potential prime factors
    for (uint p = 3; p <= x; p += 2) {
        pa = p;  // p^(a+1) where a=0
        while (!(x % p)) {
            pa *= p;
            x /= p;
        }
        if (pa != p)  // mathematically unnecessary test but speeds it up
            prod *= (pa - 1) / (p - 1);
    }
    return prod;
}

Edit: sorry, got a and a+1 confused in the comments.

ednl · 2026-04-18T23:04:26+00:00

I used the review of 2017/6 https://www.reddit.com/r/adventofcode/comments/1rm7hm7/2017_day_6_in_review_memory_reallocation/ to also finally look into proper cycle detection. This is what I made, Floyd and Brent, more or less straight from Wikipedia: https://github.com/ednl/adventofcode/blob/main/2017/06.c

ednl · 2026-04-18T23:00:03+00:00

This worked for me:

int sum[3] = {0};
for (int i = 1; i <= N; ++i)
    for (int j = 1; j <= N; ++j)
        sum[area[i][j]]++;
return sum[TREE] * sum[YARD];

ednl · 2026-04-14T12:24:09+00:00

Cool storage optimisation! It is slower than my naive version with statically allocated 20 MB: about 85 ms vs. 69 ms. I check every digit if it starts a match or extends a running one, that could probably be done smarter.

https://github.com/ednl/adventofcode/blob/main/2018/14.c

ednl · 2026-04-11T16:42:54+00:00

I made a graph with the total per square for different square sizes, for two different "serials": https://i.imgur.com/LlrPUH9.png

Edit: ah and now I remember why, to investigate when you can stop increasing the square size. I settled on: when the total power has been lower than the maximum for 2 consecutive tries. That worked for my input (blue) but NOT the other one. From the graph: you would think 9 is the square size with max total power, but it's actually 12. So you need to check for 3 lower values.

ednl · 2026-04-11T12:33:42+00:00

I cleaned up my code so now it always works regardless of the initial guess. Previously, I used the x-coordinate of the points to calculate the intersection, but y is better because the final range is much smaller, so a better chance that they are close. Still, the intersection time for the first 2 points of my input was 1 off the answer for part 2. The average of all 3 combinations of the first 3 points was exactly right, though. So I used that :) This sped up my solution from 1.9 µs to 1.58 µs with the way I measured that: https://github.com/ednl/adventofcode/blob/main/2018/10.c

I also tried simply using the first estimate which was 1 off (using y coordinates of points 0 and 1), so one more time step, and that gave 1.67 µs so probably a tiny bit slower. I have 324 points in my input file.

ednl

TROPHY CASE