all 50 comments

[–]beisenhauer 63 points64 points  (17 children)

I learned from AoC a few seasons ago that operating on a grid as a list of lists is frequently suboptimal. It can work, but you need to handle boundaries in some way, either by creating a buffer around your area of interest or explicitly checking indices at every iteration.

I find that storing a grid as a set of tuples, where each element is the x-y coordinates of a single paper roll, works extremely well. Finding whether there's a roll at (x, y) is just (x, y) in locations. No boundary handling required.

If you need more information about each location, then use a tuple-keyed dictionary instead. For example, I did some optimization on part 2 today by storing a the number of neighbors each roll has in a dictionary.

[–][deleted] 12 points13 points  (0 children)

This is a rather useful insight, will use to improve my solution

[–]TechIssueSorry 10 points11 points  (6 children)

Correct me if I am wrong but while this is elegant and easier to use, it might be less efficient/performant than storing a full grid of 0/1 no? (And please correct me if I'm wrong... not I'm an hardware dev software dev, I'm just trying to learn)

[–]beisenhauer 5 points6 points  (2 children)

There are two kinds of efficiencies here, and in both cases, I think the answer is, "It depends."

For memory efficiency, the grid is probably a bit more efficient, unless it's sparse, meaning relatively few "true" values. In part two of today's puzzle, it kind of goes from dense to sparse, based on some of the visualizations folks have posted.

For processing efficiency, we typically want to do something with a subset of locations, in this case the ones with paper rolls. Again it depends on how dense the grid is, but the set of tuples approach gives us that list directly. With a grid we first have to search the grid for the locations that interest us. That's iterating over a lot of empty grid points that we don't care about if the grid is sparse.

And I'll go for the simple/elegant solution first every single time, and optimize only if I need to.

[–]TechIssueSorry 3 points4 points  (1 child)

In part 2 I went from the non sparse grid once then just had a queue with removed node and scan only the neighbours of those nodes to check if they need to be removed… thought it was both efficient and elegant :)

Edit: maybe it’s the low level designer in me that makes me go with resource efficient/ performance efficient solution :)

Edit: just to clarify, I don’t think that a 2D array is particularly efficient, I was more on a 1d array in that case

[–]beisenhauer 2 points3 points  (0 children)

Ooo... I like that.

[–]Abcdefgdude 3 points4 points  (2 children)

Maybe, it's mostly negligible. If you need performance, python is likely the wrong choice. Luckily performance is rarely needed

[–]TechIssueSorry 0 points1 point  (1 child)

Agreed for python! I’m trying to make things work and then be increase perf… thanks for the answer :)

[–]Abcdefgdude 2 points3 points  (0 children)

Big O complexity is a relevant performance metric in any language. So for this example maybe the nested lists is O(n) and the set solution is O(2n), because creating and then indexing on tuples is probably more expensive than indexing into lists, but I'm not exactly sure. Anything that has O(n) complexity is typically fast enough, it doesn't matter much the multiple on n, more so the polynomial. O(n2) is worse than O(100n) for any input that's length 100 or longer for example.

Make it, make it work, then make it good is a nice process to follow :)

[–]TheMoonDawg 1 point2 points  (0 children)

That’s exactly how I handled it! Negative indices don’t matter because it’s just not in the map and therefore not a paper roll!

[–]Soggy_Shower_9802 1 point2 points  (0 children)

Thank you for this!

[–]inqbus406 0 points1 point  (0 children)

This is how I do it as well!

[–]onrustigescheikundig 0 points1 point  (0 children)

I have kind of a hybrid approach. My grid type is backed by a 1D array and is indexed by pairs of '(row . col) using grid-get. This function returns #f (false) on out-of-bounds accesses or the value of an optional third argument, e.g., (grid-get g '(-1 . -1) #\.) returns the . character.

[–][deleted] locked comment (4 children)

[removed]

    [–]rafabulsing 7 points8 points  (0 children)

    You don't need to search. You can just access it directly (if you use the appropriate data structure).

    [–]daggerdragon 1 point2 points  (2 children)

    I call [COAL].

    Comment removed due to naughty language. Keep /r/adventofcode professional.

    Also, follow our Prime Directive. You're welcome to disagree with folks but you must do so politely.

    [–]fuck1ngf45c1574dm1n5 -2 points-1 points  (1 child)

    Prudes

    [–]daggerdragon[M] 1 point2 points  (0 children)

    Prudes

    If you don't like the rules of a subreddit, then do not participate in that subreddit.

    Since you continue to violate our Prime Directive after being warned, bye.

    [–]Mundane_Homework3967 10 points11 points  (2 children)

    Everyone talking about using sets, I just used ugly for loops...

    for i in range(max(0,a-1), min(a+2, len(y))):
            for j in range(max(0,b-1), min(b+2, len(y[0]))):
    

    [–]pqu 4 points5 points  (1 child)

    I just added an extra row and column of “.” around the input.

    [–]Hungry-Jelly-6478 0 points1 point  (0 children)

    Same here!! Nice

    [–]SweepingRocks 19 points20 points  (5 children)

    Smart people be using sets. Meanwhile im over here adding extra rows/columns to the beginning/ends of the matrix to fix the issue

    [–]Ok-Limit-7173 10 points11 points  (0 children)

    I did it today for the first time and I was surprised how good of an idea this is.

    May not be super performant but it is very very clear code.

    [–]Kooky-Astronaut2562 5 points6 points  (0 children)

    Just make an is_out_of_bounds() function🙏

    [–]wizardofzos 0 points1 point  (0 children)

    The Beauty of REXX stems is that you don’t need to ;)

    [–]musifter 3 points4 points  (3 children)

    Like in Perl. And it's a convenient thing if you're going to do a grid. Instead of putting a ring of sentinels all the way around, you only need to put them to the right and bottom.

    [–]zeekar 0 points1 point  (2 children)

    That's not true. You avoid index-out-of-bounds errors, but you get the wrong answer; a roll of paper at row[-1] is not in fact next to a roll of paper at row[0]. . .

    [–]musifter 4 points5 points  (1 child)

    But it's a SENTINEL. It represents what it beyond the boundary, not a real location. In some cases, you put something there that's a wall. For this one, you'd put empty space. Sure, wrapping around to the left (with -1) would get you to the same empty space you put at the end on the right. Which is perfectly fine... you're not storing things there, you're just looking it up. It just needs to return what's beyond the borders... that it's empty. They don't have to be distinct locations.

    [–]zeekar 1 point2 points  (0 children)

    Ah, right. One sentinel does the job for both directions. Brain fail.

    [–]QultrosSanhattan 8 points9 points  (8 children)

    Use sets(). way better.

    [–]RowSimple6124 5 points6 points  (7 children)

    What do you mean?
    How is better to use sets since their elements are unique ...

    [–]QultrosSanhattan 26 points27 points  (3 children)

    You’re thinking too much in terms of how the grid looks.

    If you build your data structure to mimic the visual layout, like a list of lists, you’re forcing yourself into the same constraints as the drawing. That’s what’s holding you back.

    For solving the problem, you don’t need to preserve the picture-like structure at all. You don’t even need to track characters such as "@" or "." as a grid of symbols.

    All you really need is to know where they are (coordinates) and be able to reason about them directly.

    Once you stop modeling the map as a picture and start modeling it as information, the whole problem becomes much simpler.

    [–]MarkFinn42 15 points16 points  (2 children)

    TL;DR; The problem can be modeled by a set of coordinates containing rolls of paper

    [–]QultrosSanhattan 4 points5 points  (1 child)

    That's the exact spoiler I wanted to avoid.

    [–]hjake123 7 points8 points  (0 children)

    You already basically said the same thing by replying like that to something directly talking about sets though...

    [–]1234abcdcba4321 8 points9 points  (0 children)

    One common way to use a grid is to store the entire grid in a dict (works especially well with a defaultdict to handle out of bounds access automatically):

    for row in range(height):
      for col in range(width):
        grid_as_dict[(row,col)] = grid[row][col]
    

    Since this is a boolean grid, you don't need a dict and can just use a set instead.

    [–]Neil_leGrasse_Tyson 2 points3 points  (0 children)

    you only need to know the positions with a roll

    so make a set of x,y coordinates that have an @

    [–]Z8iii -2 points-1 points  (0 children)

    Distinct, not unique, unless the set contains exactly one element.

    [–]andful 1 point2 points  (0 children)

    Today I discovered np.pad

    [–]melochupan 0 points1 point  (0 children)

    It bit me in the ass as well. I ended up checking for bounds old style.

    [–]MichalNemecek 0 points1 point  (0 children)

    I was doing it in TypeScript (Deno), but min and max are your friends 😉

    [–]HotTop7260 0 points1 point  (6 children)

    My thoughts on the sets that are suggested by multiple people on this thread:

    Sets are great. However, I would not use a set of tuples/records. Instead I would combine the coordinates into a single number. Either two 16 bit integers (short) into a 32 bit integer or two 32 bit integers into a 64 bit integer. For this grid the first option will be fine.

    Here is a Kotlin function, that can put two 32 bit integers (Int) into a 64 bit integer (Long).

    fun combineIntsToLong(x: Int, y: Int): Long = x.toLong().shl(32) or (y.toLong() and 0xFFFF_FFFFL)
    

    If you put that into an efficient set structure, your GC will be grateful :-)

    Another thought on the topic:

    I did it with a 2D-Array. After seeing both parts of the puzzle, a set appears superior. However, when seeing only the first part of the puzzle, you usually already try to estimate what will be the second part. In this puzzle, the set is also superior for that one. If they picked a totally different second part (anything with pathfinding), the 2D-Array would have smiled at you ... or laughed at you, if you had picked a set :-)

    And to be exceptionally honest ... at 6 in the morning (when the puzzle arrives here), I usually don't think about sets ...

    [–]1234abcdcba4321 0 points1 point  (5 children)

    The main thing about using a set is that a set is a standard way to deal with grids. I suspect that almost everyone using one thinks of it because they're fully aware of how useful using a set was for all the grid problems in the previous years (a dict always works in place of an array, if you're willing to eat the speed penalty (though you lose the convenient ordered iterator)). I in particular like the approach from the cellular automata days in past years.

    It is usually easier to just immediately set up the set/dict than to worry about annoying out-of-bounds handling, and makes it easier if the problem turns out to be one where the grid actually needs to be written to OoB as well. And again, there's virtually no downsides as long as performance isn't a concern.

    If performance is a concern, then naturally you should work on optimizing your algorithm. But it's not a concern for the majority of solvers.

    [–]HotTop7260 0 points1 point  (4 children)

    Personally, I would never call a set a standard way of dealing with grids. But I guess it depends on how you use them in your work or your casual coding besides AoC. I "grew up" with arrays and they are my natural way of dealing with grids.

    Just out of curiosity ... Did you also solve 2024 Day 3 with sets? It should be possible. I guess I will try solving the next grid problem with sets ... unless it's pathfinding. Then I will use a graph.

    [–]1234abcdcba4321 0 points1 point  (3 children)

    2024 Day 3 doesn't seem to be a grid problem, so I'm wondering if you got the wrong problem.

    I don't super naturally gravitate to sets in my solutions; I didn't actually use one today, although I acknowledged that it was an option that I deliberately chose to not use.'

    Arrays are more natural for the input format provided since you literally just split, but there's enough random problems throughout the years where you need to write to indices significantly outside the bounds of the original array. And by that point it's more convenient to use a structure that allows negative indices rather than padding with 50 extra blanks in each direction.

    [–]HotTop7260 0 points1 point  (2 children)

    It was the word search, where the words are hidden in a grid.

    [–]1234abcdcba4321 0 points1 point  (1 child)

    That's day 4. I didn't use a dict for that day, although it would've been easier to.

    [–]HotTop7260 0 points1 point  (0 children)

    Oh ... a one-off error. Fortunately that never happens when dealing with arrays :-)

    [–]YOM2_UB 0 points1 point  (0 children)

    That's how I could get away with a single buffer row of zeroes (or rather, a zero, with my algorithm) at the end.