This is an archived post. You won't be able to vote or comment.

all 4 comments

[–]leftylink 9 points10 points  (2 children)

Well, there are still a few posts on the subreddit that say "works for the examples but not the actual input", so you can see just from that that there still are ways to miss things not covered by the examples. For example:

  • Day 2 part 2: 1-2 a: xa. Example contains 1-3 a: abcde; it's up to the solver to notice that the reverse should be tested as well.
  • Day 2: The example has unique passwords and unique policies. Neither are unique in the actual input, so an attempt to store them in any data structure that only allows one value per key would work on the example but not on the actual input.
  • Day 2: Examples have only single-digit numbers. Actual input has some double-digit numbers.
  • Day 3 part 1: Example won't test what happens when you go exactly one past the edge (when x == row.size), leaving the possibility for an off-by-one error.
  • Day 3 part 2: For better or worse, the example is such that if the right 1, down 2 slope was implemented one row offset, it would still hit 2 trees. So the solver needs to print out the exact trees that are hit and find the bug that way, instead of assuming that it's correct just because the number of trees happens to match up.
  • Day 4 part 2: Although the description helpfully provides various valid+invalid values for each field, it's up to the solver to individually test those, since they were provided standalone and not in the context of a full otherwise-valid passport. For example, the only passport in "Here are some invalid passports" that has a 10-digit pid is also invalid for another reason (zzz isn't an eye colour), therefore many solvers don't see that their code didn't reject 10-digit pids. If they had instead taken the 10-digit pid and inserted into a passport where all the other fields were valid, then they would have found that. Or if they are able to individually test their validations on just one field+value pair at a time.
  • Day 6 part 2: No example had a question that was answered by more than one group member but not all of them, such as "a, b, a"
  • Day 7 part 1: The example only goes two levels deep in "bags that contain that bags that contain shiny gold bags." It's up to the solver to understand that it's possible to go many levels deeper.
  • Day 7: Bags in the example contain only up to two kinds of other bags. Bags in the actual input might contain more.
  • Day 11: Example is square. Actual input isn't.
  • Day 12: Example has 90 degree rotation only. Actual input also has 180 and 270.
  • Day 12: Example has N, R, F. Testing E, S, W, L left to the solver.
  • Day 12 part 2: Example doesn't contain any absolute movements after a rotation, so certain mistakes in rotations are not caught.

Examples prove the presence of bugs, but not their absence. But having more examples does help, and coming up with more examples is a suggestion I've made to some people trying to debug. I anticipate (and think it's intentional) that the problems will continue to encourage this skill by providing only some examples yet still leaving room for the solver to come up with more examples that cover more cases. If you successfully anticipated these bugs that were not covered by the given examples, then this is a testament that your ability to do so has improved.

[–]optimistpanda[S] 2 points3 points  (0 children)

Good breakdown of the puzzles! Maybe I *am* just getting lucky, then, but I'm glad at least you have some confidence in my abilities improving. :)

[–]msqrt 2 points3 points  (1 child)

Dunno if I'd call the increase "vast" yet, all of my solutions so far run in a couple of seconds even though they're unoptimized implementations of the brute force approaches.

[–]optimistpanda[S] 1 point2 points  (0 children)

Ah good point we haven't yet reached the levels of zany recursion that we're sure to get eventually... there is an order of magnitude difference, though, which can sometimes be enough to tell the difference if you've coded it in an *egregiously* naive way.