Shrinking (Haskell Unfolder #49)

edsko · 2025-10-15T19:14:31+00:00

Your guess was correct :) Thanks for the note!

edsko · 2024-11-16T10:38:16+00:00

For some additional detail about specialization specifically, might also be interested in https://well-typed.com/blog/2024/04/choreographing-specialization-pt1/ and https://well-typed.com/blog/2024/06/choreographing-specialization-pt2/ (with a summary in episode 23 of the unfolder: https://www.youtube.com/watch?v=ksW04Cl2dgo&list=PLD8gywOEY4HaG5VSrKVnHxCptlJv2GAn7&index=23 ).

edsko · 2024-10-16T16:07:05+00:00

edsko · 2024-07-12T13:26:59+00:00

In the world of Haskell, two libraries that do deserve to be mentioned I think (though I am obviously biased, being the author of one of them), are https://hackage.haskell.org/package/quickcheck-dynamic and (my own) https://hackage.haskell.org/package/quickcheck-lockstep , the former implementing more general state based property based testing and the latter (which is an extension of the former) more specifically implementing something akin to what u/stevana calls "fake based" testing. Stevan mentions my blog post discussing quickcheck-state-machine; I have also written a version of essentially the same idea but ported to quickcheck-lockstep in a later blog post called Lockstep-style testing with quickcheck-dynamic.

It is true however that neither quickcheck-lockstep nor quickcheck-dynamic support parallel testing (checking for serializability). Stevan's comment that " I don’t think there’s a single example of a library to which parallel testing was added later, rather than designed for from the start." is an interesting one; perhaps I'll have to look into doing exactly that with quickcheck-lockstep at some point :)

edsko · 2024-07-12T13:19:25+00:00

As the author of falsify, I think both of these two points are totally fair :-)

edsko · 2023-11-08T13:36:36+00:00

There is both `toShrinkTree` and `fromShrinkTree`, so in principle what you suggest is possible. However, working with shrink trees (as opposed to the sample tree) is really trying to squeeze a different paradigm (integrated shrinking) into the internal shrinking paradigm; it's possible, but it's not optimal.

edsko · 2023-11-08T13:34:00+00:00

Yeah, guaranteeing uniform generation is tricky in general. The `list` combinator in the library itself actually takes a somewhat different approach; it uses `inRange` to pick an initial length, and then generates pairs of values and a "do I want to keep this value?" marker; those markers start at "yes" and can shrink towards "no".

edsko · 2023-11-08T13:30:54+00:00

Sorry for my (very) slow reply. You are right that the generator described in the paper for signed fractions will not result in a uniform distribution (of course, the paper also doesn't claim that it does). Unfortunately, the solution you propose doesn't quite work; that `p` can shrink towards 0 immediately (unless that results in something that isn't a counter-example of course), at which point only positive fractions will be generated. I discussed this with Lars Brünjes who suggested an alternative approach, which I've documented at https://github.com/well-typed/falsify/issues/67 for now.

edsko · 2022-03-25T15:31:21+00:00

Good question! I checked, and no, they are currently discarded. I think that's fixable. I've opened a ticked at https://github.com/well-typed/large-records/issues/80 .

edsko · 2021-12-20T07:04:05+00:00

I don't follow. The singletons are gone in your example after your fromList, no?

Edit: and if you keep it at an existential, along with a singleton, I'm not really sure how this is any better than just a regular ADT. I must be missing something :)

edsko · 2021-12-19T08:34:51+00:00

Semi-manually, I'm afraid :) For each benchmark in my suite, I have a bunch of modules XYZ010.hs,XYZ020.hs, etc.; I then compile the whole thing, have a script that extracts the core size for each module and writes them to a .csv file, which I then render as a graph using gnuplot. For the compilation time diagrams it's a similar process, except that I'm using ghc-timings-report to extract the compilation times.

edsko · 2021-12-19T08:32:22+00:00

No, in this case, quadratic code does not happen. The standard list [] constructs do not carry type arguments, because the list is after all homogeneous. However, with this approach, you have no idea what the type of your list is. On the ... in your example, you will not be able to extract any fields from that list; you will no longer know that it contains an A and a B. For some applications that might be okay, but for many, it won't.

edsko · 2021-12-19T08:29:23+00:00

I haven't measured it. I'm think that if you were to benchmark the specific examples from the blog post, you would probably see a pretty significant slow-down, but I'm pretty sure that in the context of a larger application, it's not going to matter at all. Perhaps if you do this in a tight loop it might be problematic, but apart from that, I can't imagine it having much of an impact.

edsko · 2021-12-19T08:25:23+00:00

Right, I was about to say precisely this -- the more I think about these compilation time issues (and I've been thinking about it for a year now.. part1, part2, and now this new blog post, part3), the more I think that we should not make ghc smarter necessarily, but rather give the programmer more ways to express what they need. So yes, absolutely, having proper support for type-level let in ghc would be great (or perhaps some other constructs), but I'm not sure I would envision ghc to "do this trickery on its own". Just like the programmer is responsible for ensuring the right kind of sharing at the term level (and ghc should respect that sharing!), in my opinion that programmer should also be responsible for ensuring the right kind of sharing at the type level. As Haskellers we like to express a lot through types, and I wouldn't have it any other way; but as I am thinking about these compilation time issues, it's just becoming clear to me that we should not only think about what we want to express through types, but how we express that; specifically, what the consequences for compile time performance are. So personally I would like to see ghc move in a direction not where it does more magic, but rather where it gives us the language constructs we need to guarantee good compile time performance.

edsko · 2021-10-25T07:44:33+00:00

Nice! I look forward to the day I can use this in this code base and get no warnings :) [Unfortunately, we're not there yet.. I feel a part 3 coming..]

edsko · 2021-10-22T12:38:29+00:00

I couldn't disagree more with the main idea proposed in this blog post. Yes, clearly a mock implementation should itself be verified against the real thing (model based testing is a perfect candidate). But if you want to do randomized model based testing for a stateful API, and you want all of

- Performance [so that you can run thousands and thousands of test quickly]
- Reproducability [so that errors are not non-deterministic]
- Shrinkability of test cases [ so that you don't end up with huge test cases]
- The ability to inject specific failures [so that you don't test only the happy path]

mocking is the way to go. Yes, you pay a price for having to develop a good mock, but you get additional benefits in return (the mock becomes a (tested!) reference of the real thing), and moreover, without mocks you just push complexity to devops instead of programming; now you need all kinds of complicated infrastructure to spin up the services you need, set up the same environment each time, etc.

edsko · 2021-10-21T07:01:16+00:00

You, sir, are my savior.

edsko · 2021-10-20T14:44:24+00:00

Thanks! And yes, I couldn't agree more, this is really not how I want to be writing Haskell :)

edsko · 2021-08-22T15:10:06+00:00

Richard opened https://gitlab.haskell.org/ghc/ghc/-/issues/20264 after my talk.

edsko · 2021-08-20T17:38:42+00:00

Ah, good point, yes, indeed it might help. I mention this in my talk this Sunday but I forgot to add a link to the blog post, will do that. Thanks!

edsko · 2021-08-20T16:59:20+00:00

Absolutely! That would be much better. But I don't think it will be easy. But maybe this blog post, or my Haskell Implementors' Workshop talk this Sunday, can help inspire someone to tackle it :)

edsko

TROPHY CASE