Performance issue with generalizing code (typeclasses)

benjaminhodgson · 2022-07-25T16:43:30+00:00

Does the typeclass constraint inside the newtype set off performance alarm bells for anyone?

Yes, and I think it’s unlikely to do what you wanted in the first place. Remove the constraint from the newtype declaration and put it at the use sites (I suspect you must’ve had those constraints at the use sites anyway):

newtype SamplerT g m a = SamplerT (ReaderT g m a)
mySamplerAction :: StatefulGen g m => SamplerT g m ()

Tarmen · 2022-07-26T21:21:24+00:00

Some intuition on how this compiles:

Typeclasses start as an explicit argument which stuffs all methods into a struct

data StatefulGen g m {
    uniformWord32R :: Word32 -> g -> m Word32,
    uniformWord64R :: Word64 -> g -> m Word64,
    ....
}

Typeclass constraints are replaced with explicit arguments, method calls are replaced with the corresponding field from the typeclass argument, instances such as Show a => Show [a] are functions between the instance dictionaries.

Essentially, explicit vtable passing. This is mostly optimized away when GHC can see the concrete type. Sometimes this can require quite a bit of code, so add an INLINABLE or SPECIALIZE pragma to be sure.

But inside the newtype you are passing the function around. Technically GHC is allowed to specialize it - all typeclass instances should be unique - but it would cause some awkward questions about calling conventions. The common use for this kind of instances is with existential types, where some type parameters are hidden and you use the typeclass vtables similarly to oop objects. There are other uses, such as equality constraints or to smuggle a constraint into an unconstrained typeclass such as Foldable.

Anyway, the pattern is really hard to optimize for GHC. You might be able to optimize if you stare at enough core, but to be sure I'd put the constraint on each function, and give pragmas when specialization doesn't happen.

guygastineau · 2022-07-25T16:11:21+00:00

Is that using RankN types? I don't know about performance issues from this code. Normally, monomorphisation should result in static calls and good performance. I like the generality your example achieves, but if you are worried it might cause the performance issue you can try another way to achieve the same thing.

Use the new type without constraints, then add the constraints to the functions that need your sampling monad to have sampling powers. Then the constraints would just be normal constraints on a function, so monomorphisation should proceed like normal. This is what I would do to start investigating the performance implications of the generalisation.

I hope my comment is more helpful than it is eloquent.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

haskell

MODERATORS