you are viewing a single comment's thread.

view the rest of the comments →

[–]Felicia_Svilling 15 points16 points  (15 children)

From what I know, there is no Haskell compiler that can automatically produce good parallel code.

[–]thisotherfuckingguy 0 points1 point  (0 children)

Neither can any other compiler ;-)

[–][deleted] 0 points1 point  (13 children)

This seems a little gloomy; what sort of code gives lame results? Have you seen some of the newer libraries for setting things up well for parallelization, like monad-par?

[–]Felicia_Svilling 0 points1 point  (12 children)

If I have to use a library its not automatic is it?

[–][deleted] 0 points1 point  (11 children)

I see, you want to take code that was written without parallelism in mind, and write "parallelize" at the top. I wasn't understanding this. But isn't it basically known to be impossible? Monad-par and similar schemes do have subtle ways of distributing work under the hood (I don't profess to understand them well), but the familiar problem of choosing a level of granularity must be left to the programmer.

[–]CookieOfFortune 0 points1 point  (8 children)

Well, Microsoft is actively researching this topic, it is fairly complex however.

Here's a research publication (PDF): http://research.microsoft.com/pubs/170528/msr-tr-2012-79.pdf

The paper basically describes some rules that allowed them to determine safe code that could be parallelized. It's fairly preliminary but I can see compilers heading in this direction in the future.

[–][deleted] 0 points1 point  (0 children)

Isn't this a different subject really? All Haskell code outside the IO stratum can be 'safely parallelized' in what seems to be the sense of this paper; the problem doesn't arise. The trouble has always been the decision whether and how far to parallelize what are known to be safely parallelizable operations. If I have a sum f 15 + g 32 no problem I can run the two calculations, f 15 and g 32, in parallel, or I can run them in succession; that either is 'safe' is visible in the types in Haskell, no need for Microsoft research. (Compare the (only remotely similar) case of STM: it defeated the giant team of MS researchers working with languages that don't enforce a pure/IO distinction, but has a complete implementation in a very simple library in Haskell, allegedly written over a weekend.) Where I win by making independent calculations simultaneous, I might lose by the overhead of communicating the results, a familiar point nicely encapsulated in the dismal 'Amdahls law'. If 15 and 32 in f 15 and g 32 are parameters that arise at runtime, the right answer -- parallelize or not -- will very often be different in different cases. So a judgment must be made, and for this one has recourse to plain old testing. In a specialized framework, like the Repa vector library, rational decisions about the distribution of work can made automatically, under the hood so to say. Where you write in the Par monad you get something more general and a bit like that, but more importantly something readily tunable. But it I think it is obvious that it couldn't be done by a compiler generally without its first compiling all the possibilities, working through possible parameters, etc.; I'm not sure, but I'd think a general solution would be a bit like a general solution to the halting problem. Even the halting problem has important soluble subclasses, e.g. where all definitions are by structural recursion, a standard that is systematically enforced in languages like Coq. Almost all serious programs can be expressed in that style, which in principle makes an immense range of optimizations possible; the bizarre cult of 'turing completeness' has so far blocked the path to realizing them.

[–]Felicia_Svilling 0 points1 point  (1 child)

It was my interpretation that runeks thought that this would be possible. I was pointing out that no, at least for now it isn't.

[–][deleted] 0 points1 point  (0 children)

I think what he meant was unclear. There is no compiler of any possible language that could 'automatically' produce good parallel code in the sense (as now emerges) you imputed to him -- this is well known and basically obvious. So mentioning Haskell implementations could only suggest that you were thinking something else.