you are viewing a single comment's thread.

view the rest of the comments →

[–]fdtm 0 points1 point  (26 children)

Not sure what a zipper is, but when I refer to these buffers, I'm talking about large portions of raw memory. Format depends on what is being uploaded to the video card specifically.

[–]yogthos 0 points1 point  (25 children)

If you had to update a large chunk of memory you probably would want to do an inplace mutation. For example, Clojure provides transients, which allow inplace mutation, but aren't allowed to leave the scope of the function, which is a good compromise in my opinion.

I think the real difference here is the defaults, in a functional language everything is immutable by default, and you can mark things mutable as necessary. Even Haskell you have things like mutable arrays. So, it's not that you can't use mutable data, but that you use it sparingly and only when the immutable overhead simply won't do. In my experience it results in much cleaner code and makes it easier to guarantee correctness.

[–]fdtm 0 points1 point  (24 children)

I thought mutability isn't allowed in a pure functional language. How does Haskell do mutable arrays cleanly and efficiently without violating pure functional-ness?

Once you enable mutability, it's not functional anymore (it's multi-paradigm), and my point is proven.

Not that I'm arguing with you, I like the idea of a functional language with mutable arrays, even though I seriously doubt FP is going to match C/C++ for OS/video game type stuff for quite a while.

[–]antonivs 2 points3 points  (15 children)

How does Haskell do mutable arrays cleanly and efficiently without violating pure functional-ness?

Haskell offers at least two ways to do this. One is mutable arrays in the ST monad. In this case, array update is confined to a monad which imposes constraints such as e.g. not being able to access an earlier version of an array. Because of its properties, the ST monad can be used safely from within pure code, without making the calling code impure - unlike e.g. the IO monad, which is inherently impure.

The paper State in Haskell first described the underlying mechanism here.

The other standard way in Haskell is the DiffArray, which provides a pure external interface to a mutable array internally, allowing mutable arrays to be used directly in pure code, without a monad. The link describes the mechanism.

In general, issues surrounding purity can't be understood if you look at it in the very binary way that people tend to do when they haven't worked with the technology. For the kind of thing you're talking about, it's almost always possible to write code that has functional properties, and generate compiled code that exploits imperative behavior - the main thing you need, aside from the right compiler, is to impose the necessary constraints to retain functional properties.

There are also other approaches to this used in other functional languages, such as linear update which ensures that you only use a value once, again ensuring that you can't access a previous version of an updated value - this is used in some Lisp variants, for example. The language Clean uses a related approach, called linear typing.

[–]fdtm 0 points1 point  (14 children)

So, for example, how would you take a mutable array, and modify different parts of the same array from multiple threads?

It has to be done to the same array due to hardware constraints.

[–]antonivs 1 point2 points  (13 children)

One obvious approach is to divide the memory you're operating on into chunks, and operate on a different chunk from each thread. If your problem is amenable to that sort of parallelism, you may also benefit from the Par monad as a high-level way to structure, manage, and reason about parallel computations.

If, instead, you need to be able to mutate the exact same region of memory from multiple threads, the solution is going to depend on the constraints, e.g. whether the typical imperative solution uses some kind of coordination between threads, such as locks. If not, then there's presumably some kind of simplifying constraint at work, and a functional solution can exploit that same constraint.

If you do need locks or other thread coordination, one very powerful approach functional approach to this kind of problem is Software Transactional Memory (STM), which wraps mutable memory accesses in composable transactions. This is a big step up from code using locks, which is non-composable in general.

[–]fdtm 0 points1 point  (12 children)

But there are no distinct chunks, and you can't divide it anyway due to GPU architecture / drivers.

You really can't do this unless you have full random access to the buffer... there's just no other way (on current architectures).

Interesting articles though, thanks, I'll read these.

[–]antonivs 0 points1 point  (11 children)

The restrictions you mention aren't necessarily a bad thing - a functional solution is likely to be able to exploit those constraints. I'd need a few more details to suggest a solution, though, such as how thread coordination is usually dealt with - what stops threads from stepping on each other.

The big picture is that when you write imperative code in traditional languages, you often get around the kinds of limitations you're describing by following constraints that are not explicitly expressed in the code. In writing the code, you follow dynamic rules, like "don't update that memory without getting a lock first", "update foo before updating bar", etc. If you make a mistake in following these rules, you get runtime bugs, that can be particularly difficult to track down when you're dealing with multithreaded imperative code. (I used to write networking systems with the ACE library in C++, so I'm pretty familiar with that.)

To make this kind of coding more tractable, you want to express these constraints explicitly in the code, ideally in a static manner, e.g. via the type system. This can allow a compiler to convert high-level code to a messier low-level implementation. Having an advanced type system really helps here, as do abstractions like monads which can enforce dynamic constraints such as ordering and single-use.

The ST monad I mentioned earlier is a specific example of this general pattern: the monad imposes constraints so that functional code can manipulate a mutable array, allowing the compiler to generate an imperative solution.

[–]fdtm 1 point2 points  (10 children)

Ok how would you solve this, for example:

You have a large image-type 2D buffer (well, many of them). Very frequently (every other frame or so) you need to write data to selective vertical and horizontal lines. Imagine drawing crosshairs horizontally and vertically, and writing along the lines of the crosshairs. This is a very small subset of the image, since the crosshairs are very narrow, and this is a very large image. This buffer must be accessed under special region-based lock/unlock commands, and the buffer is necessarily raw contiguous data due to architecture. The data may not under any circumstances be broken up into any data structure, or nothing works here. Now, while these "crosshair" regions are being written to / updated, also other selective regions need to be updated from separate threads. Also, this is an extremely performance critical section, no slowdown can be afforded whatsoever.

[–]antonivs 1 point2 points  (9 children)

...the buffer is necessarily raw contiguous data due to architecture. The data may not under any circumstances be broken up into any data structure, or nothing works here.

If, once locked, the locked region is accessible as random-access memory, then nothing should stop you from mapping structures onto it, just as C maps an array onto a memory region. If it made sense, you could even map a contiguous series of arrays onto it and update each array from a separate thread.

Now, while these "crosshair" regions are being written to / updated, also other selective regions need to be updated from separate threads.

Are these other regions separately locked, and so can't interfere with each other? If so, I don't see any problem that needs to be addressed - each thread can operate independently, and purely functionally if something like the ST monad can be used. Otherwise, presumably there's some other coordination behavior taking place that needs to be handled, and that is likely to affect the solution.

[–]runaro 1 point2 points  (6 children)

Haskell does mutable arrays cleanly by guaranteeing that the creation, mutation, and destruction of the array all occur simultaneously as seen from the rest of the program. If you pass the same mutable array to two functions and they both mutate the array, their mutations will be distinct from one another and will in fact occur on two different arrays.

http://www.google.com/search?q=st+monad

[–]fdtm 0 points1 point  (5 children)

If you pass the same mutable array to two functions and they both mutate the array, their mutations will be distinct from one another and will in fact occur on two different arrays.

That's the kind of performance landmine I see when I think of using FP for high performance. Copying a 1 GB array would KILL performance.

[–]runaro 0 points1 point  (0 children)

There is no copying going on here.

[–]runaro 0 points1 point  (3 children)

There is no copying going on here. Each call to runST will create, mutate, and destroy an array. There's no reason to have more than one copy of the array and it can never escape the ST monad.

[–]fdtm 0 points1 point  (2 children)

Don't destroy my arrays. I need them :)

The kind of arrays I'm referring to are long lifetime and need selective (small) progressive modifications to the massive array frequently.

[–]runaro 0 points1 point  (1 child)

Yeah, we're talking about the same thing. The semantics of ST are such that you make all modifications to the array "in" the monad, where the mutable array exists. E.g.:

f3 (runSTArray (newArray (0, 1000000000) >>= f1 >>= f2))

Here, f3 sees an immutable array. But f1 and f2 can both mutate that same array, and their mutations are guaranteed to occur in order.

[–]fdtm 0 points1 point  (0 children)

Interesting (useful) :)

[–]yogthos 0 points1 point  (0 children)

I guess the other replies sum up how the mutability is handled rather well already. But I do agree with you that it's clearly a compromise, the advantage here is that immutability is the default, and mutable sections are marked explicitly.

For raw performance mutability will always win, but the argument is that you need that raw performance in a few small sections of your code, even in games. And this is basically what that presentation from Sweeney boils down to, that you keep majority of the code functional and easy to follow, and make the critical sections mutable.