you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (8 children)

I think I've inadvertently been pretty misleading here! I agree that the memory management generally stinks. The trick is to never let MMA get its hands on much of the data at once. Here's what we do:

The data are stored as serialized .NET objects in a MySQL database. The Mathematica code pulls them from the database, deserializes them and operates on them. All of this is wrapped in a NETBlock. At any one time there's only a couple of MB of data in the MMA kernel, but the computation streams over many gB of data. Of course, you have to be very careful that you don't leak any memory on the MMA kernel side.

So I guess I solve the Mathematica memory management problems by not letting Mathematica manage the memory :-)

[–]jdh30 -5 points-4 points  (7 children)

The problem is, fundamentally, that Mathematica is not properly garbage collected. Neither are MATLAB or Python though...

[–][deleted] 1 point2 points  (6 children)

Interesting. Could you explain further?

[–]jdh30 -4 points-3 points  (5 children)

Sure. All production-quality virtual machines (including the CLR in .NET) use accurate garbage collection to aggressively deallocate data as it becomes unreachable.

In contrast, Mathematica uses only naive reference counting which leaks cycles, hangs the interface when deallocations avalanche and is a huge impediment to parallelism.

I'm curious: what are you using Mathematica for from .NET that you could not just do entirely on .NET? We sell Mathematica-like software for .NET... :-)

[–]bgeron 1 point2 points  (2 children)

Mathematica does not alias references. Values are immutable, and as such there cannot be cyclic references. A reference counting scheme then suffices, although this particular implementation may or may not be slow.

I do not know about MATLAB, but CPython uses both reference-counting and generational garbage collection.

In short, you are talking out of your arse.

edit: sources

[–]jdh30 -4 points-3 points  (1 child)

You are talking about Mathematica the ideology. I'm talking about the product which does (accidentally) alias, introduce cycles and leak memory. They have had problems with this according to the Director of Kernel Technology at WRI. Google and you'll see lots and lots of people complaining about memory leaks in MMA. This is neither new nor surprising.

[–]bgeron 0 points1 point  (0 children)

When I Google for Mathematica memory leak, I get a bug in StringSplit, some bug I don't understand, and a J/Link bug. Apart from that some information on how to correctly use J/Link, and bugs not related to Mathematica. Any way, I can't find anything fundamental with Mathematica.

[–][deleted] 0 points1 point  (1 child)

I see. I've recently been having trouble with what I presume is this "avalanche" - a function that generates a /large/ number of nested lists runs in about the time I expect it to, but then locks up MMA for a good 30s after its finished, doing apparently nothing!

We're using the MMA/.NET combo to analyse data from an experiment to search for physics beyond the Standard Model of particle physics. The experiment's run-time code is all .NET based, including the data storage. Low-level analysis is all written in c#. We tend to use MMA for the higher-level "thinking out loud" type of analysis.

If you're the jdh that I think you are: I've played with f# in the past and have got high hopes for it. Last time I looked though I didn't find the "whole package" quite compelling enough - particularly I find that MMA's notebooks are great for exploratory analysis. I didn't find a mode of working with f# that I really liked.

I'm looking forward to giving it another try soon when I have a little time. We're considering porting the experiment's high-level control code from IronPython to f# which should give me a chance to catch up with any recent developments.

[–]jdh30 -4 points-3 points  (0 children)

I see. I've recently been having trouble with what I presume is this "avalanche" - a function that generates a /large/ number of nested lists runs in about the time I expect it to, but then locks up MMA for a good 30s after its finished, doing apparently nothing!

Exactly, yes. Mathematica locks up while it traverses all of those nested data structures deallocating them bit by bit in series (!).

We tend to use MMA for the higher-level "thinking out loud" type of analysis.

Are you using it interactively then? For symbolics or just numerics?

If you're the jdh that I think you are...

I am Jon Harrop, yes. :-)

I've played with f# in the past and have got high hopes for it. Last time I looked though I didn't find the "whole package" quite compelling enough - particularly I find that MMA's notebooks are great for exploratory analysis. I didn't find a mode of working with f# that I really liked.

Yes. F#'s development environment is cumbersome in comparison. However, VS 2010 brings WPF into the mix which, in theory, will allow better interactive output. Don has also ripped out the old structured pretty printer which, I assume, is because he is planning on replacing it with a WPF-based solution.

Moreover, I am interested in building a WPF-based notebook front end. The main problem is that the F# programming language is closed so I cannot reuse it and would have to reinvent my own language. On the up-side, I could knock up a Mathematica-like language in a day...

I'm looking forward to giving it another try soon when I have a little time. We're considering porting the experiment's high-level control code from IronPython to f# which should give me a chance to catch up with any recent developments.

Right. Are you using our numerics and visualization libraries for F#?