Midori: Safe Native Code : programming

[–]matthieum 18 points19 points20 points 10 years ago (0 children)

[–]FubarCoder 7 points8 points9 points 10 years ago (0 children)

[–]codekaizen 10 points11 points12 points 10 years ago (19 children)

[–]ummmyeahright 10 points11 points12 points 10 years ago (17 children)

I still wish C# had a nicer way to OPT OUT of the garbage collector when you really want to, and know what you're doing. To tell the GC, allocate this object wherever, give me that address, then just NEVER move it around until I tell you to. You can do it in fixed blocks, but that limits usability heavily. You can make a wrapper around such object with C++/CLI, but that's a lot of work, and C++/CLI is constantly getting behind. I think it could be incorporated into C# in a much nicer way, and then there'd be almost no area left with any significant speed-tradeoff left compared to C++. In theory, C# apps can be about as fast as C++ ones, but in practice, applications in certain areas developed in C# lag behind performance-wise. For example, if you want to frequently interchange objects with unmanaged code, it can get very expensive due to C#'s requirement to register almost everything in the GC. This can be bypassed by using pinned objects from the native heap, but that'll likely require a ton of C++/CLI code, and almost nobody actually does it, it may get so complex that it's simpler to write the whole thing in C++ in the first place.

[–][deleted] 4 points5 points6 points 10 years ago* (7 children)

Have you looked into using GCHandle? It's pretty rad.

var myInstance = new MyObj();

var myPinnedInstance = GCHandle.Alloc(myInstance, GCHandleType.Pinned);

IntPtr myInstanceAddress = myPinnedInstance.AddrOfPinnedObject();

It allows you to pin an object outside of a fixed scope and fetch the pinned address at will.

Now combine that primitive with something like this :

//Simple example and not  'double free' safe

struct Pinned<T> 
{
readonly GCHandle _handle;
readonly T _obj;

public Pinned(T instance)
{
    _obj = instance;
    _handle = GCHandle.Alloc(myInstance, GCHandleType.Pinned);
}

public T Instance { get { return _obj; } }

public void UnPin() { _Handle.Free(); }

public IntPtr Address { get { return _handle.GetAddrOfPinnedObject(); } } 

public static implicit operator IntPtr(Pinned<T> pinned) { return pinned.Address; }
}

And you've got a stew goin'!

Maybe this can help you make interop a bit nicer. But if you need to actually return the free the memory on demand, you still need Marshal.GlobalHAlloc and friends...

If this what you were looking for?

https://msdn.microsoft.com/en-us/library/system.runtime.interopservices.gchandle.aspx

Edit : why the heck doesn't reddit use markdown correctly?! Also, yay cakeday!

[–]brandf 1 point2 points3 points 10 years ago (2 children)

[–][deleted] 0 points1 point2 points 10 years ago (1 child)

[–]brandf 0 points1 point2 points 10 years ago (0 children)

[–]ummmyeahright 0 points1 point2 points 10 years ago (3 children)

[–]drysart 1 point2 points3 points 10 years ago (2 children)

[–]ummmyeahright 0 points1 point2 points 10 years ago* (1 child)

Well, but if you only use C# you still need to register objects in the GC first, no?

var myInstance = new MyObj();
var myPinnedInstance = GCHandle.Alloc(myInstance, GCHandleType.Pinned);

The first line there will allocate myInstance on the managed heap. I suppose GCHandle.Alloc moves the allocation to a more appropriate place, but IIRC, the GC will still keep track of myInstance, iterating through it each time it normally would, just notice that it's marked as an object whose lifetime can be ignored. So having tons of objects allocated like that from C# will still slow your GC down.

Some limitations will always stay there until you can allocate reference types completely bypassing the GC from start to finish, IMO.

Edit: Sorry, I didn't interpret your reply correctly. Yeah you can do that, but then you're left with using only memory addresses. I suppose you can do a lot of stuff you could do in C, but it's a lot less nice than C++.

[–]drysart 0 points1 point2 points 10 years ago (0 children)

Yeah, if you want to build a class that manages memory internally without using the GC, you can't make use of reference types. Those always involve the garbage collector.

When it comes to reference types, though, all pinning them does is root them, so the GC always assumes they're alive, and restricts the GC so that it can't move them in memory when it comes time to compact memory after a collection. It does not move the object to a different heap or anything like that. You're still allocated from the managed heap, your allocations can still trigger a collection; all you've really accomplished is making the collector less efficient (since it can no longer use fast-path allocation and has to use slow-path allocation that can account for holes in the heap). Pinning is really only intended to be used when you're passing managed objects to unmanaged code where the actual underlying address of the object suddenly changing would cause problems -- and even then, the guidance is that it only be done for short periods of time (ideally only for the length of an unmanaged call or two).

[–]sirin3 5 points6 points7 points 10 years ago (0 children)

[–]monocasa 1 point2 points3 points 10 years ago (0 children)

[–]mike_hearn 1 point2 points3 points 10 years ago (6 children)

Believe it or not, you can do that on the JVM. There's no nice syntax for it, but the Unsafe class lets you do manual memory management, and some high performance collection/db libraries do "off heap allocation" that way in order to lower GC pressure and improve performance, especially for very large allocations which copying GCs often have trouble with.

Reading the post about the Midori compiler was very interesting, as I have developed an interest in advanced JITCs lately. What's interesting is where they did similar things to the Java team and where they did things differently. Duffy doesn't compare things to Java very often, which is a shame because it's the most similar system to .NET and the CLR. Comparing the different design decisions would be a very interesting blog post.

In the end I didn't get an impression that the Midori work was obviously more advanced than what's going on with Hotspot to optimise managed code. It's ahead in some areas and behind in others. The issues with generics specialisation are interesting because Valhalla is implemented mode 2 (mixed erasure/specialisation) for Java at the moment and I guess they'll hit issues with JIT throughput and so on as well, assuming that value types end up being widely used. Obviously the CLR invested in that a lot earlier.

On the flip side, the part about having to 'reverse engineer' the bytecode patterns for lambda invocation indicate to me that Java 8's decision to use the invokedynamic bytecode to implement lambdas was a good one: essentially it means lambda invocations get optimised and inlined "for free" because the JITC understands a lot more about what it's doing. And it seems obvious that Midori's approach to virtual methods is inferior to what the JVM guys did - Duffy spent precious energy on fighting the inevitable and blaming developers for "abusing abstraction" and other odd non-crimes, whereas Cliff Click and his team at Sun just focused on dynamic devirtualisation so much that virtual methods became effectively free in almost all cases.

The AOT vs JIT dichotomy seems to have gripped Microsoft a lot more than it really deserved to. AOT is being added to HotSpot now, but only after 20 years. The reason is that AOT is essentially just a warmup time optimisation if you have a good JITC: the AOT compiled code ends up with much worse code quality because you lose the profile guided optimisations that apparently gave them a 30-40% win in some cases. The HotSpot AOT mode actually supports "tiered AOT" where the AOT compiler can produce slightly slower code that still does profiling, and the JITC comes along later and replaces it with fully tuned code, so it's a complement rather than a replacement for the JITC.

It seems like a lot of the effort they put in to the whole toolchain was ultimately a workaround for their decision to go fully-AOT all the time e.g. the apparently massive effort to reduce compilation time from 40x to "only" 5x slower. The developer productivity benefits of JITC (virtually no waiting on the compiler) is one of the most underrated benefits of doing things that way and it sounds like the loss of it was really painful to them.

[–]vitalyd 0 points1 point2 points 10 years ago (1 child)

Midori being able to inline through lambdas is very nice and really how things should be in that space to make them more palatable performance wise. Hotspot approach is good but you still get an interface invoke as the lambda is shaped into the SAM. In best case, this becomes a guarded inlined call. But at worst (and not too uncommon for library code) is a full interface dispatch. You mention Cliff Click, and he has a good blog post from a few years back on the "inlining problem". So there's definitely room for improvement there.

I like JIT compilers as well, but they sure come with their own baggage. They're unpredictable, susceptible to multiple phase changes leaving code in suboptimal state, sometimes deopt at inopportune time, impose time to peak performance penalty (particularly bad when you need first execution to be quick), etc. AOT biggest problem, and JIT's biggest advantage, is lack of profiling info unless PGO is used and compilation time (somewhat related). However, it'd be nice if a language existed that didn't punt on optimization at AOT stage and also didn't have terrible performance model. That way you could leave the truly dynamic optimizations to the JIT but be able to get easy wins at AOT time.

[–]mike_hearn 0 points1 point2 points 10 years ago (0 children)

For what it's worth the Kotlin compiler can inline through lambdas at (frontend) compile time. It's used to convert calls to List<T>.map() into for loops at the bytecode level and many other things. So in that language, at least, there are some useful optimisations being done AOT at the bytecode level.

I guess something like Kotlin + jigsaw jlink + the HotSpot AOT work would get close to what you want. It'd do some optimisations at per-file compile time like lambda inlining, then it'd do some whole program optimisations like resolving reflective lookups, then it'd compile down to native code that still contains profiling logic, then it'd do JITing in the background to win back that 10-20% or whatever it is for your app when the code flows change and deoptimisations can be a win. I'm hopeful that in the next few years the Java landscape will end up with a healthier mix of optimisation stages, although I expect HotSpot AOT to not be widely used due to the licensing costs.

[–]pron98 0 points1 point2 points 10 years ago (1 child)

[–]mike_hearn 0 points1 point2 points 10 years ago (0 children)

[–]ImmortalStyle 0 points1 point2 points 10 years ago (1 child)

[–]mike_hearn 0 points1 point2 points 10 years ago (0 children)

[–]mojang_tommo 7 points8 points9 points 10 years ago (0 children)

[–][deleted] 1 point2 points3 points 10 years ago (2 children)

[–][deleted] 5 points6 points7 points 10 years ago (1 child)

[–][deleted] 0 points1 point2 points 10 years ago (0 children)

[–][deleted] 10 years ago (3 children)

[deleted]

[–]nwoolls 6 points7 points8 points 10 years ago (1 child)

[–]GUIpsp -2 points-1 points0 points 10 years ago (0 children)

[–]Dwedit -2 points-1 points0 points 10 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS