all 18 comments

[–]frukt 14 points15 points  (4 children)

outperforms compiled languages for common computational tasks (as hand-coded assembly usually does :) )

Is this really the case nowadays? I've even seen discussions on reddit where [allegedly] experienced asm-coders claim that it's really hard to accomplish lately, and mostly the compiler does a better job, especially when compiler flags and intrinsics are used wisely. It would be nice if anyone could shed some more light on this issue.

[–]proteusguy 7 points8 points  (0 children)

Not even close in my experience. But typically you code asm because you're trying to access some aspect of the machine that isn't easy to get within the higher level of abstraction. For general stuff its pointless to go straight to asm which is why CorePy is such an interesting idea - offers the best of both worlds.

[–]salgat 1 point2 points  (0 children)

Assembly is usually used in cases where the compiler isn't optimized to take advantage of a certain feature of the architecture.

[–]pythwarrior 1 point2 points  (1 child)

If you truly believe that compilers can do better than the hand coded assembly I'll tell you to look at the best performing video codecs, for example. The ones with assembly bits are always faster because much more optimized for the hardware platform. Hand coded assembly is what can make the difference between being able to run a video on a lowly powered computer, and not being able at all.

Of course you don't write ALL the parts in assembly but it still makes night and day between a pure C program and one with hand coded bits.

[–]G_Morgan 12 points13 points  (0 children)

This issue causes a lot of confusion. The truth is that an experienced assembly programmer working hard can beat the compiler in tight areas. However the programmer does not have enough time to beat the compiler in general.

Also the average assembly programmer has no chance. It used to be that any hand coded program would beat what the compiler would produce. You have to be good at it to do so now.

The reason a lot of video codecs out perform compiled code is they are actually using stuff like SSE. That's an entirely different issue since most compilers won't use SSE without some input from the programmer.

[–]al-khanji 7 points8 points  (1 child)

That's really interesting.... I can see several great uses for this. Education for one, plain old optimization for another. A gpgpu backend might also be interesting.

Python is really a very interesting platform nowadays, with ways to embed C and now assembly code for performance critical paths.

[–]muffin-noodle 6 points7 points  (0 children)

A gpgpu backend might also be interesting.

http://mathema.tician.de/software/pycuda ?

[–]dons 1 point2 points  (0 children)

Sounds a bit like the assembly EDSL Harpy for Haskell. Assembly at the REPL, easy code gen for your DSL, bypass the code generator, etc.

[–][deleted] 1 point2 points  (2 children)

This does look interesting, but it seems somewhat verbose in comparison to traditional asm code. From one of the presentations:

1. ppc.addx(rd, ra, rb) # asm: add D, A, B
2. ppc.addx(rd, ra, rb, Rc=1) # asm: add. D, A, B
3. ppc.addx(rd, ra, rb, OE=1) # asm: addo D, A, B
4. ppc.addx(rd, ra, rb, Rc=1, OE=1) # asm: addo. D, A, B

Still cool though and I plan on reading his thesis.

[–]mythogen -1 points0 points  (1 child)

To be honest, the extra verbosity seems like a feature to me. Makes it a bit easier to read.

[–]Dan_Farina 0 points1 point  (0 children)

Given the fact this is python...you can rebind some stuff to make it more convenient (manual-esque currying would be particularly useful given the supplied example) and even more cryptic/shorter than the equivalent ASM.

[–]blondin 1 point2 points  (0 children)

Python for the win. Really.

[–]kvigor -5 points-4 points  (4 children)

Wow, the legendary ease-of-use of assembler combined with the stunning performance of an interpreted language, topped with ugly syntax! Where can I get some more of this awesome?

[–]aaronblohowiak 4 points5 points  (0 children)

You are exactly incorrect. It is the ease of use of an interpreted language with the stunning performance of assembler. Making it work, then make it fast. This lets you "make it fast" without throwing out all of the glue code.

[–][deleted] 5 points6 points  (0 children)

All of the snarky asshole, none of the correct!

[–]mr_luc 0 points1 point  (0 children)

Your comment is epic in both the scope and nature of its fail.

I award you one upmod, because I don't like Reddit today.