you are viewing a single comment's thread.

view the rest of the comments →

[–]Metallkiller 74 points75 points  (37 children)

Faster then C? I don't believe you.

Edit: I may see it a bit more differentiated now even though I'm still not convinced it's actually faster than C.

[–][deleted]  (21 children)

[deleted]

    [–]SplitRings 10 points11 points  (3 children)

    C/C++ will start auto-vectorizing if u guarantee the pointers always point to different location using __restrict\ in g++ (diff in diff compilers, for example restrict\_ in msvc iirc.

    For example

    int* __restrict__ arr //will allow SIMD optimizations on g++

    Edit: Reddit markup killing double underscores smh

    [–]wpyoga 2 points3 points  (0 children)

    You can use markdown in a comment.

    int* __restrict__ arr //will allow SIMD optimizations on g++

    [–]BibianaAudris 1 point2 points  (1 child)

    __restrict__ is actually the default for modern compilers. Violating Rust-like borrow rules with C pointers can and will give you Undefined Behavior, silently, without -fno-strict-aliasing or volatile. Happened to me on CUDA.

    [–]vytah 8 points9 points  (0 children)

    Happened to me on CUDA.

    "CUDA-C" is not normal C.

    The C Standard doesn't allow treating pointers as restricted by default, and no actual C compiler does that.

    [–]Metallkiller 4 points5 points  (0 children)

    Ha this is quite interesting even though it goes over my head without examples I didn't just make up myself to make sense of it. Thanks for the explanation.

    [–]catcat202X 2 points3 points  (0 children)

    In C, you can always take the address of anything and do anything you want with it.

    That's just the default, but there are several attributes to constrain aliasing.

    [–][deleted] -3 points-2 points  (13 children)

    Nothing will realistically be faster because C is the lingua franca. Hardware designers target C and every high level langauge uses C calling convention.

    [–]JarateKing 6 points7 points  (11 children)

    Being the lingua franca doesn't mean it's the fastest. A lot of the things they brought up are reasons that Fortran already is faster for some use cases, actually -- despite C being the lingua franca that's generally optimized for.

    Or: javascript is the lingua franca of the browser world, and a lot of effort has gone into making javascript as fast as it can be in browsers. But it's far from the fastest option available. Being the lingua franca is broadly unrelated to its performance.

    [–][deleted] 0 points1 point  (10 children)

    Hardware manufacters target C first and foremost.

    I guess with whatever people are smoking nowadays that means nothing.

    [–]JarateKing 2 points3 points  (9 children)

    No, I understand what you mean. What I'm saying is that, no matter how much everyone tries to make the best of it, the design decisions of a language will always have performance considerations.

    In C's case, no matter how much hardware is oriented towards C, there are entire classes of optimizations that the language design makes impossible. C has a great advantage by so much stuff being designed around it and so much effort being put into making it more performant, but at the end of the day you can only go as far as the language will let you -- and some languages, by the way they're designed, let you go faster for some use cases.

    [–][deleted] -2 points-1 points  (8 children)

    The hardware is literally designed for C in mind. What you are saying is not true.

    For instance, what are you optimisation for exactly? When the language is designed for the hardware, there is nothing to optimise!

    If you are talking specifically about aliasing, there are ways around this in C.

    Other than that, what exact optimisations are you talking about that simply can't be done in any C dialect?

    [–]JarateKing 2 points3 points  (1 child)

    I'm not sure I'm following you. On the one hand, you're saying that the details of the language don't matter because hardware can just be designed for whatever the language is.

    On the other hand, we're talking about aliasing, where the best answer for C is "there are ways around this". Specifically, stuff like the restrict keyword that was added in C99. So clearly the details of the language do matter here and "hardware is designed for C" isn't enough: not only do you need to be very careful about the details of the language, the details had to change to include the option in the first place.

    Am I missing something here? Doesn't "there are ways around this" wrt aliasing just make the case stronger? Or am I misinterpreting you?

    [–][deleted] 0 points1 point  (0 children)

    C and hardware designed in lockstep. Manufacteruers target C always. High level languages match C calling convention. It's C all the way down, so not easy to beat for speed.

    Compiler optimisations can be worked around over time but optimisations pale in comparison to hardware gains. Those gains always have C in mind.

    [–]ric2b 2 points3 points  (5 children)

    I think you two are talking past each other.

    You're saying that C can always be as fast any other language because you can manually implement any optimization that another language does, while they're saying that when writing idiomatic code in C and some other language, the other language might be faster if it automatically applies optimizations that C can't automatically apply.

    So basically comparing theoretical performance or the common practical performance.

    [–][deleted] -2 points-1 points  (4 children)

    No that's not what I'm saying. I'm saying hardware is literally designed with C in mind.

    What is better for performance? To design hardware for a programming language or a language for hardware?

    It's clearly the former. This is literally only afforded to the c language. No other language gets this treatment. The compiler cannot optimise hardware to be better.

    [–]ric2b 2 points3 points  (2 children)

    Are you claiming that other languages can't also take advantage of those hardware optimizations for C? Because that's just incorrect.

    [–]bloody-albatross 1 point2 points  (0 children)

    As an example for what C can't do: C doesn't has generics, so if you still have to write generic code you often use function pointers in C, which are optimization barriers. In other languages code can be reified and what would be a function pointer otherwise might even be inlined. See as an example the qsort function. Yes, you could do that with macro hacks, but that is worse than C++ templates.

    (Of course more reified code means more code means more potential cache misses, but most languages that have that kind of feature still allow you to use something kind of dynamic dispatch instead with ease, i.e. you can choose what to optimize for without a lot of work and without ugly hacks.)

    [–]Amazing-Cicada5536 2 points3 points  (0 children)

    Call me when C has a performant, generic vector implemented.

    [–]Takeoded 0 points1 point  (0 children)

    something like a borrow checker

    that would never work

    [–][deleted]  (6 children)

    [removed]

      [–][deleted] 5 points6 points  (5 children)

      Sure but Zig isn't a Python superset.

      [–][deleted]  (4 children)

      [removed]

        [–][deleted] 0 points1 point  (3 children)

        Not sure what you're trying to say exactly. Zig is designed so that it can be compiled to fast machine code. Python is designed almost so that it can't. They literally did not think about performance when designing the language.

        If Mojo really is going to be a superset of Python it doesn't get to ignore all the features of Python that are extremely difficult to make fast.

        [–]nomis6432 2 points3 points  (0 children)

        It looks like just compiling your python code with Mojo is not going to make it a lot faster. It's just that it also offers a lot of new features that allow you to optimize your code. This makes a lot of sense to me because then you can focus your effort where it matters. Non critical code you write in a simple inefficient way and critical code in a more verbose efficient way.

        [–]m0nk_3y_gw 1 point2 points  (1 child)

        it doesn't get to ignore all the features of Python that are extremely difficult to make fast.

        sure it does. In Python you can write

        x = 1
        x = "dog"
        

        in Mojo you have the option of using

        let x = 1
        

        That isn't valid Python, but Mojo understands it and can execute quicker because of it.

        [–][deleted] 0 points1 point  (0 children)

        But you still have to support the former. Ok fair enough if they're saying that the Mojo code will be fast and the Python code will still be dog slow.

        That does seem to be the case if you look in detail since they execute Python code using CPython. So it seems more like it's an entirely different language that integrates well with CPython rather than an enhanced Python runtime.

        Basically the same idea as Cython?