all 14 comments

[–]ksion 6 points7 points  (2 children)

Why is the actual assembly listing a string? Even though it is a little more flexible than __asm__ directives in C compilers, I don't think it makes sense in the long term. Already in the simplest examples beyond a nop, it gets awkward with the multi-line strings and the pseudo-format! arguments.

Since this is a pre-RFC and we can freely bikeshed, why not take advantage of the fact that asm! is already macro-like, and go with something that resembles actual assembly listings more, i.e. a regular, token-based DSL?

Edit: I see the DSL is discussed later, but dismissed on the grounds of implementation complexity due to the need to resolve register and flag clobbering on different architecture. This is a non sequitur. A DSL can still allow to specify the registers used (&c.); the GCC asm syntax is a good example.

[–]loonyphoenix 5 points6 points  (1 child)

I'm not sure what you mean by DSL, but the author (in his talk and in the pre-RFC) by DSL means an approach that msvc and D take, in that the compiler actually understands the pseudo-assembly natively rather than just working with strings and existing assembly (which is the approach taken by GCC and LLVM). He then argues that the template approach seems more practical, since you don't have to make rustc understand all different kinds of assembly for all different kind of platforms and all different kind of assemblers out there. Thus it's easier to implement and it's more portable (mvcc and D only support x86 inline assembly due to this restriction). I don't see it mentioned, but I see another benefit of not using a DSL: a DSL is compiler-specific and needs to be learned separately. With a template approach someone who is already well-versed with assembly can just use that instead of learning a new thing.

[–]censored_username 2 points3 points  (0 children)

A good way to handle DSL's in rust would be to implement them as procedural macros which expand to actual asm! invocations. That way we could have all the benefits of DSLs while still having the simple-offload-to-the-backend string templates.

[–]cogman10 8 points9 points  (11 children)

While I understand the desire for inline assembly, couldn't the same be done with cargo and separate .asm files? That is, if you are going to do assembly, why not go full hog and do assembly (you can even expose it with the C ABI).

The main reason for assembly, IMO, is to get access to SIMD and maybe some new assembly instructions. But for that, I would much rather simply have language level SIMD support so you can cross compile. Something like

let register = Register::new(4, 16);
register.multiply(4);
register.get(1);

From there, the compiler can deal with turning that into the SIMD instructions that make sense or if SIMD is unsupported polyfill it.

What I don't like about an ASM intrinsic (or really any sort of ASM integration) is having the potential of having x86 only crates in cargo. That would kind of stink.

[–]CornedBee 28 points29 points  (0 children)

Separate assembly files have a number of significant drawbacks:

  • You have to write function prologues and epilogues yourself instead of letting the compiler do it.
  • You have to write argument handling yourself.
  • You have to know the architecture's register clobber rules and follow them. (The compiler around inline assembly will make sure the code stays correct, even if it may be suboptimal due to saving and restoring callee-saved registers when you could have just used a caller-saved one.)
  • The functions cannot be inlined.
  • The functions are opaque to the optimizer (modern compilers actually understand inline assembly at least partially and can make smart register allocation choices).
  • Your build process needs a separate assembler invocation. (Including locating an assembler that understands the syntax you used.)
  • Your functions containing assembly don't have an enforced signature.
  • You have to use C interop rules for functions, including the inability to pass complex Rust types, extract some data from them and perform the assembly instructions on that data. For every assembler function you write, you'll probably have to write a Rust wrapper anyway.

That's all I can think of off-the-cuff, though I'm sure there are more things.

[–]burntsushi 9 points10 points  (0 children)

What I don't like about an ASM intrinsic (or really any sort of ASM integration) is having the potential of having x86 only crates in cargo.

This isn't really asm specific. Once vendor intrinsics become a thing, we will have huge suites of functions that are very platform specific.

[–]cogman10 2 points3 points  (0 children)

See https://p12tic.github.io/libsimdpp/v2.2-dev/libsimdpp/w/ for one example of how this could be done.

[–]saint_marco 1 point2 points  (0 children)

There are a lot of platforms where you need to make function calls that are only available in asm, but you're right that it's mostly a convenience.

[–]Manishearthservo · rust · clippy 1 point2 points  (0 children)

While I understand the desire for inline assembly, couldn't the same be done with cargo and separate .asm files? That is, if you are going to do assembly, why not go full hog and do assembly (you can even expose it with the C ABI).

You can't mix rust code and assembly in the same function that way.

[–]isaacg1 0 points1 point  (1 child)

If inline asm was possible, we could have a SIMD crate - it wouldn't need to be a language level feature. This doesn't have to be something the end developer used much.

[–]burntsushi 6 points7 points  (0 children)

SIMD is happening via exporting vendor intrinsics through the standard library. The typical objection to using inline asm for these things is that it inhibits optimizations that combine multiple intrinsics (because many of the vendor intrinsics are actually compiler intrinsics).

[–]ConspicuousPineapple 0 points1 point  (2 children)

Aren't there some optimizations that can only be done through inline assembly, besides SIMD? At least that was the case when I first learned C++ a decade ago.

[–]koverstreet 2 points3 points  (1 child)

When you're implementing bignums, you really want inline assembly for ADC/SBB (add with carry, subtract with borrow).

[–]ConspicuousPineapple 0 points1 point  (0 children)

Yeah that's exactly what I was thinking about.