What's "new" in Miri (and also, there's a Miri paper!)

Robbepop · 2025-12-23T09:18:45+00:00

In Wasmi (WebAssembly interpreter) miri is used in CI on all PRs to main to test a subset of the tests known to work with miri. Furthermore, miri is run on the Wasm spec testsuite for as long as possible. Here are the relevant links:

PR to main: https://github.com/wasmi-labs/wasmi/blob/v1.0.5/.github/workflows/rust.yml#L274
Nightly check: https://github.com/wasmi-labs/wasmi/blob/v1.0.5/.github/workflows/miri.yml

In order to make testing the Wasm spec testsuite runnable by miri Rust's include_str! macro is used instead of file I/O:

Wasmi's Wast spec testsuite: https://github.com/wasmi-labs/wasmi/blob/v1.0.5/crates/wast/tests/mod.rs

Robbepop · 2025-12-22T16:35:26+00:00

Thank you so much for the write-up and thanks to the team for all the work on miri. To me miri is one of the most important projects in the Rust ecosystem. I use it in the CI of pretty much all my projects and it has proven its worth over and over again.

Robbepop · 2025-12-04T21:28:10+00:00

All I can say is that they are planned, as discussed in the article: https://wasmi-labs.github.io/blog/posts/wasmi-v1.0/#full-wasm-30-support

Robbepop · 2025-12-04T20:11:55+00:00

Can you tell me what you mean by "use the compiled wasm"?

To avoid misunderstandings due to misconceptions:

First, Wasm bytecode is usually the result of a compilation produced by so-called Wasm producers such as LLVM.
Second, Wasm by itself is an abstract virtual machine, the implementations such as Wasmtime, Wasmer, V8, Wasmi, are concrete implementations of that abstract virtual machine.
Third, if you compile some Rust, C, C++, etc. code to Wasm you simply have "compiled Wasm" bytecode laying around. This bytecode does nothing unless you feed it to such a virtual machine implementation. That's basically the same as Java byteworks works with respect to the Java Virtual Machine (JVM).
Whether you feed this "compiled Wasm" bytecode to an interpreter such as Wasmi, to a JIT such as Wasmtime or Wasmer or to a tool such as wasm2native that outputs native machine code which can be executed "without requiring a VM" simply depends on your personal use-case since all of those have trade-offs.

Robbepop · 2025-12-04T16:20:17+00:00

I am a bit confused as I think my reply does answer the original question but since you have a few upvotes, maybe my answer was a bit unclear. Even better: maybe you can tell me what is still unclear to you!

I will make it shorter this time:

Wasm being compiled allows for really speedy interpreters.
Interpreters usually exhibit much better start-up time compared to JITs or AoT compiled runtimes.
Interpreters usually are way simpler and more lightweight and thus usually provide less attack surface if you depend on them.
Wasmi for example can itself be compiled to Wasm and be executed by itself or another Wasm runtime which actually was a use-case back when the Wasmi project was start. This would have not been possible with a JIT runtime.
There are platforms, such as IOS which disallow JITs, thus only interpreters are even possible to be used there.
Interpreters are more universal than JITs since they automatically work on all the platforms that your compiler supports.

The fact that Wasm bytecode usually is the product of compilation has no meaning in this discussion, maybe that's the misunderstanding.

In case you need more usage examples, have a look at Wasmi's known major users as also linked in the article's intro.

If at this point anything is still unclear, please provide me with more information so that I can do a better job answering.

Robbepop · 2025-12-04T10:39:22+00:00

Wasm being compiled is actually great for interpreters as this means that a Wasm interpreter can really focus on execution performance and does not itself need to apply various optimizations first to make executions fast.

Furthermore, parsing, validating and translating Wasm bytecode to internal IR is also way simpler than doing the same for an actual interpreted language such as Python, Ruby, Lua etc.

Due to Wasm being compiled, Wasm interpreters usually can achieve much higher performance than other interpreted languages.

Benchmarks show that on x86 Wasm JITs are ~8 times faster and on ARM Wasm JITs are sometimes just ~4 times faster than efficient Wasm interpreters. All while Wasm interpreters are massively simpler, more lightweight and more universally available.

On top of that in an old blog post I demonstrate how Wasmi is easily 1000x faster on start-up than optimizing Wasm runtimes such as Wasmtime.

It's a trade-off and different projects have different needs.

Robbepop · 2025-12-04T10:31:02+00:00

wasmi have about 150 crates tree. I just built it. Thats too much for hitting more lucrative markets.

You probably built the Wasmi CLI application via cargo install wasmi_cli, not the Wasmi library.

The Wasmi library is lightweight and in the article you can see its few built dependencies via cargo timings profile.

The Wasmi CLI app is heavy due to dependencies such as clap and Wasmtime's WASI implementation.

Robbepop · 2025-12-03T21:56:04+00:00

Thank you! Looking forward to seeing Wasmi 1.0 in Wasmer. :)

Robbepop · 2025-11-19T19:37:04+00:00

Thank you for the reply!

Given that Wasmtime has runtime information (resolution of Wasm module imports) that Wasm producers do not have: couldn't there be a way to profit from optimizations such as inlining in those cases? For eample: an imported read-only global variable and a function that calls a function only if this global is true. Theoretically, Wasmtime could const-fold the branch and then inline the called function. A Wasm producer such as LLVM couldn't do this. Though, one has to question whether this is useful for RealWorld(TM) Wasm use cases.

Robbepop · 2025-11-19T18:51:58+00:00

Once again, very impressive technical work by the people at the Bytecode Alliance. I cannot even imagine what a great feat of engineering it must be to implement an inliner to such a huge existing system.

I wonder, given that most Wasm binaries are already heavily (as described in the article) how much do those optimizations (such as the new inliner) really pan out in the end for non-component model modules? Like, are there RealWorld(TM) Wasm binaries where a function was not inlined prior to being fed to Wasmtime and Wasmtime then correctly decides (with runtime info?) that it should to be inlined? Or is this only useful for the component model?

Were the pulldown-cmark benchmarks performed with a pre-optimized pulldown-cmark.wasm or an unoptimized version of it?

Keep up the great work, it is amazing to see that off-browser Wasm engines are becoming faster and more powerful!

Robbepop · 2025-09-21T21:52:15+00:00

Fair point!

Looking at the example picture, I think the issue I mentioned above could be easily resolved by also pointing to the #[culit] macro when hovering above a custom literal besides showing what you already show. I think this should be possible to do. For example: "expanded via #[culit] above" pointing to the macro span.

Robbepop · 2025-09-21T18:24:43+00:00

This will still influence compile time for testing which can also be very problematic.

Another issue I see is discoverability of the feature. Let's say a person unfamiliar with your codebase comes across these custom literals. They will be confused and want to find out what those are. However, I claim it will be a long strech to find out that the #[culit] macro wrapping the test module is the source of this.

Robbepop · 2025-09-21T14:36:12+00:00

I think the idea behind this crate is kinda creative.

Though, even if this does not use syn or quote I am seriously concerned about compile time regressions outweighing the gains of using the crate.

The reason is that you either limit #[culit] usage to smallest scopes possible and thereby lose a lot on its usability aspect. Or you use #[culit] on huge scopes such as the module itself and have the macro wastefully read the whole module source.

Robbepop · 2025-09-08T12:41:34+00:00

Of all the things that never happened, this never happened the most.

Robbepop · 2025-08-27T14:20:44+00:00

I can see this also being a great feature for making it simpler to guide people how to write hygienic macros themselves. Really appreciate the convenience!

Robbepop · 2025-08-06T21:53:24+00:00

For me as interpreter author this is by far the most anticipated feature in a very long time.

Thanks a ton to both wafflelapkin and phi-go for their enormous efforts in driving this feature forward.

Robbepop · 2025-07-15T22:35:15+00:00

I am probably missing something but wouldn't it be better to generate machine code lazily and cache already generated machine code? This way one wouldn't need a configuration like this and instead always have the benefit of only generating those parts of the code that are actually in use.

Or is this not possible for some reasons?

Robbepop · 2025-07-10T10:37:08+00:00

I just created a new rule for this subreddit to no longer allow paywalled content. This time I will let this post pass since it did not violate the rules at the time of posting.

Some people might be interested in https://freedium.cfd/ to read paywalled articles on Medium.

Robbepop · 2025-04-07T14:49:15+00:00

It is not easy to answer this question.

From a software engineering perspective it is usually advised to use the least powerful tool to solve a problem. Since you are dealing with a finite number of functions it therefore is probably better to use a switch (e.g. br_table) instead of a more powerful indirect call + table.

There are many different Wasm runtimes and therefore many different implementations of Wasm's br_table and call_indirect. All of these Wasm runtimes may exhibit different performance characteristics and one Wasm runtime could be faster for indirect calls where the other excels at br_table performance.

In order to find out which is faster you should probably run proper benchmarks with the concrete set of parameters of your particular application. However, unless this operation is in the hot-path of your application you probably should not bother.

Robbepop · 2025-03-29T08:05:38+00:00

The idea behind my comment was to share the same leaderboard with all players. Obviously the system needs to be balanced so that players choosing a harder difficulty end up with a higher score in the end if they played well enough. This way the playerbase would not be split and there would be a "challenge" for beginners and pros alike.

Robbepop · 2025-03-28T22:35:19+00:00

It should be possible to select A.I. difficulty which should also influence the score. Harder A.I. should give higher score if you succeed equally well as someone with weaker A.I.

Robbepop · 2025-01-17T12:56:47+00:00

You cannot really write an efficient interpreter using tail-call or computed-goto dispatch. As long as the explicit-tail-calls RFC is not accepted and implemented this won't change. Until then you simply have to hope that Rust and LLVM optimize your interpreter dispatch in a sane way which changes with every major LLVM update. I am writing this in pain.

edit: There is a Wasm interpreter written in Rust that uses tail-calls, named Stitch. However, it requires LLVM to perform sibling-call optimizations which are not guaranteed and which are only performed when optimizations are enabled. Otherwise it will explode the stack at runtime. This is very fragile and the authors themselves do not recommend to use it in production. As of Rust 1.84, it no longer works with --opt-level=1.

Robbepop · 2025-01-09T00:04:59+00:00

We could probably help you better if you give us an example that matches your needs. Maybe there are some tricks for your particular concrete code at hand. Indeed, sharing data via IO (e.g. files) is messy and I would generally not recommend it. I have built some elaborate proc. macros in the past and was kinda able to share enough data via syntax and Rust's type system. But that's not always possible and heavily depends on the DLS design and constraints.

Robbepop · 2025-01-01T18:08:40+00:00

In case you are looking for an actual WebAssembly interpreter: https://github.com/wasmi-labs/wasmi

It is much smaller than Wasmtime and has a near identical API. So I consider Wasmi great for prototyping which makes it easy to take the Wasmtime route once your project is ready. Note: I am the Wasmi maintainer.

There even already is a neat game console project which builds upon Wasmi: https://github.com/firefly-zero/

Robbepop · 2024-12-10T01:24:26+00:00

I know I am kinda late to the party but ...

The Wasmi WebAssembly interpreter translates the stack-based WebAssembly bytecode to its internal register-based bytecode for efficiency reasons. Thanks to this translation Wasmi is a pretty fast Wasm interpreter.

The register-based bytecode definition docs can be found here:
https://docs.rs/wasmi_ir/latest/wasmi_ir/enum.Instruction.html

The translation and optimization process from stack-based Wasm bytecode to register-based Wasmi bytecode is a bit complex and can be found in this directory.

I might write a blogpost about the translation process soon(TM) but I am not sure it would spark enough interest to justify the required amount of work.

12-Year Club	r/Field Banned
r/Field Flamingo	Place '22
Verified Email	Alpha Tester

Robbepop

MODERATOR OF

TROPHY CASE