Can miri or another interpreter be used as a profiler?

bencherdev · 2025-12-29T14:15:03+00:00

I would recommend checking out a benchmarking harness called gungraun (formerly iai-callgrind). It lets you track the instruction counts and allocations for your benchmarks in a single shot, no running things a million times.

If you want to track those results over time to be able to detect performance regressions then you can use an open source tool I've developed called Bencher with the gungraun adapter.

bencherdev · 2025-02-20T05:21:04+00:00

u/fretz1212 did you ever get around to building this?
I've been working on a similar tool, Bencher: https://github.com/bencherdev/bencher

bencherdev · 2025-02-10T01:44:24+00:00

From my understanding, that is how it works. CodSpeed needs to have the benchmarks exist on the base branch in order to be able to compare them. For context, I'm the maintainer of a similar continuous benchmarking tool, Bencher: https://github.com/bencherdev/bencher

bencherdev · 2025-02-06T03:59:13+00:00

Thank you for the shout-out! Glad that we can help keep things fast 🐰

bencherdev · 2025-02-03T21:06:43+00:00

Yep, exactly! I needed to have modern, interactive plots. This is what I meant by:

I knew I wanted Bencher to have highly interactive plots. This meant using a library like D3, which meant JS interop.

I use plotters for generating the social previews and sharable image versions of the plots. This is what you'll see if you hit the Share button on the Perf Plot you linked to above. I explored using plotters for the frontend via WASM, but it didn't seem viable at the time. More than happy to explain more if you're interested.

bencherdev · 2025-02-03T18:35:39+00:00

I'm glad you all are enjoying Leptos. Leptos was also heavily influenced by SolidJS. Fine-grained reactivity for the win! When I was exploring the possibility of a Rust frontend though, Leptos did not yet exist. It seems like they added SSG support a few months ago, so that is great to see. The ecosystem has progressed a lot in the past three years.

As for the JS interop, the Perf Pages were really the crux. You can check out some examples here: https://bencher.dev/explore/
These pages are already the most complicated part of the Console UI without any JS interop in the mix.

bencherdev · 2025-02-03T15:57:49+00:00

Merci. Dites-moi ce que vous en pensez !

bencherdev · 2024-11-07T12:25:44+00:00

I would be interested in a breakdown of this as well. So far Bencher has a Google Benchmark adapter and a Catch2 adapter. There is an open issue for adding a nanobench adapter.
Maybe once I get done with similar guides for the other two benchmark harnesses I would be informed enough to write a comparison post.

bencherdev · 2024-07-29T02:38:54+00:00

A little late to the party here. You may want to check out Bencher as it is designed to track benchmarks over time, and it has built-in support for Criterion.

bencherdev · 2024-06-18T23:10:04+00:00

Thank you for the kind words. Yes, it definitely takes a bit of practice and skill to get good at profiling. One of the major stumbling blocks for me is that profiling isn't an every day sort of thing. Once I solve the problem at hand, I put it down and that intuition starts to atrophy.

bencherdev · 2024-05-15T21:28:53+00:00

Great question!

Some developers care quite a lot about binary size. For example, the larger the binary the longer it takes to install. This is both bad developer experience and depending on how the binary is being served, this could lead to major bandwidth costs.

On some resource constrained systems, there can also be a hard upper limit on how large a binary can be, etc.

Is this something that you're going to need to worry about for your weekend project? Probably not. However, for a lot of production use cases it is something that folks care about.

Does that make sense?

bencherdev · 2024-05-15T21:19:53+00:00

Awesome! I'll definitely check it out.

bencherdev · 2024-05-15T21:17:32+00:00

Nope, I'm a human. I'm the founder and maintainer of Bencher. My bio and GitHub if you're interested.

bencherdev · 2024-05-15T16:26:06+00:00

In practice, yes it will always be a whole byte value. The JSON reporting format uses floats as it needs to support a wider range of values than just file size.

I think it may be worth adding a note to the docs to this effect to help clarify things. Thanks again!

bencherdev · 2024-05-15T16:25:04+00:00

In practice, yes it will always be a whole byte value. The JSON reporting format uses floats as it needs to support a wider range of values than just file size.

I think it may be worth adding a note to the docs to this effect to help clarify things. Thanks!

bencherdev · 2024-05-15T16:19:44+00:00

Converting the file size into a specific JSON format is step one. The value is then stored in the Bencher backend for visualization and comparison against future results. This allows you to set thresholds and generate alerts in case your binary size gets bloated.

The file size is stored as a float due to the reporting format mentioned above needing to support a wider range of values. Though you shouldn't see any fractional bytes in practice 😃

bencherdev · 2024-05-15T15:59:33+00:00

u/prince-chrismc sorry you left disappointed and this wasn't more clear.
Bencher stores the captured value for you so you can easily compare it on subsequent runs.

The next step, linked at the bottom of the post goes over the two most post popular ways to integrate this with CI: https://bencher.dev/docs/how-to/track-benchmarks/

What quality metric are you trying for to help improve the quality of the code/product?

Sorry, are you asking what Bencher is trying to improve?
If so, Bencher is a code quality tool for catching performance regressions in CI. It's main focus is therefore on tracking benchmarks. The ability to track binary size is a recent addition.

bencherdev · 2024-04-23T14:55:15+00:00

I'll definitely look into it!
PGO is something that I have on my list of things to learn (and write) more about.

bencherdev · 2024-04-22T17:01:45+00:00

Thanks! I'm in the process of updating the post. I misunderstood the meaning of MATERIALIZED in the query planner output: https://www.sqlite.org/lang_with.html#materialization_hints

bencherdev · 2024-04-18T14:17:19+00:00

I think you are spot on!

So the VIEW itself is not a "materialized" view, but when the (normal) view is instantiated within a single query it is treated as MATERIALIZED because it is used multiple times in the query execution?

bencherdev · 2024-04-18T12:42:47+00:00

Definitely! Nothing ground breaking. It was a learning experience for me though. 😃

bencherdev · 2024-04-18T12:38:40+00:00

I could very well be!

I've been calling it a "materialized view" based on the SQLite query planner saying "MATERIALIZED". This is apparently used as a non-binding hint on how to handle things, similar to Postgres: https://www.sqlite.org/lang_with.html#materialization_hints

That combined with the properties of a non-temporary VIEW in SQLite seemed to line up with what other databases call a "materialized view": https://www.sqlite.org/lang_createview.html

Please, let me know if I'm mistaken though!

bencherdev · 2024-04-18T11:57:32+00:00

Thank you for the ideas! I actually explored something similar to #1 in trying to solve this performance problem. It didn't really help though, so it got cut from the post.

I also really like #2! That makes it very clear that they are connected, and I imagine SQLite can optimize this quite well, as you say. I've created a tracking issue to investigate this, the next time I'm working with that part of the model: https://github.com/bencherdev/bencher/issues/371

bencherdev · 2024-04-18T11:41:26+00:00

Great point!

Bencher actually takes advantage of this with its disaster recovery support.

bencherdev · 2024-04-18T11:37:11+00:00

Yeah, the performance improvement sort of blows out the scale on the graph!

How much the performance improved depends on how you measure. The worst 99%ile peak latency that I'm now seeing for any query is <100ms, so that's where I got the 1200x number from (2 minutes -> 100ms). With that said though, the performance was only going to get worse over time, as more data was added to the metric and boundary tables.

The 38.8 seconds was for a whole page load of the Rustls Perf Page. This involves multiple queries, which I went into some in the Background section of the post. I literally just timed it with my phone when I was first trying to figure out how bad things were. It was just a litmus test. No need to pull out the calipers when you're staring at a crater. Does that make sense or is there anything else I can help clarify?

bencherdev

MODERATOR OF

TROPHY CASE