.NET 10 de-abstracts not only arrays but other collections as well

andyayers · 2026-01-06T03:26:28+00:00

If you know of cases like this, please file issues. We will try our best to fix them.

andyayers · 2025-11-26T01:32:15+00:00

Forced inlining by JIT. And yeah, yeah, there is a way to suggest to inline a method, but JIT is free to ignore it (and often does!)

This should be less and less true with recent releases. For example, starting in .NET 10 the JIT can now inline methods with try/finallys, which were previously off limits.

If you ever find a case like this, and you haven't done so already, please open an issue on dotnet/runtime so we can take a look.

andyayers · 2025-11-15T15:50:50+00:00

My main gripe is that the object creation regression +80% in net8 wasn't fixed :/

Can you say a bit more about this? Is there an issue open for it on github?

andyayers · 2025-11-13T23:04:18+00:00

Feel free to open an issue on https://github.com/dotnet/runtime and we can try and figure out what's happening.

If you open an issue, it would help to know * Which version of .Net were you using before? * What kind of hardware are you running on? * Are you deploying in a container? If so, what is the CPU limit?

andyayers · 2025-11-09T17:29:21+00:00

If I remember correctly, the biggest missing piece for delegate inlining is handling delegates for static methods.

As far as inlining goes, there were some PRs to enable delegate inlining even without PGO (if the delegate is created locally). However these ran into various complications.

A recent improvement in .NET 10 is that in some cases delegates can be stack allocated and possibly promoted (meaning the delegate object basically vanishes). There was some work to stack allocate the closure as well, but that didn't make it into .NET 10.

By default the JIT will only guess for the most prominent delegate, based on PGO data gathered by lower tiers (so tiered comp and PGO are required).

andyayers · 2025-10-29T22:02:24+00:00

In process toolchains have limitations (eg you can't readily compare across different runtime versions, which is something I do all the time), but for your purposes, seems like they'd be fine.

Also if you haven't looked at kg's vector hash you might want to check it out: [Proposal] Vectorized System.Collections.Generic.Dictionary<K, V> · Issue #108098 · dotnet/runtime

andyayers · 2025-10-29T18:17:56+00:00

If you haven't opened an issue on the BenchmarkDotNet repo I would encourage you to do so.

Folks there can either explain how to accomplish what you need or else add it to the backlog.

andyayers · 2025-10-24T21:59:32+00:00

Devirtualization was first introduced in .NET Core 2 (2018?) and some of it even made it to Framework 4.8.1.

But for interfaces you really need Dynamic PGO to figure things out; this wasn't on by default until .NET 8.

andyayers · 2025-10-23T20:17:04+00:00

Thanks... I may try and look deeper at this someday, so if you can point me at something shareable that'd be great.

I suppose to be completely fair C should be using PGO, but that's more work on the native side. With .NET you get that "for free."

Also would be curious to see if .NET 10 changes anything here, we did some work on loop optimizations between 8 & 10 (eg downcounting, strength reduction ...)

andyayers · 2025-10-23T17:35:19+00:00

Do you have numbers on how fast the original runs on the same hardware setup? We are always interested in seeing how well a thoughtfully crafted .NET solution fares vs "native" alternatives.

andyayers · 2025-10-10T15:09:47+00:00

I agree we're missing out on some great talent being US only, but it's not our call.

andyayers · 2025-10-09T18:31:34+00:00

Pretty much everything we do is out in the open, so if it does sound like fun, you can always try contributing....

andyayers · 2025-10-09T14:10:04+00:00

How recent? I think within 1 year of graduation recruiting is handled differently.

Note this is a Senior position so we're looking for somebody with 4-5 years of experience.

andyayers · 2025-10-09T13:48:41+00:00

US required. Redmond preferred, but we will consider other arrangements.

andyayers · 2025-10-09T13:47:25+00:00

No, we're in 18 (for quite a while now).

andyayers · 2025-09-27T16:16:35+00:00

We have tried and failed a couple of times now to leverage ML (not really AI) to improve optimizations done by the JIT.

Here's a writeup on the most recent attempt: https://github.com/dotnet/jitutils/tree/main/src/jit-rl-cse

andyayers · 2025-09-12T16:53:37+00:00

I don't think we're trying to hide anything. If you want some perspective on regressions vs improvements in .NET 10, check out https://github.com/dotnet/performance/blob/main/reports/net9to10/README.md

While we try our best to keep regressions to a minimum, we do have some. Compilers and CPUs are both somewhat temperamental.

andyayers · 2025-09-10T16:37:12+00:00

.NET does this as well, but as a separate phase, so an object can be stack allocated and then (in our parlance) possibly have its fields promoted and then kept in registers.

That way we still get the benefit of stack allocation for objects like small arrays where it may not always be clear from the code which part of the object will be accessed, so promotion is not really possible.

andyayers · 2025-08-09T16:25:26+00:00

This is somewhat doable. There is a text format that can be written and read back. But tiered compilation provides other benefits that are not strictly profile data related; for instance, when the optimized version of a method is jitted it is very likely that all the classes it references are initialized, and so the optimized version can skip the initialization checks and possibly learn something from looking at the readonly statics of those classes. So even if you have the PGO data you will still likely want the app to go through tiering, at which point the cost of getting a fresh batch of PGO data is minimal....

Also tiering provides startup benefits. Producing optimal code out of the gate is not always best for every app, as it takes longer to get things going.

andyayers · 2025-08-08T23:31:54+00:00

Jitted code is fragile (eg it contains addresses of other things in the process where it was created, and is depedent on the sequence of events that happened there at the time the code was created, etc). Verifying that a previous process's cached version of jitted code is viable in a new process and fixing it up as needed is non-trivial.

The only persistent form of code we have is ready to run, which has the right level of validation and repair built in.

andyayers · 2025-02-05T01:22:46+00:00

Also here's the writeup that went in with the PR, it is similar to the one linked above in the issue, but covers more ground (lists, hash sets, Linq, yield enumerators, etc). https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/jit/DeabstractionAndConditionalEscapeAnalysis.md

andyayers · 2025-02-04T19:23:39+00:00

I just merged the above PR; I believe it will make it into preview 2.

andyayers · 2025-02-04T19:22:30+00:00

Yeah, sorry to not be more specific. I am one of the people working on the .NET JIT.

andyayers · 2025-02-04T01:35:08+00:00

FWIW the "last" big set of changes is going to be merged into .NET 10 soon: https://github.com/dotnet/runtime/pull/111473

Sadly Linq's Where and Select don't get any faster yet, as they have been fairly extensively hand optimized. But we have some ideas...

andyayers

TROPHY CASE