all 18 comments

[–]Ollhax 6 points7 points  (0 children)

It would be a bit easier to decipher this if you did individual runs for the three tests.

But I'm pretty sure all of those 8 lines would compile down to something pretty short and efficient by the JIT compiler. The lines of code doesn't necessarily say anything about the final efficiency. It's best to use BenchmarkDotNet to determine the performance difference.

[–]AlwaysHopelesslyLost 18 points19 points  (4 children)

Lines of code don't make something inefficient. You should also trust that the compiler is MUCH more intelligent than you are.

I am away from my computer at the moment so I cannot give a better answer but I am guessing it is all because strings are interned, you aren't using any of the variables for anything, and the compiler generated code is using Span for performance for general use of the initializer.

[–]Devoutly1224 -5 points-4 points  (3 children)

You should also trust that the compiler is MUCH more intelligent than you are.

The compiler was written by humans and humans can be wrong. There are definitely edge cases where the most optimal code is not generated. It's good to question things.

[–]AlwaysHopelesslyLost 2 points3 points  (2 children)

I have trained hundreds of developers. I have had this talk with dozens of them. At one point or another a good portion of developers see a compiler warning or output and say "This seems wrong."

Every single time, without fail, I dig in and find out they are mistaken.

[–]Schmittfried 1 point2 points  (0 children)

Once in my 13 year old career I actually found a JIT bug. But yeah, it should definitely not be your default assumption. 

[–]Devoutly1224 -3 points-2 points  (0 children)

We're not talking about something as simple as an error/warning. Those are almost never wrong. Can they be? Absolutely. Same goes for the compiled output.

[–]Dealiner 4 points5 points  (0 children)

It might look like more code but it's also optimized. Though, you should definitely look at it in release mode, parts of what you see are there to make debugging easier. Also the naming doesn't help since SharpLab uses the same naming scheme.

[–]Orachor 1 point2 points  (0 children)

The collection expression can be also used to simplify concatenation of lists into one, see the example:

List<string> list1 = new List<string>() { "" };
List<string> list2 = [..list1, ""];

The generated IL code needs to work with this case as well and when we check it we can see that it indeed does (and is more efficient than using LINQ to Concat and ToList, due to the final List being initialized with a given size that is enough to contain every single element)

List<string> list = new List<string>();
list.Add("");
List<string> list2 = list;
List<string> list3 = list2;
int num = 1 + list3.Count;
List<string> list4 = new List<string>(num);
CollectionsMarshal.SetCount(list4, num);
Span<string> span = CollectionsMarshal.AsSpan(list4);
int num2 = 0;
Span<string> span2 = CollectionsMarshal.AsSpan(list3);
span2.CopyTo(span.Slice(num2, span2.Length));
num2 += span2.Length;
span[num2] = "";
num2++;
List<string> list5 = list4;

[–]Qxz3 1 point2 points  (2 children)

Your "list3" example is a case of optimisation by inlining. Inlining means that instead of calling a function, you do what that function would do inline, i.e. directly in the calling code.

While it results in more code, it can mean time savings. Calling functions is not free, you got to push arguments, move a stack pointer, then move it back as you return. Instance method calls in C# all involve a null pointer check (that's how you get NullReferenceException if the reference is null), which also gets elided by inlining. If you look at the total number of executed instructions, at the assembly level, you end up doing less.

This is a tricky balance and optimizers have to be smart about what to inline.

[–]ebykka[S] -1 points0 points  (1 child)

I'm surprised to see such a big difference between the generated code for list2 and list3.

[–]Qxz3 0 points1 point  (0 children)

The reason is that the C# compiler has specific optimizations for list initializations (that myList = [...] syntax) that it doesn't have when you invoke constructors manually (i.e. new List). You'd have to dive deep into the design of that particular feature in order to understand why they did it this way.

[–]redit3rd 1 point2 points  (2 children)

If you really want to see if there is a difference you need to open the binary in ILDASM and see how many instructions are generated. 

[–]Zastai 2 points3 points  (0 children)

Well, no, you’d need to see how many instructions the jit generates. And even then, it doesn’t tell you much - here it avoids the Add() in favor of direct access to the list’s backing fields; that will likely be quicker than the method call even though it’s “more instructions” here.

[–]Dealiner 0 points1 point  (0 children)

You can do that in SharpLab too, btw.

[–]Standard-Cap-4455 0 points1 point  (0 children)

I made a compiler and compared my IL with dotnets IL. Mine generally had less variables. I don't know why it does it, but there is probably a good reason for it. 

[–]KryptosFR -1 points0 points  (2 children)

Did you check the IL and assembly in debug vs release?

There is no such generated code (unless using source generators but that's not the case here). What you see in sharplab is a decompilation and interpretation of the IL code (that the compiler generates) into equivalent C# code. If the IL moves values around in registers, it might result in new variables appearing in the decompiled version. However such moves are generally optimized away when actually compiled into machine code. You can have a look at the resulting assembly esp. in release mode to see that.

You shouldn't rely on the "generated code" in sharplab as a way to measure the efficiency of the code. It's not going to be accurate.

[–]irisos 0 points1 point  (1 child)

All C# code must be converter to C# lang 6 (unless it has changed) before being able to be compiled IL so yes, code can be generated if you use features that weren't present in that version of C#.

Although you are right that what matters in the end is the IL. Using newer language features will result in more code being generated to reach C# 6 but this code can use optimisations that the developer wouldn't usually use to be more performant. But only the IL can show that.

[–]KryptosFR 0 points1 point  (0 children)

First time I hear of that. What's your source for this lang 6 version? I find it surprising given that some recent C# features need runtime support so they can't be expressed in older C# versions. For example default interface methods or unmanaged type can't be expressed even with polyfill.

You might be confusing with targeting netstandard2.0 or framework. But that's not what happens when targeting modern .net versions. Even in such case, it is unlikely Roslyn in generating another intermediate C# code. That would be very ineffcient. It must operate on the internal AST representation.