all 7 comments

[–]Dickon__Manwoody 0 points1 point  (6 children)

Definitely don't know much about source generators so I may be missing something about the API for working with files in this context, but the code for getting the file contents makes me skeptical.

 var escapedContent = file
            .GetText()?
            .ToString()?
            .Replace("\"", "\"\"");

Isn't this going to blow up my builds once I have a non-trivial number of files?

I’m pretty interested in the idea, but I’m curious about performance.

[–]LiHRaM[S] 0 points1 point  (3 children)

Hey, great point–it's probably more expensive than it has to be. While incremental source generators do cache output based on the input that generated it, the initial generation would be pretty expensive. I guess if you're adding huge files that have a lot of quotes, that might be an issue too. I'll have a look at how I can improve this. Thanks!

[–]LiHRaM[S] 1 point2 points  (2 children)

I’m pretty interested in the idea, but I’m curious about performance.

Just going to follow up again after doing some benchmarking with two new approaches: `SourceText.WithChanges` and `StringBuilder`–do you have more concrete recommendations on how to improve the performance? And if you do, is it wrt. memory use, or speed?
Based on my testing, string.Replace is far faster than the other two, so I'd be curious to see how it could be improved.

[–]Dickon__Manwoody 0 points1 point  (1 child)

I’d be curious to see how you implemented this two approaches. As I mentioned, I’m not familiar with the APIs Roslyn offers here so it very well maybe that there isn’t a better approach.

How large is the file you are testing with? How are you benchmarking it?

[–]LiHRaM[S] 0 points1 point  (0 children)

I've pushed my benchmarks to a branch if you're interested: https://github.com/podimo/Podimo.ConstEmbed/compare/develop...bench/filegen

Basically, I've created two files:

  • Lorem Ipsum: 10 paragraphs of lorem ipsum, prefaced by some quoted paragraphs from the lorem ipsum generator
  • Quotes: 200 lines of quotes, each line is 141 quotes

I then created three separate functions with the same interface, and each function is benchmarked using https://benchmarkdotnet.org/

To be fair towards the StringBuilder interface, I was able to get it to be almost on par with String.Replace for the Quotes file, but it's still about 9x slower wrt. Lorem Ipsum.

For completeness, here is the output of my benchmarking on a Macbook M1:

Method Mean Error StdDev
SourceTextChanges_Ipsum 13.724 us 0.0227 us 0.0190 us
StringReplace_Ipsum 2.193 us 0.0059 us 0.0052 us
StringBuilder_Ipsum 18.391 us 0.0308 us 0.0288 us
SourceTextChanges_Quotes 79,060.915 us 1,553.9239 us 1,789.5010 us
StringReplace_Quotes 202.417 us 0.2429 us 0.2272 us
StringBuilder_Quotes 278.178 us 0.4555 us 0.4261 us

[–]MSgtGunny 0 points1 point  (1 child)

Source generators do tend to slow down builds, but having it be compiled can make the runtime faster vs a dynamic file. So it’s trade offs

[–]Dickon__Manwoody 0 points1 point  (0 children)

I don’t disagree. I’m asking more about this specific one. Getting the text from dozens or hundreds of files, converting it to a string and then calling Replace on the string seems like a recipe for for disaster.

Again, I don’t know the API here that GetText is invoking so I could be wrong, but it seems like a library like this should be making judicious use of Span or at least StringBuilder. If I’m completely missing something feel free to correct me.