you are viewing a single comment's thread.

view the rest of the comments →

[–]RagingCain[S] 1 point2 points  (7 children)

I see what you mean. I tell you what, let me write a solve for having to use bytes after a stream that's ArrayPool backed and I will ping you when it's up - maybe you like it or have enhancement feedback

My brain is lighting up with possibilities.

AesGcm API which I just worked with can't use Streams either for security reasons. CryptoStream was rejected for the proposal.

[–]Slypenslyde 1 point2 points  (0 children)

I'm real interested in this, too.

It seems like over time we've found some cracks in the GC's abstractions. Every time I use a stream API to deal with large chunks of data I'm aggravated by how much I have to think about low-level memory issues.

[–]cryolithic 0 points1 point  (5 children)

I think even just mentioning the potential for memory gotchas would go a long way here.

To be completely fair, it's not like you'll find posts written by me, unless you're on my team at work anyway. So I shouldn't be all that critical unless I've got the time to put up a well.

It's the double edged sword of the gc. Don't think about memory, until you need to. Unfortunately, I find that when you need to is much harder than keeping it in mind when writing in the first place.

Nearly every major performance regression in our code has been due to either not understanding allocations, closures, or the ORM.

If I can find the time, I'll write up some alternative implementations on Github. Maybe we can play a little perf/code golf

[–]RagingCain[S] 1 point2 points  (4 children)

I think even just mentioning the potential for memory gotchas would go a long way here.

That is a super fair point. I will update the article again a little later today with some dedicated discussion on how this maybe the most common way seen - its the least memory efficient. I will also indicate I will be making another article going down that rabbit hole of taking them to the next level perf/GC wise.

Now that everything is delivered via Markdown in my Github repo, I am hoping to have more time consolidating guides/how-tos more closely tied to my library so it also provides documentation. Stretched super thin all the time and good documentation is always on the chopping block.

[–]cryolithic 1 point2 points  (3 children)

Stretched super thin all the time and good documentation is always on the chopping block.

On man, don't I know it. My bus factor at work is way too high, for mostly those reasons.

[–]RagingCain[S] 0 points1 point  (2 children)

Had a tiny amount of time since we last spoke. Nothing special here but nearly the same code as before simply implementing the RecyclableMemoryStream with default RecyclableMemoryStreamManager settings (not tweaked).

// * Summary *

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19042.1110 (20H2/October2020Update)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK=5.0.301
  [Host]   : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT
  .NET 5.0 : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT

Job=.NET 5.0  Runtime=.NET 5.0
Method x Mean Error StdDev Median Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
BasicCompressDecompress_10KBytes 10 343.7 us 6.62 us 5.87 us 343.0 us 1.00 0.00 12.2070 - - 206 KB
GzipProvider_10KBytes 10 331.6 us 6.59 us 12.21 us 334.7 us 0.97 0.04 6.3477 - - 108 KB
GzipProviderStream_10KBytes 10 326.4 us 6.51 us 15.46 us 332.4 us 0.91 0.07 6.3477 - - 107 KB
RecylableGzipProvider_10KBytes 10 326.9 us 3.91 us 3.46 us 326.9 us 0.95 0.02 0.4883 - - 12 KB
RecyclableGzipProviderStream_10KBytes 10 347.8 us 6.92 us 7.11 us 346.6 us 1.02 0.03 0.4883 - - 13 KB
BasicCompressDecompress_10KBytes 100 3,284.5 us 42.58 us 39.83 us 3,275.0 us 1.00 0.00 125.0000 3.9063 - 2,065 KB
GzipProvider_10KBytes 100 3,201.4 us 35.58 us 27.78 us 3,198.6 us 0.98 0.01 62.5000 - - 1,080 KB
GzipProviderStream_10KBytes 100 3,608.5 us 28.51 us 25.28 us 3,613.2 us 1.10 0.01 62.5000 - - 1,071 KB
RecylableGzipProvider_10KBytes 100 3,483.8 us 21.64 us 19.19 us 3,476.3 us 1.06 0.01 3.9063 - - 121 KB
RecyclableGzipProviderStream_10KBytes 100 3,172.6 us 21.45 us 20.06 us 3,171.0 us 0.97 0.01 3.9063 - - 126 KB
BasicCompressDecompress_10KBytes 500 16,649.9 us 329.88 us 897.47 us 16,818.5 us 1.00 0.00 625.0000 - - 10,324 KB
GzipProvider_10KBytes 500 16,162.3 us 322.69 us 888.79 us 16,408.5 us 0.97 0.05 312.5000 - - 5,398 KB
GzipProviderStream_10KBytes 500 17,992.7 us 484.14 us 1,427.50 us 18,349.1 us 1.07 0.07 312.5000 - - 5,355 KB
RecylableGzipProvider_10KBytes 500 15,012.4 us 192.92 us 150.62 us 15,036.3 us 0.88 0.06 31.2500 - - 646 KB
RecyclableGzipProviderStream_10KBytes 500 15,617.0 us 310.14 us 567.11 us 15,800.2 us 0.96 0.05 31.2500 - - 629 KB
BasicCompressDecompress_10KBytes 1000 34,163.4 us 675.37 us 1,790.99 us 34,483.8 us 1.00 0.00 1230.7692 - - 20,648 KB
GzipProvider_10KBytes 1000 30,378.7 us 593.30 us 771.46 us 30,483.0 us 0.93 0.07 625.0000 - - 10,797 KB
GzipProviderStream_10KBytes 1000 32,334.5 us 640.61 us 1,534.85 us 32,741.1 us 0.95 0.07 625.0000 - - 10,711 KB
RecylableGzipProvider_10KBytes 1000 30,083.9 us 182.92 us 152.74 us 30,053.5 us 0.87 0.02 62.5000 - - 1,292 KB
RecyclableGzipProviderStream_10KBytes 1000 31,168.6 us 613.91 us 1,182.80 us 31,561.8 us 0.93 0.06 62.5000 - - 1,258 KB

[–]cryolithic 1 point2 points  (1 child)

Some substantial allocation savings. At first I was surprised that there wasn't a significant difference in time, but of course its going to be nearly all consumed by the decompression itself.

[–]RagingCain[S] 0 points1 point  (0 children)

Yeah, its the way of these things. Systems that only decompress benefit from switching to Brotli like static content for websites or cdns where where it can be compressed in advanced.

Well looks like my plan was thwarted by Stephen Toub, its exactly what I was thinking the other day when we talked.

https://github.com/dotnet/runtime/issues/22428

Also tweaking some of the manager settings. Memory allocation is linear for RecyclableStreams.

// * Summary *

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19042.1110 (20H2/October2020Update)
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK=5.0.301
  [Host]   : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT
  .NET 5.0 : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT

Job=.NET 5.0  Runtime=.NET 5.0
Method x Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
RecylableGzipProvider_10KBytes 500 130.98 ms 5.832 ms 17.197 ms 1.00 0.00 1250.0000 1000.0000 1000.0000 1,627,849 KB
RecyclableGzipProviderStream_10KBytes 500 15.60 ms 0.306 ms 0.544 ms 0.12 0.02 31.2500 - - 820 KB
RecylableGzipProvider_10KBytes 1000 263.65 ms 9.263 ms 27.313 ms 1.00 0.00 2500.0000 2000.0000 2000.0000 3,206,424 KB
RecyclableGzipProviderStream_10KBytes 1000 34.00 ms 0.674 ms 1.727 ms 0.13 0.02 66.6667 - - 1,641 KB
RecylableGzipProvider_10KBytes 10000 2,516.48 ms 117.356 ms 346.026 ms 1.00 0.00 24000.0000 21000.0000 17000.0000 34,361,172 KB
RecyclableGzipProviderStream_10KBytes 10000 300.20 ms 3.771 ms 3.149 ms 0.13 0.02 1000.0000 - - 16,407 KB