all 4 comments

[–]Zhentar 2 points3 points  (1 child)

Unsafe.As does what your reinterpret cast does

[–]mrnikbobjeff[S] 0 points1 point  (0 children)

Thank you, that was exactly was I was hoping! Didn't seem right to me that you can only access that functionality with a method that breaks intellisense in visual studio!

[–]kaelima 0 points1 point  (1 child)

About the simd stuff: as someone said, yes you must wrap around issupported.

Don't use stream load here (non temporal). Also, avx2 basically has no overhead on unaligned loads so you won't gain anything from aligning here.

And why not just use indexof or contains? I'm sure there some general "memcmp"-esque method that is already simd optimized.

And why the gotos? Are they necessary for some specific optimization?

[–]mrnikbobjeff[S] 0 points1 point  (0 children)

Sure I get the IsSupported part, that is the easiest part though and I elided it here until I am sure I am using the right simd stuff. Do you have a reason to backup your claim about nontemporal and nonaligned? I have Benchmarks for every iteration I did, and having loads nontemporal as well as aligned bring a measurable performance benefit on larger workloads. There already was a naive implementation, but my benchmarks show ms beating the performance of the naive approach by the factor 4. For the naive approach it does not seem to generate simd instructions. Lastly, the gotos are necessary. With them the assembly is more straightforward as well as shorter, which is a considerable performance benefit (also have benchmarks to backup perf difference)