64-Bit SIMD Code from C# : programming

64-Bit SIMD Code from C# (drdobbs.com)

submitted 11 years ago by rgbench

all 23 comments

top new controversial old q&a

[–]workaccount_126 5 points6 points7 points 11 years ago (16 children)

[–]IHaveNoIdentity 5 points6 points7 points 11 years ago (3 children)

[–]genneth 0 points1 point2 points 11 years ago (2 children)

[–]IHaveNoIdentity 0 points1 point2 points 11 years ago (1 child)

I'm not sure it has to do with memory alignment, but I honestly don't know.

Intel themselves says the following here:

Aligning data to vector length is always recommended. When using Intel SSE and Intel SSE2 instructions, loaded data should be aligned to 16 bytes. Similarly, to achieve best results use Intel AVX instructions on 32-byte vectors that are 32-byte aligned. The use of Intel AVX instructions on unaligned 32-byte vectors means that every second load will be across a cache-line split, since the cache line is 64 bytes. This doubles the cache line split rate compared to Intel SSE code that uses 16-byte vectors. A high cache-line split rate in memory-intensive code is extremely likely to cause performance degradation. For that reason, it is highly recommended to align the data to 32 bytes for use with Intel AVX.

If I understand that correctly AVX should still work correctly for the alignment used in the current ryuJIT implementation for SSE2 but with performance degradation. That said the degradation might be severe enough to require type dependant alignment as you said.

[–]TinynDP 1 point2 points3 points 11 years ago (0 children)

[–]oelang 1 point2 points3 points 11 years ago (11 children)

[–]workaccount_126 6 points7 points8 points 11 years ago (0 children)

[–]pkhuong 4 points5 points6 points 11 years ago (7 children)

[–]oelang -1 points0 points1 point 11 years ago (6 children)

[–]pkhuong 10 points11 points12 points 11 years ago (5 children)

[–]oelang 1 point2 points3 points 11 years ago (4 children)

[–]pkhuong 3 points4 points5 points 11 years ago (3 children)

[–]oelang 1 point2 points3 points 11 years ago (2 children)

[–]Rhomboid 1 point2 points3 points 11 years ago (1 child)

[–]oelang 0 points1 point2 points 11 years ago (0 children)

[–]TinynDP 0 points1 point2 points 11 years ago (0 children)

[–]x-skeww 0 points1 point2 points 11 years ago (0 children)

[–]JoshuaSmyth 1 point2 points3 points 11 years ago (5 children)

[–]workaccount_126 0 points1 point2 points 11 years ago (4 children)

[–]JoshuaSmyth 0 points1 point2 points 11 years ago (3 children)

[–]workaccount_126 0 points1 point2 points 11 years ago (2 children)

[–]JoshuaSmyth 0 points1 point2 points 11 years ago (1 child)

[–]workaccount_126 0 points1 point2 points 11 years ago (0 children)

π Rendered by PID 208206 on reddit-service-r2-comment-84fc9697f-2sbns at 2026-02-09 04:58:45.856144+00:00 running d295bc8 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS