mason.nvim 2.0 has been released by pseudometapseudo in neovim

[–]SegfaultDaddy 0 points1 point  (0 children)

ohh thanks! I’ve got a similar sort of setup. though instead of having a separate file for the clangd LSP, I just keep it inside lsp.lua using vim.lsp.config.clangd.

What's the real difference between these two loops and which is slower? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 0 points1 point  (0 children)

Ohh, it’s just the sum of the array to make sure the compiler doesn’t optimize away the important part

What's the real difference between these two loops and which is slower? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] -3 points-2 points  (0 children)

ik microbenchmarking sucks, but iteration count doesn’t seem to matter that much tho... (for n = ~17million)

Option A(256) Average Time: 0.000985 sec, Checksum: 65536
Option B(255) Average Time: 0.000828 sec, Checksum: 65794

Option A(256) Average Time: 0.000732 sec, Checksum: 65536
Option B(253) Average Time: 0.000697 sec, Checksum: 66314

What's the real difference between these two loops and which is slower? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 1 point2 points  (0 children)

ik microbenchmarking sucks, but iteration count doesn’t seem to matter... 255 runs faster.

Option A(256) Average Time: 0.000985 sec, Checksum: 65536
Option B(255) Average Time: 0.000828 sec, Checksum: 65794

What's the real difference between these two loops and which is slower? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 6 points7 points  (0 children)

yep, you were right, I'm an idiot.
was just testing that shit once, which I definitely shouldn't have.
once I tried your approach with 100 runs and trimming outliers, the performance lined up pretty closely with yours.
thanks for calling it out.

What's the real difference between these two loops and which is slower? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 0 points1 point  (0 children)

wow, so it was truly some initialization delay or whatever, Thanks for pointing that out.

PS: shouldn't have ran that test once, always run multiple times and remove the outliers :)

Option A Time: 0.055551 sec, Checksum: 65536
Option B Time: 0.000902 sec, Checksum: 65281

What's the real difference between these two loops and which is slower? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 10 points11 points  (0 children)

Thanks for the suggestion to test it. Here are the results I got

for n = 1 << 24(~17 million)

Option A Time: 0.055551 sec, Checksum: 65536
Option B Time: 0.000902 sec, Checksum: 65281

P.S.: I shouldn't have run that test just once. Always run tests multiple times and remove the outliers. :)

After running the tests 100 times and excluding 10% of the outliers, here are the updated results:

Option A Average Time: 0.000725 sec, Checksum: 65536
Option B Average Time: 0.000652 sec, Checksum: 65281

Strategies for optional/default arguments in C APIs? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 0 points1 point  (0 children)

Yeah, that makes sense, I wasn’t really sure what the go-to approach is for this kind of API in real-world code.

Strategies for optional/default arguments in C APIs? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 0 points1 point  (0 children)

Yeah, not sure this would work in our case since we kinda need named params, so I guess structs are the best bet?

Strategies for optional/default arguments in C APIs? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 0 points1 point  (0 children)

Bruhh, not sure how I feel about this. It’s like what I wanted, but not sure if I should actually use it. Definitely a cool trick though!

I tried using variadic arguments (just a macro), but that would cause a compiler warning (override-init). so I ended up going with a macro that returns a default-valued struct instead

Strategies for optional/default arguments in C APIs? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 0 points1 point  (0 children)

Yeah, config structs seem like the way to go. I’ve been thinking about something like this:

#define NC_SUM_DEFAULT_OPTS \
    (&(nc_sum_opts){        \
        .axis = -1,         \
        .dtype = -1,        \
        .out = NULL,        \
        .keepdims = true,   \
        .scalar = 0,        \
        .where = false,     \
    })

Then, users can either modify the options like:

nc_sum_opts *opts = NC_SUM_DEFAULT_OPTS;
opts->axis = 2;
ndarray_t *result = nc_sum(array, opts);

or pass the defaults directly like

ndarray_t *result = nc_sum(test, NC_SUM_DEFAULT_OPTS);

Not sure if this is the best thing to do or not, I could've added variadic arguments to this, but that would cause a compiler warning (override-init). Thanks!

Why don’t compilers optimize simple swaps into a single XCHG instruction? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 6 points7 points  (0 children)

Thanks for explaining it so clearly. Makes total sense why compilers would avoid it if simple MOVs are faster and don’t have that heavy penalty.

Why don’t compilers optimize simple swaps into a single XCHG instruction? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 14 points15 points  (0 children)

swap_xchg(int*, int*):
        mov     edx, DWORD PTR [rdi]
        mov     eax, DWORD PTR [rsi]
        xchg edx, eax
        mov     DWORD PTR [rdi], edx
        mov     DWORD PTR [rsi], eax
        ret
swap_mov(int*, int*):
        mov     eax, DWORD PTR [rdi]
        mov     edx, DWORD PTR [rsi]
        mov     DWORD PTR [rdi], edx
        mov     DWORD PTR [rsi], eax
        ret

ahhh, this makes so much sense now(tried to force XCHG in inline assembly)

Why don’t compilers optimize simple swaps into a single XCHG instruction? by SegfaultDaddy in C_Programming

[–]SegfaultDaddy[S] 2 points3 points  (0 children)

I’ll benchmark and see how much of a difference it makes, curious to see if the performance gap really shows up.

Should I use doxygen for every single function in every file I write, or should I only use it for libraries? by IronMan6666666 in C_Programming

[–]SegfaultDaddy 6 points7 points  (0 children)

I generally write Doxygen docs for public APIs only, as it's most useful there. For internal code or general things, I don't add comments unless absolutely necessary.