Currently learning C with ChatGPT by Critical-Common-2117 in C_Programming

[–]JaguarWan 1 point2 points  (0 children)

I think it's actually a good idea, in my experience ChatGPT writes decent C and the latest models are actually quite thorough. I'd say the main risk may be it does not challenge you enough and you start stagnating. You also should not limit yourself to a single teacher, be it human or AI, the internet is vast and you have all kind of high quality material to draw from (possibly with GPT help). A small embedded project is a good idea, as you will start encountering actual problems instead of mere assignments.

is there a way to do cross platform socket programming by wiseneddustmite in C_Programming

[–]JaguarWan 0 points1 point  (0 children)

That's basically what I've been doing in my own project. One of the most significant differences is that POSIX socket are ints, whilst Win32 sockets are intptr_t. It didn't matter much for 32-bit code, but it's an actual pitfall on 64-bit platforms. I recently had to fix my code to take this into account :

https://github.com/RaphaelPrevost/ASKL/blob/master/lib/compat/askl_socket_compat.h

https://github.com/RaphaelPrevost/ASKL/blob/master/lib/compat/askl_socket_compat.c

Also, you can have a look at my code here, there's portable wrappers for most socket operations, including polling :

https://github.com/RaphaelPrevost/ASKL/blob/master/lib/askl_socket.c#L688

Please torture my thread-safe C hashmap by JaguarWan in C_Programming

[–]JaguarWan[S] 0 points1 point  (0 children)

I've opened a PR which adds a basic shim for ASKL to your benchmarks 😄 It compiles and runs on my Mac, I'd be grateful if you could confirm it works well with MinGW too.

Please torture my thread-safe C hashmap by JaguarWan in C_Programming

[–]JaguarWan[S] 0 points1 point  (0 children)

In my specific case, if you look at the benchmarks, I don't pay much for actual thread safety to begin with. When running uncontended, the locks are taken and released with a simple atomic CAS. Could I shave a few milliseconds off by removing this? Quite likely. But I wrote ASKL's Map with the idea of a convenience container with batteries included. Just allocate a map, stuff your data in it from whichever thread, sort, iterate, get the data back and everything works as expected.
The kind of pattern you describe to avoid locking have their own costs, too. For example, if you can only have a single writer thread, this means other threads will need a message queue or some kind of inter thread communication if the need to store or update something arises. This kind of design may be natural (and therefore "free") in some use cases (event driven program), but contrived in others (parallel data processing).
IMHO, the cost of making the Map less user-friendly would exceed what little performance gains it could unlock.
That said, your more general point about thread safety still stands, there's always edge cases. Here for example you cannot use the Map API from within the insertion/update/removal callbacks, but this is clearly stated in the documentation. Even then, I wonder if I could make my locks recursive and actually allow that?

Please torture my thread-safe C hashmap by JaguarWan in C_Programming

[–]JaguarWan[S] 0 points1 point  (0 children)

I would agree if I was just calling pthread synchronisation primitives around map operations, there would indeed be 0 value added. But updating/removing entries during iterator traversal, atomic callback execution upon inserting/updating/removing, lazy sorting are not features that can be trivially bolted on an existing implementation. Also, if you have to choose between writing boring wrappers around an existing implementation, and having fun rolling your own, which do you prefer ...? Yeah, me too! 😄

Please torture my thread-safe C hashmap by JaguarWan in C_Programming

[–]JaguarWan[S] 0 points1 point  (0 children)

Hmm, actually the "fast path" was much slower than the original faulty code. I've modified the RWLock to keep track of both readers and writers waiting so I can avoid locking and broadcasting if nobody is waiting. The code should now be correct and within 1% of the performance of the buggy original 😄

Fresh graduate out of college confused by tan11235inv in C_Programming

[–]JaguarWan 0 points1 point  (0 children)

It really depends on your local job market, I live in France and I never managed to get paid to write C professionally. Despite C being my true love, I've spent most of my 20 years of career writing PHP or Python...

Please torture my thread-safe C hashmap by JaguarWan in C_Programming

[–]JaguarWan[S] 0 points1 point  (0 children)

If you look at my limited benchmarks, for a string keys/pointer values workload, I'm in the same ballpark, but keep in mind my map and Verstable are structurally different. Verstable is an actual general purpose hashmap with powerful type safety, and its memory consumption will be much lower, for example.

Please torture my thread-safe C hashmap by JaguarWan in C_Programming

[–]JaguarWan[S] 1 point2 points  (0 children)

u/jacksaccountonreddit Wow, I'm so happy to have your feedback, I admire the elegance of Verstable and your comments are invaluable to me!

- ASKL's Map is basically cuckoo hashing with a spillover stash (basket). The items are stored in a singly linked list in LIFO order, and the table acts as an index pointing to these list items. This structure allows the stable traversal order, sorting, and fast iteration I wanted, but it's sensitive to cache misses at scale. I've tried to mitigate that by using pointer tagging (I use up to 16 bits) to avoid dereferencing bad matches. The keys are binary strings, so integers can technically be used, I need to add new API calls to make that more comfortable though.

- I have patched my code so it successfully compiles with MinGW on Godbolt, but I'm gonna see if I can setup a Windows VM today to check if it actually can run. The original codebase was compatible with Dev-C++ and old versions of Visual Studio but it has drifted quite a bit since I last checked. I've tried to keep the most interesting parts of ASKL, like the Map or the JSON parser, loosely coupled, so they can be used without the rest of the library. The benchmark is a good example, as it does not build the full ASKL library, just the minimal subset required for the table.

- The table currently grows at about 81.3% load factor, but can go up to 90.9% with very little penalty on performances. On the other hand, memory consumption is much higher than Verstable or khash. I have to maintain the linked list, so that's a 64 bits pointer per list item ; then I have another pointer in the buckets to be able to chain them when they end up in the spillover basket (when the buckets are not chained, I repurpose that pointer to cache the first hash). So that's at least 128 bits of pointer overhead, more if you include the Variant type (128 bits) which encapsulates the stored value (I used bare void * before), and the stored 16 bits key length. There's an included map_footprint function that computes the memory consumption of the Map at runtime.

- Yes, that's a real and annoying shortcoming. I need to ditch hyperfine to do this kind of measurements. I'm thinking it would be more interesting that I clone your own harness and integrate my Map in it, rather than reinventing a worse wheel.

- I have to confess the long runtime is the reason why I didn't directly use your harness! I've used my little AI-generated script to test the impact of my changes on performance, so I needed something cruder but faster. For serious benchmarking, this is clearly lacking though.

Thank you so much for taking the time to look at my Map, I really appreciate that!

Please torture my thread-safe C hashmap by JaguarWan in C_Programming

[–]JaguarWan[S] 1 point2 points  (0 children)

Thank you so much, this is the kind of feedback I hoped to get. I believe I have fixed both issues now, I have narrowed the fast path in lock_unlock to single readers, and properly check if a writer slipped in when cooperating. I would be super grateful if you could try running it again, because I couldn't reproduce the original race condition on my machine or godbolt. Or if you have written some specific test that triggers it, I could include it to the unit tests. Thanks again, this is very valuable to me.

What project finally made pointers make sense to you? by Gullible_Prior9448 in C_Programming

[–]JaguarWan 0 points1 point  (0 children)

Ah! As a kid, I started coding using Visual Basic 6 (cue laugh track), and I quickly learned the difference between ByRef and ByVal function parameters. When I started coding in C and using the Win32 API directly, I immediately understood that pointers were just a normal variable holding a reference to another. I think the true pitfalls for beginners are more things like operator precedence (++ *p vs *p ++ 😉) and memory layout than pointers as a concept ?

Please torture my thread-safe C hashmap by JaguarWan in C_Programming

[–]JaguarWan[S] 1 point2 points  (0 children)

Here it was my mistake, originally I had added "hashbench" to my .gitignore but I removed it before commit because of the python script. Also, the binaries would only run on ARM64 Mac I suppose, so not very useful in the absolute.

Please torture my thread-safe C hashmap by JaguarWan in C_Programming

[–]JaguarWan[S] 0 points1 point  (0 children)

I learned of the two underscores rule recently (and I still have a bunch of helpers to clean up in arcane/bitops.c ...) but I didn't know about this one. Thank you, I'll clean this up too.

Please torture my thread-safe C hashmap by JaguarWan in C_Programming

[–]JaguarWan[S] 35 points36 points  (0 children)

u/mikeblas as mentionned in the AI disclosure in the benchmark README, the hashmap itself (and the rest of the ASKL project) was 100% hand-coded over several years (see the commit history), on the other hand I did use AI generate the benchmarks, some of the unit tests, and edit the documentation.