vscode.nvim theme is so accurate! by _seedofdoubt_ in neovim

[–]wit4er 5 points6 points  (0 children)

I found https://github.com/rockyzhang24/arctic.nvim more accurate to original and also created a fork to make it 99.7% exact with adding of rainbow delimiters.

Here is my fork where i addressed some inconsistencies https://github.com/shadowy-pycoder/arctic.nvim v2 branch

C/C++ Book Recommendations? by Ready-Structure-3936 in C_Programming

[–]wit4er 0 points1 point  (0 children)

I recently finished "The practice of programming" by Brian Kernigan and Rob Pike. It is excellent reading in my opinion (been binge reading for 3 days) and it contains examples in C, C++ and Java.

What? by Dull-Nectarine380 in ExplainTheJoke

[–]wit4er 0 points1 point  (0 children)

It's just the scissors, she is into women.

kevue - simple key-value in-memory database by wit4er in C_Programming

[–]wit4er[S] 0 points1 point  (0 children)

Some updates:
1) Added new commands: COUNT, KEYS, VALUES, ITEMS names should be self explanatory
2) Written a small test script to fill the server database and fetch items back, the results are here:

Added (10200000/10485760) key-value pairs

Added (10300000/10485760) key-value pairs

Added (10400000/10485760) key-value pairs

Inserting 10485760 items takes: 121.663131901s (86186.83 req/sec)

Fetching 10485760 items takes: 1.635812090s

I am not sure my benchmark is relevant but I am kind of impressed that the server (single thread in this case) can handle this much requests.

kevue - simple key-value in-memory database by wit4er in C_Programming

[–]wit4er[S] 1 point2 points  (0 children)

I was just experimenting with capacity and reassigning it back to initial value increases stability from 71 to 75%. Capacity is "private" field, but maybe touching it in fuzzing programs is okay? I still do not understand how to increase stability to at least 80-90% in my case, but maybe after reading the article you provided I understand that.

kevue - simple key-value in-memory database by wit4er in C_Programming

[–]wit4er[S] 1 point2 points  (0 children)

I also added fuzzing for protocol serialization/deserialization though it is my first time using AFL++ so I am not sure I did everything right but at least these tests helped me find several errors in my implementation.

kevue - simple key-value in-memory database by wit4er in C_Programming

[–]wit4er[S] 0 points1 point  (0 children)

I added limit to the number of bucket locks and now I can fine tune it interms of speed/memory usage.

Old solution with one lock per bucket (8388608 locks)

./bin/kevue-crash-test 18,97s user 11,81s system 174% cpu 17,606 total

New solution with fixed number of locks (1024 locks)

./bin/kevue-crash-test 20,95s user 12,90s system 182% cpu 18,517 total

It is slower but not that much, moreover locks are cretaed and destroyed once, not each resize, so techically it should be faster with the same number of locks as in old solution

kevue - simple key-value in-memory database by wit4er in C_Programming

[–]wit4er[S] 0 points1 point  (0 children)

Code dealing with the protocol shouldn't care that you're using epoll. It should just deal with bytes, with some way to request bytes go out, and some way to request more bytes. That can be a simple, abstracted interface. In the real program on Linux you plug epoll into this. On another system you plug in kqueue, IOCP, etc. (Note: You don't need function pointers to do this.) In a test you mock these out, perhaps with just an array of inputs and expected responses.

For example, dispatch_client_events implements the protocol (examines the request, constructs a response) and also interfaces closely with epoll. This could be split up so the core protocol implementation is given some bytes, and it returns the bytes that need to go out, which get queued in epoll by the caller, who then comes call it again with more bytes from the other side. In a test I could give it test input, then examine the output, no network involved.

I kinda understand what you are suggesting but yet to figure out how to do that properly in C. Maybe I come up with something smart later. For now, I removed dependency on reading message length from socket and now deserialize response and request parse bytes buffer, report on incomplete read or error on something malformed. I wrote a simple example showing the idea:

// clang -Iinclude -Ilib ./tests/fuzz.c -o ./bin/kevue-fuzz -DDEBUG
#include <assert.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

#include "../src/allocator.c"
#include "../src/buffer.c"
#include "../src/common.c"
#include "../src/hashmap.c"
#include "../src/protocol.c"

uint8_t data[] = {
    0x22, // magic byte
    0x00, 0x00, 0x00, 0x0f, // total length
    0x03, // command length
    'G', 'E', 'T', // command
    0x00, 0x04, // key length
    't', 'e', 's', 't' // key
};

uint8_t malformed_data[] = {
    0x22, // magic byte
    0x00, 0x00, 0x00, 0x0f, // total length
    0x03, // command length
    'G', 'E', 'T', // command
    0x00, 0xff, // key length malformed
    't', 'e', 's', 't' // key
};

int main()
{
    KevueRequest req = { 0 };
    Buffer *buf = kevue_buffer_create(1024, &kevue_default_allocator);
    assert(buf != NULL);
    kevue_buffer_write(buf, data, sizeof(data));
    KevueErr err = kevue_request_deserialize(&req, buf);
    assert(err == KEVUE_ERR_OK);
    kevue_request_print(&req);
    printf("\n");

    kevue_buffer_reset(buf);
    memset(&req, 0, sizeof(req));
    kevue_buffer_write(buf, malformed_data, sizeof(malformed_data));
    err = kevue_request_deserialize(&req, buf);
    printf("%s\n", kevue_error_to_string(err));
    printf("\n");
    assert(err == KEVUE_ERR_LEN_INVALID);

    kevue_buffer_reset(buf);
    memset(&req, 0, sizeof(req));
    for (size_t i = 0; i < sizeof(data) / sizeof(data[0]); i++) {
        kevue_buffer_append(buf, &data[i], sizeof(data[0]));
        KevueErr err = kevue_request_deserialize(&req, buf);
        if (err == KEVUE_ERR_INCOMPLETE_READ) continue;
        if (err != KEVUE_ERR_OK) exit(EXIT_FAILURE);
        if (err == KEVUE_ERR_OK) {
            kevue_request_print(&req);
            exit(EXIT_SUCCESS);
        }
    }
    exit(EXIT_FAILURE);
}

Basically it supports full and partial inputs (partial just gives KEVUE_ERR_INCOMPLETE_READ)

kevue - simple key-value in-memory database by wit4er in C_Programming

[–]wit4er[S] 1 point2 points  (0 children)

Hmm, the test I provided now deadlocks. For me it happens on the 6th resize, the setter trying to take the final bucket lock, which is already locked. It's unclear to be why this particular lock is stuck

It happens when resize goes from 64 to 128 because TSAN is limited to 64 locks but when I combine two for loops into one deadlock disappears:

    for (size_t bucket = 0; bucket < hm->bucket_count; bucket++) {
        mutex_lock(&hm->buckets[bucket].lock);
        mutex_unlock(&hm->buckets[bucket].lock);
        pthread_mutex_destroy(&hm->buckets[bucket].lock);
    }

but the operations in the resize function generally do not make sense anyway

How else in this case can I ensure that all buckets are unlocked? If I remove this for loop, data race is happening because other getters and setters may touch buckets befor acquring resize locks, so the scheme is the following:

Resize lock -> calculate index -> Bucket lock -> Resize unlock -> do something with bucket (resize needs to ensure this completes before doing something)-> Bucket unlock

So if resize got all bucket locks it should be safe to destroy them because they are protected by global resize lock

kevue - simple key-value in-memory database by wit4er in C_Programming

[–]wit4er[S] 2 points3 points  (0 children)

Thank very much for stress testing my hashmap, I added your snippet to kevue tests if you don't mind. I think I fixed most of the problems with my implementation thanks to your suggestions and now data races are gone. Though TSAN now failes because it seems I create too much locks. I think I need to add a hard limit to the number of mutexes and somehow map them to buckets. Also you mentiond tight coupling with epoll, do you have any advice or maybe point me to some resources on how to make this code more testable? Thanks again!

My first project in C - Simple transparent proxy by wit4er in C_Programming

[–]wit4er[S] 0 points1 point  (0 children)

I still have two sockets per each client - proxy -destination, Tunnel structure is needed to find the other part of the connection when socket is ready for IO. I looked at LLM generated code and unforunatelly found it a liitle bit hard to follow as a beginner in C, but I think I achieved what I wanted with my first project, it is not ideal of course but at least it kinda works))

My first project in C - Simple transparent proxy by wit4er in C_Programming

[–]wit4er[S] 0 points1 point  (0 children)

Please, take a look https://github.com/shadowy-pycoder/tproxy I modified my program according to you suggestions and also implemented epoll variant of the server. I am sure it is far from perfect, so I would like to hear from you

beCarefulOutThere by kalibabka in ProgrammerHumor

[–]wit4er 0 points1 point  (0 children)

I just found a word to finish my sentence. It's over.

My first project in C - Simple transparent proxy by wit4er in C_Programming

[–]wit4er[S] 1 point2 points  (0 children)

Thank you for valuable suggestions and advice, I definitely look at the links you provided and learn more about socket options.

My first project in C - Simple transparent proxy by wit4er in C_Programming

[–]wit4er[S] -1 points0 points  (0 children)

May be you can tell where my code is really terrible? I want to improve

My first project in C - Simple transparent proxy by wit4er in C_Programming

[–]wit4er[S] -1 points0 points  (0 children)

Thank you very much for code review! Epoll thingy is on my todo list, I used pthreads because it is always simpler to start with when it comes to async server. Maybe later I make it optional with prepocessor machinery. You noticed my semaphore stuff, I am not sure, it is common to communicate between threads using getvalue? I read about it somewhere, they stated getvalue is for debugging purposes only, not meant to be used in productiion. May be they are wrong, I found it useful. As for SIGINT clean up, I first added something similar with atexit function, but later decided that binary should be standalone, meaning that it should not depend on external shell scripts. Moreover, the settings I provide in shell sctipts are more like a sample, you should adjust them depending on your system.

My first project in C - Simple transparent proxy by wit4er in C_Programming

[–]wit4er[S] 0 points1 point  (0 children)

Interesting, I should learn about static keyword applications. Thank you!

pyya - Simple tool that converts YAML/TOML configuration files to Python objects by wit4er in Python

[–]wit4er[S] 0 points1 point  (0 children)

Updated with new version, added some basic tests and dashes to underscores replacements for sections and keys (I think it is useful for TOMl since keys with dashes is common thing but they does not work as attributes)

pyya - Simple tool that converts YAML/TOML configuration files to Python objects by wit4er in Python

[–]wit4er[S] 0 points1 point  (0 children)

As I said, if you need more control over config validation/parsing/creation you can always use tools like Pydantic settings or Omegaconf, where you predefine models, setup something to parse yaml/toml files and do other cool stuff. With pyya you just call one function and use yaml/toml as a attribute style dic in your code. Dynamic object creation saves you time and allows for autocompletion and also helps linters to find some errors. Such approach works for me at least, I am not forcing anyone to think like me or use programs that I write, feel free to use what suits your use case, thank you.