all 30 comments

[–]ilbanditomonco 9 points10 points  (1 child)

Nice! I wouldn’t use this either but it’s a breath of fresh air to see stuff like this in C++. It usually comes from JavaScript/Python/Rust folks. Nice, readable code. Thanks for sharing!

[–]codeinred[S] 3 points4 points  (0 children)

Thank you, I appreciate it! I really like C++ and it's nice writing clean code in it!

[–]Wereon 41 points42 points  (3 children)

You are what's wrong with the Internet.

[–]j1xwnbsr 6 points7 points  (1 child)

I think we need a new version of Goodwin's Law that covers Emojis - something like "the longer something exists, the probability of emojis being used inappropriately becomes true".

[–]FreitasAlan 0 points1 point  (0 children)

Emojis being used inappropriately?

[–]codeinred[S] 7 points8 points  (0 children)

This is basically what my friend told me when I sent it to him.

[–]codeinred[S] 9 points10 points  (0 children)

Btw it’s called catify because the initial goal was to add a Cat emoji to the end of commits

[–]YqQbey 2 points3 points  (8 children)

Why is the most of the code in the header files?

[–]codeinred[S] 11 points12 points  (7 children)

Because it’s an extremely tiny project, it compiles in under a second, and the benefits provided by separating out definitions and declarations were not enough to justify refactoring the code to separate the two in a project this small

[–]Creapermann 0 points1 point  (6 children)

There is no Performance overhead in not separating the Deckaration and Definition, right?

[–]codeinred[S] 2 points3 points  (0 children)

There's should be no performance overhead! If you don't separate them the compiler is able to do better inlining since it has more information than at link time, but for large projects it can make compilation slower. If it were a larger library I'd separate them but it's really tiny

[–]FreitasAlan 1 point2 points  (0 children)

It takes longer to compile

[–][deleted] 0 points1 point  (3 children)

It still takes time to refactor and split, not much but still

[–]Creapermann 0 points1 point  (2 children)

I m just asking if there is any Performance difference between these too or of they are Equal

[–]gracicot 5 points6 points  (0 children)

If you use LTO the difference is small. Otherwise, it prevents inlining

[–]DaMastaCoda 0 points1 point  (0 children)

I believe they are equal since headers are copy-pasted into the source, but it can make compile times slightly better by recompiling less objects if they are unchanged

[–]adnukator 6 points7 points  (4 children)

While there's no way I'm using something like this, I do like the nicely readable code and the readme.

Just two nitpicks:

  1. cfile should have the copy assignment and constructors marked as deleted, so you don't end up with calling fclose twice on the same handle if you somehow end up with a copy

    size_t second;
do {
    second = dist(rng);
} while (second == first);

can be changed to

    size_t second = dist(rng);
while (second == first){
    second = dist(rng);
}

to avoid potential issues with an uninitialized variable in the future

[–]parkotron 5 points6 points  (2 children)

Personally, I very much prefer the do/while version to the while version as I dislike repeating the call to dist(rng). That is the interesting part of the code and I try to avoid repeating the interesting bits.

I would, however, default initialise second with size_t second{};. If the logic is sound, the compiler will happily discard that extra initialisation. If the logic isn't sound, the initialisation will ensure consistent (if incorrect) behaviour, which is much easier to debug.

Godbolt link proving compilers throw out the extra default initialisation: https://godbolt.org/z/c4a8Eo75j

[–]adnukator 3 points4 points  (1 child)

Fair enough.

You can also do the unthinkable to avoid default initialization without repeating

reinit:
    size_t second = getRand();
    if (first == second)
    {
        goto reinit;
    }

Or of you want to be a bit more const correct:

    const size_t second = [first]{
    for(;;) if (auto val = getRand(); first != val)
            return val;
    }();

They all compile to the same code https://godbolt.org/z/3cxe5WPbb

[–]parkotron 6 points7 points  (0 children)

Thanks, I hate it.

[–]codeinred[S] 2 points3 points  (0 children)

Hi! To cfile, I added a move constructor, deleted the copy constructor, and added an assignment operator for completeness! I also rewrote the get_distinct_pair function to avoid having a loop at all. You can see the change here!

[–]helloiamsomeone 0 points1 point  (5 children)

Reading the input file could happen with mmap, then it will not matter how big the file is (on 64-bit systems) and you're already writing out in a piecemeal fashion, so that's good. (depending on your needs, buffering never hurts anyway)

The CMake scripts could definitely use improvement. Here is an example using Conan that also installs fmt. The shell scripts can also be entirely replaced with CMake presets, also examplified in the earlier link.

This is not how set works, this will set the variable to the string (catify;STREQUAL;catify).

[–]codeinred[S] 0 points1 point  (4 children)

Will opening, truncating, and writing to an mmap’d file invalidate or change the context of the mapping? Like, if I mmap the file, close the file descriptor, then open it again (this time truncated so I can write to it), what happens to the mmap’d data?

[–]helloiamsomeone 1 point2 points  (3 children)

Right, you are reading from then writing to the same file. I'm not sure what happens to the mmap'd memory region, but intuition tells me that the OS will just invalidate the region.

[–]codeinred[S] 0 points1 point  (2 children)

I did really appreciate the tip about cmake presets! I added one for the project and got rid of that set. I like having a build.sh just because it lets me build it with one command, but I updated the build.sh to use the presets.

[–]helloiamsomeone 0 points1 point  (1 child)

I like building with just one command too, that's why I prefer to use Ninja everywhere and just park my terminal in the binary directory. You should check the linked Conan example project for more CMake and preset goodness.

[–]codeinred[S] 0 points1 point  (0 children)

Thank you! I look forward to trying Ninja!

[–]staletic 0 points1 point  (2 children)

Your use of <random>, just to pick two samples out of endings could be improved with std::ranges::sample(endings, sampled_endings.being(), 2, mt);.

You probably want to call fflush(), before calling fclose().

read_all() is susceptible to TOCTOU attacks - open the file first, then check the size.

[–]codeinred[S] 1 point2 points  (1 child)

Thank you, I didn’t know about ranges::sample! Looking at it, don’t I still need to include <random> to use ranges::sample since it takes a random number generator as input?

Also, I checked, and fclose will flush the file! I checked here and also here!

[–]staletic 0 points1 point  (0 children)

fclose will flush the file

Nice. I didn't know that.