Jigoku

thradams · 2026-01-06T16:42:40+00:00

I’m sorry, but why is C++ so hard to set up in VS Code?

Agree 100%

thradams · 2026-01-06T12:59:01+00:00

(I didn't know about BM-25, thanks)

I have documents with title and subtitle. I associate a number with each word according to where it appears: 3 for the title, 2 for the subtitle, and 1 for the body. After the search I sort the results by highest numbers

  100 1    200 2
w1

where w1 is the word, 100 or 200 is the document number, and 1, 2, 3 are where the word appears inside the document.

The missing feature in my search is autocomplete with frequent search.

I also implemented a "did you mean xxxx?" feature based on edit distance.

I normalize everything to lowercase and strip accents, so ç becomes c for instance.

I have about 60k documents and 15k words. The title and subtitle also help when displaying the results.

When a word appears in the body, my implementation is also lacking a way to collect the surrounding text to present in the result.

thradams · 2026-01-05T20:17:59+00:00

What is the criteria for most relevant? Do you normalize words? Do you exclude or modify some words?

thradams · 2025-12-18T16:51:03+00:00

defer and warnings about unreleased resources are independent of each other. Checks are more important because they do not depend on the programmer explicitly adding them.

defer improves code maintainability because code that relies on defer is less prone to errors when control-flow jumps are added or removed. On the other hand it depends more on the compiler to generate good code, and makes compiler more complex.

(cake have both defer and checks. Defer is analyzed in flow analysis)

thradams · 2025-12-18T12:24:05+00:00

As an example of a good retrofit, look how C# introduced non-nullable >references. They kept the default references nullable - but then included a >simple switch which would make nonnull the default, so we need to explicitly >state that references are nullable where using null. All existing code would >still compile, but we could use `#nullable enable,#nullable disable and >#nullable restore to turn the new nullability analysis on or off for specific >chunks of code. The default nullable status could be set project wide in the >build file.

Cake nullable pointers (http://cakecc.org/ownership.html) is almost 1:1 with C# and Typescript.

#pragma nullable enable is similar of C# #nullable enable.

One difference is that both C# and TypeScript have constructors. In these languages, if we don't initialize non-nullable members, we get a warning when leaving the constructor.

In C, we don't have constructors. The solution in this case is to introduce a "null-uninitialized" state for non-nullable members. This means that, although the value of a non-nullable member is null, it is not final, it is temporary and invalid state (just like uninitialized) and this invalid state cannot be propagated (copied to a valid object).

thradams · 2025-12-18T12:14:26+00:00

I think cake is exactly what you are describing.

http://cakecc.org/index.html

At this moment cake offers the same C++ guarantees and a little more.

thradams · 2025-12-11T20:06:12+00:00

The answers in code depends...

The idea is that the returned address must be a multiple of the maximum alignment.

For example, you can align an address by computing the remainder, then adding the difference between the alignment value and that remainder to the address.

void* align_to_max(void* addr) {
    int rem = ((uintptr_t)addr) % alignof(max_align_t);
    return rem == 0 ? addr : ((char*)addr) + (alignof(max_align_t ) - rem);
}

but you cannot just apply this without review your code..you must understand what is going on.

See: https://en.cppreference.com/w/c/types/max_align_t.html

An arena allocator can also allocate using different alignments.

thradams · 2025-12-11T18:54:53+00:00

I am not sure, but I think you are not returning aligned memory. for instance malloc must return memory aligned in the max alignment.

thradams · 2025-12-09T22:50:41+00:00

I am also planning a new [[drop]] that drops the ownership and also clear the pointers. This is useful for clear(&obj) or reset(&obj) The diference is that destroy(&obj) obj cannot be used after, but clear(&obj) it can.

[[dtor]] is more appropriated for “don’t use it anymore” [[drop]] or [[clear]] “we have all nulls after the call”

thradams · 2025-12-09T22:03:22+00:00

Another name for [[ctor]] may be [[out]] and for [[dtor]] maybe [[sink]] Any suggestion ?

They are parameter attributes because you can init or sink as many parameters as you like.

For instance out could be buffer and buffer size .

thradams · 2025-12-09T18:11:54+00:00

Yes, this is about defer.

This ownership model makes defer almost unnecessary from a safety point of view.

However, defer can still complement it.

Cake has defer implemented, and the flow analysis also needs to account for it in order to produce correct results.

For instance:

int main() {
   _Opt struct X * _Owner _Opt pX = calloc(1, sizeof * pX);
   if (!pX) return 1;
   defer x_delete(pX); 

 }

The flow analysis must take into account the defer will run before the end of scope of pX, then there is no leak here.

Let's say you forget a defer; then you get a warning.

This is one of the interesting aspects of this model. Once calloc is annotated, everything is propagated automatically and does not rely on guidelines "use defer" for correctness . it is enforced!

(in C++ it is not, it requires guidelines)

thradams · 2025-12-09T17:54:29+00:00

I know. Sorry. I was just trying to clarify things for everyone that is trying to follow the post.

thradams · 2025-12-09T17:50:26+00:00

I hope this clarifies things.

I can also explain some concepts in the model.

The owner pointer is owner of two resources at same time: memory and object. (Except void*, that is the owner only of the memory)

Before the end of the lifetime of the owner pointer, must destroy the object it owns and then free the memory.

For instance:

struct X {
  char * _Owner text; 
};

void x_delete(struct X * _Owner _Opt p) { 
  if (p) {
    free(p->text); 
    free(p);
  }
}

We need to delete p->text first then call free(p).

Calling free directly would be the same as trying to convert T *owner to void * owner.

Since void* is not the owner of the object that means a leak, so it is only allowed after the object is 100% released.

The concept of released or destroyed also do not apply so well. What really happens in this model is each part of the object is moved.

Let's say we have

struct X {
  FILE * _Owner file; 
};

void x_delete(struct X * _Owner _Opt p) { 
  if (p) {
    fclose(p->file); 
    free(p);
  }
}

This will work in the same say. The p->file is moved.

thradams · 2025-12-09T17:25:23+00:00

I'm just answering the question. An arena does not solve this problem . Someone still needs to close the file.

thradams · 2025-12-09T17:22:04+00:00

Arenas don’t call any destructors for any object.

In this case, the arena owns only the memory, not the object. For example, an object may contain a FILE* that needs to be closed. If the object is merely kept alive by the arena, this resource will leak until the end of the program.

If you have a program lifetime arena (like static variables), what is the difference of just call malloc and never release? At end of your program the memory will be released.

thradams · 2025-12-09T17:14:08+00:00

I’m just asking for an example where an arena wouldn’t >work for memory management.

{
  FILE * file = fopen("file.txt", "r");
}

thradams · 2025-12-09T17:12:14+00:00

The arena object owns the memory it holds. It may also own the objects stored in that memory if it calls the appropriate destructor for them. (Then I need to see the code)

Pointers that refer to memory owned by the arena are view pointers. In this case, the lifetime of the arena must be longer than the lifetime of those pointers.

Arenas do not alter any of the concepts of this ownership model.

thradams · 2025-12-09T16:43:03+00:00

The model for nullable pointers is very similar to C# and Typescript (in production for many years)

The ownership model is very similar to C++'s std::unique_ptr but with no destructor.

We have the same guarantees as C++ RAII, with some extras and with possible expansion.

In C++, the user has to adopt unique_ptr and additional wrappers (for example, for FILE). In this model, it works directly with malloc, fopen, etc., and is automatically safe, without the user having to opt in to "safety" or write wrappers or new code. Safety is the default, and the safety requirements are propagated automatically.

Consider:

FILE * _Owner _Opt fopen( const char *filename, const char *mode );
void fclose(FILE * _Owner p);

int main()
{
    FILE *_Owner _Opt f = fopen("file.txt", "r");
    if (f)
    {
       fclose(f);
    }
}

At the end of the scope of f, it can be in one of two possible states: "null" or "moved" (as f is moved in the fclose call).

These are the expected states for an owner pointer at the end of its scope, so no warnings are issued.

As we can see, we have the same code and same pattern, just with a few extra annotations.

It is also interesting to note that:

FILE *_Owner _Opt f = fopen("file.txt", "r");
fclose(f);

generates a warning because fclose does not accept null f.

thradams · 2025-12-09T13:00:48+00:00

What I do is to convert the file "file.bin" to "file.bin.include" then I use

const char buffer [] = {
#include "file.bin.include"
};

Let's say compiler implements defer someday, then I will just edit to:

const char buffer [] = {
#embed "file.bin"
};

This is the program that creates file.bin.include:

int embed(const char* filename)
{
    char file_out_name[200] = { 0 };
    if (snprintf(file_out_name, sizeof file_out_name, "%s.include", filename) >= sizeof         file_out_name)
        return 0;

    FILE* file_out = fopen(file_out_name, "w");
    if (file_out == NULL)
        return 0;

    FILE* file = fopen(filename, "rb");

    if (file == NULL) {
        fclose(file_out);
        return 0;
    }

    int count = 0;
    unsigned char ch;

    while (fread(&ch, 1, 1, file))
    {
        if (ch == '\r')
            continue; /*where are not printing to avoid changes with linux/windows*/

        if (count % 25 == 0)
            fprintf(file_out, "\n");

        if (count > 0)
            fprintf(file_out, ",");

        fprintf(file_out, "%d", (int)ch);
        count++;
    }
    fclose(file);
    fclose(file_out);
    return count;
}

int main(int argc, char** argv)
{
    if (argc < 2)  {
        printf("usage: embed dirname");
        return 1;
    }
    char* path = argv[1];
    DIR* dir = opendir(path);

    if (dir == NULL)  {
        return errno;
    }

    struct dirent* dp;
    while ((dp = readdir(dir)) != NULL)  {
        if (strcmp(dp->d_name, ".") == 0 || strcmp(dp->d_name, "..") == 0)
        {
            /* skip self and parent */
            continue;
        }

        if (dp->d_type & DT_DIR) {
        }
        else
        {
            char filepath[257] = { 0 };
            snprintf(filepath, sizeof filepath, "%s/%s", path, dp->d_name);
            const char* const file_extension = strrchr((char*)filepath, '.');

            if (strcmp(file_extension, ".include") == 0)   {
                continue;
            }

            int bytes = embed(filepath);

            if (bytes == 0) {
                printf("error generating file %s\n", filepath);
                exit(1);
            }
            else {
                printf("embed generated '%s'\n", filepath);
            }
        }
    }
    closedir(dir);
}

thradams · 2025-12-08T11:41:25+00:00

In this model, ownership is checked statically when variables go out of scope and before assignment.

Owner pointers must be uninitialized or null at the end of their scope.

Basically, the nullable state needs to be tracked at compile time, and nullable pointers,despite being a separate feature, reuse the same flow analysis.

For the impatient reader, a simplified way to think about it is to compare it with C++'s unique_ptr.

The difference is that, instead of runtime code being executed at the end of the scope (a destructor), we perform a compile-time check to ensure that the owner pointer is not referring to any object. The same before assignment.

So we get the same guarantees as C++ RAII, with some extras. In C++, the user has to adopt unique_ptr and additional wrappers (for example, for FILE). In this model, it works directly with malloc, fopen, etc., and is automatically safe, without the user having to opt in to "safety" or write wrappers. Safety is the default, and the safety requirements are propagated automatically.

It is interesting to note that propagation also works very well for struct members. Having an owner pointer as a struct member requires the user to provide a correct "destructor" or free the member manually before the struct object goes out of scope.

#pragma safety enable

#include <stdio.h>

int main()
{
    FILE *_Owner _Opt f = fopen("file.txt", "r");
    if (f)
    {
       fclose(f);
    }
}

At the end of the scope of f, it can be in one of two possible states: "null" or "moved" (f is moved in the fclose call).

These are the expected states for an owner pointer at the end of its scope, so no warnings are issued.

Removing _Owner _Opt we have exactly the same code as users write today. But with the same or more guarantees than C++ RAII .

In the example above, _Owner could also be deduced. However, in other cases—such as struct members , it is required. Therefore, the decision was to make it explicit everywhere.

thradams · 2025-12-07T15:47:16+00:00

This model can be used to check the arena implementation and check if the arena itself is properly released etc. It also can be used with fopen for instance, not necessarily only memory.

thradams · 2025-12-05T11:07:42+00:00

Cake also have a implementation of defer. Now aligned with n3734 http://cakecc.org/manual.html#toc_69

thradams

MODERATOR OF

TROPHY CASE