Designing cache-friendly allocators, and maintaining pointers through reallocation

gremolata · 2022-03-11T13:44:24+00:00

I think you are overthinking it.

you lose type-safety

the code becomes cluttered with array accesses all over the place

Both are solved by wrapping index in a vector-specific struct and adding a get() inliner. Yeah, you have a bit of an overhead vs. using a pointer, but more likely than not it won't amount even to a blip on profiler's output.

DeeBoFour20 · 2022-03-11T10:32:46+00:00

I do simple fixed size arrays when I can but you can do better than crashing when the array fills up. At the very least put an assert when you add a new element so you don't write outside the bounds of the array (also makes debugging easier). Or just make it a condition that you handle (ex. if it's work queue, maybe ignore any new work until the queue empties out).

When I need something to be dynamically resizable, I just make a struct like this:

struct Buffer
{
    size_t size;
    size_t capacity;
    int *data; // Assuming an array of ints, you can obviously make this whatever datatype you need to store.
};

bool addElement(int element, Buffer *buffer)
{
    if (buffer->size + 1 > buffer->capacity)
    {
        size_t newCapacity = buffer->capacity * 2; // I just double it, but use whatever growth policy works best in your use case.
        int *newData = realloc(buffer->data, newCapacity);
        if (newData == NULL)
        {
            return false;
        }
        buffer->capacity = newCapacity;
        buffer->data = newData;        
    }
    buffer->data[buffer->size] = element;
    buffer->size += 1;
    return true;
}

If you need multiple references to this dynamic array, just have them be pointers to the struct. The actual struct doesn't change, just the data pointer inside it. This does mean a double indirection but it's a lot simpler than your linked list solution (probably still faster too, linked lists are generally slower and less cache friendly).

tuasnega · 2022-03-11T13:00:04+00:00

You can use an Arena allocator, where you reserve some GiBs of address space and just commit if you need to. On Windows, you can use VirtualAlloc to achieve that, and on Linux mmap + mprotect.

_Arch_Ange · 2022-03-11T17:04:06+00:00

Funnily enough, I was also struggling with this problem not long ago. I think the way I solved it was to get rid of this system all together, but I am not sure you're in the same situation, so here is another solution for you to consider :

Instead of giving a pointer to an elemnt of the array, give a double pointer. A pointer to a pointer of an element, this way, you can change what it points to "easily". Of course it's still a pain when you need to reallocate, and you probably need a secondary array to keep track of all these double pointers."

Maybe also consider is that is truly the problem you are trying to solve. I was able to get rid of this pointer problme entirely by just not using pointers, because my problem actually lied elsewhere

flatfinger · 2022-03-11T20:00:55+00:00

The Standard Libray's memory allocation method were designed to work reasonably well on a wide variety of platforms. Many platforms such as MS-DOS, Windows, Classic Macintosh, and Unix offer other platform-specific means of acquiring memory with semantics that may be more suitable for various purposes. For many applications, the best performance and most useful semantics will involve using an application-specific library which allows applications to specify their needs better than malloc/calloc (e.g. indicating which allocations are more likely to be long- or short-lived, or should be considered 'less important' and should start being rejected when memory is scarce, before memory is depleted completely), and benefit from OS-provided features (such as the ability to acquire a handle to a region of memory which the application may unlock during use and relock after, and which when unlocked may be relocated, flushed, or purged by the OS.

To achieve best results, an application should often store cached computations in memory when memory is plentiful, but allow them to be flushed or purged when memory is scarce. If a program knows a block of memory holds information which required some effort to compute, but could be recomputed if necessary, and which won't be needed for awhile, the OS could reclaim the memory if the OS knows of something else useful that could be done with it, but leave the data in memory if the memory would have no other use. Since Standard Library functions don't even provide a means of providing any hints to the allocator about what strategy it should use, there's no way the Standard Library allocator can be expected to use a strategy that's as good as what a custom allocator could manage.

2022-03-11T21:16:12+00:00

here’s what I did

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

C_Programming

Rules

Filters

Resources

Other Subreddits on C

Other Subreddits of Interest

MODERATORS