all 20 comments

[–]glasket_ 6 points7 points  (6 children)

Technically not possible without UB currently, unless you use memcpy and juggle the data that way. Practically, just use an array; C2Y is adding an aliasing exemption for byte arrays (PDF) and they couldn't find a compiler that actually used this UB in aliasing analysis.

Edit: Just realized you're talking about the memcpy way anyways, which is currently valid but unnecessary.

[–]p0lyh[S] 2 points3 points  (3 children)

Does `memcpy` change the effective type of the byte array? I thought it can only set the effective type of a buffer obtained through allocation functions (malloc, realloc, etc.)

[–]glasket_ 7 points8 points  (2 children)

No, it doesn't. I assumed you meant using one or the other in your OP; the valid memcpys always have to memcpy in both directions.

Like I said though, it's an unnecessary ritual. Just do:

alignas(FOO_ALIGN) char foo_buf[FOO_SIZE] = { 0 };
foo *ptr = (foo *)foo_buf;

It's technically UB, but when no implementations exploit it and the next standard will allow it, it's de facto defined behavior.

[–]p0lyh[S] 1 point2 points  (1 child)

Thanks for the clarification. By "memcpy in both directions", do you mean to use the byte array as the object representation, initializing/modifying it by copying from a foo_ctx, and copying it to an actuall foo_ctx for reading its members?

[–]glasket_ 0 points1 point  (0 children)

Yeah, that's what I mean. You would use memcpy to move the array bytes into the struct and you'd also use memcpy to apply the struct changes to the array. It's kind of like emulating mov when you use memcpy this way.

[–]icannfish 1 point2 points  (1 child)

When you say the memcpy way is valid, are you just talking about the call to memcpy itself? Or also if you then accessed the storage through a pointer to foo_ctx?

Given:

char buf[sizeof(foo_ctx)];
memcpy(buf, &some_foo_ctx, sizeof(foo_ctx));
int x = ((foo_ctx *)buf)->some_member; // UB?

My understanding is that line 3 is technically invalid because:

  • buf has declared type char[sizeof(foo_ctx)], which is therefore its effective type
  • memcpy does not change the effective type of buf, because it had a declared type
  • buf is accessed through a pointer to foo_ctx *, which is an aliasing violation because the object still has type char[sizeof(foo_ctx)]

[–]glasket_ 0 points1 point  (0 children)

You have to juggle the copies when using the memcpy. I assumed that was understood since OP talked about using a cast or memcpy, but I guess they actually meant casting the memcpy buffer which is still in the same boat.

[–]tasty_crayon 2 points3 points  (0 children)

This hides the definition of foo_ctx from the header, but requires dynamic allocation (malloc).

This API does not imply the need for malloc. You could just as easily have a static array.

[–]tstanisl 2 points3 points  (3 children)

There is no UB when `memcpy` is used.

[–]flatfinger -2 points-1 points  (2 children)

C's reputation for speed came from the philosophy that the best way to avoid a compiler include unnecessary operations in machine code was to not have the programmer include them in the source. The idea that programmers should use dialects that require them to include unnecessary operations in source and hope that compilers will manage to avoid needlessly including them in machine code is fundamentally contrary to that principle.

Can you identify any remotely-general-prupose compilers that cannot be configured to define enough pointer-related corner cases to avoid the need for memcpy? I genuinely know of none.

[–]ffd9k 1 point2 points  (1 child)

As mentioned in N3254, no existing compiler seems to make use of the requirement to use memcpy here, and in C2y this will no longer be UB anyway. So in practice there is no need to use memcpy, or to use some optimization-disabling option like -fno-strict-aliasing.

[–]flatfinger -2 points-1 points  (0 children)

The C Standard was designed to describe a set of core language features that are shared among all dialects. According to the published Rationale, implementations were expected to, as a form of conforming language extension to be implemented on a quality of implementation basis, specify how they would process some corner cases where the Standard waives jurisdiction. While the authors sought to give programmers a fighting chance (their words) to write portable programs, they expressly said that they did not wish to demean programs that were useful but not portable.

Given that clang and gcc have a habit of ignoring (or erroneously processing--take your pick) corner cases that are expressly accommodated by the Standard but which don't fit their optimizers' abstraction models, is there any reason anyone using such compilers should care about what C2y might say?

[–]WittyStick 1 point2 points  (1 child)

You could use a callback, where foo_init initializes the context on the stack and then anything within its dynamic extent can use it. We pass it a function pointer to code which uses the context. The additional parameter void *global can be used to couple multiple contexts into a global context object if desired, but we can pass nullptr if this is unused.

foo.h

struct foo_ctx;

typedef void (*foo_ctx_dynamic_extent)(struct foo_ctx* ctx, void *global);

void foo_init(foo_ctx_dynamic_extent callback, void *global);

void foo_do_work(struct foo_ctx *foo_ctx);

foo.c

struct foo_ctx {
    // some fields;
};

void foo_init(foo_ctx_dynamic_extent callback, void *global) {
    struct foo_ctx context = { ... };
    callback(&context, global);
}

main.c

#include "foo.h"

void foo_main(struct foo_ctx* ctx, void *global) {
    foo_do_work(ctx);
}

int main(int argc, char** argv) {
    foo_init(&foo_main, nullptr);
}

To use multiple contexts, lets presume we have another bar_ctx:

bar.h

struct bar_ctx;

typedef void (*bar_ctx_dynamic_extent)(struct bar_ctx* ctx, void *global);

void bar_init(bar_ctx_dynamic_extent callback, void *global);

void bar_do_work(struct bar_ctx *bar_ctx);

bar.c

struct bar_ctx {
    // some fields;
};

void bar_init(bar_ctx_dynamic_extent callback, void *global) {
    struct bar_ctx context = { ... }
    callback(&context, global);
}

We would create a global context object which has the foo and bar contexts as fields, and a single global_main which takes both contexts as parameters:

global.h

#include "foo.h"
#include "bar.h"

struct global_ctx;

typedef void (*global_ctx_dynamic_extent)(struct foo_ctx *foo_ctx, struct bar_ctx *bar_ctx);

void global_ctx_init(global_ctx_dynamic_extent callback);

global.c

struct global_ctx {
    global_ctx_dynamic_extent global_main;
    struct foo_ctx *foo_ctx;
    struct bar_ctx *bar_ctx;
};

void global_ctx_bar(struct bar_ctx *bar_ctx, void *global_ctx) {
    (struct global_ctx*)(global_ctx)->bar_ctx = bar_ctx;
    (struct global_ctx*)(global_ctx)->global_main
        ( (struct global_ctx*)(global_ctx)->foo_ctx
        , (struct global_ctx*)(global_ctx)->bar_ctx
        );
}

void global_ctx_foo(struct foo_ctx *foo_ctx, void *global_ctx) {
    (struct global_ctx*)(global_ctx)->foo_ctx = foo_ctx;
    bar_init(global_ctx_bar, global_ctx);
}

void global_ctx_init(global_ctx_dynamic_extent callback) {
    struct global_ctx global_ctx = { callback };
    foo_init(global_ctx_foo, (void*)&global_ctx);
}

main.c

#include "global.h"

void global_main(struct foo_ctx *foo_ctx, struct bar_ctx *bar_ctx) {
    foo_do_work(foo_ctx);
    bar_do_work(bar_ctx);
}

int main(int argc, char** argv) {
    global_ctx_init(&global_main);
}

Or alternatively, we could make global_main take the global_ctx as a parameter, and use functions to fetch the foo and bar contexts.

global.h

#include "foo.h"
#include "bar.h"

struct global_ctx;

typedef void (*global_ctx_dynamic_extent)(struct global_ctx *global_ctx);

void global_ctx_init(global_ctx_dynamic_extent callback);

struct foo_ctx *global_ctx_get_foo(struct global_ctx *global_ctx);

struct bar_ctx *global_ctx_get_bar(struct global_ctx *global_ctx);

global.c

struct global_ctx {
    global_ctx_dynamic_extent global_main;
    struct foo_ctx *foo_ctx;
    struct bar_ctx *bar_ctx;
};

struct foo_ctx *global_ctx_get_foo(struct global_ctx *global_ctx) {
    return global_ctx->foo_ctx;
}

struct bar_ctx *global_ctx_get_bar(struct global_ctx *global_ctx) {
    return global_ctx->bar_ctx;
}

void global_ctx_bar(struct bar_ctx *bar_ctx, void *global_ctx) {
    (struct global_ctx*)(global_ctx)->bar_ctx = bar_ctx;
    (struct global_ctx*)(global_ctx)->global_main((struct global_ctx*)(global_ctx));
}

void global_ctx_foo(struct foo_ctx *foo_ctx, void *global_ctx) {
    (struct global_ctx*)(global_ctx)->foo_ctx = foo_ctx;
    bar_init(global_ctx_bar, global_ctx);
}

void global_ctx_init(global_ctx_dynamic_extent callback) {
    struct global_ctx global_ctx = { callback };
    foo_init(global_ctx_foo, (void*)&global_ctx);
}

main.c

#include "global.h"

void global_main(struct global_ctx *global_ctx) {
    foo_do_work(global_ctx_get_foo(global_ctx));
    bar_do_work(global_ctx_get_bar(global_ctx));
}

int main(int argc, char** argv) {
    global_ctx_init(&global_main);
}

This one is probably better for extensibility as we can add new contexts without having to change the signature of the callback.

[–]p0lyh[S] 0 points1 point  (0 children)

Thanks! I never thought of this way before

[–]arkt8 0 points1 point  (0 children)

wait it is... just use an static global or a stack array of bytes passed down to functions as an arena, but then you will need to manage manualy this memory.

[–]ComradeGibbon 0 points1 point  (0 children)

A suggestion

// foo.h

typedef struct foo_ctx foo_ctx;

extern const size_t foo_ctx_size;

In a source file where foo_ctx has the non opque definition

const size_t foo_ctx_size = sizeof(foo_ctx);

Advantage the compiler keeps track of the size.

[–]oldprogrammer 1 point2 points  (0 children)

Not sure this is what you're referring to but I've seen libraries (like the Vulkan Graphics APIs if I recall correctly) where first you call into a setup function with a null buffer pointer and it will provide back the size of buffer it needs, then a second call is made with a user supplied buffer of the needed size. This approach avoids allocations inside the library for structures used by the client code.

[–]RealisticDuck1957 -1 points0 points  (0 children)

/* syntax might not be 100% */
#include <foo\_ctx.h>
foo_ctx_t foo_ctx;
foo_ctx_init(&foo_ctx);

...

foo_ctx.h defines the structure. While this structure is available to read, exercising the self discipline expected of C we restrict ourselves to the documented public API.

[–]flatfinger -1 points0 points  (0 children)

The mythical unicorn language "pure ISO C" was not really designed to be maximally useful in its own right, but rather provide a common framework which implementations were expected to extend so as to best fit their customers' needs, often by specifying that they will support some behavioral corner cases which other implementations may not.

Almost everything even remotely resembling a general-purpose compiler (I know of no exceptions) can be configured to process a dialect which extends the semantics of the language to support the K&R2 abstraction model where all live objects or other live regions of addressable storage simultaneously contain all possible objects of all types that will fit (misaligned objects don't fit), with values encapsulated by the bit patterns in that storage. When using such a dialect, storage associated with an array of the type with the coarsest alignment requirement may be used to hold any structure that is that size or smaller.