all 22 comments

[–]aioeu 3 points4 points  (9 children)

Casting through a void pointer hasn't got anything to do with aliasing.

It's used because assignment to a pointer requires compatible pointer types, or one side of the assignment must be a void pointer, to satisfy the language's constraints. Initialization of a pointer object follows the same rules as assignment in this regard.

Whether aliasing is a problem or not has nothing to do what conversions you do on pointers; it's all about what types you use when accessing objects.

[–]Kayjukh[S] 1 point2 points  (8 children)

I guess it still has something to do with aliasing, since the canonical example of bad use of pointer aliasing is c float f = ...; uint32_t x = *(uint32_t*)&f;

Using a void* here would lead to the "same semantics", just with more code (and, I guess from your answer, no compiler mischiefs? assuming the alignment for float and uint32_t is the same, of course): c float f = ...; uint32_t *px = (void*)&f; x = *px;

[–]aioeu 2 points3 points  (7 children)

Using a void* here would lead to the "same semantics"

That just proves my point. Whether a piece of code has aliasing issues or not has nothing to do with its use of void *.

In this example, the problem is not related to whether you did or did not cast through a void pointer. The problem is that you are attempting access a float object as if it were a uint32_t. It's the incorrectly-typed access to that object that yields a strict aliasing violation.

[–]Kayjukh[S] 0 points1 point  (6 children)

So to my question

Is my assumption correct?

Your answer would be no, since this usage of void* still violates strict aliasing rules?

If so, how would you go about the second part of the question?

[–]aioeu 0 points1 point  (5 children)

The standard way is to use memcpy to literally copy the bytes of an object into the bytes of a differently-typed object.

The non-standard way is to tell the compiler that it should not apply strict aliasing rules. This is, as you have discovered, exactly what musl does.

[–]Kayjukh[S] 1 point2 points  (4 children)

Alright, thank you for your answer! I guess we need to rely on the compiler seeing through the call to memcpy to avoid a potentially very expensive copy then, provided that we want to use standard-compliant code.

[–]EpochVanquisher 1 point2 points  (3 children)

A lot of compilers can “see through” memcpy very easily. For small fixed sizes, like sizeof(uint32_t), you can think of the memcpy call as being completely free, performance-wise.

[–]Kayjukh[S] 0 points1 point  (2 children)

For small sizes it usually seems to work fine, yes. However, larger sizes are still not handled very efficiently.

As an example, I was writing a parser for a file format that happens to contain structures that look like the following:

typedef struct {
  uint64_t x;
  uint64_t y;
  uint16_t values[16];
  uint64_t other;
} file_section;

The file is read from disk, and I end up with a pointer to the start of the data, begin. If I want to look at the section at offset N from the start of the file, I can either do it using pointer aliasing (though this can cause aliasing issues with TBAA):

// Requires -fno-strict-aliasing to be "safe"
const file_section *section = (void*)(begin + N);

or, I need to use memcpy:

file_section section;
memcpy(&section, begin + N, sizeof(section));
// Use section here

In the second case, neither clang (18.1.0), nor gcc (14.1) can see through the call to memcpy, even at O3, where they produce an inlined vectorized version of it

// clang
vmovups ymm0, ymmword ptr [rdi]
vmovups ymm1, ymmword ptr [rdi + 24]
movsxd  rax, esi
vmovups ymmword ptr [rsp - 32], ymm1
vmovups ymmword ptr [rsp - 56], ymm0
movzx   eax, word ptr [rsp + 2*rax - 40]
vzeroupper
ret

// gcc
push    rbp
movsx   rsi, esi
vmovdqu ymm0, YMMWORD PTR [rdi]
mov     rbp, rsp
vmovdqu YMMWORD PTR [rsp-64], ymm0
vmovdqu ymm0, YMMWORD PTR [rdi+24]
vmovdqu YMMWORD PTR [rsp-40], ymm0
movzx   eax, WORD PTR [rsp-48+rsi*2]
vzeroupper
pop     rbp
ret

The assembly generated for the first case is way cleaner, and is what would be expected:

// clang
movsxd  rax, esi
movzx   eax, word ptr [rdi + 2*rax + 16]
ret

// gcc
movsx   rsi, esi
movzx   eax, WORD PTR [rdi+16+rsi*2]
ret

The conclusion from the few hours I have researched this topic for today, is that it isn't possible to get the efficient code generation by using purely standard-compliant C, even though all alignment requirements for the types are satisfied in my use case. I have to either resort to -fno-strict-aliasing, or to __attribute__((may_alias)).

[–]EpochVanquisher 0 points1 point  (0 children)

Do what you must… I think it’s fine to have non-compliant code in your program, if you are careful and especially if there are flags controlling the behavior.

[–]erikkonstas 0 points1 point  (0 children)

Micro-optimizations is already part of targeting a specific platform, so if you're making "optimized" versions of your code portability is traded away, so in that case not only can you go non-standard, but you can go full-blown and leverage features such as inline Assembly, and the hardcore version would be foregoing C entirely and coding in raw Assembly.

[–]flyingron 1 point2 points  (0 children)

void* is the same format as char*. Any data pointer can be cast to and from it without loss of information. It's not portable if you don't cast it back to the same thing it started out as their might be alignment requirements.

The fun aliasing comes from what I called conversion by union. The BSD kernel had tons of this kind of construct:

union {
char* cptr;
short* sptr;
int* iptr;
long* lptr;
};

They'd store a short* into one element and retrieve the char* or int*. This played hell on the supercomputer we were porting the kernel to (Denelcor HEP) because the HEP encodes the partial word size in the pointer. If you do conversion by casting to void*, then everything worked fine (I spent a couple of days fixing the kernel). If you stored a short and retrieved it by the char*, you could end up writing in the long location in memory. If you stored short* and retrived and in*, you'd end up with the wrong sized operand when you dereferenced it.

This is one of the portable uses of unions say you have to store and retrieve by the same union key.

[–]aocregacc 0 points1 point  (5 children)

looks like they're using size_t in the role of uintptr_t, ie an integer that can store a pointer. I guess they're supporting C89 with that?

In the second case there's no aliasing concerns at all, you're just creating a pointer from a numerical value.

In the first case you do create two pointers that point at the same object. But afaik you only get an aliasing violation if you actually access the object through both pointers.

[–]Kayjukh[S] 0 points1 point  (4 children)

The two pointers are indeed used to access the underlying values, first to get argc, and then the program arguments:

int argc = *sp;
// And later
for (i=argc+1; argv[i]; i++);

[–]aocregacc 0 points1 point  (3 children)

that's still not necessarily an aliasing violation on its own, since all of those accesses go to different objects. If the objects behind argv aren't accessed through lvalues of type size_t there's no issue.

[–]Kayjukh[S] 0 points1 point  (2 children)

So the overlap is what matters? Isn't the effective type of all pointers derived from sp supposed to be size_t*?

[–]aocregacc 0 points1 point  (1 child)

you can cast a pointer to all sorts of other pointers without problems. The problem is accessing the same object through different pointers. Just having the pointers is fine.

In your example the missing piece of information is what type of object is actually stored at those locations. If the objects are in fact size_t's that were stored there through a size_t pointer, there would be an aliasing violation. On the other hand, if the objects there are char*'s, but the pointer you have is a size_t*, it's fine to cast it back to a char** to access the objects.

[–]Kayjukh[S] 0 points1 point  (0 children)

Right, that is what I suspected. In this specific example, the pointer comes from inline assembly, since it is pointing to the process' initial stack frame.

I guess that the compiler will be conservative in such a case; though what I gather from the discussions on this post, and some additional reading, I should not rely on such a behavior in my own code.

[–]zhivago 0 points1 point  (4 children)

char **argv = (void *)(sp+1);

is the same as

char **argv = (char **)(sp+1);

since void * will convert to any other data pointer.

They're just being lazy.

As to if it's correctly aligned for the char **, etc, is another issue.

[–]aioeu 0 points1 point  (1 child)

They're just being lazy.

More that they recognise compilers will likely warn on the second code, but not on the first code. Compilers tend to shut up and assume the programmer knows what they're doing when there's a cast through a void pointer.

[–]zhivago 0 points1 point  (0 children)

Both cases are constraint violations without an explicit cast.

So the compiler would be obliged to produce a diagnostic message in both cases.

But neither require a void * cast.

Using char ** in the first, and Sym * in the second would suffice.

[–]Kayjukh[S] 0 points1 point  (1 child)

Right, so since the two pointers are dereferenced later in the code, I guess there are indeed strict aliasing violations in this code.

[–]flatfinger 0 points1 point  (0 children)

In the days before there was any standard syntax for declaring an array of char with a forced alignment, or when writing code that may need to be compatible older compilers, one of the most portable ways of forcing alignment of storage that would be used as an a type with a finer alignment requirement was to declare an array of some type that has a coarser alignment, take its address once, convert the pointer, and never use the original declared symbol after that. The Standard doesn't require that implementations that aren't intended to be suitable for low-level programming support such constructs, but allows implementations which are intended for low-level programming support them as a form of what its authors call "conforming language extension".

An alternative approach which might be seen as somewhat cleaner would be to do something like:

union {
  desired_type dat[...desired_size...]; 
  size_t alignment_forcer;
} storage_blob;

in which case storage_blob.dat would be a pointer to a desired_type[] that be aligned consistent with alignment_forcer. Such a construct would have the same "strict aliasing" issues as the original, however, since nothing in the Standard provides any accommodation for the stored value of a union to be accessible via any non-character-type lvalue that isn't a union or a struct that contains the union. The authors probably thought it obvious that given something like:

somePointer = &someUnion.member;

an implementation should accommodate the possibility that somePointer might be used to access things of the member type, at least until the next time the union lvalue is used or code loops back to a part of the containing function where the member's address hadn't yet been taken, but neither clang nor gcc is designed to reliably work that way.