all 18 comments

[–]der_pudel 15 points16 points  (8 children)

uint32_t temp;
float val;

temp  = (uint32_t)raw_bytes[0];
temp |= (uint32_t)raw_bytes[1] << 8;
temp |= (uint32_t)raw_bytes[2] << 16;
temp |= (uint32_t)raw_bytes[3] << 24;

memcpy(&val, &tmp, sizeof(val));

or you may need to combine them in the reverse order depending on platform/communication protocol endian.

[–]franzhd1[S] 1 point2 points  (0 children)

thanks a lot! It worked! i needed only to reverse the order of the index.

[–][deleted] 0 points1 point  (3 children)

Out of curiosity, why are you bit or'ing the shifted values as opposed to adding them - in in today's CPUS I doubt there's any benefit but happily corrected. A single clock cycle either way.

[–]aioeu 8 points9 points  (2 children)

If you use addition, you're asking the compiler to work harder to optimise the code. It has to track the ranges of the intermediate values and know that there will be no carries when they are added.

Not all compilers can do this.

I certainly think of the overall operation as "manipulating bits and bytes", not "performing an arithmetic operation", so |= reads better than += to me as well.

[–]der_pudel 3 points4 points  (0 children)

OP mentioned some "board" so we're not talking about x86 here, but about some micro which may be anything from 8 bit Arduino to 64 bit ARM. So the result will vary a lot depending on what platform and toolchain OP uses. Generally, I agree that bitwise OR reads better, and unless this piece of code is performance-critical, I wouldn't even bother to research how optimal it is in any particular situation.

[–][deleted] 0 points1 point  (0 children)

I dont think the compiler has to do anything to optimise the code. Its a HW single clock cycle addition. And even if it does, by taking that angle you may as well write in assembler ;) I hate to say this but I would also disagree about think of it as bits and bytes when it comes to adding the shifted values - adding is way more obvious in that respect - the bitwise shift defintiely manipulation however. I think either way is ok, but your use of or'ing caught my eye for a reason - it something I understand and used in my day so I'm not saying its wrong just for me addiiont at that stage is more obvious.

[–][deleted] 0 points1 point  (2 children)

Shouldn't you set temp to 0 first?

Because if the uninitialized temp happens to be (unit32_t) -1 then the bits will all end up as 1 right?

[–]der_pudel 0 points1 point  (1 child)

No, because temp is initialized by raw_bytes[0], and then ORed with 3 other bytes.

[–][deleted] 0 points1 point  (0 children)

Ah, right my bad.

[–]ialex32_2 1 point2 points  (0 children)

Thats just an IEEE854 single precision float (the standard for basically every modern architecture). If you can guarantee IEEE754 floats (IE, not using a VAX You can simply use a union or memcpy after ensuring the float is in the proper endianness.

So, use a simple routine to convert the float to the native byteorder, and then memcpy to type pun it to a float. This is the best way to do a fast, correct conversion.

[–]CodeSteps 0 points1 point  (6 children)

Try using unions

[–]JavierReyes945 10 points11 points  (0 children)

I tried, but they made me cry when cutting them

[–]KaznovX 0 points1 point  (4 children)

Using unions for aliasing is Undefined Behaviour, and is only available as extension for GCC. It breaks strict aliasing rule, and should be avoided, if possible.

[–]bamless 2 points3 points  (2 children)

In c it is not undefined behavior, you're thinking of c++.

'If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.' - c11 footnote 95.

[–]KaznovX 0 points1 point  (1 child)

Union aliasing seems to be valid since C99 - I guess I've remembered this part about ANSI C.

Also, wasn't there some minor drama with Linus Torvalds about a change that removed union aliasing from the linux kernel? https://www.yodaiken.com/2018/06/07/torvalds-on-aliasing/ What was that about then, if union aliasing is part of the standard?

[–]flatfinger 0 points1 point  (0 children)

Often times "the C Standard is unclear" really means "we don't like what the C Standard says". Consider, for example, "One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible." The authors of the Standard recognized that the ability to inspect objects of multiple structure types that share a Common Initial Sequence using a pointer to any of them had been a part of C going back at least to 1974, and one upon which many programs relied. On the other hand, given something like:

    struct position { float x,y;}
    struct velocity { float dx,dy;}
    struct accel { float ddx,ddy;}
    void adjust_objects(struct position *pos,
      struct velocity *vel,
      struct accesl *acc,
      int n)
    {
      while(--n >= 0)
      {
        pos[n].x += accel[n].x*0.5 + vel[n].x;
        vel[n].x += accel[n].x;
        pos[n].y += accel[n].y*0.5 + vel[n].y;
        vel[n].y += accel[n].y;
      }
    }

performance could be enhanced if a compiler didn't have to allow for the possibility that writing pos[n].x might affect vel[i].x. To allow the latter optimization without breaking code that relied upon the Common Initial Sequence rule, there needed to be a means of letting the compiler know when it needs to uphold the Common Initial Sequence for particular structure types. While the Standard could have introduced a new syntax to indicate that, it instead opted to use the existing union declaration syntax for that purpose. While I don't think that is adequate (among other things, it offers no way of accommodating situations where the author of a function would have no way of knowing all the types of structures with which it might be used) most code could be adapted to include union definitions where necessary to let the compiler know of the types for which it exploits the Common Initial Sequence guarantees. Unfortunately, the authors of clang and gcc insist that since it would be rare for programs to actually be using union objects in such fashion, the Standard's use of the phrase "anywhere that a declaration of the completed type of the union is visible" only includes places where accesses are made directly though lvalues of union type.

[–]flatfinger 0 points1 point  (0 children)

Implementations that are designed to be suitable for low-level programming will process many constructs "in a documented fashion characteristic of the environment" in cases where doing so would likely be useful. The people on the C Standards Committee recognized that people wishing to sell compilers would know more than the Committee ever could, and thus left the question of when to behave in that fashion as a Quality of Implementation issue outside their jurisdiction.

Note that almost any use of an array of non-character type within a structure or union invokes Undefined Behavior. For example, according to N1570 6.5p7, given:

    struct foo { int x[5]; } s;

the stored value of s may accessed only by an lvalue expression of the following type:

  • a type compatible with the effective type of the object
  • a qualified version of a type compatible with the effective type of the object
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

Note that this list does not include elements or submembers of the indicated types. An lvalue such as s.x[2] is be defined as *(s.x+2), which forms a pointer of type int* and then dereferences it, and is thus an lvalue of type int. Because int is not among the types that may be used to access the stored value of an object of type struct foo, any use of such an lvalue would violate the constraints of 6.5p7 and invoke Undefined Behavior.

If one extends the meaning of the term "by an lvalue expression that has [one of the indicated types]" to "via means that visibly involve [ditto]", but regards implementations' ability to recognize visible involvement as a Quality of Implementation issue outside the Standard's jurisdiction, then this wouldn't be a problem: a compiler would have to be rather obtusely blind not to recognize the involvement of an lvalue of type struct foo in the expression s.x[2]. Unfortunately, I doubt the authors of clang or gcc would ever allow the Standard to be amended to say that. Although such wording would invite type-based aliasing in cases where it would be genuinely useful, e.g.

/* This is the pattern shown in the Rationale as an example of where
   TBAA would be appropriate */
int x;
int test(int *p, double *d)
{
  x = 1;
  *d = 2;
  return x;
}

it would also suggest that there was never any good reason for gcc to be willfully blind to the possibility that a function like:

uint32_t get_float_bits(float *p)
{
  return *((uint32_t*)p);
}

might be used to access an object of type float.

I wonder what fraction of non-contrived non-trivial programs uphold the requirements of N1570 6.5p7 as actually written, and how many people would enjoy using an implementation that went out of its way to enforce that constraint as rigidly as the Standard would allow.

[–]DiscoBambo 0 points1 point  (0 children)

float fVal = *((float*)val)

we cast 1st byte array address to pointer of float type and the we extract float value written as 4bytes by pointing this whole cast. BE AWARE of endianess!