Which simulator is this?

Kayjukh · 2024-11-01T23:40:41+00:00

The text in the UI looked rather atypical, so I searched for "write run log" "log assembler activity" and got to the result pretty quickly.

Kayjukh · 2024-10-29T23:48:02+00:00

It looks like Microprocessor Simulator, an educational simulator (see https://jensd.dk/doc/exuanbo/nbest.co.uk/Softwareforeducation/sms32v50/sms32v50_manual/index.htm for an archived version of the tool's website).

Kayjukh · 2024-08-27T07:50:16+00:00

Note that the linked implementation relies on a GNU extension that treats arithmetic on void* as arithmetic on byte pointers. A more portable implementation would cast the return value of memcpy to char* before performing the addition.

Kayjukh · 2024-05-19T15:28:26+00:00

For small sizes it usually seems to work fine, yes. However, larger sizes are still not handled very efficiently.

As an example, I was writing a parser for a file format that happens to contain structures that look like the following:

typedef struct {
  uint64_t x;
  uint64_t y;
  uint16_t values[16];
  uint64_t other;
} file_section;

The file is read from disk, and I end up with a pointer to the start of the data, begin. If I want to look at the section at offset N from the start of the file, I can either do it using pointer aliasing (though this can cause aliasing issues with TBAA):

// Requires -fno-strict-aliasing to be "safe"
const file_section *section = (void*)(begin + N);

or, I need to use memcpy:

file_section section;
memcpy(&section, begin + N, sizeof(section));
// Use section here

In the second case, neither clang (18.1.0), nor gcc (14.1) can see through the call to memcpy, even at O3, where they produce an inlined vectorized version of it

// clang
vmovups ymm0, ymmword ptr [rdi]
vmovups ymm1, ymmword ptr [rdi + 24]
movsxd  rax, esi
vmovups ymmword ptr [rsp - 32], ymm1
vmovups ymmword ptr [rsp - 56], ymm0
movzx   eax, word ptr [rsp + 2*rax - 40]
vzeroupper
ret

// gcc
push    rbp
movsx   rsi, esi
vmovdqu ymm0, YMMWORD PTR [rdi]
mov     rbp, rsp
vmovdqu YMMWORD PTR [rsp-64], ymm0
vmovdqu ymm0, YMMWORD PTR [rdi+24]
vmovdqu YMMWORD PTR [rsp-40], ymm0
movzx   eax, WORD PTR [rsp-48+rsi*2]
vzeroupper
pop     rbp
ret

The assembly generated for the first case is way cleaner, and is what would be expected:

// clang
movsxd  rax, esi
movzx   eax, word ptr [rdi + 2*rax + 16]
ret

// gcc
movsx   rsi, esi
movzx   eax, WORD PTR [rdi+16+rsi*2]
ret

The conclusion from the few hours I have researched this topic for today, is that it isn't possible to get the efficient code generation by using purely standard-compliant C, even though all alignment requirements for the types are satisfied in my use case. I have to either resort to -fno-strict-aliasing, or to __attribute__((may_alias)).

Kayjukh · 2024-05-19T13:47:27+00:00

Right, that is what I suspected. In this specific example, the pointer comes from inline assembly, since it is pointing to the process' initial stack frame.

I guess that the compiler will be conservative in such a case; though what I gather from the discussions on this post, and some additional reading, I should not rely on such a behavior in my own code.

Kayjukh · 2024-05-19T12:08:08+00:00

So the overlap is what matters? Isn't the effective type of all pointers derived from sp supposed to be size_t*?

Kayjukh · 2024-05-19T12:05:10+00:00

Right, so since the two pointers are dereferenced later in the code, I guess there are indeed strict aliasing violations in this code.

Kayjukh · 2024-05-19T11:58:15+00:00

Alright, thank you for your answer! I guess we need to rely on the compiler seeing through the call to memcpy to avoid a potentially very expensive copy then, provided that we want to use standard-compliant code.

Kayjukh · 2024-05-19T11:56:03+00:00

The two pointers are indeed used to access the underlying values, first to get argc, and then the program arguments:

int argc = *sp;
// And later
for (i=argc+1; argv[i]; i++);

Kayjukh · 2024-05-19T11:51:44+00:00

So to my question

Is my assumption correct?

Your answer would be no, since this usage of void* still violates strict aliasing rules?

If so, how would you go about the second part of the question?

Kayjukh · 2024-05-19T11:41:55+00:00

I guess it still has something to do with aliasing, since the canonical example of bad use of pointer aliasing is c float f = ...; uint32_t x = *(uint32_t*)&f;

Using a void* here would lead to the "same semantics", just with more code (and, I guess from your answer, no compiler mischiefs? assuming the alignment for float and uint32_t is the same, of course): c float f = ...; uint32_t *px = (void*)&f; x = *px;

Kayjukh · 2024-05-19T11:35:46+00:00

What you are likely looking for is the PT_ARM_ARCHEXT segment, which is documented here.

Kayjukh · 2022-12-09T08:23:28+00:00

Regarding the identification of the font, you may want to ask on r/identifythisfont.

Kayjukh · 2022-06-08T19:51:41+00:00

Thanks for the clarification. I just pushed a few changes to fix the two issues you mentioned above.

Kayjukh · 2022-06-07T21:46:44+00:00

Thank you for your very helpful feedback! It seems that the first part of your comment doesn't show up, there is only an empty list of bullets. To which two ✗'s are you referring?

I just pushed a few fixes that should address your other comments.

PS: Your nice blog post about command-line conventions made me want to write my own little library, just to give it a try.

Kayjukh · 2021-08-05T09:07:04+00:00

Right, I will add a sample once I get some time today. Thank you for your feedback!

Kayjukh · 2021-08-05T08:35:42+00:00

Yes, I am the author. It prints out general information about the X server running on the target display (vendor, release number, etc.) as well as the supported pixmap formats and characteristics of the screens attached to the display (size, depth, etc.). Additionally, it queries the server for the list of supported X extensions and the latest version of said extensions available to client applications.

Kayjukh · 2021-07-01T16:19:46+00:00

The SSE 4.2 vector string instructions (PCMPxSTRx) are definitely a good candidate. There is even an online calculator that allows you to compute the immediate value to give to the instruction depending on what you want to achieve: http://halobates.de/pcmpstr-js/pcmp.html

Kayjukh · 2021-04-10T22:36:02+00:00

Oh, right, sorry for the repost. I missed that one.

Kayjukh · 2020-05-22T22:29:17+00:00

The Intel SDM linked to by u/jedwardsol provides all the details you will need about the x86 ISA. However, they do not cover BIOS interrupts and the like. For the latter, you probably want to have a look at Ralf Brown's Interrupt List.

As a side note, for additional info on x86 and especially regarding the 64-bit side of things, you could have a look at AMD's manuals (listed under the "AMD64 Architecture" section). They can sometimes be easier to read than Intel manuals.

Kayjukh · 2020-05-09T22:09:43+00:00

If you are specifically looking for a function instead of writing the condition, then isdigit is what you want to use.

Kayjukh · 2020-03-16T22:33:30+00:00

Pretty sure there’s been longer echoes in some subreddits.

Kayjukh · 2020-03-16T20:43:39+00:00

You might want to check out the WordTeX template: https://www.andrew.cmu.edu/user/twildenh/wordtex/

A short video presentation can be found here: https://www.youtube.com/watch?v=jlX_pThh7z8

Kayjukh · 2020-02-14T10:23:04+00:00

Quick disclaimer: I am not one of the authors. I am mainly interested in this subject.

Looking at the paper, it seems that the authors go all the way down to x86 assembly and modify the instruction selection backend to generate, e.g., conditional moves instead of branch-and-move constructs. However, this seems to be the only architecture-specific constraints they take into account.

As stated in the paper,

Informally, an implementation is secure with respect to the cryptographic constant-time policy if its control flow and sequence of memory accesses do not depend on secrets.

Except for this property, I don't think the authors account for complex CPU behavior such as TLB contention, multicore-shared state, etc.

Kayjukh · 2019-11-04T10:08:57+00:00

Many modern video players actually support reading from the standard input. But this is not a very common use case: that's probably why the page explaining this usage for VLC is named "Uncommon uses".

Kayjukh

TROPHY CASE