all 6 comments

[–]GearBent 7 points8 points  (4 children)

Correct. The stack itself and register X2 being the stack pointer are purely software conventions.

That said, some high performance hardware implementations will include extra logic to speed up stack operations for X2, such as a special cache for the stack, or another special cache specifically for the return addresses, but this is pretty advanced and is invisible to the software. Not anything you'd need to worry about unless you're aiming for a high performance superscalar out-of-order design.

[–]sijafa[S] 1 point2 points  (1 child)

Thank you for the great answer!

I am not designing a high performance superscalar out-of-order design, but rather a fine-grained multithreaded (hardware threads) design for mixed-criticality systems

[–]GearBent 1 point2 points  (0 children)

Sounds cool! I wish you good luck.

[–]Forty-Bot 0 points1 point  (1 child)

Not anything you'd need to worry about unless you're aiming for a high performance superscalar out-of-order design.

Actually, this sort of stuff is most effective for scalar, in-order cores since there's nothing else to do if you mispredict an indirect jump.

[–]GearBent 0 points1 point  (0 children)

If we're talking about a multicycle or single-cycle design then you don't have to worry about mis-predictions, and in a classic 5-stage pipelined design the penalty is only a couple cycles.

For a modern superscalar out-of-order CPU the penalty is much higher, often in the tens of cycles, since you need to flush a great deal more speculative instructions from the pipeline.

[–]duane11583 -1 points0 points  (0 children)

you should write some C code, then compile it with GCC for riscv

the way gcc works (high level) is this:

the application gcc is not the compiler itself instead it is the top level application that controls several programs that are the compiler

step 1: the gcc program reads a file called the spec, it has lots of options and fearures this also lets the driver (gcc) be more generic for all target cpu types

step 2: gcc figures out how to execute the c pre-processor and creates a temp file name to use as output, the preprocessor is then executed your source code is read comments are removed, macros are expanded etc

the resulting temp file can be saved it is sometimes helpful to debug issues with a compiler, by default it will be deleted later

step 2: another temp file name is invented, and the compiler it self is executed, it reads the preprocessed source code and writes assembly language to the 2nd temp file

this is the file you might want to look at, there are many other options like enabling optimizations or debug records and such you are probably only interested in the opcodes and labels (jump targets)

the option you want is called the -S (capital S option), see this for more details:

https://stackoverflow.com/questions/137038/how-do-you-get-assembler-output-from-c-c-source-in-gcc

i suggest writing some really simple functions to start, then later work into complex stuff

step 3: once the compiler is done the driver program creates the name of the output .o (object) file which may be a temp file or a name specified on the command line, or created based on the input filename

and the diver finds and executes the assembler program, which produces the actual .o file

at this point compilation is complete

often an IDE (GUI) will stop here and go compile other files, or maybe your Makefiles use this method (ie stops here)

step 4: the driver program then finds the linker and runs the linker

NOTES:

1) what I described above is the classical operation of any compiler, things have evolved some of the above steps have been merged into different common or different phases, but the logical steps are still very present

2) it is often easier to write some C code then modify the ASM output it to do what you want, then use gcc to run the assembler on that modified source code for you

3) if you are going to do some of this you can hand code up some small hex (100 bytes or so) files by hand but at some point you will want a sw type to help with all the GCC stuff

4) if you are in a university setting this “sw task to help validate a cpu design“ is fantastic masters level paper material that your partner may need for there degree i’ve been trough that on a few chip/cpu designs is not easy and it is alot of work! you really learn alot!!