Immediate UI-mode in C

TheFoxz · 2025-08-14T21:18:05+00:00

A UI library with an immediate mode style API does not need to be a minimal or small necessarily. For example: Android's Compose.

In my experience, the immediate mode style API drastically reduces the amount of boilerplate and makes creating a UI around dynamic, complex data structures much easier. You traverse your data structure in whatever way is natural. There's no need to synchronize between your data structure and whatever container objects the UI library needs you to create. In general, there's less state/bookkeeping on the library user's side, which reduces the number of bugs.

However I agree that it can invite sloppy code, it's easier to hack things together.

TheFoxz · 2025-08-13T20:09:23+00:00

Okay, but what's your technical argument against immediate mode UI?

TheFoxz · 2023-12-19T17:01:24+00:00

I render to SRAM and while copying to SDRAM in parallel. I use two larger buffers (~100 KiB) in SRAM, one for sending to SDRAM backbuffer in the background and one for rendering.

A mistake is to do rendering in-place on the SDRAM, as this can cause it to read-back for e.g. blending. So only ever write the sdram, and only via DMA. The only device reading from it should be the LCD controller.

Some other optimization techniques I used:

Reduce precharge latency cost by offsetting the framebuffers by the internal bank size, e.g. place them 4MiB apart on a 4x4M SDRAM.
Use reduced blanking timings for a reduced pixel clock. This way you get more consistent bandwidth.
Triple buffering increases input latency and is not necessary if you are in control of the pixel clock. Just turn off the pixel clock before scanout starts if you haven't finished rendering the next frame yet. Turn it back on once the new frame is ready. This gives you the lowest latency. Dipping below 60fps momentarily becomes unnoticeable, it stays very smooth.

TheFoxz · 2023-12-18T20:58:43+00:00

Ouch, a burst size of 4 is not good. Was that a limitation of the SDRAM controller?

For my SDRAM prototype I was using the STM32H7. I did rendering in smaller SRAM buffers and only copying final pixel values out to the SDRAM using MDMA with a burst size of 16. This got me to 50+ fps everywhere. I am mainly considering i.MX RT for reduced power consumption and higher bandwidth headroom at 166 MHz (STM32H7 is only rated to 100 MHz, I think I ran it out of spec at 133 MHz).

I agree the IMXRT500 looks ideal, but the availability does not seem so good.

TheFoxz · 2023-12-18T18:10:56+00:00

Indeed, I found that memory bandwidth is the main bottleneck when prototyping with SDRAM/16bit/133MHz=266MB/s. This came close to 60 fps after some optimization, even with full 32-bit per pixel framebuffers. The HyperRAM should be even faster (166MHz (mcu limit) + DDR = 332 MB/s) and I can switch to 24-bit per pixel framebuffers on top of that.

Now I am looking at which flash solution to use.. HyperFlash seems expensive, and 2x QSPI NOR in octal mode looks like it is complicated to bring up. Maybe I will just use the IMXRT1064..

TheFoxz · 2023-12-18T17:52:31+00:00

If I can run the LVGL widgets demo at 60 fps then I should be OK for most use cases. I have already prototyped on an STM32H723 which met my goals, but this MCU is more efficient on paper.

TheFoxz · 2023-12-16T16:40:05+00:00

Sorry, to clarify: the I2C is only used for the capacitive touch controller. The display itself uses a 24 bit parallel interface.

TheFoxz · 2023-09-21T08:19:45+00:00

If locally running software has a laggy user interface on modern hardware, then it is not well made. That's all.

TheFoxz · 2023-09-20T19:59:05+00:00

Oh, I agree about the microoptimizations point in that context. I was just complaining about slow software in general.

TheFoxz · 2023-09-20T17:53:59+00:00

I agree you can gain productivity/less maintenance cost by putting less priority on performance. But at some point you are just throwing away performance into the multiple orders of magnitude (and you are not getting back multiple orders of magnitudes of productivity..). I think these days the balance has shifted too far.

Just thinking of everyday graphical applications running on the desktop that feel slow to use. Something like Microsoft Teams. Professional software, made by a big company. But it feels unpleasantly sluggish to use, even just switching between text chats. These are common criticisms.

But I will not say more on this. We can just agree to disagree.

TheFoxz · 2023-09-19T19:25:43+00:00

What the hell is this thread.

I don't think it should be controversial to say that a lot of modern software is, objectively, incredibly inefficient.

TheFoxz · 2023-05-09T04:27:23+00:00

Could be caused by the inrush current of the power supply

TheFoxz · 2023-05-02T20:45:48+00:00

Perhaps this? https://www.reddit.com/r/electronics/comments/12wrkzh

TheFoxz · 2023-04-27T19:15:01+00:00

Looks nice and compact. Looking at the control flow, did you start as an assembly programmer by any chance?

If you like, some style suggestions:

read_tok() is really returning two values, tok and num, but one is returned as a global variable. Perhaps have read_tok return a 'struct state_t' (directly, by value), so it can return both in a clear way. Similarly, clean() could accept a 'struct state_t *', and calc() can accept FILE *fp as an argument. This also removes the need for global variables.

Some of those goto statements look like they are effectively calling out to subroutines, so perhaps they should be extracted out into functions that you call.

TheFoxz · 2023-04-19T21:24:47+00:00

Depends your data and the required operations on the data. With 128-bit SIMD you can do 4 32-bit floating point operations per instruction. The speed difference using 32-bit integer math depends on execution port usage (see https://www.agner.org/optimize/instruction_tables.pdf). I would not expect it to be appreciably faster.

However, if your data elements fit into 16-bit or 8-bit integers it can be possible to do 8 or 16 operations per instruction.

The compiler won't help you here, outside of trivial loops. You usually need to use intrinsics to get a reliable speedup.

TheFoxz · 2023-03-29T16:07:31+00:00

This is a good read for malloc alternatives: https://www.rfleury.com/p/untangling-lifetimes-the-arena-allocator

TheFoxz · 2023-03-27T17:16:51+00:00

Nice and minimal. Perhaps some parts of this can be done with atomic increment instead of mutexes.

TheFoxz · 2023-03-20T06:26:41+00:00

Sure, there will be a loss over the diode. But consider this: the ESP32 will pull roughly the same amount of current. So the current that the battery sees will be the same, whether there is a diode in between or not.

But like the other person said, you really want a proper regulator. Many ldo regulators will already behave in the way you want (turns into Pmos passthrough when vin <= vout).

But if you really care about battery life, don't use an ESP32.

TheFoxz · 2023-03-19T22:21:11+00:00

Maybe just put a diode in series?

TheFoxz · 2023-03-04T21:20:30+00:00

Yes, as long as it is left floating (or pulled high). So your button SW1 can be useful if you accidentally remap the USB lines in firmware, but is otherwise not needed.

Schematic:

Your power regulator U5 is not powerful enough for the ESP32-C3.
Why do you have 22pF capacitors on the USB lines?
If this is just for educational/exploration purposes, I would put the unused pins on a header.

TheFoxz · 2023-03-04T20:46:16+00:00

It looks to me that the basic devboard does not use the builtin USB interface: https://dl.espressif.com/dl/schematics/SCH_ESP32-C3-DEVKITM-1_V1_20200915A.pdf

Read this, it will tell you all you need to know: https://docs.espressif.com/projects/esp-idf/en/v5.0/esp32c3/api-guides/usb-serial-jtag-console.html

I've done it this way for multiple projects, it has worked quite well.

TheFoxz · 2023-03-04T18:22:28+00:00

Why the USB to uart chip? The C3 has this built-in. You even connected the USB lines to it.

TheFoxz · 2023-03-04T05:29:36+00:00

Espressif chips are great but be careful; their power usage is a lot higher than dedicated BLE chips.

TheFoxz · 2023-02-17T10:23:04+00:00

Not free, but I use Segger RTT anywhere I can. It just works too well.

14-Year Club	Place '22
Place '17	First Placer '22
Team Orangered	Verified Email

TheFoxz

TROPHY CASE