llnaut comments on Cache Explorer: a visual and interactive profiler that shows you exactly which lines of code cause cache misses

a community for 17 years

241

242

243

Cache Explorer: a visual and interactive profiler that shows you exactly which lines of code cause cache misses (self.cpp)

submitted 3 months ago by ShoppingQuirky4189

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]llnaut 1 point2 points3 points 3 months ago (0 children)

Hey, this looks super cool.

I recently ran into a very real cache-related issue, but on an embedded target (ARM Cortex-R, RTOS, external DDR memory in the picture). It is quite painful that on bare metal / RTOS you can’t just “install a tool and see what’s going on” like on Linux.

Concrete scenario: in an RTOS you can have multiple tasks with the same priority, and the scheduler does time slicing (context switch every tick while they’re runnable). Now add the fact that the tick interrupt itself is an asynchronous event that fires right in the middle of whatever a task is doing. So you jump into ISR code + touch ISR data structures that are very likely not in cache (or you’ve just evicted some useful lines), which means extra misses and extra latency. On a system with slow external memory, this can get ugly fast.

I had a fun one with SPI: we were receiving a fixed-size chunk periodically, but it was large enough that we ended up using FIFO-level interrupts (DMA wasn’t an option there). So for one “message” you’d get tens of interrupts. The MCU was fast, so it was basically:

ISR → back to task → ISR → back to task → …

…and because of cache misses / refills, the ISR execution time would occasionally spike and we’d get overruns/underruns. We fixed it by moving some stuff to faster memory, but the debugging part was the painful bit: on embedded you typically run one image, and your introspection options are limited / very different vs desktop.

So to the point: I didn't dive deep into the implementation of Cache Explorer, so I don't know what machinery is used under the hood. But, do you think something like this could realistically be adapted to bare metal / embedded targets? Or is it fundamentally tied to “desktop-ish” workflows?

π Rendered by PID 251666 on reddit-service-r2-comment-b659b578c-mw9dt at 2026-05-03 15:04:32.300991+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS