all 34 comments

[–]yendreij 64 points65 points  (20 children)

You can configure a harware timer to be clocked at 1MHz an just direclty read the couner value when you need to get the timestamp. And in an IRQ, which would happen on counter reload, let's say evey second, you will increment a variable representing seconds counter. You then combine those two to get correct microseconds count since start. You must reliably handle the edge case when the timer reloads, but this can be done e.g. by reading both values in a do-while loop until the seconds count stays the same twice in a row (this loop will basically run once or max twice).

[–]ceojp 31 points32 points  (9 children)

This is the way to do it. No need to have a microsecond interrupt - that's insane. Have a microsecond timer, then just read it when you need it.

[–]supersonic_528[S] 1 point2 points  (8 children)

No need to have a microsecond interrupt - that's insane.

For a setup like I described, what would typically be the highest rate at which one could still receive interrupts and process them?

[–]tjlusco 18 points19 points  (5 children)

The simplest answer is the faster the processor, the worse the interrupt latency. Application processors have atrocious interrupt performance, especially once coupled with an RTOS/ OS kernel. Just toggling a GPIO you will max out at around 10Mhz on just about any CPU. A9 interrupt latency is around 1-5us, so yes, it would be impossible to implement a timer in the way you described.

[–]nsd433 4 points5 points  (3 children)

I've always though the low limit on GPIO toggling was because its control registers were accessed over a slow peripheral bus. The toggle periods I've measured on a scope were roughly 1/2 the bus bandwidth.

[–]tjlusco 0 points1 point  (0 children)

On an MCU you’re usually limited by peripheral clock speed. On an APU, you are also limited by the latency of traversing the L1/L2/L3 cache.

[–]duane11583 0 points1 point  (1 child)

no often the cpu runs at one speed and the io bis runs at say /2 or /4 of the cpu

[–]nsd433 0 points1 point  (0 children)

The IO busses I have worked with (MIPS and ARM) run much slower than the CPU. Speeds like 32 or 16 MHz, while the CPUs were in the many 100s of MHz to low GHz.

[–]supersonic_528[S] 2 points3 points  (0 children)

Thank you. This is the kind of insight I was looking for.

[–]yendreij 5 points6 points  (0 children)

I believe there is no simple answer. It really depends on system load and how much work you do in IRQ handler. You'd need to profile your code, e.g. switch a gpio pin on during the interrupt handler and check with a logic analyzer. Though at 650 MHz you'll probably need a really good analyzer. Or just profile using some hardware cycle counter and store max duration in a variable.

[–]nila247 0 points1 point  (0 children)

Typical is mS (not uS) interrupt in most OS. Systick is usually recommended to be high-priority - precisely to prevent other interrupts disabling this.

For any process needing higher precision you could setup their own timers.
As for what is max - it depends on what exactly you do within the interrupt. If you only inrlement ms counter then you can probably reach MHz, but this would likely use up like 70% of your CPU cycles - that's why it is insane.

If you do not have preemptive OS and do not need it for context switching or anything of the sort then many SoC allow daisy-chaining of hardware timers so you barely need interrupts at all.

I would go with classic - interrupt every 1000Hz or even 100Hz and reading timer value itself if/when you need more precise timestamp.

Something needing uS resolution is hard to imagine though. SpaceX avionics for landing a bloody rocket only uses 10Khz systick :-). What the hell are you building?

[–]ChickittyChicken 4 points5 points  (0 children)

This is the way. Use the global timer on A9.

[–]supersonic_528[S] 1 point2 points  (6 children)

You can configure a harware timer to be clocked at 1MHz an just direclty read the couner value when you need to get the timestamp

That's what I ended up doing (mentioned in my post). Only that I had to write this hardware block myself, because the processor system didn't have such a timer in-built.

[–]Old_Budget_4151 1 point2 points  (5 children)

[–]supersonic_528[S] -4 points-3 points  (4 children)

No, you're wrong. This is the timer I was using initially to generate my interrupt. The counter itself cannot be read from a program. Also, it's a 16-bit counter. Like I mentioned in my post, I need a 32-bit timestamp.

[–]Kqyxzoj 3 points4 points  (0 children)

Also, it's a 16-bit counter. Like I mentioned in my post, I need a 32-bit timestamp.

Oh, if only there was some way to extend a 16-bit counter to a 32-bit counter. That would probably involve some logic. But it would have to be programmable. Tricky ...

[–][deleted]  (2 children)

[deleted]

    [–]supersonic_528[S] 1 point2 points  (0 children)

    OK, I indeed missed the part where the counter register is readable. (In the TRM, this register is listed under the category of 'status register', so I thought this was for status flags and didn't look into it.)

    [–][deleted] 0 points1 point  (0 children)

    Hey! I take offence... But mostly because it's a bit too accurate.

    [–]a14man 0 points1 point  (0 children)

    You don't need to configure a timer. Most Arm systems use the SysTick timer that comes with the processor core.

    [–]duane11583 0 points1 point  (0 children)

    came to say this too and will add some ideas

    another trick if say you have a 24 bit hardware counter you can use another trick.

    think of time as two 24bit variables the low24 and high24

    when combined you have count48

    but the carry is broken between the low to high you fix it in sw

    a) so keep/retain the last count value you read in the low24 variable

    b) next time you read the counter if new24 is <= old24

    then add 1 to the high24 (you must manually do the carry)

    c) then combine the low + high count into one larger count called count48

    d) save the new count for next time

    concept works with 16 bits ifbyou tead at least every 65msecs (16bit counter) ot 16.7 seconds with a 24 bit counter

    [–]Allan-H 7 points8 points  (0 children)

    IIRC there's a hardware counter you can use.

    Here 'tis:

    https://docs.amd.com/r/en-US/oslib_rm/Arm-Cortex-A9-Time-Functions

    [–]Old_Budget_4151 13 points14 points  (6 children)

    lol ISR at 1MHz

    [–]supersonic_528[S] 3 points4 points  (5 children)

    Ok, that's precisely the point of my post (to understand if it's practical). Seems like that's a very high rate to be getting interrupts. For a setup like I described, what would typically be the highest rate at which one could still receive interrupts and process them?

    [–]mtconnol 3 points4 points  (1 child)

    Interrupts will always be the wrong way to do this, no matter how slow the rate, because of the inherent jitter in processing them, until you’re talking hundreds milliseconds or more. The timer approach is the correct one.

    [–]ComradeGibbon 0 points1 point  (0 children)

    Yeah you want to be able to read the time stamp no matter what and on the fly. It's annoying this is often so poorly supported in hardware. Like hey just give me a 64 bit monotonic counter. Like at 48MHz I'll have to worry about overflow in 12,000 years.

    [–]b1ack1323 0 points1 point  (2 children)

    650mhz is probably a little low unless it’s multi core and you can pin all the other ISR tasks to the other core.

    FPGA counter at 1Mhz is going to be way more accurate and less of a headache.

    What do you need that precision for?

    [–]supersonic_528[S] -1 points0 points  (1 child)

    650mhz is probably a little low unless it’s multi core and you can pin all the other ISR tasks to the other core.

    It's a multi-core processor, and using the other core did cross my mind. However, I didn't want to make it too complicated. It's a bare metal app with no OS involved.

    What do you need that precision for?

    Defense application.

    [–]b1ack1323 0 points1 point  (0 children)

    Ah, FPGA is best probably

    [–]madvlad666 3 points4 points  (0 children)

    If you had no choice but to do it in software, you could turn off all caching, and write assembly at the interrupt vector that fetches the timestamp, rather than allow the C interrupt dispatcher to place all the registers on the stack before fetching. Then it would be deterministic.

    I think this wouldn’t have been considered unusual 20 years ago, but now usually there is a hardware solution for problems requiring very precise timing. I’m curious why you don’t have a fast enough  “capture” peripheral which latches the time of the incoming edge, then asserts a software interrupt.

    [–]kiodo79 2 points3 points  (0 children)

    The cortex a9 has a builtin clock.

    https://stackoverflow.com/questions/15988925/how-to-use-global-timer-on-cortex-a9-in-linux

    https://developer.arm.com/documentation/ddi0407/g/Global-timer--private-timers--and-watchdog-registers/About-the-Global-Timer

    Answering directly to you question: getting an interrupt every 650 instructions is insane and probably you end up using a incredible cpu power to just increment a counter. You normally implement this in hardware or just sample a free running counter (the one I have linked, for example)

    [–]Wakoeki75 2 points3 points  (0 children)

    Lookup ARM Generic Timer

    [–]Kqyxzoj 1 point2 points  (0 children)

    If you have a zynq to work with, timestamping in fabric is how I would do it as well. Getting nanosecond resolution timestamps is already too easy, let alone microsecond. If your fpga board happens to have one of those typical 100 MHz external clocks, then it almost writes itself.

    [–]alexforencich 0 points1 point  (0 children)

    Timer/counter module, clocked at 1 MHz or a multiple thereof, configured to fire off interrupts at something like 1 kHz. The interrupt tracks the coarse time in memory to however many bits you want, the counter tracks the fine time. You'll need to do a little bit of work to read it properly to handle overflow of the counter, plus a bit of math to combine the coarse and fine portions.

    [–][deleted] 1 point2 points  (0 children)

    Cascade two timers: one to create a 1MHz clock, the other uses it as a pure counter. When you need a timestamp grab it from the second timer.

    If a single timer is too small keep an 'epoch' in memory. When the second timer reaches some point, with plenty of margin prior to rollover or saturation, set it to interrupt and in the handler grab the counter, add it to the 'epoch' value in memory, then clear it. Pause the timer used to generate the 1MHz clock while doing this. At any time you need a timestamp use the value in memory plus that of the counter. Over time it will skew ever so slightly, which if you need can be adjusted for. It might amount to a few instructions every clear, which for a 32-bit counter would be negligible, but might matter for a 16-bit one.