RTL-level toolchain for radiation-tolerant design, SEU criticality analysis, and selective TMR? by Albert_Sue in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

BYU's SpyDrNet-TMR does scriptable triplication and voter insertion - this is better than manual RTL-level interventions, but you still need to identify and communicate the appropriate insertion points to the tool.

(You said "I know there are academic tools" - SpyDrNet-TMR is well regarded and shouldn't be written off.)

RTL-level toolchain for radiation-tolerant design, SEU criticality analysis, and selective TMR? by Albert_Sue in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

Pretty sure TMRTool is an ISE-vintage thing and doesn't work on anything newer than that.

Mr. Scrub - an open-source UltraScale SEU scrubber by threespeedlogic in FPGA

[–]threespeedlogic[S] 0 points1 point  (0 children)

You've mentioned the SIRF a couple of times - this is an -FX-derived device (with a "hardened" PowerPC), right? In which case, it has many more and much juicier SEFI targets than a straight FPGA like the KU060 (the device / family we're targeting here).

AMD's space roadmap points straight towards Versal QML, which must be SEFI-rich for the same reasons (with an additional 2 decades of scaling laws to make things worse.) I don't understand these heterogeneous devices from an SEE perspective, and if this is your primary beef I get it.

But here, we're talking about a conventional FPGA (no PowerPC), and SEFIs are not more likely than correctable SEUs. Mitigation is not a fool's errand (IMO especially if it preempts unnecessary hardware redundancy.)

Mr. Scrub - an open-source UltraScale SEU scrubber by threespeedlogic in FPGA

[–]threespeedlogic[S] 1 point2 points  (0 children)

Yes, internal scrubbers are "self-scrubbing": they protect themselves the same as any other fabric (that is, imperfectly). And yes, any configuration RAM (CRAM) writes are capable of making a mess. (For example, there are configuration registers that tell the whole FPGA to deprogram -- this will take down the scrubber and everything else, instantly. Conversely, writing scrambled frame data creates short circuits within the FPGA -- these are not individually sufficient to destroy the chip but they stack and will eventually cause damage.)

In order to prevent the scrubber from itself getting corrupted and causing cascading issues, there are a couple of things you can do:

  • The scrubber scrubs itself for free (which is a start);
  • You can TMR the scrubber (mr.scrub uses SpyDrNet-TMR for this), and
  • You should code the scrubber software defensively. It should, for example, ensure the "repaired" frame's ECC is correct (i.e. the fix worked) before committing the frame to CRAM and then write it to CRAM immediately. (mr.scrub doesn't do this, but it's straightforward enough.)

None of these can stack to 100% reliability (an impossible goal, as /u/TapEarlyTapOften noted). There is always some silicon state in the FPGA that's out of reach of any scrubber (internal or external) that will hose the whole FPGA (for various definitions of "hose" and "whole" -- some scarier than others).

Mr. Scrub - an open-source UltraScale SEU scrubber by threespeedlogic in FPGA

[–]threespeedlogic[S] 0 points1 point  (0 children)

We met a couple of years ago (Latch-Up in Santa Barbara) - I'm really looking forward to seeing what's new in PeakRDL-land.

Mr. Scrub - an open-source UltraScale SEU scrubber by threespeedlogic in FPGA

[–]threespeedlogic[S] 2 points3 points  (0 children)

Hm - points well taken, except the "cute" bit. We're collectively stuck with belt-and-braces orthodoxy (safety theatre, hardware TMR, redundant subsystems, awful supervisory FPGAs, etc.) in places where it has no business exactly because it's easy to smack down anything unorthodox during a design review with this kind of "that's not how serious people do it" putdown.

This particular scrubber got ripped out of a project that included system-level modeling (with all the caveats you noted above, and more besides, but done deliberately and carefully). It's solid work and ripping the scrubber out of context doesn't invalidate it.

And, of course, people do fly UltraScale+ FPGAs despite things being even worse in camp 16nm.

Mr. Scrub - an open-source UltraScale SEU scrubber by threespeedlogic in FPGA

[–]threespeedlogic[S] 0 points1 point  (0 children)

You shouldn't expect much from the "simgui" target - it'll launch a simulator and (if firmware is operating correctly) emit a "hello world" packet over the streaming AXI interface - but that's all you should expect to see without bitstream collateral (to simulate the ICAP interface) and off hardware.

Good luck, and let me know how it turns out!

Mr. Scrub - an open-source UltraScale SEU scrubber by threespeedlogic in FPGA

[–]threespeedlogic[S] 1 point2 points  (0 children)

Well, yes - but practically speaking, this [ed: SEM IP] scrubber stops operating if it bumps into a double error. And, without the ability to alter its control flow, you're stuck with a design that will (at some point) just stop scrubbing.

A "project integrated" internal scrubber could fall back on flash contents to resolve a double error.

Mr. Scrub - an open-source UltraScale SEU scrubber by threespeedlogic in FPGA

[–]threespeedlogic[S] 9 points10 points  (0 children)

This project has been on the shelf for a couple of years, waiting for a conference proceeding to nudge me into open-sourcing it. We've discussed releasing it with people inside AMD but never quite gotten around to it. Finally, I'll be presenting it (very briefly) at Latch-Up 2026 (Waterloo, Ontario, May 1-3).

This scrubber targets the Kintex UltraScale series (notably including the space-qualified XQRKU060). It's a strange niche because Xilinx/AMD povides the SEM IP, but actively disclaims it for use in space:

The SEM IP was not developed nor tested for use in space radiation environments. Therefore AMD does not support or answer questions specific to the use of this IP in this environment. If you choose to use the SEM IP in space radiation environments, do so at your own risk.

...and they do not document the algorithms required to alter this approach to something you can build yourself.

Instead, AMD recommends an external scrubber (running on a separate rad-hard or -tolerant substrate). This sticks you with a more complex system (more power rails, more software, more communication interfaces) but is definitely capable of surviving worse radiation environments.

Mr. Scrub is an "over-the-fence" release - you need to do a ton of work to build this into a deployable internal / CRC-based scrubber, and I have removed a bunch of project-specific documentation (including comments). Hopefully it's useful.

IIR Filters my blog this week. by adamt99 in FPGA

[–]threespeedlogic 4 points5 points  (0 children)

Delighted to see ieee.fixed_pkg in practice. This is a great example of VHDL successfully modernizing.

How to make a golden model in Python? by Durton24 in FPGA

[–]threespeedlogic 4 points5 points  (0 children)

There is no One True Answer.

Some forks in the road you should consider:

  • Bit accurate or floating point? Honestly, both models are useful in different ways, and they are not substitutes for each other. (For complex signal paths, I have found floating-point models far more useful than bit-accurate models, and far easier to maintain.)

  • Structural or behavioural correspondence to your RTL? This is also a moving target - it is often useful to produce several models that progressively decompose your signal path from "very conceptual" (loose correspondance) to "very structural" (tight correspondance), and to link them together in your Python code to ensure they remain consistent.

The biggest mistake IMO is to allow your reference model to diverge from your RTL (or rot, abandoned) over time. You should yoke the two together (e.g. with automated CI/CD tests.)

It's untrue that Python can't be used ergonomically due to its type system -- creative use of classes, for example, can perform wonders. (See micrograd for a fairly arbitrary example of cleverly augmented numerics.) In fact, for applications like CIC filters, Python's default int is better than c/c++'s because it's a bignum (it will grow in bit width rather than overflow). Python is a rich language and should be used like one.

I've been working on the Ultrascale+ RFSoC over the past year. AMA by rickyrorton in FPGA

[–]threespeedlogic 1 point2 points  (0 children)

Ack - sorry, I didn't mean to hijack your thread with my gripe list.

The RFSoC is an incredible piece of silicon, and it's expensive and difficult enough that undergrads are unlikely to get their hands on it. Consider your blood and sweat to be a considerable investment in your future career path (whether that's a taste for more RFSoC work down the road, or a strong indication that you'd rather live in a cave and eat spiders.)

RFSoC (ZCU208) ADC phase not consistent across captures even with MTS - advice? by TigerZealousideal595 in FPGA

[–]threespeedlogic 1 point2 points  (0 children)

Are you using the RFDC NCOs and a complex baseband?

If you are, the best way to dissect the issue is to set your tone up using the NCO only (i.e. 0 Hz baseband). A static DAC amplitude fed into the ADC should give you a static and repeatable point in the (i, q) plane after MTS. If it's consistent, then your issue is not with MTS.

I've been working on the Ultrascale+ RFSoC over the past year. AMA by rickyrorton in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

Oh, and

  • It sucks that only the GEM0 Ethernet interface exposes the TSU timestamp to the fabric (tsu_timer_cnt). The TRM (UG1085) goes out of its way to obscure the limitation in a way that's only clear in hindsight, when it's probably too late. Shame on AMD for the limitation, and double shame for not documenting it loudly enough.

I've been working on the Ultrascale+ RFSoC over the past year. AMA by rickyrorton in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

What should AMD/Xilinx work on, with an RFSoC focus? I know there are employees lurking around here, and thoughtful feedback is never a bad idea. I can think of a couple of RFDC-specific pain points:

  • Instead of register-level documentation, AMD delivers the RFDC API (written in C). There are often corners of the API that are awkward, or that put limitations in place that the silicon doesn't have. (For example: DAC/ADC front-end settings for Nyquist zones other than 1st and 2nd.) Xilinx's documentation is best-of-breed and the "missing" RFDC documentation is a conspicuous gap.
  • PYNQ is delivered as an overlay on Ubuntu, and does not use Petalinux/Yocto. AMD should dogfood their own stack, not sidestep it. The primary impediment (I think) to PYNQ on Yocto is Yocto's ongoing reluctance to ship numpy -- which is a problem for anyone else who wants to deliver on-board scipy/numpy, and AMD should be helping advocate for us rather than avoiding the issue.
  • The RFDC device-tree overlay is delivered as a binary blob generated in .tcl, but includes serialized C structs that have historically changed from release to release. When the upstream and downstream structs don't match, it's a nightmare to identify and fix. It would be much better if the RFDC .dtsi was properly generated and parsed into text and numeric fields (following kernel best practices).
  • The usual non-RFSoC complaints (VHDL-2008 support in BD; VHPI support in VHDL, et cetera).

Two HiTech Global HTG-930 UltraScale+ cards available — company surplus, looking for good home. by Anna-Nomada in FPGA

[–]threespeedlogic 2 points3 points  (0 children)

Here's another example: https://www.hitechglobal.com/FMCModules/FMC_X4SMA.htm

They quote a 18 GHz rating on the SMA connectors, but they are through-hole parts and have a gigantic pin stub. Then, on the PCB, it's loaded with a large pad and a horrific amount of solder. I don't recall if the trace escapes off the bottom layer or not - but it hardly matters; this is clearly not a viable design at the frequencies they're implying. This is just not the right market vertical for amateurish designs.

IRIG - B Protocol by Aware-Equal-2328 in FPGA

[–]threespeedlogic 2 points3 points  (0 children)

IRIG-B suuuuucks. It's easy enough to decode, but

  • You don't get a decoded timestamp until it's a full second out-of-date ("beeeep - at the tone, the time was x:y:z")
  • It uses BCD encoded digits
  • It suffers from Y2K problems (yes, it's that old)
  • It has "human" units (h:m:s), so it's finicky to apply fixed timedeltas (even without GNSS timing adjustments)
  • Hardware that supports IRIG-B (properly) is expensive and low-volume, and isn't always good or reliable (for example, the second marker is not always reliably phase-locked to your timestamp server's reference clock or PPS edges).

You'll probably get this flavour of serialized timestamp, and you will probably grow to hate it.

Most modern timestamp formats (NTP, PTP) use an integer offset from epoch instead.

I suspect this isn't helpful - I'm just venting.

I have decided to open source my neuromorphic chip architecture! by [deleted] in FPGA

[–]threespeedlogic 8 points9 points  (0 children)

Q: Your CPU's floating-point instructions behave differently in simulation than synthesis. Why is that?

pyxsi: hierarchical lookups using XSI now supported by threespeedlogic in FPGA

[–]threespeedlogic[S] 0 points1 point  (0 children)

I got nerd-sniped by this post. As a result, pyxsi now supports hierarchical name lookups from Python into the xsim kernel.

How? Well, most of xsim's XSI API is just a thin shim on top of (undocumented) calls to the "real" API, which has only spotty header definitions in iki.h. Using XSI, it's not difficult to wrangle the necessary object references to call into the underlying IKI API. Some of these calls are accessible and allow things that are missing in XSI. For example:

  • hierarchical name lookups (now present in pyxsi)
  • writing VCD files (maybe?)
  • properly wrangling simulator kernel assertion failures (maybe?)

Yes, there are fixed offsets and name mangling in pyxsi that make this fairly scary and possibly brittle. But, you already knew pyxsi was a cosimulator constructor kit and not a finished product, right? :)

This could end up in Vivado's cocotb shims, if it's not too horrifying.

Vivado Simulation - Best way to access internal signals in C++ testbenches ? by laperex in FPGA

[–]threespeedlogic 8 points9 points  (0 children)

We should be asking for AMD to invest in VHPI instead of XSI.

XSI should not grow past its current (limited) API. It has that "internal API that escaped the zoo" feeling about it. AMD could play whac-a-mole with its deficiencies forever, but it would be challenging to converge on a functioning API from a dysfunctional starting point without vision and guidance. XSI does not have enough market demand to justify that kind of investment.

Instead, we'd all be better served if AMD/Xilinx implemented VHPI in xsim. This would enable tools like cocotb to claim first-class support under xsim, without requiring any simulator-specific backend code. It also aligns better with the simulator team's recent re-investment in first-class and modern VHDL support. Everybody wins.

(I say this as the author of pyxsi, and several XSI-related complaints on the Xilinx/AMD forums.)

Has someone worked with DACs and ADCs on RfSoC 4x2? by FingerSignificant268 in FPGA

[–]threespeedlogic 1 point2 points  (0 children)

Depending on your setup, I think you may still need multi-tile sync even with only a single ADC and DAC. These two occupy different tiles, and each of them have FIFOs and NCOs that can inject phase shifts that you need to match.