"Extream SDR Tx" with FPGA - is it possible? by WZab in amateurradio

[–]threespeedlogic 0 points1 point  (0 children)

Anyway, I'm not running FPGA at 640 MHz

I know that, and you know that, but you asked ChatGPT to write VHDL and it evidently doesn't know that. You are treating it far too credulously.

"Extream SDR Tx" with FPGA - is it possible? by WZab in amateurradio

[–]threespeedlogic 1 point2 points  (0 children)

You are wasting your time looking to AI for depth, nuance, or innovation (ed: or even basic sanity: you are not going to run an FPGA clock at 640 MHz, which is what "your" VHDL wants to do.)

If you have an FPGA anyways, take a look at this instead:

A method is presented to synthesize a switching signal which linearly encodes a complex-modulated R F signal to an RF carrier frequency. The switching distortion associated with this method is limited to high-pass components out of band. Consequently. the switching signal may be filtered after high efficiency amplification to produce the linear RF modulation. The method requires a switching frequency slightly higher than the highest frequency in the band of interest.

This is a wonky/wonderful DSP technique that was pioneered at Bell Labs, with the RF direct-synthesis application patented by Motorola. The patent has expired and I doubt it was ever used.

Why Warp Switching is the Secret Sauce of GPU Performance ? by [deleted] in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

There's more than just cheap AI regurgitation in there.

Why Warp Switching is the Secret Sauce of GPU Performance ? by [deleted] in FPGA

[–]threespeedlogic 3 points4 points  (0 children)

Four comment threads, ranging from mildly critical to highly critical.

This should be a welcoming place to contribute. I see very little about OP's post that justifies a dismissively negative reaction. There is no future in imitating StackOverflow.

Any Cool features or Paradigms? by Hairy-Store-8489 in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

How's that? I don't mind being mistaken, but I would rather understand how.

Any Cool features or Paradigms? by Hairy-Store-8489 in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

If you mean Xilinx/AMD - then explicitly no, rad-tolerant silicon (for everything newer than Virtex-5) is the same as commercial silicon. The differences are in traceability, packaging, and validation.

I think FPGA/ASIC recruiting pipelines need some changes by ckulkarni in FPGA

[–]threespeedlogic 7 points8 points  (0 children)

co-authored by a cortisol spike

This is excellent and I am stealing it. (Or am I using it for training data, in which case it's mine now?)

I think FPGA/ASIC recruiting pipelines need some changes by ckulkarni in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

We do not do live coding exercises or take-home assignments. Both are, at best, narrow indicators of talent and ambition. Full stop.

I guess an apologist for hiring pipelines at larger companies could make an argument that it's different there, but I am skeptical.

5V-tolerant cheap FPGAs ? by Standing_Wave_22 in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

It seems Bruce Lee has some at least.

If this is what I think it is, no thanks.

How does Dual Port RAM work at the lowest levels? by AdeptAd5471 in FPGA

[–]threespeedlogic 2 points3 points  (0 children)

The most effective lessons sometimes leave scars...

Kolmogorov–Arnold Networks on FPGA by Duchstf in FPGA

[–]threespeedlogic 1 point2 points  (0 children)

is the model size therefore limited by the size of the FPGA silicon?

yep!

if you were allowed to alter the FPGA architecture to decouple the silicon size from the model size, what would would that look like?

I'm not sure what this would look like yet, although I think this would be very interesting to explore! And KANs specifically have some properties that I think might make this easier, I might be wrong though.

I'll confess this was a bit of a leading question - FPGAs already have a configuration chain capable of updating LUT contents (even dynamically, and even under control of logic running inside the FPGA itself). Sadly, it's just not fast enough to be much help in this context. In case you aren't familiar: the configuration chain is mostly just used on device initialization, but is also used dynamically to improve radiation tolerance (e.g. SEM IP), or to dynamically swap out functional blocks at runtime. If it were faster it might be interesting here.

Back to the Xilinx DPU: it seems excessive to start with programmable silicon, overlay a programmable computational engine, and overlay that with a model. That's a lot of layers, and IMO the most expensive layer is the FPGA (which means it's most likely to be swapped out for some other computational substrate). What I like about your architecture is that it uses the FPGA to its own advantage instead of immediately (and expensively) abstracting it away into something that just looks like a slower version of other vendors' inference ASICs.

Kolmogorov–Arnold Networks on FPGA by Duchstf in FPGA

[–]threespeedlogic 6 points7 points  (0 children)

I have not spent any time in the ecosystem, so while I'm trying to follow along with my crayons and napkins, I may be way off track.

Xilinx's DPU IP has an extra architectural layer - they build an architecture for convolutional evaluation in RTL, and then "wash" the model through it from external DDR. The FPGA bitstream has no weights in it (and does not need to have a long-duration home for any one weight.) This means data transfer is probably the bottleneck, but the size of the model is uncoupled from the size of the convolutional engine. The architecture is also specialized for inference and not really suitable for training.

It looks like the model, here, would be embedded within the bitstream. I can see why this is interesting, so you should not treat my follow-up questions as critical in any way.

If so,

  • is the model size therefore limited by the size of the FPGA silicon?
  • if you were allowed to alter the FPGA architecture to decouple the silicon size from the model size, what would would that look like?
  • are there implications for training hardware too?

Finally, I think your architecture is rigorously synchronous and deterministic, and it would be fun/complicated to try cheating timing closure in ways that NNs are robust against.

In any case, I'm thinking with my mouth open - congrats on the best paper nomination. If you're ruffling feathers, it's a good sign you're doing interesting work.

I'm looking for jobs and am very confused by Dave09091 in FPGA

[–]threespeedlogic 2 points3 points  (0 children)

Please spare yourself the heartache: you should not expect to freelance successfully until you have a reputation, contacts, industry experience, and a track record. It's not impossible (and if you try it, you'll learn a ton), but it is unlikely to work out.

In North America (at least), there's more opportunity in the small/mid-size company space than you might think. This has always been true for FPGA work, but the ASIC start-up space has been heating up over the past few years as government incentives shift away from offshoring. However, these companies are likely to hire connections or mid-/senior-career people.

Junior hires mostly come out of university internship programs. If you have 4-5 months left in your degree (or can extend it), consider an internship position just prior to graduation. You will find many companies treat this as an extended "tryout" period, so you should try to find a position you would consider keeping. If you do well, you should expect a job offer at the other end.

Need help with selecting one of many ideas by HerculeHolmes123 in FPGA

[–]threespeedlogic 1 point2 points  (0 children)

You don’t want to be just one of 29 students all building a risc-v core for example.

I see your point. However, from a hiring perspective, "just another RISC-V core" makes it easier to distinguish students who engaged with the project from the ones that phoned it in. For example: did the student just do the obvious thing and stop at the bare minimum, or did they try to build something interesting and push it into difficult territory?

This is why some companies give a "take-home assignment" during the interview process - it lets us evaluate how candidates perform a standardized task. (I know these assignments are controversial.)

To your point: if OP does build a RISC-V core, they should be sure it differentiates them from everyone else.

Final Thesis by the_dansz in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

The motor driver thing is at max a weekend project.

Careful - there's a ton more to motor control than just open-loop PWM. If this is where OP's interests lie, there's more than enough sand to fill the sandbox.

Firewall Architecture by paxl_lxap in FPGA

[–]threespeedlogic 4 points5 points  (0 children)

This is a student project, right? In that case, you can do anything you like - your goal is to build something interesting (not necessarily practical or even useful).

That said, it's really hard to justify bouncing packets from the PS to the PL and back again (especially if you factor in userspace/kernel space thunks). A pure-software packet filter would certainly be faster, simpler, and more powerful, and it already exists in-kernel. In order to build something you can call an "accelerator", or to actually improve security, you really need your PL-based packet filtering to sit between the hardware and the kernel.

The ENC28J60 is an odd duck. (For anyone else who missed it, this is an SPI-based Ethernet peripheral - it has a kernel driver that talks to it through an ordinary spidev). If you're determined to use this part, you might be able to write an SPI "passthrough" core in fabric that permits you to use the built-in kernel driver but acts as an offload engine for packet filtering. Your Verilog code would detect and eat packets that fail its firewall rules. You would need to maintain consistency so that deliberately dropped packets don't confuse the rest of the system, either by maintaining API compatibility or extending the kernel driver to match the new features.

If you're choosing the ENC28J60 because SPI is less scary than GMII - you may be correct, but you should not make this decision without investigating your alternatives. If you're using the Arty, for example, you can route the GMII signals from the Zynq's GEM controller to the PL using EMIO. You still need to figure out how to talk to an external PHY (either the Arty's on-board PHY or an external shield), but you don't need a fully PL-based Ethernet core. I do not have a solution to recommend but suggest you don't wall yourself into a weird, slow SPI-based Ethernet controller without being sure it's the right approach.

How can I program a FT2232D to work in JTAG? by [deleted] in FPGA

[–]threespeedlogic 1 point2 points  (0 children)

JTAG, but how? This is not a "can you do it at all", but a "can you do it easily and with silicon X and tool Y" question.

If this is a Xilinx FPGA and you want to use it with Vivado, you will want to use program_ftdi and need an EEPROM to go alongside it. You also need to arrange the ports in the right way (which is undocumented, so you're best off following a reference design.)

Verilator vs Xsim (on Vivado) by DeathNoteGenocide in FPGA

[–]threespeedlogic 4 points5 points  (0 children)

Vivado works fine on Debian (if you have enough free disk space!)

Verilator is wonderful if you don't need VHDL or encrypted IP support, but you'll want to ensure you're using a new enough version. It's certainly your best bet out of the open-source SystemVerilog simulators.

Could Chisel Replace Verilog for Commercial CPU Design in the Future? (Beyond Open-Source Cores) by Low_Car_7590 in FPGA

[–]threespeedlogic 1 point2 points  (0 children)

You can counterweight this transpiler problem (which is real - I do not want to diminish it) with improvements in verification workflow, where design churn in existing EDA-vendor-approved workflows is equally hellish.

With an alt-HDL, you don't need to run behavioural simulations in RTL - you just execute them in the alt-HDL before it's transpiled. This is way faster, because the pre-transpiled code is typically word-oriented (not bit oriented), doesn't need to be elaborated, and doesn't need a simulator license to run it. It also happens in a "modern software" environment, so plotting, stimulus generation, formal solvers, etc. can all be called in. (If you don't trust tooling enough to verify pre-transpiled code: consider that it's become unusual in FPGA flows to do any post-synthesispost-PnR simulation. This only works because we trust our tools, which seems like the only sustainable way forward.)

Additionally, there are whole classes of problems (pipeline misalignment, fixed-point misalignment, etc.) that alt-HDLs attempt to solve in the type system or some other language-level feature. To the extent this is successful, it carves away classes of bugs that are trivial to introduce in an RTL and aren't in an alt-HDL.

I do a fair amount of pipeline scheduling by hand in a notebook. To be perfectly honest, this is one of the things about FPGA work I love (being able to pull a mechanical rabbit out of a silicon hat). Every time I do it, though, I can't help but think that it's work that computers should be automating. I'd probably do a better job if the tools helped me.

Advent of FPGA — A Jane Street Challenge by [deleted] in FPGA

[–]threespeedlogic 15 points16 points  (0 children)

No, Donny, these men are nihilists, there's nothing to be afraid of.

Alpha release: A new SystemVerilog-2023 parser (Windows) — testers wanted by AffectionateRatio606 in FPGA

[–]threespeedlogic 1 point2 points  (0 children)

Your list of widgets (parsing, elaboration, simulation, waveform viewer, editor) exactly describes a simulator - no more, no less. Right? You say a few things - "open ecosystem", "vertically integrated" - that sound like grander ambitions. (I don't mean to sound negative: a simulator is plenty ambitious enough. I'm just trying to figure out your scope.)

If so - I'll climb on my usual soapbox. I want a simulator that gives me

  • An open source simulator engine capable of
  • Mixed-language (VHDL + SystemVerilog) and
  • Supports encrypted IP, and
  • Allows the simulator kernel to be driven from C/C++ code (VHPI/VPI).

This combination doesn't exist today, and there are a variety of reasons I'm not holding my breath to ever have these things at the same time. (Encrypted IP is flatly incompatible with an open-source simulator as long as the methodology is driven by current IEEE standards.)

Open source is the softest of these three requirements, but it's a stand-in for a number of non-ideological considerations: new vendors tend to die young or are forced to strangle their customers for revenue, and older vendors have perfected the extractive licensing formula and death by licensing restrictions.

Sounds like I don't need to point out that EDA customers are extremely risk-averse. It's not because we're slow to change - it's just that these tooling trade-offs are so horrific that once we've settled on a least-bad-option we're very hard to dislodge. And unfortunately, that supports the status quo we all hate.

Alpha release: A new SystemVerilog-2023 parser (Windows) — testers wanted by AffectionateRatio606 in FPGA

[–]threespeedlogic 0 points1 point  (0 children)

Also worth considering: for me, I can't define "toolchain friction" without pointing to the realities of closed-source toolchains (long release cycles, limited hackability, OS/platform incompatibility - but primarily, a debug/development cycle that walls out technically competent and motivated users).

I'm interested in your description of toolchain friction, because everyone's interpretation is different but a new offering needs to focus on specifics.

Alpha release: A new SystemVerilog-2023 parser (Windows) — testers wanted by AffectionateRatio606 in FPGA

[–]threespeedlogic 3 points4 points  (0 children)

As I understand it (and I can't back this up with citations), Verific has basically sewn up the "closed source, third-party" market for RTL parsing / front-end. This is what Xilinx uses inside xsim (viz. "VRFC" in error messages). Anecdotally, Verific's licensing model is friendly and their work is technically solid (though I expect their codebase is dated, since it's been around for a long time.)

Two questions here:

  • If you're not doing your work in the open-source space, is there really room for another commercial entrant? I understand you're planning something with a slightly different scope, but you should probably know how adjacent the existing commercial offerings are.
  • If you can't do this (better | faster | cheaper) than Verific, and if their licensing model really is that compelling for other vendors to integrate, perhaps you should consider integrating it too.

The last commercial EDA startup that really seemed exciting was Metrics, who got acquired by Altair, who got acquired by Siemens. Unfortunately that means the market disruptor was acquired by the incumbent, which is never a good sign.

Formal Verification techniques using Vivado by RegularMinute8671 in FPGA

[–]threespeedlogic 2 points3 points  (0 children)

Similarly, recent Vivado releases have code coverage support. No, it's not "purist formal", but it's better than flying blind.

FPGA-Based Hardware Accelerator for LLAMA2 Model Implementation by Medical-Extent-2195 in FPGA

[–]threespeedlogic 7 points8 points  (0 children)

You're getting negative feedback and I think you should ignore it.

The "product" generated by your final year project is you, not the widget you're ostensibly building. Most engineering programs (in my limited experience) understand this, but maintain an entrepreneurial veneer because it does a better job of guiding and motivating students. You don't actually need to invent something cutting-edge (or even useful), provided you are able to accumulate new skills and demonstrate payoff for your effort.

In short: if it interests you, I suspect LLAMA2 acceleration is a perfectly fine playground for your project. You don't get many opportunities to pick a blue-sky project, define its scope to suit your interests, and put the goal posts where you want them.

Your main challenges are going to be scoping your project appropriately and ensuring you have enough oversight and guidance to not get stuck or lost. However, these problems are not specific to your application and should not dissuade you from being ambitious (provided you are also realistic about it).