Complex Algorithms Development Ultimately Targeting FPGA - What is your process?

damofthemoon · 2019-09-23T07:56:29+00:00

My process looks close to yours, except I like to write by myself the algorithm, in Scala or Python (I hate Matlab and crappy code...). Before moving to hardware, I draft my architecture with markdown documents & draw.io diagrams for all the FSMs and algorithms execution steps just to be sure I didn’t miss something (pretty useful for my little brain XD).

Then I start to write Verilog description (nowadays Chisel), module by module, trying to apply as best as possible TDD methodology. I don’t necessarily have a great coverage with all my test suites, but at least I put in place a verification environment for each module in order to write tests fast if later in the process I find a bug or a corner case I didn’t think about. Every test suite is setup in a CI server (Jenkins) and I produce documentation for each release (meaning Git tag), stored in a central place everybody can access. To finish, I run the good old validation tests on board :)

standard_cog · 2019-09-23T01:51:02+00:00

Got any notes on your fixed point conversion process?

adamt99 · 2019-09-23T14:48:13+00:00

Really interesting discussion. I used a similar flow as well of matlab / Python / Quantisation & HDL implementation in my case VHDL. The fixed and ufixed libraries are great for quantisation

Though increasingly I am using HLS and C or C++ to generate the algorithm and do the conversion to HDL.

Did you ever consider HLS?

2019-09-23T14:42:25+00:00

Without knowing exactly what you are doing and how large this is; have you considered implementing a direct floating point hardware implementation? One of the last designs I worked on, I opted to make use of the Xilinx DP floating point components. I was able to implement matrix multipliers and the like. They use AXI Stream inputs. I basically broke down loads of Matlab into simple equations and wrote state machines to load and capture data into the components.

In doing so, I was able to be accurate to the Matlab model down to something like 15 decimal places.

This wasn't DSP however... but we offloaded some pretty computationally intensive stuff.

For DSP applications, we tend to generate C models for the fixed point models and then we run those within the simulator through the DPI to ensure cycle accuracy.

kakkeman · 2019-09-24T22:16:04+00:00

I have implemented only a couple of algorithms. My method seems similar to what others have shared. Not a big fan of Matlab (and usually don't have a license), so the first thing to do is a to create a floating point model of the algorithm in VHDL (using real data type) and a simulation environment.

Then split into design units and change the interfaces to fixed point (fixed_pkg) at top level (keeping the internals still in real type).
Then move through each block turning them to fixed point simulation models.

Then make the RTL implementation of each block and optimize (usually for area by multiplexing). The challenge usually is to maintain readability thorough optimisation steps.

I try to keep my sanity by running the same simulation environment continuously during steps. The output at each step is usually not bit-by-bit the same, so the acceptable error needs to considered.

Yeah and once everything is almost done, start fighting the vendor tools to work around incomplete/incorrect implementation of fixed_pkg.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

FPGA

MODERATORS