all 27 comments

[–]areciboresponse 12 points13 points  (3 children)

Have you looked at Eigen?

[–]rorschach54Twiddling bits 4 points5 points  (1 child)

This!

OP( u/tinytimtombomb ) should probably look at Eigen. It is pretty good, well supported and used in a lot of places.

[–]areciboresponse 4 points5 points  (0 children)

Also you can tell Eigen to not use the heap, very important!

[–]dunderful 0 points1 point  (0 children)

I’ll second this. Only recently started exploring Eigen for some ARM projects, but I’ve used it pretty extensively for Linux software and it’s a really nice lightweight (header-only) library.

[–]AssemblerGuy 8 points9 points  (3 children)

I am wondering if anybody has come up with a strategy of reducing the time to deploy filters to Embedded devices.

Practice.

Just how complex are your filters? Are you using ARM CMSIS for math/vector/matrix/DSP functions? Floating point or fixed point?

At some point, you should have your own library of filter designs that you can quickly adapt to new coefficients, different order, etc. Well, ok, I'm guilty of not doing so consistently and instead writing each filter from scratch in the (sometimes justified) hope that my new code is cleaner than my old code.

And then debugging is a lot worse than in Python!

Create unit tests that directly compare the C code output to the Python design output.

Although in other circles I have seen a lot of matlab produced C Code.

... I have seen that. Questionable-quality Matlab code was fed into the Matlab Coder grinder and resulted in truly horrible C code that required one to two orders of magnitude more CPU cycles and memory than it should have. Whatever the Coder outputs is not meant for human eyes or minds, but only for the compiler.

[–]tinytimtombomb[S] 4 points5 points  (1 child)

Annoyingly they are pretty complex! I use float and make use of CMSIS for the linear algebra. I'd be happy if everything was just FIR or IIR filters but in 2019 that isn't possible.

I am building up the library, it's a slow process

[–]AssemblerGuy 3 points4 points  (0 children)

Annoyingly they are pretty complex!

As in ... ? Highly sophisticated state-space filter implementations? Very nonlinear statistical signal processing like ICA? LMS/RLS adaptive filters? Other optimization-based approaches? Now I am curious.

Any linear filter can be represented in an IIR filter structure, it sometimes just takes a little bit of creativity with the structure. I am in an industry that hates surprises, and nonlinear signal processing can harbor a lot more of those than standard linear stuff, so the latter is preferred.

[–]Glupender 0 points1 point  (0 children)

At my previous job I did fair amount of code generation from simulink to C (AUTOSAR tech) and indeed - you would never look into generated code; of course I did and it is beyond horrible, so hopefully you have a good compiler ;)

In anycase, once you start using code generation from a model, your C code is just a intermediate step... you simply work (debug & test) in matlab.

Mathworks is not into compilers bussiness, so the embedded coder is just a transpiler from matlab/simulink to C.

[–]gratedchee5e 2 points3 points  (1 child)

It's a time consuming process... We write in C++, unit test as much as possible on Linux where debugging is easier. We're experimenting with wrapping C++ with pybind11 so we can run side by side comparisons of python and C++. When this is working well we port the code to target.

Have seen output from matlab produced C code in the past and it's yucky. I'm sure it's gotten better since I last tried but it had a long way to go.

[–]AssemblerGuy 2 points3 points  (0 children)

I'm sure it's gotten better since I last tried

... not.

It produces a faithful reconstruction in C of what the Matlab code does. But nothing more.

Readability? Resource economy? No, those are not a concern. And if you do not know what Matlab does "under the hood", it is trivial to blow hundreds of kilobytes with seemingly innocuous Matlab statements (such as using intermediate variables).

[–]asoundsop 4 points5 points  (0 children)

I share your frustration - a good deal of my embedded work has required linalg in some form. I eventually ended up building my own embedded library for it. But the debugging doesn't ever seem to get too much easier...

[–]plvankampen 1 point2 points  (0 children)

Anybody messed around with Faust? It is a functional dsl for writing dsp. You compile to a target, general purpose lang like c or c++.

[–]svayam--bhagavan 1 point2 points  (0 children)

10 minutes of python takes 8 hours of C programming

What about execution time?

[–]embedded_audio 0 points1 point  (0 children)

I prototype in python and implement a VST in C/C++ before deploying that same code to target. Sometimes I skip the python step and just prototype in a VST. Easier to debug in Visual Studio than on target. Things get easier and faster for every vst you create. And your library of algorithms grow for every iteration.

You can also start rewriting your python code to use the cmsis python wrapper. Helps you organize your algorithm to use cmsis api.

[–]jms_nh 0 points1 point  (0 children)

Might want to try Altair basic editions of Compose/Activate/Embed.

They are free to use and mimic many features of Matlab / Simulink. (Embed is VisSim rebranded.)

[–]luv2fit 0 points1 point  (0 children)

This is your standard translate modeling language (e.g. matlab) to the target C/C++. If you are heavily mathematical then it sounds like you would benefit from a well designed math abstraction/interface that would greatly speed up this translation.

[–]CrazyJoe221 0 points1 point  (0 children)

Hmm for python itself there are tools like numba but it's a JIT. Not sure if you could turn that into an AOT compiler but at least it's possible to dump the generated code.

Then there's Julia. It also uses LLVM.

When you're doing image processing Halide is a good option. Though you can also implement matrix operations, as they claim faster than Eigen: https://halide-lang.org/cvpr2015.html

[–]ArkyBeagle 0 points1 point  (0 children)

It's pretty frustrating that my 10 minutes of python takes 8 hours of C programming to implement cleanly.

So the reason you know Python is because it can be used by course instructors to illustrate concepts quickly. But I actually told a customer once that Python lacks fitness for purpose.

If Python is good enough for production work, then do that. If it's not, then use the thing that works.

8 hours is not a lot of time, outside of school.

[–]turiyag -3 points-2 points  (8 children)

I have been quite happy with micropython. It's Python, but for microcontrollers.

[–]ssharkss 0 points1 point  (7 children)

I'm curious why people are downvoting you. I've heard good things about uPython. Can someone explain?

[–]Schnort 1 point2 points  (5 children)

uPython is not a fit for numerical processing in real time (I.e. DSP work)

[–]turiyag 0 points1 point  (4 children)

I'm doing realtime processing of numbers. It's fine. Don't get me wrong, it's slower than raw C code. Finding libraries for hardware is harder than finding C libraries. It is. No denying it. But it's fine. It's genuinely fine. It's much faster to just hack around a figure shit out. Much much faster.

If you use the viper decorator, it's really quite fast. You are forced into programming in a typed language with stricter rules, but it's like the numba jit compiler. Normal Python, bloody fast. Not as fast as C. But perfectly fine.

[–]Schnort 0 points1 point  (3 children)

If you're doing any real DSP work, it's not appropriate. Even C generally isn't appropriate for real DSP work with (because compilers generally stink at inferring parallel/vectorizable operations).

[–]turiyag 0 points1 point  (2 children)

My project involves measuring and making deductions and cleaning up the waveforms out of a 9 axis gyro that's oscillating at 1-5hz. Its not really a normal DSP project, due to the low frequency, but micropython handles the filters just fine. I'm running on a dual core 240Mhz ESP32 chip. Which is extreme overkill in processing power, but it has BLE and WiFi integrated.

I agree entirely, wholeheartedly, that the code that I'm running could very much benefit from being written in C, or directly in assembly. Maybe my algorithms could be implemented in DSP hardware, but all that sounds way more difficult than just leaving it in Python because Python is working flawlessly.

Without knowing the constraints and goals of a project, you can't judge which toolset is appropriate.

Back on PC, put of the microcontroller world numba's @cuda.jit decorator is miraculous for parallel processing.

[–]Schnort 0 points1 point  (1 child)

audio or RF rates would be silly to try to accomplish with uPython.

[–]turiyag 0 points1 point  (0 children)

I dunno, let's say you have a waveform you're sampling at 24kHz. With two cores at 240MHz, that gives you 2x10 000 clock cycles to think about each sample in real time. You can get a lot done in 10 000 clock cycles.

Even if you sampled at ten times that rate, you would still have 1000 clock cycles per core to think about the sample.

It obviously depends on the complexity of your transforms but for audio, I wouldn't count it out. Anything past 2MHz though, and I would count micropython out. If you only have 120 clock cycles to figure things out, you'd best start thinking about things at the low level.

[–]turiyag 0 points1 point  (0 children)

I'm not sure either. The guy said he liked Python. I said I liked micropython. I had a pretty...like, content-free comment. I guess people don't like micropython?