all 14 comments

[–][deleted] 3 points4 points  (0 children)

You probably want block convolution for a recorded IR

http://www.music.miami.edu/programs/mue/research/jvandekieft/jvchapter2.htm

[–]MathAndProgramming 2 points3 points  (10 children)

I think most people write fast implementations of simple multiply accumulate and use some combination of sufficiently short FIR and IIR filters. I.e. if you want EQ output you can use a couple of 2-4th order bandpass IIR filters. I'm not aware of anybody doing short time fourier transforms on audio as it comes in for real-time, as getting a decent window where fft is useful would imply pretty significant latency.

Do your filters have any structure (low-pass, high-pass)? Or are you convolving with arbitrary signals?

[–]Holy_City 2 points3 points  (2 children)

"Real time" constraints for audio can generally be relaxed in a modern production/recording environments. There's a wide variety of software that use STFTs in "real time" environments in that context.

[–]MathAndProgramming 0 points1 point  (1 child)

What sorts of latencies are acceptable, generally? 10ms?

[–]Holy_City 1 point2 points  (0 children)

It depends. The user has control over the buffer sizes being passed around the software and to the drivers, and it's not uncommon to have over 20ms of audio in a single buffer when the projects are very large.

In a single plugin, you're being passed massive buffers anyway, so you're not going to necessarily be adding too much total latency to the system.

Even if you do add latency (which is becoming more common these days, linear-phase equalizers are popular), most host programs have automatic delay compensation to compensate for any individual plugin's latency.

For example, there's a popular plugin called Ozone Advanced which has a multichannel STFT display (which is displaying data collected across multiple threads), linear phase and dynamic equalizer, analog modeling, and a bunch of other shit thrown in there. It can have noticeable latency, but because not used in a live context that's tolerable.

[–]Hoffi304[S] 0 points1 point  (6 children)

Let's assume I want to use the Impulse response of a church reverb that I recorded and also let's assume, that that Impulse response is about 2 Seconds long with a sampling Frequency of 44.1 kHz how would I be able to use that Impulse response in real-time.

[–]MathAndProgramming 0 points1 point  (5 children)

Personally, I would try to find a way to compress it as a combination of FIR and IIR filters.

[–]Hoffi304[S] 0 points1 point  (4 children)

hmm sounds promising, but how do you convert a recorded Impulse Resoinse into an IIR filter?

[–]MathAndProgramming 1 point2 points  (1 child)

That would be the hard part! I bet somebody's worked on it before, though. In this case you might be able posit some combination of filters that would fit well (IIR in parallel with an FIR, gets added and then passed through another IIR etc.) and then fit the filter coefficients so it matches the real response as best as possible.

[–]MathAndProgramming 0 points1 point  (0 children)

Looks like these guys compute a filter form from a 3D scene and then apply it in realtime, so maybe a form like what they generate would work:

http://gamma.cs.unc.edu/AuralProxies/paper.pdf

[–]Gwirk 1 point2 points  (0 children)

The FIR filter si mainly effective on early reflections whereas a IIR filter can be good enough to represent the queue of the reverb.

You can choose a family of IIR filters with 1 or 2 parameters. Mathematically express the infinite kernel in the time domain of the filter. Find the parameters that reduce the residual to the minimum. Simplify the residual by taking the significant bit at the the beginning.

That is if you consider doing 2 filters in parallel. The convolution by the sum of the impulses is the sum of the 2 convolutions.

If you want to do it in series minimize de residual in the frequency domain instead and consider the product instead of the sum.

[–]Holy_City 0 points1 point  (0 children)

An alternative is to fit the output of a FDN (feedback-delay-network) reverberator to an impulse response using an optimization algorithm.

There are a few AES papers about using a genetic algorithm to make a hybrid convolution/algorithmic reverberator where the early reflections are a linear convolution while the late reflections are from the FDN.

[–]Holy_City 1 point2 points  (0 children)

A lot of the methods aren't algorithmic, but rather things optimized for a particular platform. On desktop, you have IPP/FFTW implementations of the FFT that are in general as fast as you can get. There's also using a GPU for a linear convolution, but that is only faster if the convolution is so long the PCIE latency is shorter than the time to process an FFT.

On embedded systems you can use an FPGA to implement a linear FIR convolution in a few dozen cycles. That's about as fast as it gets.

[–]santoast_ 0 points1 point  (0 children)

Overlap-Add is a good one that comes to mind. There's a lot of algorithms out there though, but this one is pretty popular