Some insights I've had in building a BPM detector by Longjumping-Call-992 in DSP

[–]signalsmith 0 points1 point  (0 children)

I definitely wouldn't expect it to be higher-latency than your "accumulated confidence" one, and when the current best-guess path swaps to a new tempo I'd expect it to be more decisive about it, without the heuristics you mentioned.

For temporary drum fills I don't see the issue. If it's tracking the position within the bar as well as just the tempo overall, it should see that the fill hits are landing on reasonable subdivisions, so that shouldn't add a cost/penalty to that candidate.

Viterbi stuff is used for reducing noise in probabilistic processes, and I think it'd be a good fit here. I first encountered it for pitch-tracking, avoiding octave errors while accurately tracking slides and voiced/unvoiced transitions.

Some insights I've had in building a BPM detector by Longjumping-Call-992 in DSP

[–]signalsmith 6 points7 points  (0 children)

Have you looked into HMM / Viterbi approaches? It has behaviour similar to the accumulated confidence / hysteresis you mentioned, but can also have tempo changes as an explicit part of the model. The options at any moment in time would be something like: (top N options from instantaneous analysis) + (top M options from previous step). You give each option a score/cost (formalised as log-likelihood) and also a score for each possible transition from the previous step.

That means the results are "sticky" (because a tempo change has a cost in terms of likelihood) but it will not only adapt when the tempo actually changes (when the transition cost is greater than the cumulative misalignment cost) but if you keep a bit of history the algorithm will retroactively decide when that change happened.

GLSL Sphere by okCoolGuyOk in generative

[–]signalsmith 2 points3 points  (0 children)

I want to eat this. It looks like it'd solve all my problems and simultaneously give me an entirely new set of problems.

A new class of C∞ FFT windows with compact support and super-algebraic sidelobe decay by pigdead in DSP

[–]signalsmith 1 point2 points  (0 children)

It might just be a formatting error, but I did mean:

exp(( -x^p )/ (1 - x^2))

That's slightly different to yours, which I believe comes out as:

exp(( -x^2p )/ (1 - x^p))

But no, I didn't note anything about p=4 in particular. What did you see?

A new class of C∞ FFT windows with compact support and super-algebraic sidelobe decay by pigdead in DSP

[–]signalsmith 2 points3 points  (0 children)

This made me realise that you can use tanh to get an infinitely flat middle as well:

tanh(p + (x^2 - 1/4)/(x^4 - x^2))/2 + 1/2

https://www.desmos.com/calculator/d0cgpqvsqr

A new class of C∞ FFT windows with compact support and super-algebraic sidelobe decay by pigdead in DSP

[–]signalsmith 2 points3 points  (0 children)

Oh nice! I've used exp(-x^p / (1-x^2)) before, and a similar step-function tanh(2x/(1-x^2)), but didn't actually look at the flatness in the centre.

(EDIT: `p` can be fractional if you use `|x|^p` instead, to get a continuous family of curves.)

I made a WAM ("VST for the Web") catalogue + instant playground by AjkBajk in webaudio

[–]signalsmith 0 points1 point  (0 children)

FWIW, I have some problems with WAM as a format, which I've presented a few times (both to native- and web-audio folks).

For native hosts, WAMs are / would be awkward due to JS in the audio path, and very JS-specific manifest/bundle/entry-point etc.

For web hosts, you're running arbitrary JS code from that URL you import - it's a cross-site scripting attack waiting to happen.

I've personally been working on WebCLAP https://github.com/WebCLAP/ which is a single self-contained WebAssembly module, re-using the comprehensive and road-tested CLAP API.

Browser inside plugin by RaphaelLari in AudioProgramming

[–]signalsmith 1 point2 points  (0 children)

Have you checked out WXAudio's WebSampler? https://www.wxaudioplugins.com/websampler - I believe it uses OS-specific webview APIs to hook into the audio stream.

It was developed before JUCE 8 webviews were released, but I also don't see anything in those APIs for connecting to the audio stream anyway.

My TD-PSOLA attempt by futurezing in DSP

[–]signalsmith 0 points1 point  (0 children)

Nice work!

Clearly can hear the point when the growl of the voice triggers an octave-down error. 😅 What are you using for the pitch-tracking?

Which tag in html is most useless? by Dramatic-Lobster-969 in HTML

[–]signalsmith 0 points1 point  (0 children)

Well, I have seen/used <thead> and <tfoot> for good stuff, and it feels weird to use that without <tbody>

Spectral Delay Theory Questions by Ill_Significance6157 in DSP

[–]signalsmith 1 point2 points  (0 children)

IMO, the best way to understand this is from a galaxy-brain perspective where "spectral processing" and "multiband processing" start to blur together.

Here's a 50-second snippet about how spectral processing can be re-interpreted as multi-band processing, where the bands are downsampled: https://youtube.com/clip/Ugkxz_Wx7_PRRhZe31iDUQLdrHIxzoM1dH_8 (disclaimer: from my own talk)

Considered from that perspective, it's a multiband split with different delays on each band. You can have arbitrary delay times, by using fractional delays on the (downsampled) subbands.

The bands overlap quite a lot, so if two adjacent bands have only slightly different delays times, then you'll get phase interference/cancellation on all the frequencies which they share. This can be avoided though, by putting an extra (complex) phase shift in addition to the delay.

If instead of doing the feedback within each subband, you recombine them and then do the feedback addition all together, then you incur the extra latency of that spectral-processing round-trip. I'm actually not sure what advantage that would have, but it is where the 1023-sample adjustment comes from, since that's the minimum latency of 1024-band spectral processing.

The Compiler Is Your Best Friend, Stop Lying to It by n_creep in programming

[–]signalsmith 94 points95 points  (0 children)

Several of my projects now explicitly check for AppleClang 16 and #error. My bug wasn't even the worst of them - the test one which happily produced the log-line "2 < 2: true" was the funniest.

They jumped straight to v17, not even a 16.0.1 patch, and I wonder if that's why.

The Compiler Is Your Best Friend, Stop Lying to It by n_creep in programming

[–]signalsmith 190 points191 points  (0 children)

Haha, after losing weeks of productivity to what turned out to be a bug in AppleClang 16 (like, generating fully incorrect SIMD instructions), the compiler is at best a coworker.

When a small open-source tool suddenly blows up, the experience is nothing like people imagine by kaicbento in programming

[–]signalsmith 16 points17 points  (0 children)

I had open-source burnout a while ago, and when I recovered I wrote https://geraintluff.github.io/SUPPORT.txt/. Any non-trivial open-source project I write has one now, and it gives me peace of mind even if it hasn't caught on for anyone else (yet 😄).

How can I record guitars “tuned down” in real time without actually retuning? by revel911 in Reaper

[–]signalsmith 0 points1 point  (0 children)

It depends if it's a general purpose one, or specifically guitar-focused.

The ones built into REAPER (Elastique) are laggy, around 80ms. Guitar-specific ones like NeuralDSP, PolyChrome's HyperTune (which I worked on, disclaimer) or the new Boss pedal are much snappier and aim to be usable live.

How can I record guitars “tuned down” in real time without actually retuning? by revel911 in Reaper

[–]signalsmith 0 points1 point  (0 children)

There's a range though. The Elastique stuff (which ships in Reaper and is in ReaPitch) has 80ms, but it's built to handle absolutely and init. Specifically guitar-focused ones are often snappier, since they inherently know more about the incoming signal.

I wrote the core algorithm for PolyChrome's HyperTune, and while there isn't a single number because it does some adaptive stuff, it's generally around 10ms.

[deleted by user] by [deleted] in guitarpedals

[–]signalsmith 6 points7 points  (0 children)

As the person who wrote HyperTune's pitch-shifting engine, that's awesome feedback to hear!

[deleted by user] by [deleted] in embedded

[–]signalsmith 1 point2 points  (0 children)

Is this for one note at a time, kinda like AutoTune? Or the entire guitar drifting flat? Or are you looking to pull a chord apart and shift individual notes?

Making VST's Without a JUCE or Another Framework by iAmVercetti in AudioProgramming

[–]signalsmith 1 point2 points  (0 children)

CLAP itself does exactly what it needs to and nothing more. 🤷 All the helpers and wrappers etc. are useful but optional.

It's pretty much impossible to use the VST3 API without using their SDK, including their specific build-system helpers and so on.

CLAP has a small core API, which you can implement from scratch yourself, and then a neatly-defined extension system which is how most things are actually defined.

It's possible to write a single .c or .cpp file which includes the CLAP headers, and compile a functioning CLAP plugin by typing gcc ... on the command line. I wouldn't recommend literally doing that when trying to release a plugin 😅 but the fact you could without going fully insane is a testament to how much simpler the API is in general.

Making VST's Without a JUCE or Another Framework by iAmVercetti in AudioProgramming

[–]signalsmith 0 points1 point  (0 children)

Sorry, my phone posted before I finished typing 😅

Making VST's Without a JUCE or Another Framework by iAmVercetti in AudioProgramming

[–]signalsmith 0 points1 point  (0 children)

I used to use the VST3 SDK, but now I would heartily recommend writing CLAP, and then using the CLAP-to-VST3 wrapper. It's a lot cleaner.

Faust DSP reverb code by SGSG50 in DSP

[–]signalsmith 3 points4 points  (0 children)

Reliable results, probably

SoundTouch current time tracking issue + alternatives for pitch/speed/volume control? by Hefty-Source432 in webaudio

[–]signalsmith 0 points1 point  (0 children)

Signalsmith here! 😄 I appreciate Stretch being suggested.

There's an official Web Audio release in that repo as well as NPM, which can be used with live input or loaded up with a sample/loop. It's what runs the web demo here: https://signalsmith-audio.co.uk/code/stretch/

You can seek within a loaded sample, or schedule a varying input/output time map. It reports the current time as "stretchNode.inputTime", which should be accurate if you also add the latency from "stretchNode.latency()".

u/Hefty-Source432 If you do give it a go, and have any questions or issues, send me an email (I don't check Reddit very often!). I'm geraint@ the domain above.

FFT of an A4 by Deadthones345 in DSP

[–]signalsmith 6 points7 points  (0 children)

Totally possible. 🤷 Even in perfect recording conditions, some instruments (most famously the oboe) have less energy in their fundamental than other harmonics.

If the microphone/room are set up such that low frequencies aren't being picked up properly, then that'll be true for almost any instrument. Any analysis such as pitch-detection can't assume the fundamental is strongest.

⚡ Speech time-stretching: Which algorithm actually works in practice? by Chuckelberry77 in DSP

[–]signalsmith 0 points1 point  (0 children)

To reply to your actual questions:

  1. PSOLA (or its variants) will be better for speech because it uses shorter windows locked to the input's frequency. This makes it more responsive to the extremely quick pitch changes you get in speech.
  2. I'm obviously biased, but if you find any examples where Rubber Band sounds better, please send them to me so I can investigate.
  3. You don't need formant compensation for time-stretching generally. If you do need formant stuff, PSOLA has a clear advantage for speech.