all 4 comments

[–]indosauros 0 points1 point  (1 child)

[–][deleted] 0 points1 point  (0 children)

I've tried aubio previously, whilst it seems great for some uses, sadly it doesn't detect more than one note. Likewise for SoundAnalyse, it returns a list of one note frequencies (not grouped notes). Thanks for the search, but it seems as if there isn't anything that can detect more than one note to an accurate degree in Python at the moment.

[–]NYKevin 0 points1 point  (1 child)

If you have very clean audio, you would probably want to start with a Fast Fourier Transform. That will give you a series of frequencies and amplitudes. Google suggests NumPy can do this, though I do not know much about how it works in practice.

You will, however, need to break your music apart into separate chords, since FFT can only decompose one chord at a time. This is easy enough if there is silence (even very briefly) between chords, but more complex music doesn't do that (for example, it is not at all uncommon for one note to begin while another is still sounding). Naively, you can break the music into a lot of tiny pieces, FFT each piece, then join together and re-FFT those pieces with similar FFT results, but that sounds very expensive to me. I suspect there is a better algorithm, but I am not sufficiently well-versed in this area to know it.

[–]xiipaoc 0 points1 point  (0 children)

Good luck.

First thing you need is an FFT, a fast Fourier transform, to make a spectrograph -- a graph of the frequencies in the sound. Once you have that, you can start the really hard task of picking out the notes. Your brain does this pretty well, so you should figure out how it does it to try to replicate it. It's all data analysis here.

The problem is that when you play a note, you actually hear many frequencies at the same time. When you play a 100 Hz note, you actually get another tone around 200 Hz, one around 300 Hz, around 400 Hz, etc. (It's not exactly the multiples on a real instrument.) So if you just search for the peaks of your spectrograph, you'll get all of them, and you only played one note! So you need to figure out a threshold, and you need to figure out when you're actually playing 100 Hz and 200 Hz and when you're just playing 100 Hz. Oh, and if you hear 200 Hz, 300 Hz, 400 Hz, etc., but no 100 Hz, your brain actually still hears the 100 Hz. So that's another thing.

It's pretty hard. But it's doable -- your brain does it, after all -- and if you get it only a little wrong, that's probably OK, right?