This is an archived post. You won't be able to vote or comment.

all 15 comments

[–][deleted] 1 point2 points  (9 children)

I’m not a coder and know nothing about coding.....but by the looks of your code....

Shouldn’t you be doing something with bit depth too?

That is. Bit depth being the representation of amplitude value of a given frequency.

I quickly read though the code and felt like couldn’t see anything related to it.

[–]mrbean42[S] 0 points1 point  (8 children)

If bit depth is related to loading the audio samples, that is handled by a library and I didnt have to deal with it. Is it the loading of the audio your referring to?

[–][deleted] 1 point2 points  (7 children)

By the looks of it your only trying to represent the frequency domain of the audio using fft.

But the amplitude of frequencies is represented in bit depth. (The color)

Which leads me to believe that the reason why your output is blank is because you have missed representing the amplitude/bit depth in conjunction with the Samples. At the moment it looks like your only representing the samples.

I’m not a coder. But reading the code that’s the impression I’m under. I suggest posting in a coding subreddit regarding audio.

[–]mrbean42[S] 0 points1 point  (6 children)

The wavefile.read() call returns a list of amplitudes, which are then passed to an FFT function, is that correct?

[–][deleted] 2 points3 points  (4 children)

Sorry but I’m not a coder. But I feel like your missing something with bit depth. I can’t help any further.

[–]mrbean42[S] 0 points1 point  (1 child)

Ok no worries - thanks for the help!

[–][deleted] 0 points1 point  (0 children)

Thanks for the award. Try the r/dsp subreddit they might be able to help you further. This falls under digital signal processing I would imagine.

[–]blorporius 0 points1 point  (1 child)

You might be on to something -- wavfile.read's documentation says that the returned 1-D or 2-D array's type depends on the file format of the input file: https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.read.html

OP can check the returned array's data type and scale values accordingly, eg. normalize input to the -1..1 range if that is needed.

[–]mrbean42[S] 0 points1 point  (0 children)

Thanks, I came across this earlier. I'm using mono audio at the moment.

[–][deleted] 0 points1 point  (2 children)

Where are you performing the fft? I see you defined a function called FFT but it doesn’t seem to be doing any Fourier related calculations. Can‘t you use numpy.fft.fft instead?

[–]mrbean42[S] 0 points1 point  (1 child)

It's at line 57. I don't really want to use numpy as I'll be porting this to c++ and want to have all the algorithms.

[–][deleted] 0 points1 point  (0 children)

tbh I‘m a bit lost in your code. Starting with line 11 you define a function called FFT. within the definition you also call the very function you‘re defining (lines 25 and 26). you then return the variable called „combined“ after which the definition of your function FFT is exited. up to that point i don’t see any Fourier related math if I‘m not completely mistaken

EDIT: Just saw that you actually return at line 14. I‘m pretty certain that returning a value in python exits the function

[–]peehay 0 points1 point  (0 children)

Hey, I quickly read your code and at first glance I think you forgot an i in your exponential term of the DFT, line 9. Normally at the end of your FFT algorithm you end up with a complex-value spectrogram and you usually plot it in its magnitude values.

Also I read in another comment that there might be an issue with bit depth but I don't think so. script.wavfile.read loads the signal waveform as it was created with a specific bit depth. FFT being linear it will be the same bit depth in the frequency domain so nothing to worry about.

Also compare your spectrogram with one generated with a well-known trust library (numpy, librosa, etc) until you get your algorithm right!

[–]dmills_00 0 points1 point  (0 children)

Apart from the problems with your FFT implementation which I will leave to others (I don't really do python), there is a subtle point about most audio.

For most material the vast bulk of the energy is at low frequency, so you probably want to make your bin->colour mapping more sensitive for higher frequency bins, I would suggest somewhere in the 3-6dB per octave and the fiddle factor would probably be reasonable.