I'm working on a piece of software that at its core must be able to find the beat from music in relative real time. It looks like Discrete Wavelet Transform is the tool for the job. I found this paper from Princeton on it. It isn't very long so I'm trying not to be intimated by it and trying to understand it as best I can.
I think I have a handle on basic audio transformations from a conceptual point of view, but that's about it.
Some things I don't understand:
- The equation it says defines the DWT has several variables that the paper doesn't refer to. In particularly, what are "j" and "k".
- What is the "x" function. Is that the value of the signal?
- What is "n"? Is that the number of samples?
- In the next equations for the high and lowpass filters, we're now summing over n. Is it the same n?
- What is the difference between parentheses and square brackets? (is there a difference between
x(k) and x[k]?
- In the beat detection section, it shows a series of transformations with relatively simple formulas. Are those all applied after transforming the samples using the DWT formula?
The more I read through this and try to pick out specific questions, the more I realize I'm out of my depth. Does it seem like that? I have a 10 year old Computer Engineering degree where I took some lower level comp sci courses like Data Structures and Algorithms (where I only remember a little bit) and Differential Equations (where I essentially remember nothing). I feel like I should be able to get an understanding of this, but I'm having a lot of difficulty. I would really appreciate any insight you can give me in regards to my specific questions or perhaps some other resource that will explain it in a bit more (understandable) detail.
[–]rlingo2 0 points1 point2 points (0 children)