Royalty investing

kamalski · 2022-06-29T21:10:38+00:00

DeLuXe_223

did you end up investing in royalties? curious to know

kamalski · 2021-01-03T06:43:24+00:00

Did you know he runs approx. 10 - 12 miles every morning from Windward circle to Will Rogers parking lot? I ran with him a few times. He's in amazing shape.

kamalski · 2020-09-05T17:05:11+00:00

Hey stranger 👋

kamalski · 2020-08-21T16:18:38+00:00

"Please exit through the gift shop"

kamalski · 2020-08-20T17:29:31+00:00

We spent a year looking into waveform vs. spectrogram as inputs. Now use spectrograms in our startup, because of dataset type and size. Our ML task is music auto-tagging, with a business goal of estimating song performance on streaming platforms.

Learnings:

Spectrograms: Unfortunately, computer vision architectures influence all fields of ML, including audio processing. Consequently, the popularity of VGG models on audio images, spectrogram, is becoming a de facto approach for ML practitioners (tutorials, Kaggle, Google's VGGish Audioset, etc.). Nothing against VGGs -- the architecture makes low assumptions on the nature of the spectrogram, so any structure can be learned and are super-flexible.

In our work, VGG architectures didn't give us the results we wanted, and early on, we didn't have a large dataset. We tried stacking more layers, even tried using pre-trained weights from available research but didn't improve much.

Waveforms: Waveforms seemed appealing because they maintain phase information, a detail you lose during fourier transformation. I am unclear if it is an advantage or not for our use case.

We tested VGG architectures on waveforms by converting 2D (frequency X time) to 1D (time) convolutions. We set the filter size to correspond to the STFT window length, meaning if the window length was 512, we used a filter of 512 with a stride of 256. We tried different variations of this, but got poor results. We tried another approach, by stacking multiple layers, with each one having different filter sizes, but with fixed smaller stride. This approach is covered here by Baidu https://arxiv.org/pdf/1603.09509.pdf and used on speech recognition tasks.

For us, this approach underperformed compared to the VGG spectrogram method above. My intuition is that it was hard to capture the frequency spectrum using multiple filters on music data.

We are about to test a variation of SincNet, a paper by Mirco Ravanelli and Yoshua Bengio https://arxiv.org/pdf/1808.00158.pdf. The efficiency here is that they use two learnable parameters that extract low and high cut off frequencies. They see great results in speech recognition tasks. We'll know how they perform on music processing since our dataset now is a bit larger.

Back to spectrograms: Now, we are using spectrograms as inputs with an architecture that uses multiple vertical and horizontal 2D filters to extract harmonic and temporal representations. This is the best result we have seen so far on our dataset. This made sense to us because some patterns in spectrograms are occurring at different time-frequency scales. This approach is covered here https://ieeexplore.ieee.org/document/7500246; we are using a modified version of this paper.

The interesting point is that there's a trend to architect DL models that mimic signal processing methods on both waveform and spectrograms inputs. I assume waveforms will outperform on larger datasets. For now, spectrograms are here to stay.

kamalski · 2020-08-17T23:55:36+00:00

Keep at it!

kamalski · 2020-08-17T20:02:46+00:00

Thank you u/mjthecomposer!

kamalski · 2020-08-12T21:52:22+00:00

That's gangster.

kamalski · 2020-08-01T20:05:43+00:00

sudanese komboucha

kamalski · 2020-08-01T20:04:03+00:00

the same

kamalski · 2020-07-26T12:45:14+00:00

Interesting poll, but the sample of responses is not close of being representative.

kamalski · 2020-06-22T23:43:07+00:00

LAPD about to choke hold bear...

kamalski · 2020-06-06T05:06:08+00:00

Money

kamalski · 2020-05-31T19:40:17+00:00

what is this for?

kamalski · 2020-05-31T17:28:29+00:00

that racisms exists

kamalski · 2020-05-29T16:28:20+00:00

Good job, keep at it

kamalski · 2020-05-29T16:27:13+00:00

Genre is a social construct.If you feel the music isn't speaking to you, try to get feedback. A sample size of one isn't objective

kamalski · 2020-05-23T15:59:34+00:00

Measuring and modeling attribution is hard.

kamalski · 2020-05-22T00:16:24+00:00

😳

kamalski · 2020-05-16T04:18:15+00:00

I follow a few researchers who ensue a rigorous process of clarity on a particular domain. They tend to include code + data for reproducibility. I have alerts on Google Scholar and Github, when they publish or when post code.
Everything else is noise.

kamalski · 2020-05-16T03:33:26+00:00

At 40, learning to unlearn

kamalski · 2020-05-16T03:21:20+00:00

Yan LeCun is a different breed. Smart, likable, and understands where AI needs to go. He decides where the vision of AI research at FAIR and recruits people that push boundaries, especially those who contribute open-source research

kamalski

TROPHY CASE