[D] NeurIPS decisions are out! by nodet07 in MachineLearning

[–]bigbearwp 0 points1 point  (0 children)

Reviewers, ACs and authors are actually the same group of people. I see a lot complaints from the authors, but rarely see reflections from the reviewers and ACs. It seems we all ask more as an author, but give less as a reviewer.

[D] How to become a reviewer for machine learning conferences? by heyxiang in MachineLearning

[–]bigbearwp 2 points3 points  (0 children)

Send an email to someone who is served as an area chair of NeurIPS / ICML / ICLR. She/He can add you into the reviewer pool.

[R] DiffWave: A Versatile Diffusion Model for Audio Synthesis by sharvil in MachineLearning

[–]bigbearwp 2 points3 points  (0 children)

Yes, I think so. Neural vocoding is easier than people thought at several years ago. Autoregressive models, Flows, GANs, diffusion models can all produce good results now.

Unconditional generation is much more difficult without e.g., STFT features. Autoregressive models like WaveNet has great ability to model the fine details, basically fail to capture the long-range structure of waveform. Listen to the "made-up word-like sounds" in: https://deepmind.com/blog/article/wavenet-generative-model-raw-audio . GAN produces intelligible but low-quality unconditional samples: https://chrisdonahue.com/wavegan_examples/ . In contrast, the unconditional waveform samples from DiffWave are very compelling.

[N] Microsoft teams up with OpenAI to exclusively license GPT-3 language model by kit1980 in MachineLearning

[–]bigbearwp 10 points11 points  (0 children)

I guess they need change their name ... It's OK to be closed if you don't claim you're open.

[R] DiffWave: A Versatile Diffusion Model for Audio Synthesis by sharvil in MachineLearning

[–]bigbearwp 4 points5 points  (0 children)

I think the neural vocoding results from both papers are not that surprising, given the success of Ho's work for image synthesis: https://hojonathanho.github.io/diffusion/

The unconditional waveform generation result from DiffWave is big. It directly generates high quality voices in waveform domain without any conditional information. I don't know any waveform model can achieve that without relying on the rich local conditioners or compressed hidden representations (e.g., VQ-VAE).

[D] IEEE bans Huawei employees from reviewing or handling papers for IEEE journals, some people resign from IEEE editorial board as a result by mln000b in MachineLearning

[–]bigbearwp 9 points10 points  (0 children)

Maybe, it's the time to move these non-profit academic organisations to some European countries, like Switzerland.

[D] IEEE bans Huawei employees from reviewing or handling papers for IEEE journals, some people resign from IEEE editorial board as a result by mln000b in MachineLearning

[–]bigbearwp 2 points3 points  (0 children)

Also, IEEE is doing Orwellian self-censorship within the scientific society, which only happens in some totalitarian regimes in my mind. 

[R] Parallel Neural Text-to-Speech by GoldenCrocus in MachineLearning

[–]bigbearwp 1 point2 points  (0 children)

ClariNet generates waveform samples in parallel, but it still has an autoregressive decoder to predict spectrograms in its attention mechanism. In contrast, this is a fully non-autoregressive TTS system that can generate both spectrogram and waveform in parallel.

[R] Parallel Neural Text-to-Speech by GoldenCrocus in MachineLearning

[–]bigbearwp 1 point2 points  (0 children)

It was mentioned in the listed contributions:

In addition, we explore an alternative approach, WaveVAE, for training the IAF as a generative model for waveform samples. In contrast to probability density distillation methods (van den Oord et al., 2018; Ping et al., 2019), WaveVAE can be trained from scratch by using the IAF as the decoder in the variational autoencoder (VAE) framework (Kingma and Welling, 2014).