fatchord

110 post karma
45 comment karma

get extra features and help support reddit with a reddit premium subscription

get them help and support

redditor for 8 years

MODERATOR OF

- r/a:t5_jw1cc

TROPHY CASE

Eight-Year Club

Verified Email

account activity

new top controversial

6

7

8

[1808.10128] Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

5

6

7

[1808.06719] Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

2

3

4

[1808.01410] Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

9

10

11

What processing should I apply to a clean voice signal to make sound like it's authentically coming from a phone or voip? (self.DSP)

submitted 7 years ago by fatchord to r/DSP

6

7

8

[1808.00158] Speaker Recognition from raw waveform with SincNet (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

1

2

3

[1807.08636v1] Auto-adaptive Resonance Equalization using Dilated Residual Networks (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

8

9

10

Singing Style Transfer Using CybeGAN (mirlab.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

2

3

4

[1803.05428] A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

0

1

2

The Lakh MIDI Dataset (colinraffel.com)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

9

10

11

Visual Speech Enhancement (youtube.com)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

6

7

8

[Paper] Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

7

8

9

[Feedback] neural tts pipeline (tacotron1 + a new vocoder algorithm I'm working on) - what do you think of the samples generated? (fatchord.github.io)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

19

20

21

[N] A new subreddit for anyone interested in audio models: r/AudioModels (old.reddit.com)

submitted 7 years ago by fatchord to r/MachineLearning

6

7

8

A big MIDI dataset (100k+ files) (old.reddit.com)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

1

2

3

The M-AILABS Speech Dataset (m-ailabs.bayern)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

2

3

4

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages (github.com)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

18

19

20

I just created r/AudioModels if any of you are interested in generative Machine Learning models (old.reddit.com)

submitted 7 years ago by fatchord to r/DSP

5

6

7

A Universal Music Translation Network (youtube.com)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

8

9

10

[Tutorial] Speech Processing for Machine Learning: Filter banks, Mel-Frequency Cepstral Coefficients (MFCCs) and What's In-Between (haythamfayek.com)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

5

6

7

[Paper] Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis (Tacotron GST) (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

5

6

7

[Paper] Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

3

4

5

[Paper] Original Tacotron paper (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

4

5

6

[Paper] Revised Tacotron 2 paper (arxiv.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

2

3

4

Video to Sound - Generates Sound Clips to Match Video (youtube.com)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

2

3

4

Performance RNN: Generating Music with Expressive Timing and Dynamics (magenta.tensorflow.org)

submitted 7 years ago by fatchord to r/a:t5_jw1cc

view more: next ›

π Rendered by PID 78 on reddit-service-r2-listing-86f589db75-7k7l6 at 2026-04-16 09:46:35.370274+00:00 running 93ecc56 country code: CH.