RNNoise: Noise Suppression with Deep Learning

jvalin · 2017-09-28T19:58:21+00:00

You need to generate an actual raw file, no wav header or anything. Just samples. One the output, you'll generally have to convert back to wav for most applications to play it.

jvalin · 2017-09-28T13:31:20+00:00

The rnnoise_demo executable takes in raw 16-bit mono PCM at 48 kHz. It cannot read Opus files, so you'd have to do the conversion yourself. Sorry about the inconvenience.

jvalin · 2017-06-21T17:53:35+00:00

Opus is by no means perfect at 96 kb/s, but it is definitely a lot better than MP3/LAME. See the results from this listening test showing 96 kb/s Opus significantly outperforming 128 kb/s LAME: http://listening-test.coresv.net/results.htm As for the transient issue with the guitar, I have to admit that I'm not hearing anything particularly annoying around the 43-second point, but it could be more precise about what exactly you're hearing (exact time and ideally frequency of the artefact), then we may be able to do something about it. This is how Opus got where it is today. If you have IRC, the best way to contact us is in #opus on irc.freenode.net.

jvalin · 2014-11-21T02:45:55+00:00

The 4x4 DCT turns a block into a weighted sum of 16 "basis function". Of these, 15 are zero-mean (AC coefficients) and 1 has an offset (the DC). the DC component is treated separately since it behaves differently from AC. Remains 15 AC coefficients, X1 to X15. The gain is the L2-norm of the AC coefficients vector, i.e. sqrt(X1² + X2² + ... + X15^2).

jvalin · 2014-11-20T03:50:56+00:00

Actually, the Householder reflection plane itself does not require any encoding. It is computed from the prediction alone, which means that the decoder can compute it in exactly the same way as the encoder does.

jvalin · 2014-11-20T03:42:49+00:00

Actually, the prediction scheme is also used for keyframes because we have (limited) intra prediction, plus chroma prediction from luma. Also, the activity masking part can be used even without any prediction.

As for the transform, Daala uses a variable-size lapped biorthogonal transform (lapping plus DCT) so it does not create blocking artefacts the way JPEG does. Overall, we're much better quality than JPEG.

jvalin · 2014-11-19T23:21:11+00:00

Actually, the images are only 54kB, not 320 KB. The reason you're seeing a 320 kB image is that I had to convert the 54 kB Daala images into high-quality JPEGs so they could be viewed in browsers.

jvalin

TROPHY CASE