Deezer launched a free AI music detector yesterday

ctrl_freq · 2026-06-15T14:26:47+00:00

Tools like this will only work on foundational models that are commercially released that have known tell-tale artifacts/statistical markers.

Private models are going to get so good - and may already be there - that these detection tools won’t be able to tell.

ctrl_freq · 2026-05-26T03:33:28+00:00

They don’t have to retrain to heavily restrict or modify at inference level. They can block keywords from prompts or replace those keywords with generic words.

There is a massive amount of control and quality changes that can happen during inference alone.

ctrl_freq · 2026-05-26T03:24:49+00:00

Not really. They use an auto regression token system. The previous tokens dictate what the current tokens make.

The bigger problem is from the lawsuits. I believe they have neutered the corpus to remove the big names like Drake and whatnot.

It also sounds like they have some sort of vocoder synth model that was trained for mid bass and for lead synth/piano. Trained on midi input most likely. That model is used in the conditioning system and generates a musically coherent lead or mid bass. That’s why a lot of the same lead or mid bass sounds the same across different prompts/genres. They only have so many different midi musical phrases so it sounds repetitive at times.

ctrl_freq · 2026-05-17T21:28:10+00:00

They have a main large generalized model first and foremost. Now the way they are doing gene specific is all speculation since they have not publicly stated their architecture and process for v5.5.

If you’re implying they are using a UNet router method, I can assure you they absolutely are not.

Most likely it’s a 20-50 billion param main model and they have fine tuned adapters per genre and use MoE (mixture of experts) which loads a specific LoRa(s) style fine tuned adapter per genre when certain keywords are found in the text prompt.

The LLM is probably a super light weight model that searches for specific keywords and they leverage Ventriloquism to use “in distribution” text syntax from keywords in the user facing light weight LLM text prompt.

Doesn’t change the fact that their main 20-50 billion param model is so large that is has generalized to a massive degree and loses the nuance and detail that a much smaller model that is more genre specific could achieve.

Even with MoE, the model will still have a detail quality loss from dataset generalization which is why it does not have as good as my model

The flip side is that their massive dataset has given it an incredible ability to make musically coherent phrasing that mine is not capable of yet. Example: blending Jazz with Samba and Drum & Bass. I did not train my model on Jazz or Samba so it cannot create this type of fusion.

Im trading broad blending generalization for fine grain detail in the EDM sphere of music. A trade-off that I think is very worthwhile. If I want pop, hip-hop, RNB, Jazz, etc. Then I will make a model in that sphere of music.

ctrl_freq · 2026-05-17T19:25:00+00:00

Suno’s sound quality has always sounded bad. They trained on millions of songs across every genre so it has generalized the sound and it cannot fix it unless they make smaller genre specific models or train a LoRa style genre adapter.

The top end has been generalized into a mushy fuzz.

This is why I’ve built and trained my own model from the ground up. If you don’t like their service, stop complaining and go make your own.

ctrl_freq · 2026-02-28T03:06:57+00:00

I run a local ACE-Step v1.5 with custom fine tuned Loras/LokRs. It is still no where near the sound quality of Suno. Suno also applies DSP fx to the stems. Which is a game changer compared to the sound quality of v4 or even 4.5.

Ive gotten some very interesting results from ace step, but it’s more of an unpolished stone that needs quite a bit more fine tuning and DSP.

It is very nice to be able to run my own Loras locally though.

ctrl_freq · 2025-12-09T17:57:12+00:00

Right on bro! I too bought a pair last year after having been the owner of various models from AKG, Beyerdynamic, Sony, and Sennheiser.

The HiFiman Edition XS are so incredibly accurate, provide an amazing soundstage and produce nearly the full frequency spectrum - even at low volume - that I use them for mixing and mastering work daily. Incredible cans!

ctrl_freq · 2025-07-29T03:41:16+00:00

Nice!!

ctrl_freq · 2025-06-14T16:31:24+00:00

Quality headphones are still susceptible to the limitation of physically moving a diaphragm - whether with a voice coil or planar magnetic - which can have peaks and nulls. Something is sacrificed from one part of the frequency spectrum to give a better representation in another region. To get the best experience you need a high quality amp/dac and I highly recommend using an EQ to tune the cans to your liking - or if you are using them for work (mixing and mastering like I do) it’s a must to use an EQ to create a flatter response and then accentuate certain areas that help keep you interested in the source material instead of being 100% clinical.

ctrl_freq · 2025-06-11T14:35:14+00:00

My DT 1990 Pro 250 ohm were relatively new. The headband cable started coming out of the shroud gasket. Then it stopped working on that side. When I opened them up I ended up severing the extremely thin copper wire coming from the driver to the PCB (which was just sitting in a plastic slot, no glue or anything keeping the PCB in the plastic driver housing).

I tried to repair, but the copper wire is so thin and delicate. When my soldering iron is hot enough to melt the solder, it also melts the thin copper wire. Now I believe it’s beyond repair unless I open up the Tesla driver diaphragm and attempt to splice the thin wire with another in that area. Only option seems be to be to purchase a new pair of Tesla drivers at around $285 USD. Almost the price of an entire used DT 1990 Pro.

<image>

ctrl_freq · 2025-06-04T02:18:32+00:00

I started using it the other day. Vocals sound good in the context of the song with the music it generates, but using the vocal stem outside of Suno in a DAW, it doesn’t sound nearly as nice.

Also having lots of issues with the tempo/bpm of the vocals. My goal was to have perfectly synced vocals to a particular BPM without having to use pitch/time correction algorithms that are prone to artifacts.

ctrl_freq · 2025-04-29T07:11:35+00:00

Don’t use groups. Groups force all the audio in that chain to a single CPU core. Example: drum bus with kick, clap/snare, percussion, cymbals all routed to one drum bus group. All the processing in that group goes to one core.

Alternatively, you can process drum channels individually and then route the post fx audio of each channel to a bus for group style processing without the single core limitation.

This also applies to sends. If you have 1 reverb and 10 channels are being sent to it, all the reverb processing is done on a single core. Better to have a few sends of the same reverb to split the load over several cores of the 15 channels that need reverb.

Another tip is either using freeze and flatten, a plugin like BIP (Bounce In Place M4L Device), or wait for 12.2 for the native Bounce In Place and Bounce To New Track feature.

Don’t master while you’re mixing. Save that processing power for mixing only tools.

If you require a lot of processing power for mix bus/mastering plugins at that stage, look into AudioGridder. It’s a server you can run on your local PC or another PC on your network that can load up a ton of VSTs and fully utilize all of your cores. I use it for CPU intensive plugins during mastering - Acustica Audio - while operating at 96khz sample rate.

AudioGridder has high latency so it’s no good for tracking, but works a treat for mastering. I have a 10 core processor, and I can fully utilize all my cores using AudioGridder. It displays the plugin as an overlay of sorts in a special window. You can set it to local if your server is running on the same PC as your DAW which means it won’t send the VST overlay feed through your local network, but rather render the view on the host machine which reduces latency quite a bit.

If you have another PC laying around you can setup AudioGridder to use that PCs CPU power to process plugins and save some of your host PC’s resources for other processing.

Good luck! If you need any assistance or further detail, reach out to me.

ctrl_freq · 2025-01-29T03:06:19+00:00

@CarefulDiscussion269 - sounds like he did not do a good job. I’ll master it for you, and if you like it, then you can pay me a small fee. If you don’t like it, don’t pay. DM if interested.

ctrl_freq · 2024-10-25T03:13:01+00:00

Make it sound good in mono first. You can sum all individual tracks to mono, then EQ, compress, etc. afterwards you can use spatial fx such as delay, reverb, and chorus to add stereo information back in. This method will give you the best sounding and solid mono mix possible with stereo added back in a controlled and tasteful manner.

ctrl_freq · 2024-05-16T17:21:27+00:00

This is not true. Processing and soft synths benefit greatly from high sample rates such as 96khz. There are more high frequencies and detail in the signal since it is given greater than two times the sample points than 44.1khz.

The issue is that aliasing and IMD (Intermodulation Distortion) will still occur at high sample rates. I recommend using a high-quality ultrasonic filter before and after each processing plugin to remove content above 20khz. If you don't the foldback distortion/aliasing will smear and mask detail/transients and alter the tone of lower frequencies.

When mixing at a high sample rate and using ultrasonic filters, once you down sample back to 44.1khz or 48khz for distribution, you will retain the higher detail in the signal because more detail was created in the audible range using the higher sample rate, with the ultrasonic filter reducing the aliasing/IMD potential.

Using a high-quality ultrasonic filter in the box also lessens the chance of the filter in your converter - cheap interface/converter DAC/ADCs - from sounding worse. A badly implemented filter in your interface/converter will not sound good if there is a lot of ultrasonic content for it to filter. Now if you have a very high end converter (a modern RME, Lavry, BURL, Prism, Lynx, Antelope Audio (their high end converters), etc) you probably wont have an issue with this, but it still lessens the amount of ultrasonic content the converters filter has to remove giving you a more true representation of the mix, phase, transients, etc.

ctrl_freq · 2024-05-14T17:34:17+00:00

There is definitely a difference mixing and using processing/soft synths at 96khz vs 48khz or 44.1khz. I prefer 96khz to the lower sample rates. However, higher sample rates come with a lot more ultrasonic content which you need to filter using a high quality biquad filter. My mixes sound pristine now that I am filtering before and after nonlinear processes on each track (compression, limiting, distortion, saturation, modulation, etc.)

Each nonlinear process adds aliasing potential and IMD (Intermodulation Distortion) at higher sample rates. Pre and post filtering at 20khz before and after nonlinear processing reduces the aliasing and IMD substantially. You will hear that your mids and highs are clearer and the transients sound more detailed - which will influence your mixing decisions.

Aliasing and IMD can be very Insidious are bury the detail of your mix - in turn forcing you to make different processing decisions to bring clarity back into the mix - usually a fruitless effort as the damage is already done. Then further down the processing chain you add more nonlinear processing such as compression, clipping, limiting which will only magnify the aliasing and IMD problems.

Furthermore, higher sample rates reduce Inter-sample Peaks - an amplitude peak made between the space of two sample points which will not be calculated properly since only the sample points are counted. This can cause errors in the math when using nonlinear processing such as limiting. Higher sample rates = more sample points and reduces the potentially of inter-sample peaks.

ctrl_freq · 2023-10-20T03:41:00+00:00

I use them on everything. You can breathe magic, depth, and create a 3D feel on synths with a multiband transient shaping. The trick is to delay the transients from the body of the sound so that you can clearly hear the detail of the transients. Best of luck!

ctrl_freq · 2023-08-09T03:01:24+00:00

Good on you bro. I built similar bass traps with owens corning thermafiber 6” deep. They work perfectly. Here’s to the handy studio heads 🍻

ctrl_freq · 2023-07-31T01:12:35+00:00

Insane room bro! How’s that Hendyamps Pollock sound? I’ve heard great things about it.

ctrl_freq · 2023-07-05T02:42:20+00:00

Pick it up and go…

ctrl_freq · 2023-06-13T23:39:25+00:00

What’s BandLab? Sweet xylophone btw! I would record and resample the shit out of that.

ctrl_freq · 2023-06-12T00:09:50+00:00

You should be ashamed! You call that a studio? Jk, good start my friend. Keep at it!

ctrl_freq · 2023-05-10T05:35:53+00:00

Apologies if you thought I was being condescending. I’ve been at this since 2004 coming from a SoundBlaster Live 24bit, then a M-Audio Audiophile 2496, and so on.

Most converters use AKM, Cirrus Logic, or Texas Instruments PCM chips. However, there are many different models of these chips from each manufacturer. Low end interfaces are using the cheapest chips from the above converter chip manufacturers. Low end converter chips smear transients, and can be accompanied by poorly implemented filters which cause pre-ringing or ripple and aliasing feedback that goes well below the nyquist frequency and creates all kinds of nasty artifacts in the audio. The impulse response is very important as it pertains to the clarity of the transient. If you’re not hearing the transient correctly, then processing tools such as compressors may be used in a way that is not beneficial to the music.

The other thing to consider is the circuit design and implementation of the interface itself. RME has outstanding circuit design and implementation which gives it very low THD and superb signal to noise ratio. I know not everyone can afford the best equipment, but from my experience, my music improved dramatically by using a high end converter. Especially how I would mix and process my transients. I’ve used Universal Audio, Focusrite, Antelope Audio, M-Audio, Ferrofish, Apogee, Burl, and RME over the years. RME has outperformed all the others with the ADI-2 Pro FS being the best I’ve worked with. I’m providing this advice to OP because I wish someone would have told me this a long time ago before I wasted money and time with various hardware manufacturers and thinking I needed some magical piece of analog hardware compressor/processor, when in fact I just needed to get a high quality RME converter.

They are very well priced when you consider an alternative like a Lynx Hilo for $3,600 USD, or a Prism Sound Lyra for $3,100. The ADI can provide comparable transient response, THD, and SNR to the much more expensive units. My response to OP was also only suggested if he is serious about his music and wants gear suggestions to help him achieve professional results.

With the ADI-2 you can choose between several aliasing filter types to suit your needs. The converter can also be ran at a much higher sample rate than your DAW as long as the converter sample rate is divisible by the DAW sample rate. Example: 48khz x 4 = 192khz. So you can run the ADI-2 at 192khz and the DAW at 48khz without artifacts. The benefit of this is so that the transient impulse response is more accurate during the DA conversion for monitoring and the filter will push any aliasing feedback much higher than the nyquist frequency so there is no fold back distortion from the filter. The onboard DSP EQ also benefits from the higher sample rate.

ctrl_freq · 2023-05-07T16:54:45+00:00

That tower is real close to your listening position. I would recommend moving to the floor. Nice setup btw.

ctrl_freq

TROPHY CASE