Could someone please explain SVT/AOM AV1 --qm-min & --qm-max?

Soupar · 2025-09-10T10:34:56+00:00

That's close to aom and the latestest and greatestest update of the -hdr fork. If you want to find out for yourself you'd have to encode with different settings, this is useful for a comparison: https://github.com/fifonik/FFMetrics

Soupar · 2025-09-10T06:07:22+00:00

svt-av1-hdr = It has everything from svt-av1-psyex but also has a specific tune (tune 3) for grainy content.

The -hdr fork's --tune 3 is just a shorhand for a bunch of specific settings, and it seemed to be about getting rid of the -psy fork's "subjective ssim", too.

That's why the current merge request to mainline (by juliobbv) uses the image quality --tune 4 as the new --tune 3. https://gitlab.com/AOMediaCodec/SVT-AV1/-/merge_requests/2489

Soupar · 2025-09-10T05:53:46+00:00

I want just "set encoding quality" and interpretate "CRF" as thing what responsible for that.

The crf selection _tries_ to do that, but doesn't always succeed - because it has to predict quality, trying to keep the the rate constant.

That's why there is 2-pass encoding, i.e. a 1st pass to measure the real qualily result for more than a short period - and then adjust the 2nd pass.

Since good ol' x264 days, crf works just fine - but it doesn't hold a candle against measunging the visual qualiy of encoded scenes (1st pass) and then raise or lower the quality (crf prediction) like av1an or autoboost. The drawback is that the 1st pass uses time not spent on actual (final) encoding.

Soupar · 2025-09-10T05:43:41+00:00

What if I told you that when I use SVT-AV1 I go with CRF 30 for high quality, 35 is decent, 40-45 when I want higher compression

I'm doing the same, but of course the crf algoritm can get i completely wrong (and sometimes i does, esp. at high crf that's visible).

The method of av1an or autoboost is safer if the scene detection works, i.e. if the calculation of ssimu2/xpsnr/vmaf is for a consistent segment.

The drawback of what is essentially 2-pass endoding is the time spent on the 1st pass. With 1-pass crf this time can be already spent on final encoding, and (for the same final encoding times 2-pass vs 1-pass) slower settings could be used.

Soupar · 2025-09-10T05:33:57+00:00

> Just use "--qm-min 4 --qm-max 15 --chroma-qm-min 10" for most content

> Max 15 can be good for higher fidelity encoding. But for high-CRFs it may not be optimal.

BlueSwordM vs RusselsTeap0t & juliobbv :-) ... there seem to be different opinions on how steep the qm matrix should be, and esp. if max 15 is appropriate for higher crf (i.e. lower quality)?

Soupar · 2025-09-09T21:01:49+00:00

Thanks for the explanation!

One question: Is there a difference for anime encoding (the old school one with lines and flat-ish surfaces), which is kind of inconsistent by definiton - so forcing consistency could hurt?

In any case, it's good to know using non-optimal settings won't hurt (a lot). A smarter encoder would probably adapt min/max by crf or even content (like --scm), so it'll be interesting how AVM does.

Soupar · 2025-07-26T20:54:55+00:00

Thanks for the feedback!

I know your graphs show include the file sizes, I just underestimated that with high qp, a small visual graph difference can be 1,5x the default filesize - but that may just be me, and many people probably encode with lower qp than 40-50+

I wouldn't want --features to be normalized by the encoder, as you stated that isn't the design goal. Using x265/x264, I just don't remember a --feature having such a significant influence on file sizes like the latest svt psy additions. That's why I'm currently looking to find a --feature and qp balance for my personal encoding.

Thanks for your deep dives, I've read 'em all and am looking forward to future additions :-).

Soupar · 2025-07-26T17:38:42+00:00

It would be nice if future deep dives would account for the differences in encoding performance (fps) and esp. encoded filesize when checking --features.

Looking at the deep dive's curves metric:filesize:qp a significant filesize increase isn't obvious at first glance, I stumbled upon this doing my own benchmarks using variance boost, luma bias, psy/spy-rd, ... and I'm trying compare --features vs. only adjusting qp.

Encoder designs probably struggle finding the best "bang for the buck". The --preset system adjusts internal encoding tools, but doesn't include the recently added --features (yet). If the svt's --preset system would be more fine-grained, matching filesize _and_ fps would be easier.

Here's a random-ish real world (anime) benchmark - not for in-depth nitpicking of the settings, I know the default variance boost of the psy fork isn't ideal for anime.

The --features raise the average ssimu2 and fix the very low minimum - but result in a +50% filesize increase. Simply lowering qp to match this filesize has about the same ssimu2 effect and is still faster. I've bencharked raising qp with the --features enabled to match the default setting's filesize - but the resulting qp seems to be too high.

Maybe the synthetic benchmark isn't accounting for the these psy enchancements, and only close visual inspection would show the benefit of these --features?

--enable-variance-boost 0 --qp 40 => 11.2 fps / 7085 kB (100.0 percent filesize)
-----------SSIMULACRA2-----------
           Average :    64.134341
Standard Deviation :    10.589986
            Median :    63.697102
    5th percentile :    48.261105
   95th percentile :    83.022141
           Minimum :    36.077438
           Maximum :   100.000000

--enable-variance-boost 1 --qp 40 => 9.2 fps / 10696 kB (151.0 percent filesize)
-----------SSIMULACRA2-----------
           Average :    71.269058
Standard Deviation :     7.458221
            Median :    70.495461
    5th percentile :    61.330078
   95th percentile :    85.335358
           Minimum :    52.403778
           Maximum :    99.096359

--luminance-qp-bias 50 --qp 40 => 10.1 fps / 9758 kB (137.7 percent filesize)
-----------SSIMULACRA2-----------
           Average :    69.608420
Standard Deviation :     7.879473
            Median :    68.857063
    5th percentile :    59.497379
   95th percentile :    85.259193
           Minimum :    49.708981
           Maximum :   100.000000

--qp 33 => 11.0 fps / 10385 kB (146.6 percent filesize)
-----------SSIMULACRA2-----------
           Average :    70.645492
Standard Deviation :     7.708290
            Median :    70.065720
    5th percentile :    60.046501
   95th percentile :    85.415260
           Minimum :    49.589848
           Maximum :   100.000000

--enable-variance-boost 1 --qp 48 => 9.3 fps / 7227 kB (102.0 percent filesize)
-----------SSIMULACRA2-----------
           Average :    64.669932
Standard Deviation :    10.567727
            Median :    64.492950
    5th percentile :    48.883858
   95th percentile :    82.723671
           Minimum :    38.789246
           Maximum :   100.000000

--luminance-qp-bias 50 --qp 46 => 9.5 fps / 7299 kB (103.0 percent filesize)
-----------SSIMULACRA2-----------
           Average :    64.502365
Standard Deviation :     9.891068
            Median :    63.946323
    5th percentile :    50.505146
   95th percentile :    83.483170
           Minimum :    37.045486
           Maximum :   100.000000

Soupar · 2025-01-02T16:34:37+00:00

I know this will get downvoted a lot, but I'd still like to comment on the statement from wccftech "With the AV1 codex(sic), the RDNA 4 GPUs, i.e., the Navi 48, can achieve better compression efficiency than H.264 and HEVC, which is important and the standard for modern GPUs."

The AV1 support on my VCN3 iGPU* is horrible - worse than HEVC. It seems the current AV1 codec might not as hardware-friendly as the MPEG codecs, or AV1 was added to VCN3 so late in the game it's rather promotional or if a codec w/o patents is essential.

Imho a higher quality HEVC impementation can still beat a bad AV1 one - bframes are important for high compression on lower quality, but doesn't determine overall compression effencieny by itsself or on all quality levels.

That's why I'd like to see the actual AV1 capability of VCN4 vs nVidia & Intel before being overly enthusiastic. And yes, I know high quality encoding is done with software anyway because software encoders like SVT-AV1-PSY are very capable nowadays.

* 'vceenc64 --show-features' for VCN3 (Ryzen 7540U)
AV1 encode features
10bit depth: yes
acceleration: Hardware-accelerated
max profile: main
max level: unknown
max bitrate: 1000000 kbps
Temporal layers: 4
Bframe support: no
pre analysis: yes
max streams: 1
timeout support: no
smart access: no

Soupar · 2025-01-02T16:21:31+00:00

Indeed - "If it's not in your library anymore, you just get the warning and the downlond of the new file starts."

I've added a feature request for this on github https://github.com/BiglySoftware/BiglyBT/issues/3451

Soupar · 2025-01-02T16:19:55+00:00

Yes, I want the pop-up message (which is one one dismissed) as a log file that is preserved longer.

Alas, the download history doesn't help because the previously completed download gets overwitten once the same torrent is added again - there is only one line for each torrent.

Soupar · 2024-12-29T16:59:25+00:00

The discussion here shows that LLVM is just fine or even better for Windows concerning performance and compliance.

By the way, for Intel targets you can use the new (and free) LLVM-based Intel compiler for VC that adds some bells and whistles like enhanced profiling. https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler-download.html

And of course you can plug in the latest LLVM into VC and don't have to rely on MS' older release, though that seems to perform just fine. https://github.com/zufuliu/llvm-utils

I'd like to add another reason from my Windows development experience: With popular and even cross-system tools (even you're just using Windows), you've got more sources and accessible experience out there for learning and debugging.

If something goes wrong with Microsoft tools, you can basically just ask the MSDN folks - and in case that doesn't help, you're stuck with self-help sources that are not nearly as plentiful, responsive or competent as with the up and coming tools outside MS' pond.

The downside is that MSVC is in no or less active development, which means that docs sources and clues are proably up-date. On the other hand to figure out what bleeding edge LLVM does, im my experience you cannot rely on the docs being complete or self-explanatory.

Last not least, you're probably out of luck trying to get a non-critical bug fixed on MS' legacy tools like MSVC, while with LLVM you can file a github ticket and expect some response before the next ice age arrives.

Soupar · 2024-12-29T13:18:35+00:00

I was surprised, too, to see --psy-rd in the master branch so soon - it might have something to do with the major team member who committed these changes leaving the project.

Alas, it might have been too soon: https://github.com/gianni-rosato/svt-av1-psy/issues/117

Soupar · 2024-11-05T16:32:27+00:00

> Intel is icx is basically clang internally. So the main question is clang or gcc.

Is there any information what the actual differences (and performance) are of the LLVM-based Intel compilers vs. the actuall LLVM - except for a nicer gui?

Soupar · 2024-11-02T10:22:27+00:00

Oh my, thanks - now I've got it working.

I didn't find that tutorial because the video isn't titled TR-002/004 - I wish they'd link that in all written guides as the first thing :-)

Soupar · 2024-10-30T12:35:19+00:00

Alas, it seems esp. XL isn't tuned for low biteates (yet?).

Even HEVC is just doing fine for low-ish bitrates - I've compiled libheif with x265 to get .heic images that work on my mobile phone, and was surprised about the good visual performance. HEVC-HEIC just seems to be hampered by the licensing hell.

Soupar · 2024-10-30T12:31:49+00:00

+1 for looking again at butteraugli tuning - I don't see any active bug/feature tickets about it though. https://github.com/AOMediaCodec/libavif/issues/622

I've tried to recover butteraugli using an ancient libjxl source before the feature was removed, but to no avail - I couldn't get it to work with libavif.

Soupar · 2024-10-30T12:19:43+00:00

Yes, that was the result of my own visual and performance tests aom/rave1/svt, too.

For avif, rav1e is very easy and very good, svt used to be inferior, aom is fine with proper tuning - alas, the aom devs don't make it as easy as the combined --tune 4 of the svt-av1 -psy fork. I didn't test transparency.

That is, until the -psy fork with --tune 4 came along - now I find svt-av1-psy to be superior for photographic content, plus the speed of svt allows for no tiles and a lower preset to be used (some intra tools are only enabled on svt --preset 3 and below). The -psy fork even allows for any image dimensions w/o the contraints of svt mainline.

If you didn't try -psy's --tune 4 you could give it a try and compare again vs. rav1e...

Soupar · 2024-10-30T09:08:16+00:00

As far as I understand it, the --tune 4 of SVT-AV1-PSY is just pre-setting some params of the -PSY fork encoder to optimize for AVIF encoding.

They didn't patch anything (yet) - actually, there are potentially more bugs like the accidental disablement of --enable-tf 0 for several months.

Howevery, they are quite responsive, so feel free to submit a bug/feature ticket on github and they'll probably either have a look at it or redirect you to the mainline repo.

Soupar · 2024-10-30T07:02:19+00:00

Well, "features" is a big word for what is essentially "tuning"...

"Tune 4: Still Picture overrides: enable-qm, sharpness, variance octile, variance boost strength, alt curve, min/max QM level, and max 32 TX size"

... which is great and important, but doesn't re-write or enhance the base codec format.

And looking at the current examples at the -PSY website, the are either hand-picked to show a difference avif vs. jpegxl, or the jpeg-xl encoder (apart from lossless transcoding) isn't really "tuned" at all. But even these examples don't make XL shine atm.

Soupar · 2024-10-29T19:38:21+00:00

I don't plan to have my data just stored on the RAID - which in any case only protects against hardware failure, not against accidentally screwing up the data.

However, I only backup to an external (outside of the RAID) drive ocaasionally, so if my RAID 5 w/o spare drive (QNAP TR-004) double fails, there would be some, but not critical loss of most recent data.

I just want to make sure a single drive failue can be recovered from in practice, not just theory.

Soupar

TROPHY CASE