AV1, to love or not to love?

More often than not, beginning a hobby often comes from an interest or excitement with a certain subject or topic, and this was how I got into hosting a media server. To someone who has never dabbled with that before, my initial thought process was, get some videos, put it in, watch them. Easy. Unbeknownst to me, or rather, future me, this project has sent me spiraling down the rabbit hole of video codecs on more than one occasion. After spending multiple weekend afternoons on this, it has finally reached this exact moment, where I rave and vent about this video codec.

What is AV1?

AV1 is a video coding format that is royalty-free and built to succeed VP9. This royalty-free licensing model has made it significantly easier for open-source projects to adopt it, but has also made it more difficult to understand as not all of them would be standardised to the same operating philosophy. This resulted in different encoders, two which I often came across would be SVT-AV1 and aomenc (or AOM-AV1). The former being touted as quicker and the latter being more quality oriented.

The Story

When I first began my Jellyfin server with the help of my friend, it was often suggested that I should adopt AV1, as it is not only more storage efficient than H264 but also the future. Doing my own investigation proved this to be true as well, and as such, we decided to adopt Handbrake into our workflow. This would have been the end of the story, but the more I used Handbrake, the more I disliked it as it was difficult to find documentations on using AV1 with Handbrake. Though, I shrugged it off as something I had to get accustomed to and left it at that.

Before long though, I found out that some videos simply did not like going through Handbrake's AV1 SVT. Most videos were fine, they could seek without issue in MPC-HC but some had a particular issue of not playing for 3-4 seconds before playing the video in very high playback speed to catch up. Digging around forums told me that I had to input some SVT-AV1's arguments to improve performance, no big deal, there was an "Advanced Options" box for that very purpose after all. At first, the performance felt better after I added keyint=120 , but there was still a 1~ second delay after seeking. Eventually, I gave up and decided to jump to ffmpeg since there was official documentation on using ffmpeg with SVT-AV1. Within half an hour, I got the result I wanted for seeking performance but I am well aware of the world of pain I will eventually have to face since I now have to account for different variables that Handbrake probably already accounted for me.

More love but sometimes, ugh.

Ultimately, after tinkering for another afternoon, I found the perfect setting for my current use-case.

ffmpeg -i <input file name> -pix_fmt yuv420p10le -c:v libsvtav1 -crf 25 -preset 2 -svtav1-params tune=0:film-grain=8 -g 24 -c:a libopus -b:a 128k -ac 2 -c:s copy <output file name>

Here's the breakdown:

-pixfmt yuv420p10le forces the video to be encoded in 10-bit, even if the video was in 8-bit, doing so can yield slightly better dark scenes.

-c:v libsvtav1 sets the video codec to Intel's SVT-AV1 codec.

-crf 25 stands for constant rate factor, of which SVT-AV1 takes in 1 to 63.

-preset 2 sets the ratio between encoding speed and compression efficiency, SVT-AV1 takes in 0 to 13, higher is faster but takes more storage space.

-sv1av1-params is used to pass parameters straight to the SVT-AV1 encoder.

tune=0 is a SVT-AV1 an argument that sets the encoder to optimise for subjective visual quality or objective quality, the former being 0 and latter being 1.

film-grain=8 is SVT-AV1's film grain synthesis, which replaces real film grain with fake grain to make it easier to compress, it takes in a value range of 1 to 50, largest value gives the most noise.

-g 24 informs SVT-AV1 when to forcefully insert a key-frame, here it does so ever 24 frames.

-c:a libopus sets the audio codec to OPUS.

-b:a 128k sets the audio bitrate to 128k.

-ac 2 forces the number of audio channels to the specified number.

-c:s copy copies whatever subtitles there are from the input video to the output video, though I do not know much about this at the moment.

This setting is great and was what SVT-AV1 had on their documentation. On devices that supported it, this did great on my Jellyfin server, and performance was miraculous. That being said, prey and hope that you users all have devices that support it.

Other than that, AV1 is great, it saves storage space, it is quick too, but getting here was irritating and I really do hope that more people adopt it, so that more people can ask edge-case questions, making it easier to find information.


Most of the information can be found here: