In video codecs, CNN-based models have shown huge promise in two related tasks: in-loop restoration and frame super-resolution. In our previous work, we presented a framework that uses a common CNN architecture with downloadable model parameters for both these tasks with a preliminary performance study, where encoderside selection of scale factor was left as future work. The advantage of a common architecture with switchable parameters is that a single hardware inference engine can be utilized in all cases of same-resolution and super-resolution restoration, thereby limiting implementation costs. In this paper, we fully integrate this framework into the under-development AV2 video codec from the Alliance for Open Media (AOM). We also implement an algorithm for encoder-side selection of the super-resolution scale factor. With this implementation, we are able to achieve combined compression improvement up to −3.5% (AI) and −3.9% (RA) in BDRATE PSNR-Y and up to −7.8% (AI) and −7.9% (RA) in BDRATE VMAF, with inference cost as low as 1500 MACs/pixel.
|