Ultra-low complexity neural networks for next generation video decoding
2025
Video compression enables the transmission of video content at low rates and high qualities to our customers. In this paper, we consider the problem of embedding a neural network directly into a video decoder. This requires a design capable of operating at latencies low enough to decode tens to hundreds of high-resolution images per second. And, additionally, a network with a complexity suitable for implementation on mobile and power constrained devices. Here, we explore the use of multi-scale convolutional neural networks to achieve these goals. We employ canonical polyadic decompositions, reduced channel counts and a super-resolution system design to create a network with a complexity of 584 multiply-and-accumulates per input sample. This is asserted to be tractable for implementation. We then introduce a method to control the network using side information in a video bitstream. Integrating the approach into a state-of-the-art codec demonstrates the efficacy of the approach, and we show the solution is able to reduce the number of bits required to send a video sequence by 30.4% at the same visual quality.
Research areas