Blanch, Marc, Blasi, Saverio, Smeaton, Alan F. ORCID: 0000-0003-1028-8389, O'Connor, Noel E. ORCID: 0000-0002-4033-9135 and Mrak, Marta (2020) Attention-based neural networks for chroma intra prediction in video coding. IEEE Journal on Selected Topics in Signal Processing . ISSN 1932-4553
Abstract
Neural networks can be successfully used to improve several modules of advanced video coding schemes. In
particular, compression of colour components was shown to
greatly benefit from usage of machine learning models, thanks
to the design of appropriate attention-based architectures that
allow the prediction to exploit specific samples in the reference
region. However, such architectures tend to be complex and
computationally intense, and may be difficult to deploy in a
practical video coding pipeline. This work focuses on reducing
the complexity of such methodologies, to design a set of simplified and cost-effective attention-based architectures for chroma
intra-prediction. A novel size-agnostic multi-model approach is
proposed to reduce the complexity of the inference process. The
resulting simplified architecture is still capable of outperforming
state-of-the-art methods. Moreover, a collection of simplifications
is presented in this paper, to further reduce the complexity
overhead of the proposed prediction architecture. Thanks to
these simplifications, a reduction in the number of parameters
of around 90% is achieved with respect to the original attentionbased methodologies. Simplifications include a framework for reducing the overhead of the convolutional operations, a simplified
cross-component processing model integrated into the original
architecture, and a methodology to perform integer-precision
approximations with the aim to obtain fast and hardware-aware
implementations. The proposed schemes are integrated into the
Versatile Video Coding (VVC) prediction pipeline, retaining
compression efficiency of state-of-the-art chroma intra-prediction
methods based on neural networks, while offering different
directions for significantly reducing coding complexity.
Metadata
Item Type: | Article (Published) |
---|---|
Refereed: | Yes |
Uncontrolled Keywords: | Chroma intra prediction; convolutional neural networks; attention algorithms; multi-model architectures; complexity reduction; video coding standards |
Subjects: | Computer Science > Machine learning Computer Science > Digital video Computer Science > Video compression |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering Research Initiatives and Centres > INSIGHT Centre for Data Analytics |
Publisher: | Institute of Electrical and Electronics Engineers |
Official URL: | http://dx.doi.org/10.1109/JSTSP.2020.3044482 |
Copyright Information: | © 2020 IEEE |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
Funders: | Science Foundation Ireland, European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska Curie grant agreement No 765140. |
ID Code: | 25285 |
Deposited On: | 07 Jan 2021 15:22 by Noel Edward O'connor . Last Modified 07 Jan 2021 15:22 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record