Optimizing half precision Winograd convolution on ARM many-core processors

By Dedong Xie, Zhen Jia, Zili Zhang, Xin Jin
2022
Download Copy BibTeX
Copy BibTeX
Convolutional Neural Networks (CNNs) are widely used in real world applications, e.g, computer vision. Winograd based convolution are usually applied due to its low computation complexity. For the underling hardware, ARM many-core CPUs, by their price performance, are favored by cloud providers like Amazon Web Services (AWS). However, existing Winograd convolution implementations for ARM architecture are mostly optimized for mobile devices, and usually can not fully utilize hardware resources of many-core processors.

In this paper, we propose HAWC, an optimized half precision floating-point (FP16) Winograd convolution implementation for ARM many-core processors. HAWC employs a series of optimization methods, which are suitable for ARM NEON architecture, and assembles them as a whole solution to improve performance. Our evaluations show that the HAWC achieves on average 10.74× and up to 27.56× speedup on representative convolution layers over state-of-the-art solutions.
Research areas

Latest news

IN, TS, Hyderabad
Welcome to the Worldwide Returns & ReCommerce team (WWR&R) at Amazon.com. WWR&R is an agile, innovative organization dedicated to ‘making zero happen’ to benefit our customers, our company, and the environment. Our goal is to achieve the three zeroes: zero cost of returns, zero waste, and zero defects. We do this by developing products and driving truly innovative operational excellence to help customers keep what they buy, recover returned and damaged product value, keep thousands of tons of waste from landfills, and create the best customer returns experience in the world. We have an eye to the future – we create long-term value at Amazon by focusing not just on the bottom line, but on the planet. We are building the most sustainableRead more