Huawei recently unveiled its latest flagship smartphone, the Honor 7, and like many of this year’s flagships, an increased amount of attention has been placed on development and marketing of the handset’s camera, which features a 20MP sensor and f/2.0 aperture. As well as the physical hardware, post-processing is a major factor that determines final image quality. Huawei, supported by the ARM ecosystem team, were able to optimize their most advanced image processing algorithms radically using the on-chip ARM Mali GPU.
To this end, Huawei made use of the OpenCL industry standard API framework that is designed to allow for programs to execute code across heterogeneous platforms and aims to allocate tasks more efficiently to the most suitable processing unit.
What they did differently
Traditionally, camera modules come with their own image signal processing (ISP) logic that is charged with processing the data gathered from the sensor, such as de-noise, sharpening or color correction. Modern application processors also typically embed one or more ISPs, in some cases DSP is used.
This tried and tested hardware setup serves its purpose and usually offers a good balance of cost, performance, area, and power requirements, which are all important points to consider when designing a mobile devices.
HiSilicon’s Kirin mobile SoCs are based on ARM CPU and GPU technologies.
However, hardware has a critical limitation: it cannot be changed once it is committed to silicon and this happens quite some time before the final device makes its way into the customer’s hands.
One of the major advantages in using OpenCl on the GPU is that developers can easily update and improve their image processing algorithms. Traditionally, manufacturers can’t easily reprogram the ISP once it has been embedded into a product, meaning that development has to be done in advance and meaningful software improvements are tough to implement. Moving over to OpenCL means that additional software tweaks and updates can be patched in at a later date, while other implementations are locked with the hardware.
Huawei and ARM benefited from using OpenCL based development, as it meant that they could continue to tweak their algorithms late into the development process.
GPU acceleration and performance
As you have probably noticed, ARM is quite big on the heterogeneous concept, whereby specific computational tasks are assigned to the most efficient type of processor. Huawei’s HiSilicon Kirin 935 SoC, found inside the Honor 7, is an excellent example of this type of processor design, with two clusters of quad-core Cortex-A53 cores at different speeds, combined with a Mali-T628 MP4 GPU.
When it comes to image processing, there are a lot of complex computational tasks that take place over multiple pipeline stages and often in parallel. Even processor that appear to be simple, such as de-noise, contain many steps, from detection to blurring and filtering. These types of filters play a core role in mobile devices, in order make up for the small image sensor sizes and compensate for noise in low light environments. Most photos are taken in challenging lighting conditions and it is essential that a mobil device is able to cope with this to ensure a good end-user experience.
We can spot reference to OpenCL at the Honor 7’s launch presentation. Now we know what it’s all about.
High resolution images have a huge chunk of data to process, which has to be done very quickly if we want a real-time output. This sounds like a pretty suitable task for a graphics processing unit with higher memory bandwidth, which are used to dealing with lots of pixel data for functions such as UI drawing and gaming.
ARM and Huawei worked together to optimize the GPU acceleration processing pipeline, fine-tuned interoperation between the CPU and GPU, and tied it all together with the existing camera hardware. Don’t ask me exactly what they did, but the end result apparently produced a twofold increase in performance by using the GPU.
Lots of other uses
Heterogeneous processing and GPU compete has plenty of other potential use cases and benefits, and development into this type of processing is well under way.
Heterogeneous processing isn’t always about more hardware, instead its about picking the most suitable piece out of what you have.
ARM envisions a range of target applications, from computational photography to computer vision, deep learning, and new multimedia codecs and algorithms. Outside of ARM, other companies have also been opening up their technologies to work with OpenCL and Mali GPUs. Examples include gesture and face tracking applications from eyeSight Technologies, the implementation of an OpenCL imaging library for Mali by Omnivision, camera middleware by ArcSoft and ThunderSoft, as well as HEVC and VP9 decoders by Ittiam Systems, and many more.
We are only beginning to scratch the surface of the potential of heterogeneous computing. I expect that other OEMs will implement similar or entirely new functions that mix and match hardware and this will contribute to furthre improving devices and end user experiences. What Huawei accomplished with the Honor 7 is an exciting milestone in the adoption of this technology and will no doubt impact the broader adoption of GPU compute for key visual computing use cases and applications.