Imagination Technologies Unveils 500 TOPS Neural Driving Assistance Platform


British chip designer Imagination Technologies has unveiled its latest neural networking platform that aims to target autonomous driving ADAS (advanced driver-assistance systems). The platform, dubbed as IMG Series4 NNA is designed to cater to a vast variety of applications with all categories of performance needs. The products will be available in the market this December states the company, as it has already licensed them out. Today's announcement comes as the automotive sector starts to embrace tech even more, with the advent of electric vehicles and advanced connectivity platforms such as Cellular Vehicle-to-Everything (C-
V2X) picking up the steam.

Imgtec's Series4 NNA To Scale Multiple Cores For Delivering High Performance

Imagination's platform is built on three pillars which the company claims will allow it to not only massively scale performance but also manage workloads in a manner that reduces latency and utilize less bandwidth and power consumption. The Series4 neural accelerator core is capable of computing anything ranging in between 12.5 trillion to a staggering 500 trillion operations-per-second (TOPS) believes the company.

Imagination Technologies Talks About Its RISC-V Undergraduate Course

It does this through linking a single Series4 core through high bandwidth interconnect to other cores, for a maximum cluster size of eight. Cores within the cluster will be connected through an interconnect, and clusters, which can also consist of four cores separated via a system bus - with the entire package being connected to an external double data rate (DDR) memory.  Each core also features an on-core memory (OCM), which forms a crucial component of the platform's computational prowess.

Imagination promises that the Series4 platform is capable of delivering up to 30 TOPS/Watt of power consumed and 12 TOPS/mm² in performance/unit of area. More importantly, the chips will be manufactured on the 5nm process node, which right now is the latest fabrication process available on the market from either Taiwan's Taiwan Semiconductor Manufacturing Company (TSMC) or Korean tech chaebol Samsung's Samsung Foundry.

Slide 19 from Imagination's slide deck highlighting key power and performance parameters of the Series4 NNA lineup.

Tensor Tiling Allows Series4 NNA To Reduce Latency

The multi-core and multi-cluster design will also allow the core to either execute multiple workloads at the same time or focus on a single workload - depending on the use case. Should all cores focus on a single task, the system's latency, which is the time lag between a processor receiving input and generating output, reduces to the extent the number of cores that have been assigned to the task. In other words,  the latency difference between a single core executing a single task and an octa-core cluster executing the same task should roughly be eightfold.

An integral component of the Series4 NNA lineup is a memory management feature that Imagination dubs as Tensor Tiling. This is an in-house technology by the looks of it, with the patent for Tensor Tiling still pending approval. Through Tensor Tiling, Imagination claims to deliver up to 90% bandwidth reductions on the Series4. Tensor Tiling allows the platform to segregate the neural network data into a subset of a subset. The first subset, a layer is what neural networks are fundamentally built on - with more layers highlighting higher complexity.

Tensor Tiling splits these layers into tensors and divides the batch into subsets which are then processed by the NNA core. This reduces latency, according to the company, since it reduces the dependency of the entire affair on external memory.

Imagination Technologies – the British Chipmaker – Is Now Aiming for a Relisting on Public Markets Following Backlash Against Its Chinese Backers

Slide 26 from Imagination Technologies' slide deck highlighting Tensor Tiling.

Performance Scaling Is Linear As More Cores Are Added Reveals Company's Data

Finally, Imagination also reveals the Series4's performance scales roughly linearly the same even as more cores are added. The company's data shows that for some workloads (ssd residual network with 34 layers, 2400x1200) performance scales perfectly linearly, and when the number of layers is increased to 50 and input image resolution decreased to 224x224, it scales linearly to what we have approximated as six cores, after which the performance-to-core gradient starts to curve downward.

The Series4 is also ISO26262 compliant, and through it, the company is delivering an impressive performance that should enable automakers to make serious shifts towards higher levels of autonomous driving. Its ability to scale performance should help adoption, as developers take advantage of the benefits from tensor tiling for the variety of applications out there.