Mobile

ARM’s Cortex A77 Delivers A 35% Floating Point Improvement; Valhall Architecture Optimizes Texture Heavy Performance

Ramish Zafar • May 27, 2019 at 12:56pm EDT

We've got a big announcement from British chip designer ARM today. The firm has introduced its Cortex A77 cores and a brand new GPU architecture. These changes will power Android smartphones that will hit the shelves from next year, as 2019's flagships will be powered by Qualcomm's Snapdragon 855. The improvements from ARM promise gains in performance and power efficiency, with the company having focused on every part of its CPU core, IP and design. Take a look below for more.

ARM's Cortex A77 Core Makes Important Changes Over Its Predecessor As The Company Launches A Brand New GPU Architecture

As we settle in 2019, changes in the smartphone market are solidifying. The need for SoCs to support computational and Machine Learning workloads has increased, and vendors are tailoring their solutions accordingly. ARM's followed this trend by launching the Cortex A77 and a new GPU architecture dubbed 'Valhall'.

Starting with the Cortex A77, ARM has focused on maintaining the Cortex A76's performance but reduce power consumption. The company has doubled branch prediction, increased fetch bandwidth, added a new ALU pipeline and increased decoder width. The Cortex A77's branch predictor's running bandwidth has doubled to 64B/cycle. ARM has also increased the predictor's BTB (Branch Target Buffer) capacity to 8K entries

A nice upgrade that follows in line with Intel and AMD's x86 designs is a brand new Macro-OP (Mop) cache on the A77's front-end. The Mop allows the A77 to reduce branch mispredict latency to 10 cycles. ARM has also designed the A77 in a manner that allows the core to bypass its decode stage in case instructions are already present in the Mop.

ARM's Cortex A77 CPU Promises A 20% Improvement In Single-Core Scores And A 35% Gain In Floating Point Calculations

ARM's decision to add a new ALU in the A77's back-end improves the core's performance by decreasing back-end bottleneck. The A77's L1/L2 Data Caches have dedicated issue ports for store-data pipelines and improved engines to contribute towards the aforementioned power efficiency. Their strongest improvement is in data prefetching, where the company has made improvements to allow the core to manage more instructions and adapt behavior according to memory subsystem latency.

Cache sizes for the A77, however, stay the same this year. The core has 64KB L1 and 256.512KB Private L2 ECC caches. Performance wise, ARM promises that the A77 will deliver a 23% increase in integer and 35% increase in floating point performance in SPEC2006. The chip will also improve memory latency by 15%, and the firm believes that the A77 will reach 3.0GHz, similar to its predecessor.

ARM's Valhall GPU Architecture Offers A 60% Improvement In Machine Learning, A 30% Increase In Performance Density And A 30% Gain In Power Efficiency

ARM's latest Valhall GPU architecture is an upgrade to the company's Bifrost architecture that's present in the current Mali G76 GPUs. Valhall delivers impressive improvements in Performance density (30%), Machine Learning (60%) and Power Efficiency (30%). Valhall's execution core is similar to the ones found in products from AMD and Nvidia, meaning that the architecture allows the Mali G77 to feature 16-wide warps, two shader cores with one execution engine for each and 16 FMA clusters per execution engine.

In publishing the performance figures for Valhall and the Mali G77, ARM claims that the GPU will provide between 1.4X to 1.6X performance improvement per mm² over the G76. Shader cores on the G77 are the same size as those on the G66. For machine learning, the G77 has 1.6X inference performance of its predecessor which is the courtesy 33% more processing units on the core.

The texture mapping unit on the Mali G77 doubles its throughput and it has 4 bilinear texels/clock, 2 trilinear texels/clock, two times the anisotropic filtering over the G76 and a focus on texture computing. It's important to note that the G77's core support is limited to 16 cores for now. The G77 also has a large IP block that consolidates the resources for earlier generations' execution engines.

Valhall and the Mali G77 are optimized for performance on texture heavy games, and fixed issue scheduling on the graphics processor is handled by the hardware. ARM's focus with the new graphics architecture is the execution core, which is optimized to reduce latency and improve texture mapping. Alongside its CPU and GPU designs, the company has also introduced its custom NPU (Neural Processing Unit) dubbed as the ML Processor.

This processor is capable of delivering 4 TOPS (Trillion Operations per Second) and power efficiency of 5TOPS/W. The Processor can scale up to eight NPUs and 32 TOPS in a single cluster, and it supports both convolutional and recurrent neural networks. These updates from ARM are in line with the software that will become common on the flagships of the future. Machine learning is after all at the heart of many different applications.

Thoughts? Let us know what you think in the comments section below and stay tuned. We'll keep you updated on the latest.

About the author: Ramish is a seasoned technology writer and editor with more than a decade of experience. He specializes in semiconductor fabrication and market analysis. With a background in finance and supply chain management - via his bachelors in Finance and a micromasters in supply chain management from MIT - Ramish combines financial rigor with deep industry insight to deliver accurate and authoritative coverage.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on ARM’s Cortex A77 Delivers A 35% Floating Point Improvement; Valhall Architecture Optimizes Texture Heavy Performance

ARM’s Cortex A77 Delivers A 35% Floating Point Improvement; Valhall Architecture Optimizes Texture Heavy Performance

ARM's Cortex A77 Core Makes Important Changes Over Its Predecessor As The Company Launches A Brand New GPU Architecture

Trending Stories

Ubisoft Copies The Crimson Desert’s Playbook, As Assassin’s Creed Black Flag Resynced Ditches Roadmap For Community Feedback

ASUS Rolls Out New BIOS Update For 600 And 800 Series AMD Motherboards, Enhancing Compatibility With CXMT Memory

PlayStation 6 Patent Scraps Liquid Metal Cooling After PS5 Leaks Fried APUs And Motherboards For Years

Crimson Desert’s BlackSpace Engine Topped Death Stranding 2 and DOOM for Best Technical Innovation While Patch 1.14 Rolls Out Cross-Save

Fans Believe Recent Destiny-like Halo MMO Leaks Are Now Backed Up by an Interview Featuring Halo Studios Head Pierre Hintze

Popular Discussions

AMD Radeon Drivers Silently Add Multi Frame Generation “MFG 8x”, Ray Regeneration, and Neural Radiance Overrides, Hinting At A Bigger FSR Push

NVIDIA’s GeForce RTX 5070 Ti SUPER – Specs, Performance, And Price, Everything We Know So Far

AMD Prepares For Zen 6 EPYC CPUs Launch For July 22nd-23rd, Confirms AMD’s Mark Papermaster

AMD Ryzen 7 7700X3D 4.5 GHz “3D V-Cache” CPU Review: The Budget X3D Champ For AM5

AMD Ryzen 7 5800X3D Outsells Ryzen 7 7800X3D For The Same Price On Amazon Despite Being Weaker

ARM’s Cortex A77 Delivers A 35% Floating Point Improvement; Valhall Architecture Optimizes Texture Heavy Performance

ARM's Cortex A77 Core Makes Important Changes Over Its Predecessor As The Company Launches A Brand New GPU Architecture

Related Story Lip-Bu Tan Nearly Walked Away From Semiconductors, But One Plea to ‘Save Intel’ Pulled Him Back as CEO, Now Hiring Top CPU/GPU Architects

Further Reading

Trending Stories

Popular Discussions