SoftBank's latest initiative focuses on making the AMD Instinct AI chip much more powerful with AI workloads, through a "GPU partitioning" mechanism that sounds really interesting.
SoftBank Has Deployed a Self-Built Orchestrator For AMD's Instinct GPUs, Dividing Hardware & Memory Pools
AMD's AI infrastructure hasn't been the go-to option for hyperscalers in recent times, given all the attention towards NVIDIA, especially after the debut of the Blackwell series. When we talk about AMD's core customers, SoftBank is a name that pops up on several occasions, and this time, their technological wing has pushed out something pretty interesting. According SoftBank's recent blog post, they have paired up an Orchestrator with AMD's Instinct AI chips, where the idea is to distribute compute resources depending upon workload intensity and availability.
In collaboration with AMD, SoftBank has developed an enhanced Orchestrator feature that leverages the GPU partitioning capabilities of AMD Instinct™ GPUs, which allow a single GPU to be used as multiple logical devices. This feature allows for the flexible and optimal allocation of GPU resources based on the requirements of the AI application, such as model size and concurrency.
- SoftBank
Diving a bit into the technicals, SoftBank's Orchestrator focuses on the compute distribution within AMD's Instinct GPUs, by seggregating workloads on the basis of multiple GPU instances, running on individual Accelerator Complex Die (XCDs). You could have a single instance (SPX mode), and running up to eight (CPX mode), increasing the level of granularity with each configuration. Apart from XCD division, the Orchestrator also leverages AMD's high-capacity memory capacities, dividing them into individual HBM regions for each GPU instance.
SoftBank intended to achieve a more low-level control of compute resources with its Orchestrator, and at the same time, ensure that there is a strict hardware-level isolation in place, to prevent unpredictable latency spikes. The company didn't share any performance figures yet, but they do mention "optimal resource allocation", which is more effective in SLM and MLM workloads. SoftBank also plans to explore such orchestrators for other AI accelerators as well, but for now, the implementation is confined to AMD.
Follow Wccftech on Google to get more of our news coverage in your feeds.
