Amazon Could Turn to Qualcomm’s 768GB AI200 Chips as AWS Races to Slash Inference Costs Choking Margins

Ramish Zafar

Semiconductor design giant Qualcomm is tipped to deepen its partnership with Amazon's Amazon Web Services (AWS) business for its AI chips, suggests a report from banking giant Wells Fargo. The pairing will play into AWS's strategy of improving operating margins with the AI chips that it uses and lower the overall inference costs that are driven by AI accelerator costs, suggests the bank. Wells Fargo's report comes as some rumors have suggested that Qualcomm might launch its AI CPUs to target the growing demand of agentic computing, which has shifted the focus back to CPUs in the AI infrastructure buildout.

Amazon AWS Might Be Qualcomm's "Lead" ASIC Partner, Says Wells Fargo

Designed for AI inference applications, Qualcomm launched its AI200 AI chips last year. The chips stand out in their capacity to support large language models due to their ability to support up to 768GB of memory per chip. With the AI200's rollout slated for 2026, Wells Fargo believes that Amazon might become a key Qualcomm partner for the new chips.

Related Story AWS Graviton5 CPUs Now Available: Purpose-Built For AI With 25% Performance Uplift, 192 Cores, DDR5-8800 & PCIe Gen6 Support

In a fresh note, the investment bank lays out the economics of the AI200 chips. It claims that they can be deployed at a cost of $3.5 billion per gigawatt and drive the firm's earnings per share up by as much as $2.50. This is contingent on Qualcomm being able to increase the number of accelerators per rack, says Wells Fargo.

Amazon Could Be Interested In Qualcomm's AI Chips Due To Shifting Cost Metrics, Says Bank

The bank adds that Amazon's Amazon Web Services (AWS) cloud business could be the lead customer for Qualcomm. It remarks that "based on company comments / our analysis, we see AWS as the potential lead hyperscale ASIC partner." It cites Qualcomm CEO Cristian Amon's comments hinting towards a large cloud company and the fact that AWS currently offers the AI100 Ultra chips. The AI100 Ultra's strong dollar-per-GPU hour-per-FLOPS performance is "relatively strong" compared to its competitors, says Wells Fargo.

Amazon is interested in efficient chips, according to the bank, as it sees a "move down the token pricing spectrum as a strategy aligned with its philosophy of utilizing internal silicon to drive OM% and save on Capex." Additionally, high inference costs are also preventing AI inference revenue from reaching all classes of customers.

Token-based pricing has increasingly started to become relevant as the AI industry shifts towards inferencing. Earlier this year, in an interview, an insider from computing infrastructure provider Nebius shared that firms are charging their customers by the million-tokens. This has led to alternatives, such as NVIDIA-backed Groq's AI chips becoming popular.

Ramish Zafar Photo

About the author: Ramish is a seasoned technology writer and editor with more than a decade of experience. He specializes in semiconductor fabrication and market analysis. With a background in finance and supply chain management - via his bachelors in Finance and a micromasters in supply chain management from MIT - Ramish combines financial rigor with deep industry insight to deliver accurate and authoritative coverage.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Deal of the Day

Button