Apple’s AI Server Chip “Baltra” Likely To Be Used Primarily For AI Inference

•

Dec 15, 2025 at 02:07pm EST

A close-up view of an unbranded central processing unit (CPU) with visible pins on a motherboard, surrounded by a digital circuit pattern.

It is common knowledge that Apple is a vertical integration aficionado, preferring to retain key technological nodes in-house wherever feasible, with its sprawling custom silicon design efforts offering perhaps the most apt illustration of this paradigm.

Enter "Baltra," the internal codename given to Apple's bespoke AI server chip, which is expected to debut in 2027.

Apple's "Baltra" AI server chip is likely to be used primarily to satisfy its gargantuan inferencing needs

JUST IN: Apple, $AAPL, is developing its first server chip for AI, code-named "Baltra," and is working with Broadcom, $AVGO, on the crucial networking technology to avoid buying from Nvidia, $NVDA
— unusual_whales (@unusual_whales) December 15, 2025

Back in Spring 2024, multiple reports emerged that Apple was working with Broadcom on its first AI server chip, bearing the internal codename "Baltra." Some reports at the time also suggested that the chip would leverage TSMC's 3nm 'N3E' process, and that the design process would conclude over the coming 12-month period.

Of course, the actual deployment of these custom AI chips is now expected in 2027, with Apple having already commenced the shipment of its US-made servers back in October 2025.

On that Apple AI chip

Basically I doubt they'll do a massive cluster, but maybe something closer to a GB300 style with like 64 chips all to all with larger high bandwidth LPDDR memory

Should be significantly cheaper than most current chips and match the needs pic.twitter.com/I6OJSFBCyb
— Max Weinbach (@mweinbach) December 15, 2025

However, the real question is: how will Apple use its custom AI chips? For that specific raison d'être will dictate Baltra's overall chip design and architecture.

Apple is not expected to train large AI models, at least for the time being, especially as it has already struck a deal with Google to deploy a customized 3-trillion-parameter Gemini AI model to power Apple Intelligence in the cloud, and would pay Google $1 billion per year for the right to use this model.

It is only reasonable to assume that Apple will primarily use the "Baltra" AI server chips to cater to its gargantuan AI inferencing needs. As a refresher, inference occurs whenever already-trained models leverage their knowledge base and prior training to perform a specific task, which might include something as mundane as writing an email based on the prompts provided.

Of course, the architecture of inference chips is fundamentally different from those that are used to train AI models, with the former laying much more emphasis on latency and throughput. AI inference chips also leverage lower precision math-based architecture, such as INT8.

As such, given this context, we can reasonably infer that Apple and Broadcom are likely to focus on these aspects as they proceed with Baltra's overall design process.

Meanwhile, Apple's sprawling custom silicon offerings continue to expand. In addition to its well-known A-series and M-series chips, Apple now also uses its bespoke C1 modem chips. Moreover, the Cupertino giant might also unveil a derivative of its Apple Watch-focused S-series chip within its AI smart glasses all set to launch next year.

About the author: Writing is my one incontrovertible passion. Over the past six years, he has authored over 2,200 distinct articles on financial and tech-related topics, spanning nearly 1 million words. And he has been a member of Wcctech mobile team since 2025. As an alumnus of the University of Toronto, Rotman Commerce Program, I bring nuance, in-depth knowledge, and a unique perspective to every topic that I cover. When I'm not writing, I'm traveling the world, exploring hidden confectionaries and restaurants as an aspiring food connoisseur.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Apple’s AI Server Chip “Baltra” Likely To Be Used Primarily For AI Inference

Related Story iPhone Fold To Feature A Dual-Battery Configuration, But If You Want Better Runtime, You’ll Have To Upgrade To The iPhone 18 Pro Max Instead

Apple's "Baltra" AI server chip is likely to be used primarily to satisfy its gargantuan inferencing needs

Further Reading

NAND Becomes The Biggest Cost Component For Apple's iPhone 18 Pro Max (1TB) At Over $250 Per Unit

iPhone 18 Pro Max’s Battery Will Make It Weigh As Much As Apple’s Older Stainless Steel Models, But The Runtime Could Make You Ignore The Trade-Off

The Global PC Market Declined By 4.9% Versus Last Year As Memory Shortages Intensify, But MacBook Neo's Success Shows That x86 Rivals Need To Do More

An M5 Max MacBook Pro Price Observer Noticed That Apple Hiked An Additional $1,000 On The Top-End Configuration After Its Gut-Wrenching Announcement