Apple's new Siri, empowered by a custom Google Gemini model in the cloud, was supposed to run on Apple silicon, or so the maker of iPhones had assured not too long ago.
Yet, Apple has struggled to accommodate Google's behemoth of a model on its own servers, forcing the Cupertino-based tech giant to resort to a NVIDIA GPU-based band-aid of sorts to safeguard at least a shred of its privacy-related credentials, all the while hosting the Siri-enabling Gemini model on Google's servers.
Apple has to host the new Siri-enabling behemoth of a model on Google's servers for optimal inference, and seems to have landed on the built-in encryption feature within NVIDIA's B200 GPUs as a privacy-related band-aid
We already know that the upcoming chatbot-style Siri will reportedly leverage a much more advanced version of Google's Gemini model, known internally as Apple Foundation Models version 11. According to Gurman, "the model is expected to be competitive with Gemini 3 and significantly more capable" than the one supporting the revamped Siri.
Meanwhile, Apple is also training a host of smaller on-device models via a technique called distillation, which imbues these student models with some of the same capabilities as those possessed by their teacher model, which in this case is the licensed Google Gemini model.
However, given the fact that Google's custom Gemini model has trillions of parameters, Apple has been struggling to accommodate it within its bespoke server network, called Private Cloud Compute. Accordingly, some user requests for the new Siri will be processed directly by the licensed Gemini model in Google Cloud to ensure optimal inference.
Now, The Information has come out with an interesting report, indicating that Apple is leaning towards deploying NVIDIA's B200 GPUs within Google's servers, especially as these GPUs come with a built-in encryption feature that enrypts data as it is being processed.
NVIDIA proclaims that the feature "preserves the confidentiality and integrity of AI models deployed on Rubin, Blackwell, and Hopper GPUs," while enabling "sensitive AI workloads to run securely at scale with near-native performance, even in shared or cloud environments."
This step should help Apple reassure its users that their data can't be siphoned off by Google, constituting the best possible compromise under the prevailing ground realities.
Follow Wccftech on Google to get more of our news coverage in your feeds.
