Google’s chips challenge Nvidia.

In just a few months, Google’s artificial intelligence chips have become one of the hottest products in the tech industry. Leading AI developers, including some of Google’s biggest rivals, are snapping up these chips in large quantities.

Now, Google, a subsidiary of Alphabet Inc., plans to take advantage of the momentum and launch a new type of chip specifically designed for inference (that is, running the computations of an AI model after its training is completed). With this move, Google is expected to further challenge market leader NVIDIA’s dominance in the rapidly growing semiconductor field, driven by the rapid popularization of AI software.

As the demand for rapid processing of artificial intelligence queries continues to grow, Jeff Dean, Google’s chief scientist, said in an interview, “Now, it is particularly important to dedicate chips to training or inference workloads.” He added, “We are looking into many different aspects,” including the speed of the desired artificial intelligence results.

The company plans to unveil its next-generation custom chip, the Tensor Processing Unit (TPU), at the Google Cloud Next conference in Las Vegas this week. Amin Vahdat, who is in charge of Google’s AI infrastructure and chips, declined to comment on plans for an inference chip that can accelerate AI output, but said more information might be disclosed “in the near future”.

Nvidia’s graphics processing units (GPUs) remain the gold standard in the field of artificial intelligence, particularly for training more advanced models. But a growing number of up-and-comers are vying to challenge the chipmaker’s dominance in inference applications, including those offering chips designed to shorten the response times of chatbots and AI agents. Last month, Nvidia began selling a chip aimed at speeding up inference, based on technology it acquired from Groq. It was reportedly part of a $20 billion licensing deal.

Google has unique advantages in this competitive landscape, including a decade of chip design experience, vast resources from its online search profits, and first-hand insights into AI models. Among the top AI developers, only Google mass-produces its own chips, enabling it to share critical feedback across teams and better tailor the hardware. (OpenAI is only just beginning to design its own chips.)

In a recent podcast interview, Nvidia’s Jensen Huang emphasized the advantages of the company’s chips, saying they can handle “a large number of applications that TPU cannot.” Google itself relies on a combination of TPU and GPU for development. “Many people hope to use both devices at the same time,” Demis Hassabis, CEO of Google DeepMind, told Bloomberg. He said that leading AI laboratories are particularly interested in TPU.

Google has previously made a big deal of the inference capabilities of its chips. Partha Ranganathan, vice president and engineering fellow at Google, said that the company had considered launching separate chips for training and inference earlier, but has not adopted this approach so far. As investment in the AI field shifts from training to inference, this situation may change soon.

Analyst Chirag Dekate pointed out: “The focus of competition is shifting to reasoning.” He said that based on his experience, Google’s Gemini model is the fastest in handling complex reasoning tasks. “In this area, Google has an infrastructure advantage.”

According to Natalie Serrino, co-founder of Gimlet Labs, a startup that develops software to route AI tasks to the most suitable chips for each task, today’s TPU has become a powerful option for processing results from emerging AI agents that can handle more complex tasks on behalf of users. “They are excellent tools for dealing with the explosive growth of workloads,” she said.

Google’s decade-long chip development efforts saw a new breakthrough in October when the highly anticipated AI developer Anthropic PBC announced an expanded agreement to use up to one million TPUs. The following month, Google released the more advanced Gemini 3 model, which was trained and run on TPUs and received extremely high praise.

Since then, the demand for Google’s chips from large enterprises has continued to grow. Meta Platforms Inc. signed a multi-billion-dollar agreement to use TPUs through Google Cloud in the coming years. Santosh Janardhan, Meta’s infrastructure chief, said the company has just received its first significant batch of TPUs and is currently testing them to determine which tasks they are best suited for. “It looks like TPUs may have an advantage in inference,” he said, while noting that “any new platform comes with challenges and a learning curve.” Anthropic

The company has also signed an agreement with Broadcom, a TPU partner of Google, to obtain TPU chips, enabling it to leverage approximately 3.5 gigawatts of computing power starting from 2027. Citadel Securities plans to showcase at the Google conference how TPU can help the company train models at a faster speed than when using GPUs. According to Talal Al Kaissi, the interim CEO of Core42, the cloud division of Abu Dhabi’s G42 technology group, G42 has had “multiple discussions” with Google regarding the use of its TPU. Al Kaissi, when talking about these negotiations, said, “I am very optimistic about the prospects of the negotiations.”

Google is taking new measures to meet the actual needs of its customers. According to a person familiar with the matter, the company is testing allowing companies like Anthropic to run some of their TPU operations in their own data centers rather than in Google’s facilities. Vahdat said that Google also allows TPU customers to use external tools such as PyTorch and other scheduling software without having to rely entirely on Google’s products.

This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.