
Google launched IRONWOOD , it’s new chip designed to run Artificial
Intelligence (AI) models developed by Google Cloud’s AI Infrastructure
team is the latest 7the gen TPU geared towards speeding up AI
applications.
Google began using TPUs internally in 2015, and in 2018 made them
available for third-party use, both as part of its cloud infrastructure and by
offering a smaller version of the chip for sale. However Ironwood is first of
its kind TPU by google to run AI models.
What is the main purpose of a Tensor Processing Unit (TPU), and how does it differ from traditional CPUs and GPUs?
ATensor Processing Unit (TPU) is a type of Application Specific Integrated
Circuit (ASIC), Its purpose is to handle a narrow set of specific tasks. TPUs
were specifically developed to accelerate machine learning workloads and
to handle AI-specific computational tasks, making them more highly
specialized than both CPUs and GPUs. They run Google’s main AI services,
such as Search, YouTube, and DeepMind’s language models. They are
highly efficient at handling large datasets and running complex neural
networks, allowing for faster training of AI models compared to traditional
processors.
Why are TPUs more suitable for AI tasks compared to CPUs and GPUs?
TPUs provide a limited number of features and functionalities that are
directly useful to ML and artificial intelligence (AI) tasks but are not
necessarily useful for everyday general computing. ML models and the AI
platforms that use them, such as deep learning and neural networks,
require extensive mathematical processing. While it’s possible to execute
these tasks in ordinary central processing units (CPUs) or more advanced
graphics processing units (GPUs), neither is optimized for such tasks.
According to the reports published each Ironwood chip comes with peak
compute of 4,614 teraflop (TFLOP), which is a considerably higher
throughput compared to its predecessor Trillium, which was unveiled in
May2024.
Notably, Ironwood is currently not available to Google Cloud developers.
Just like the previous chipset, the tech giant will likely first transition its
internal systems to the new TPUs, including the company’s Gemini models,
before expanding its access to developers. Google Cloud will offer
Ironwood in two configurations to suit varying workload requirements: a
256-chip configuration and a larger 9,216-chip configuration.
How much High Bandwidth Memory (HBM) does Ironwood offer per chip, and how does it compare to Trillium?
Substantial increase in High Bandwidth Memory (HBM) capacity.
Ironwood offers 192 GB per chip, 6x that of Trillium, which enables
processing of larger models and datasets, reducing the need for frequent
data transfers and improving performance.
What is SparseCore in Ironwood, and what types of workloads does it support?
Significant performance gains while also focusing on power efficiency,
allowing AI workloads to run more cost-effectively .
Ironwood also features an enhanced SparseCore, a specialized accelerator
for processing ultra-large embeddings common in advanced ranking and
recommendation workloads. Expanded Sparse Core support in Ironwood
allows for a wider range of workloads to be accelerated, including moving
beyond the traditional AI domain to financial and scientific domains.
How is Ironwood optimized for running models like LLMs and MoEs?
Ironwood is engineered to efficiently manage the complex computation
and communication required by “thinking models,” including Large
Language Models (LLMs), Mixture of Experts (MoEs), and advanced
reasoning tasks. These models demand massive parallel processing and
high-speed memory access
What recent innovations have been announced by Google’s competitors such as Nvidia, Amazon, and Microsoft?
Customers in the past complained of Google’s Gemini being slow,
especially with larger contexts and token counts.
Nvidia announced its latest Blackwell Ultra chip and next-generation Rubin
and Feynman GPUs. These chips power Open AI’ latest AI models.
Amazon will ship its Trainium3 AI chip this year. Anthropic’s Claude model
is optimized for the Trainium2 chip. Microsoft currently uses Nvidia GPUs
but has deployed its own AI accelerator called Maia for inferencing
workloads.
In 2024 ,Scientists in China have claimed to develop a tensor processing
unit (TPU) that uses carbon-based transistors instead of silicon. Unlike
conventional TPUs, however, this new chip is the first to use carbon
nanotubes — tiny, cylindrical structures made of carbon atoms arranged in
a hexagonal pattern — in place of traditional semiconductor materials like
silicon. This structure allows electrons (charged particles) to flow through
them with minimal resistance, making carbon nanotubes excellent
conductors of electricity. The scientists published their research in the
journal Nature Electronics .From Chat GPT to Sora, artificial intelligence is
ushering in a new revolution, but traditional silicon-based semiconductor
technology is increasingly unable to meet the processing needs of massive
amounts of data.
With the introduction of IRONWOOD google bid to lead AI front seems to be
in contention with advance technological innovations from China and other
powerhouse like NIVIDA , who is currently leading the chipset forefront in
collaboration with Media TEK.