Google DeepMind has introduced a new AI model named DiffusionGemma, part of the Gemma 4 open model family. Unlike traditional autoregressive models that generate text sequentially, DiffusionGemma produces text in parallel, which enhances its speed and efficiency on local hardware such as Nvidia GPUs. The model operates with a total of 26 billion parameters, activating 3.8 billion during inference, and can generate approximately 700 tokens per second on an RTX 5090, and over 1,000 tokens per second with a single Nvidia H100 AI accelerator. This performance represents a fourfold increase compared to similar autoregressive models.
Google DeepMind Releases DiffusionGemma AI Model with Increased Speed
Google DeepMind has launched the DiffusionGemma AI model, which generates text in parallel rather than sequentially. The model features 26 billion parameters and demonstrates significant speed improvements, producing up to 1,000 tokens per second on advanced hardware.
No note attached
on this article.
Bias Analysis
Bias Indicators Removed
- ✕ loaded language: 'Another day, another AI model from Google'
- ✕ framing: headline asserting a conclusion
- ✕ editorializing: Another day, another AI model from Google
- ✕ vague attribution: Google says
Original vs. Neutral
Google's latest DiffusionGemma open AI model comes with a 4x speed boost
Google DeepMind Releases DiffusionGemma AI Model with Increased Speed