Building a World Model: Gemini Omni and the New TPU 8 Silicon
Source: Google / Sundar Pichai (TPU 8t and TPU 8i official reveal, Google Cloud Next 2026)
The 2026 ecosystem pushes far beyond standard text generation and steps into physical simulation. Gemini Omni Flash processes audio, video, image, and text natively at the token level. It does this without relying on cascaded translation layers or bolted-on diffusion models.
This native processing allows Omni to function as a rudimentary world model. It intuitively calculates depth mapping, momentum, and fluid dynamic dispersion directly from 2D reference images and simple text instructions.
Google integrated the SynthID protocol directly into the model's logits processor to secure these outputs. SynthID does not alter pixels after generation. Instead, it uses a deterministic pseudorandom $g$-function to subtly augment the language model's natural probability scores. This embeds a cryptographic watermark that survives extreme compression and aggressive cropping.
Google's first production server rack, 1999, hand-built by Larry Page and Sergey Brin, now on display at the Computer History Museum, Mountain View. The foundation Google's silicon empire was built on.
Source: Google Cloud (cloud.google.com)
The Silicon Split: TPU 8t vs. TPU 8i
Training a world model and serving a fast agentic loop require very different hardware. The demands are now mutually exclusive. Because of this, Google officially split its silicon roadmap into two distinct paths.
To speed up Mixture-of-Experts routing, the 8i ditches the 3D torus in favor of the Boardfly topology.
Understanding the Hops reduction
In a standard 1,024-chip 3D torus, the worst-case packet traversal is 16 hops.
The flattened Boardfly topology reduces this maximum path to just 7 hops. Google combined this with a dedicated Collectives Acceleration Engine to cut on-chip communication lag by up to a factor of five. This delivers an 80% performance-per-dollar improvement. It finally makes the continuous serving of massive agent swarms economically viable.
Want to discuss this further?
I'm always happy to chat about software engineering, cloud architecture, AI/ML, and DevOps.
Follow me for more insights on software engineering, cloud architecture, AI/ML, and DevOps