25 February 2026
Mercury 2: The Diffusion-Based LLM That's 5x Faster — And Why Model Agnosticism Matters More Than Ever
Inception Labs has released Mercury 2, a revolutionary LLM that generates responses through parallel diffusion rather than sequential tokens. At 1,009 tokens/second, it's changing the economics of production AI. Here's why your architecture needs to be ready.