𝗚𝗼𝗼𝗴𝗹𝗲 𝗖𝗵𝗮𝗻𝗴𝗲𝘀 𝗔𝗜 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 𝗙𝗼𝗿𝗲𝘃𝗲𝗿

Standard AI models generate text one word at a time. This method is slow. Each new word requires a full pass through the network. This creates a bottleneck.

Google DeepMind created DiffusionGemma to solve this. It uses discrete text diffusion. It works by refining large blocks of text at once.

Key features of DiffusionGemma:

  • Parallel Generation: The model refines entire blocks of text simultaneously. It does not work left to right.
  • 4x Faster Speeds: Google reports speeds up to 4x faster on GPUs.
  • Mixture of Experts: The model uses 3.8B parameters per step from a 26B parameter backbone.

This model uses an encoder-decoder architecture. It corrects tokens across a digital canvas in real time.

You can use it now. It uses the Apache 2.0 license. It works with Hugging Face Transformers and vLLM.

Will diffusion models replace traditional AI scaling? Or will they only serve fast generation needs? Share your thoughts.

Source: https://dev.to/incredibleheck/google-just-killed-autoregressive-ai-generation-diffusiongemma-36io

Optional learning community: https://t.me/GyaanSetuAi