DiffusionGemma: Google-এর ওপেন AI টুইস্ট

📅2 hours ago⏱2 min read

𝗗𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻𝗚𝗲𝗺𝗺𝗮: 𝗚𝗼𝗼𝗴𝗹𝗲'𝘀 𝗢𝗽𝗲𝗻 𝗔𝗜 𝗧𝘄𝗶𝘀𝘁

AI has lived in two separate worlds for years.

One side handles words through Large Language Models. The other side handles images through diffusion models. You use one to write and the other to draw. They rarely talk to each other.

Google is changing this with DiffusionGemma.

Most multimodal systems are clumsy. They use an encoder to look at a picture, turn it into a text report, and then give that report to a language model. This translation process loses nuance.

DiffusionGemma skips the middleman.

It treats pixels and words as the same language. It does not translate an image into a summary. It integrates image data directly into its processing. It sees and thinks at the same time.

This shift matters for three reasons:

Native Reasoning: You can show it a complex chart and ask for the business impact. It understands the data, not just the labels.
Spatial Awareness: Show it a diagram of a machine and ask for assembly steps. It understands how parts fit together.
Holistic Creation: Instead of predicting one word at a time like a mason laying bricks, it works like a sculptor. It starts with digital noise and refines the entire idea at once.

This approach moves us away from simple word prediction. It moves us toward true creation.

Google is making this open source. They released a 2-billion parameter model and a 7-billion parameter variant. These use the same architecture as their top-tier Imagen 3 model.

This gives developers the tools to build apps that do more than talk. You can build tools that see, create, and reason across different types of data.

The race is no longer just about who has the biggest model. It is about who has the smartest architecture.

Source: https://dev.to/gp-ia-blog/diffusiongemma-googles-open-ai-twist-597m

Optional learning community: https://t.me/GyaanSetuAi

DiffusionGemma: Google-এর ওপেন AI টুইস্ট

Continue reading

𝗨𝗻𝗶𝗳𝗶𝗲𝗱 𝗥𝗲𝘄𝗮𝗿𝗱 𝗠𝗼𝗱𝗲𝗹𝘀 𝗳𝗼𝗿 𝗔𝗜

𝗙𝗶𝘅𝗶𝗻𝗴 𝗔𝗜 𝗛𝗮𝗹𝗹𝘂𝗰𝗶𝗻𝗮𝘁𝗶𝗼𝗻𝘀

𝗧𝗵𝗲 𝗥𝗶𝘀𝗲 𝗼𝗳 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗔𝗜: 𝗧𝗿𝗮𝗻𝘀𝗳𝗼𝗿𝗺𝗶𝗻𝗴 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗺𝗲𝗻𝘁

ট্রান্সফর্মার কীভাবে কাজ করে

ডিরেক্ট ইনভার্সন: ডিফিউশন এডিটিং উন্নত করা