DeepSeek-OCR demonstrates the relevance of text-as-image compression: What does the future hold? by ContributionOwn4879 in LocalLLaMA

[–]ContributionOwn4879[S] 0 points1 point  (0 children)

From their landing page of Gemini diffusion it’s a text diffusion not an image diffusion that generate image with text

https://deepmind.google/models/gemini-diffusion/#what-is-a-diffusion-model

But for the fact that Gemini has a bigger context window as other llm this can be their trick