✨ A new era for LLMs


Hey Reader,

For years, autoregressive models (ARMs) have dominated the landscape of large language models (LLMs).

They generate text one token at a time, a process that has proven effective but comes with inherent limitations.

What if there was another way—one that could model entire sequences holistically rather than predicting words step by step?

Enter Large Language Diffusion Models (LLaDA), an approach inspired by the success of diffusion models in image generation.

Instead of relying on autoregressive decoding, LLaDA gradually refines predictions through a reverse denoising process, leading to strong performance in in-context learning, instruction-following, and even complex reasoning tasks.

The latest research introduces LLaDA, a large-scale diffusion-based language model that challenges the status quo.

It not only scales effectively but also demonstrates competitive performance with state-of-the-art ARMs.

Figure 1 illustrates that LLaDA 8B achieves performance on par with strong LLMs like LLaMA3 8B across various benchmarks, including MMLU, TruthfulQA, ARC-C, GSM8K, Math, HumanEval, MBPP, CMMLU, and C-Eval.

Table 3 demonstrates LLaDA's ability to address the "reversal curse," surpassing GPT-4o in a reversal poem completion task, highlighting its proficiency in tasks requiring bidirectional context.

These results suggest that diffusion models are no longer limited to images; they may redefine how we build and scale LLMs.

If you’re interested in discussing this further, I’d love to connect.

Let me know what you think!

Best,
Diogo

Data Heroes, with Diogo Resende

Stay on top of Data Science, AI, and Analytics trends—sign up now for bite-sized insights. Join 40k+ people and boost your career with fresh strategies, top tools, and insider knowledge.

Read more from Data Heroes, with Diogo Resende

Hey Reader! I poked a few Model Context Protocol servers this week and stumbled on one with zero security. If you’re using (or about to use) MCP, watch this: 👉 Hacking MCP: We Found a Server with ZERO Protection! You’ll see the exact test I ran and the two fixes you can copy-paste today to keep your own endpoint off the hacker menu. Catch you in the comments,Diogo

Hey Reader, Apple just rolled out a deep dive that claims our smartest chatbots hit a brick wall when puzzles get tough. The paper—titled “The Illusion of Thinking”—fans out six large models and asks if slow “thinking” text actually helps them solve logic games. I read the whole thing so you don’t have to, and here’s the story in plain talk. HOW THE EXPERIMENT WAS SETUP Apple picks puzzles with clear rules like Tower of Hanoi, river crossings, the 24 Game, and simple path-finding mazes. They...

Hey Hey! Berlin’s in full-tilt summer mode—street raves, open-air cinemas, midnight kebabs. I kept pace and shipped a MASSIVE stack of course upgrades. Jump in, then jump outside. Ninja Haircut GIF RAG, AI Agents & Generative AI with Python and OpenAI 2025 NEW IN MAY: RAGAS + RAG deep dive with all the latest OpenAI tricks—done and waiting COMING IN JUNE: New sections on OpenAI Image Generation + Reasoning Models AI Agents For All! Build No-Code AI Agents & Master AI 2025 MAY REMAKES:...