Anthropic Releases New Mythos-Like Model
Digest more
Most AI models are designed to be autoregressive—they generate text left to right one token at a time. DiffusionGemma has more in common with image generation models, which start with static and then denoise it to create the desired content.
Sapient researchers trained a 1B reasoning model on just 40B tokens — scoring competitively with 2B-7B models at a fraction of typical pretraining cost.
