Models & Releases Products & Apps·The Decoder·Jun 4

xAI updates Grok Imagine to 1.5 with image-to-video generation at 720p resolution

xAI's Grok Imagine 1.5 advances the image-to-video frontier with 720p generation from static frames and text direction, enabling multi-clip composition for longer narratives. This positions xAI as a serious contender in generative video alongside Runway and OpenAI's Sora, signaling that video synthesis is transitioning from research artifact to deployable capability. The preview release suggests xAI is moving faster on multimodal generation than many competitors, though 720p remains below broadcast standards and hints at remaining computational constraints.

Modelwire context

Analyst take

The more telling detail is the 'preview' label: xAI is shipping at 720p while framing it as a milestone, which suggests the release is as much about staking a competitive claim as delivering a finished product. The multi-clip composition feature is the functional differentiator worth watching, not the resolution number.

The Latent Space interview with Ethan He (published just three days before this release) provides the clearest frame here. He described Grok Imagine as built from scratch in roughly three months, with iteration speed on data pipelines and VAE design driving gains faster than architectural novelty. That timeline maps directly onto this 1.5 preview: xAI is shipping incrementally and publicly, using each release to signal velocity rather than waiting for a polished product. That strategy mirrors how OpenAI has handled Sora, where early access and staged rollouts built developer interest ahead of full availability. The difference is that xAI is doing this with a much smaller team and a compressed build history, which makes the pace credible but also means the gap between preview and production-grade output could be wider than the version number implies.

Watch whether xAI reaches 1080p or introduces audio-video synchronization within the next two quarters. If they do, it confirms He's claim that iteration velocity is the actual competitive variable. If the preview label persists past Q3 2026, the three-month build story starts to look more like a floor than a ceiling.

Coverage we drew on

Inside xAI: Building Grok Imagine in 3 Months, Videogen vs World Models, and Video Agents, Ethan He · Latent Space

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsxAI · Grok Imagine · grok-imagine-video-1.5-preview · Runway · OpenAI · Sora

Read full story at The Decoder →(the-decoder.com)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on the-decoder.com. If you’re a publisher and want a different summarization policy for your work, see our takedown page.