AI Can Now do 3D Games with a Single Prompt

Tencent just did a trick for playable worlds. The Hunyuan (of the Hunyuan video models fame) research team quietly posted Hunyuan-GameCraft, a model that turns a single still image and a bit of text into a live video stream you can steer with the usual WASD keys or a mouse.

Okay, the generation engine is now mostly 3D world generator with movement by WASD or arrows, but it’s there. You can move through the world generated by a single prompt, and look around, and it does the infinite generation on the fly.

How does it work?

It’s yet another black box system just like Stable Diffusion. Under the hood, GameCraft adds an action encoder to a diffusion video backbone. Instead of treating “W” or or arrow as discrete buttons, it translates every key tap and mouse wiggle into a smooth camera-trajectory vector, then feeds that vector straight into the denoiser so the view glides naturally. To keep scenes from melting after a few seconds, the researchers train with a hybrid history condition: every new two-second chunk is generated while the model peeks at masked versions of the footage it already made, so buildings and mountains stay where they should.

Overall architecture of Hunyuan-GameCraft

Some examples of generated games. You can see the videos on Github page.

Training and results

The training diet is huge: more than a million clips ripped from over a hundred AAA titles, think Cyberpunk 2077, Assassin’s Creed, Red Dead Redemption 2 – plus three thousand synthetic sequences rendered with perfect camera-pose labels to teach the network how to bank, yaw and strafe without losing its bearings. That breadth lets the model generalise far beyond the Minecraft-style footage most earlier research relied on. The Chinese don’t care about copyright, but neither does OpenAI.

Numbers from the paper back up the flashy demo. A 30-person user study put it first for image quality, motion smoothness and action accuracy, with scores north of 4.4/5 across the board. Even the distilled “PCM” version (it’s the same PCM scheduler used in Stable Diffusion), which trades some fidelity for speed, still beats most baselines while jumping to real-time-ish frame rates.

There are caveats. The current action vocabulary is mainly about exploring: walking, orbiting, panning. No shooting, sword-swinging or physics yet, and the output is a video, not an executable level; you can’t bump into walls or trigger quests. Real-time runs top out at 1280 × 720, and anything higher eats GPUs for breakfast. The authors openly list these gaps and say combat verbs and physical interactions are next on the roadmap.

Do you need fast GPU for that. Sure, you do! Six frames per second at 720 p on one A100 GPU. It’s slow for gaming, but fast enough that a web demo feels interactive.

So why is Tencent pouring resources into this? The company has been making a public push to open-source parts of its Hunyuan stack. All the major video models they released are open source.

Where does it go next? Into the creation of full games. I’m sure we are 2-3 years from now to generate a full working 3D shooter from just a prompt. Even in its current state, though, Hunyuan GameCraft shows that it in the future you will

A Dark Souls dungeon crawler with gargoyles with Skyrim-style open world and a recruitable NPCs ripped from Baldur’s Gate 3 you can have lewd romance with, uncensored, high difficulty, runnable on my 3070, six alternate endings, big swords, complex crafting system.

There, I made that prompt up. And you will get what you wish for. (Although the Chinese models are always strictly censored, even open source, so.)

Post Views: 1,837

AI Can Now do 3D Games with a Single Prompt

How does it work?

Training and results

Maciej Wlodarczak

Leave a Reply Cancel reply

My Stable Diffusion Handbook Out Now!

Recent Posts

Categories

Suggestions

AI Can Now do 3D Games with a Single Prompt

How does it work?

Training and results

Maciej Wlodarczak

Leave a Reply Cancel reply

My Stable Diffusion Handbook Out Now!

Recent Posts

Categories

Don't Miss

“I Lost my Soulmate”: ChatGPT-5 Has Lobotomized Virtual Boyfriends and the Users are Crying