A world model is an AI model that learns a representation of an environment and predicts how that environment will change over time, including how actions affect future states. It gives an agent a mechanism for anticipating consequences rather than learning only from direct trial and error.
World models may operate over compact latent states, images, video, physical sensor data, game states, or combinations of modalities. Given a current state and a candidate action, the model estimates a future state, observation, reward, or probability distribution over possible outcomes.
An agent can use a world model to:
- simulate alternative action sequences before acting;
- plan toward a target state;
- generate training environments or synthetic experience;
- learn policies with fewer interactions in the real world; and
- detect when observed behavior differs from expected dynamics.
World models are related to generative video models but are not synonymous with them. A video model may generate visually plausible sequences without supporting controllable actions or stable environment dynamics. A useful world model must represent action-conditioned change well enough for planning or interaction.
Applications include robotics, autonomous systems, games, scientific simulation, and training AI agents in generated environments. They may combine a Vision-Language Model (VLM) for semantic understanding with specialized dynamics prediction.
The central failure mode is model error accumulating over long simulated horizons. An agent can exploit inaccuracies in the learned environment or plan around physically implausible states. Evaluation should therefore measure controllability, temporal consistency, action consequences, and transfer to the real environment.
Google DeepMind defines world models as systems that simulate aspects of the world and enable agents to predict environmental evolution and action effects in its Genie 3 overview.
The LLM Knowledge Base is a collection of bite-sized explanations for commonly used terms and abbreviations related to Large Language Models and Generative AI.
It's an educational resource that helps you stay up-to-date with the latest developments in AI research and its applications.