Context Engineering

Context engineering is the practice of designing and managing all information available to a Large Language Model (LLM) when it performs a task. This includes the system message, user instructions, conversation history, retrieved documents, tool definitions, tool results, examples, memory, and intermediate state.

The term extends prompt engineering. Prompt engineering focuses primarily on the wording and structure of instructions. Context engineering treats the complete input assembled at runtime as a system design problem.

A model's context window is finite, but filling the available window is not necessarily beneficial. Irrelevant or duplicated information can dilute important instructions, increase latency and cost, and aggravate the Lost-in-the-Middle Effect. Effective context engineering therefore aims to provide the smallest set of high-signal tokens required for the current decision.

Common context engineering techniques include:

selecting only relevant documents through Retrieval-Augmented Generation (RAG);
placing stable instructions and examples before variable task data;
returning concise, structured results from tool calls;
using prompt caching for long, repeated prefixes;
loading agent skills and tool definitions only when relevant;
separating durable agent memory from temporary working context;
pruning obsolete messages and duplicate observations;
applying context compaction during long-running interactions; and
measuring which context elements improve task success through agent evaluation.

For an AI agent, context changes after every action. Tool outputs, plans, errors, user approvals, and environmental observations must be incorporated without allowing the prompt to grow indefinitely. Context engineering is consequently an ongoing runtime process rather than a one-time writing exercise.

A useful implementation principle is to treat context as an attention budget. Every inserted token should have a clear purpose, provenance, and expected effect on the next model decision. Anthropic describes this approach in its guide to effective context engineering for AI agents.

The LLM Knowledge Base is a collection of bite-sized explanations for commonly used terms and abbreviations related to Large Language Models and Generative AI.

It's an educational resource that helps you stay up-to-date with the latest developments in AI research and its applications.

Context Compaction

Context

Computer-Using Agent (CUA)

Compound Prompt

Coding Agent (Agentic Coding)