Context Engineering: Building Stateful AI Agents with Sessions & Memory
The Problem: LLMs Have Amnesia
Large Language Models are inherently stateless. Every API call is a fresh start—they don't remember your previous conversations, preferences, or the context from five minutes ago. This is a fundamental challenge for building intelligent agents.
Context Engineering solves this by dynamically assembling and managing information within the LLM's context window for every turn of a conversation.
Think of it as mise en place for AI—gathering all the right ingredients before cooking.
The Two Pillars: Sessions & Memory
Sessions = Your Workbench
A session is a container for a single conversation, holding:
- Events: The chronological history (user messages, agent responses, tool calls)
- State: Working memory/scratchpad for the current task
Like a desk covered with notes and tools for your current project.
Memory = Your Filing Cabinet
Memory provides long-term persistence across sessions:
- Captures key information from conversations
- Consolidates and organizes knowledge
- Retrieved when relevant to new interactions
Like organized folders you can pull from for any future project.
How Context Flows Through an Agent
Memory Deep Dive
What Makes Memory Different from RAG?
| Aspect | RAG | Memory |
|---|---|---|
| Purpose | Expert on facts | Expert on the user |
| Data Source | Static documents | Dynamic conversations |
| Scope | Shared/global | User-isolated |
| Write Pattern | Batch/admin | Event-driven |
The Memory Lifecycle
Extraction: What to Remember?
Not everything is worth remembering. The system filters for:
- ✅ User preferences ("I prefer window seats")
- ✅ Key facts (names, dates, goals)
- ✅ Important decisions
- ❌ Pleasantries and filler
Consolidation: Keeping Memory Clean
| Problem | Solution |
|---|---|
| Duplicate info | Merge related memories |
| Contradictions | Use newer info, track confidence |
| Stale data | Time-decay or TTL policies |
Memory Retrieval Strategies
When to Retrieve?
Where to Place Memories?
- System Instructions: Stable, global context (user profile)
- Conversation History: Dynamic, turn-specific context
- Tool Output: Retrieved on-demand via memory-as-a-tool
Managing Long Conversations
As conversations grow, you hit limits: context windows, costs, latency, and quality degradation ("context rot").
Compaction Strategies
Key Takeaways
Context Engineering = dynamically assembling the right information for each LLM call
Sessions handle the now (single conversation state)
Memory handles the forever (cross-session personalization)
Memory is an active ETL pipeline, not just a database:
- Extract meaningful info
- Consolidate with existing knowledge
- Retrieve when relevant
Run memory generation in the background to keep UX snappy
Security is critical - isolate user data, redact PII, prevent memory poisoning
Quick Reference: Memory vs Session
Based on "Context Engineering: Sessions, Memory" by Google (November 2025)