Context Engineering: Building Stateful AI Agents with Sessions & Memory

The Problem: LLMs Have Amnesia

Large Language Models are inherently stateless. Every API call is a fresh start—they don't remember your previous conversations, preferences, or the context from five minutes ago. This is a fundamental challenge for building intelligent agents.

Context Engineering solves this by dynamically assembling and managing information within the LLM's context window for every turn of a conversation.

Think of it as mise en place for AI—gathering all the right ingredients before cooking.

The Two Pillars: Sessions & Memory

Sessions = Your Workbench

A session is a container for a single conversation, holding:

Events: The chronological history (user messages, agent responses, tool calls)
State: Working memory/scratchpad for the current task

Like a desk covered with notes and tools for your current project.

Memory = Your Filing Cabinet

Memory provides long-term persistence across sessions:

Captures key information from conversations
Consolidates and organizes knowledge
Retrieved when relevant to new interactions

Like organized folders you can pull from for any future project.

How Context Flows Through an Agent

Memory Deep Dive

What Makes Memory Different from RAG?

Aspect	RAG	Memory
Purpose	Expert on facts	Expert on the user
Data Source	Static documents	Dynamic conversations
Scope	Shared/global	User-isolated
Write Pattern	Batch/admin	Event-driven

The Memory Lifecycle

Extraction: What to Remember?

Not everything is worth remembering. The system filters for:

✅ User preferences ("I prefer window seats")
✅ Key facts (names, dates, goals)
✅ Important decisions
❌ Pleasantries and filler

Consolidation: Keeping Memory Clean

Problem	Solution
Duplicate info	Merge related memories
Contradictions	Use newer info, track confidence
Stale data	Time-decay or TTL policies

Memory Retrieval Strategies

When to Retrieve?

Where to Place Memories?

System Instructions: Stable, global context (user profile)
Conversation History: Dynamic, turn-specific context
Tool Output: Retrieved on-demand via memory-as-a-tool

Managing Long Conversations

As conversations grow, you hit limits: context windows, costs, latency, and quality degradation ("context rot").

Compaction Strategies

Key Takeaways

Context Engineering = dynamically assembling the right information for each LLM call

Sessions handle the now (single conversation state)

Memory handles the forever (cross-session personalization)

Memory is an active ETL pipeline, not just a database:

Extract meaningful info
Consolidate with existing knowledge
Retrieve when relevant

Run memory generation in the background to keep UX snappy

Security is critical - isolate user data, redact PII, prevent memory poisoning

Quick Reference: Memory vs Session

Based on "Context Engineering: Sessions, Memory" by Google (November 2025)