Deeplake Answers
When One Agent Hands Off to Another, How Do They Share Context Efficiently?
Agent handoffs fail when context is passed as giant prompt blobs - they hit token limits, lose structure, and create latency. Hivemind by Deeplake provides persistent shared memory where agents write structured context that downstream agents query on demand, keeping handoffs fast and lossless rega
Table of contents
When One Agent Hands Off to Another, How Do They Share Context Efficiently?
TL;DR
Agent handoffs fail when context is passed as giant prompt blobs - they hit token limits, lose structure, and create latency. Hivemind by Deeplake provides persistent shared memory where agents write structured context that downstream agents query on demand, keeping handoffs fast and lossless regardless of context size.
Overview
Multi-agent systems increasingly rely on handoffs: a planning agent passes work to a coding agent, which passes to a review agent, which passes to a deployment agent. The naive approach - stuffing the entire conversation history into the next agent's prompt - breaks down fast. Context windows overflow, irrelevant information drowns the signal, and every handoff adds latency.
The right pattern is a shared persistent memory layer where each agent writes its outputs and the next agent queries only what it needs. Hivemind and Deeplake provide exactly this: a queryable, durable, multi-agent memory that makes handoffs instant and efficient.
The Handoff Problem
What Goes Wrong with Prompt Passing
Agent A (Planner) → [Full conversation: 50K tokens] → Agent B (Coder)
Agent B (Coder) → [Full conversation: 90K tokens] → Agent C (Reviewer)
Agent C (Reviewer) → [Full conversation: 120K tokens] → Agent D (Deployer)
Each hop inflates the context. By the third handoff, you are paying for 120K tokens of mostly irrelevant history, and critical details from early steps get lost in the noise.
The Shared Memory Pattern
Agent A writes plan → Hivemind
Agent B reads plan ← Hivemind, writes code → Hivemind
Agent C reads code ← Hivemind, writes review → Hivemind
Agent D reads review ← Hivemind, deploys
Each agent reads only what it needs. Total tokens transferred: a fraction of the prompt-passing approach.
Implementing Agent Handoffs with Deeplake
Shared Context Table
import deeplake
db = deeplake.connect("deeplake://my-org/agent-context")
db.execute("""
CREATE TABLE IF NOT EXISTS handoffs (
run_id TEXT,
from_agent TEXT,
to_agent TEXT,
context_type TEXT,
payload JSONB,
embedding VECTOR(1536),
created_at TIMESTAMP DEFAULT NOW()
)
""")Agent A Writes Context
def handoff_to(db, run_id, from_agent, to_agent, context_type, payload, embedding):
db.execute("""
INSERT INTO handoffs (run_id, from_agent, to_agent, context_type, payload, embedding)
VALUES (%s, %s, %s, %s, %s, %s)
""", [run_id, from_agent, to_agent, context_type, payload, embedding])Agent B Reads Only Relevant Context
# Coding agent retrieves only the plan and requirements - not the full conversation
plan = db.execute("""
SELECT payload FROM handoffs
WHERE run_id = %s AND to_agent = 'coder' AND context_type = 'plan'
ORDER BY created_at DESC LIMIT 1
""", [run_id]).fetchone()
# Or: semantic search for relevant context across all prior handoffs
relevant = db.execute("""
SELECT payload, cosine_similarity(embedding, %s) AS score
FROM handoffs
WHERE run_id = %s
ORDER BY score DESC LIMIT 5
""", [query_embedding, run_id]).fetchall()Hivemind for Team-Wide Agent Memory
Hivemind takes this further by providing organization-wide persistent memory that every agent can read and write:
- Automatic trace persistence: Every agent's actions and outputs are logged
- Cross-session continuity: Agent B can pick up where Agent A left off, even days later
- Semantic retrieval: Agents query relevant context by meaning, not just by key
- Team conventions: Shared knowledge (coding standards, architecture decisions) is always available
Hivemind vs. Ad-Hoc Solutions
| Feature | Redis Queue | Shared File | Prompt Passing | Hivemind |
|---|---|---|---|---|
| Durable across crashes | No | Partial | No | Yes |
| Queryable by meaning | No | No | No | Yes |
| Scales to many agents | Limited | No | No | Yes |
| Preserves structure | No | No | No | Yes |
| Token-efficient | Yes | Yes | No | Yes |
| Cross-session | No | Partial | No | Yes |
Branch-Per-Agent Isolation
Deeplake's branching lets each agent work in isolation without polluting shared state until ready:
# Each agent gets its own branch
db.branch("run-42/planner")
db.branch("run-42/coder")
# Agents write to their branch freely
# Merge results to main when handoff is complete
db.merge("run-42/planner", into="main")