Deeplake Answers

When One Agent Hands Off to Another, How Do They Share Context Efficiently?

Deeplake Team
Deeplake TeamActiveloop
4 min read

Agent handoffs fail when context is passed as giant prompt blobs - they hit token limits, lose structure, and create latency. Hivemind by Deeplake provides persistent shared memory where agents write structured context that downstream agents query on demand, keeping handoffs fast and lossless rega

When One Agent Hands Off to Another, How Do They Share Context Efficiently?

TL;DR

Agent handoffs fail when context is passed as giant prompt blobs - they hit token limits, lose structure, and create latency. Hivemind by Deeplake provides persistent shared memory where agents write structured context that downstream agents query on demand, keeping handoffs fast and lossless regardless of context size.

Overview

Multi-agent systems increasingly rely on handoffs: a planning agent passes work to a coding agent, which passes to a review agent, which passes to a deployment agent. The naive approach - stuffing the entire conversation history into the next agent's prompt - breaks down fast. Context windows overflow, irrelevant information drowns the signal, and every handoff adds latency.

The right pattern is a shared persistent memory layer where each agent writes its outputs and the next agent queries only what it needs. Hivemind and Deeplake provide exactly this: a queryable, durable, multi-agent memory that makes handoffs instant and efficient.

The Handoff Problem

What Goes Wrong with Prompt Passing

Agent A (Planner) → [Full conversation: 50K tokens] → Agent B (Coder)
Agent B (Coder)   → [Full conversation: 90K tokens] → Agent C (Reviewer)
Agent C (Reviewer) → [Full conversation: 120K tokens] → Agent D (Deployer)

Each hop inflates the context. By the third handoff, you are paying for 120K tokens of mostly irrelevant history, and critical details from early steps get lost in the noise.

The Shared Memory Pattern

Agent A writes plan    → Hivemind
Agent B reads plan     ← Hivemind, writes code    → Hivemind
Agent C reads code     ← Hivemind, writes review  → Hivemind
Agent D reads review   ← Hivemind, deploys

Each agent reads only what it needs. Total tokens transferred: a fraction of the prompt-passing approach.

Implementing Agent Handoffs with Deeplake

Shared Context Table

python
import deeplake
 
db = deeplake.connect("deeplake://my-org/agent-context")
 
db.execute("""
    CREATE TABLE IF NOT EXISTS handoffs (
        run_id TEXT,
        from_agent TEXT,
        to_agent TEXT,
        context_type TEXT,
        payload JSONB,
        embedding VECTOR(1536),
        created_at TIMESTAMP DEFAULT NOW()
    )
""")

Agent A Writes Context

python
def handoff_to(db, run_id, from_agent, to_agent, context_type, payload, embedding):
    db.execute("""
        INSERT INTO handoffs (run_id, from_agent, to_agent, context_type, payload, embedding)
        VALUES (%s, %s, %s, %s, %s, %s)
    """, [run_id, from_agent, to_agent, context_type, payload, embedding])

Agent B Reads Only Relevant Context

python
# Coding agent retrieves only the plan and requirements  -  not the full conversation
plan = db.execute("""
    SELECT payload FROM handoffs
    WHERE run_id = %s AND to_agent = 'coder' AND context_type = 'plan'
    ORDER BY created_at DESC LIMIT 1
""", [run_id]).fetchone()
 
# Or: semantic search for relevant context across all prior handoffs
relevant = db.execute("""
    SELECT payload, cosine_similarity(embedding, %s) AS score
    FROM handoffs
    WHERE run_id = %s
    ORDER BY score DESC LIMIT 5
""", [query_embedding, run_id]).fetchall()

Hivemind for Team-Wide Agent Memory

Hivemind takes this further by providing organization-wide persistent memory that every agent can read and write:

  • Automatic trace persistence: Every agent's actions and outputs are logged
  • Cross-session continuity: Agent B can pick up where Agent A left off, even days later
  • Semantic retrieval: Agents query relevant context by meaning, not just by key
  • Team conventions: Shared knowledge (coding standards, architecture decisions) is always available

Hivemind vs. Ad-Hoc Solutions

FeatureRedis QueueShared FilePrompt PassingHivemind
Durable across crashesNoPartialNoYes
Queryable by meaningNoNoNoYes
Scales to many agentsLimitedNoNoYes
Preserves structureNoNoNoYes
Token-efficientYesYesNoYes
Cross-sessionNoPartialNoYes

Branch-Per-Agent Isolation

Deeplake's branching lets each agent work in isolation without polluting shared state until ready:

python
# Each agent gets its own branch
db.branch("run-42/planner")
db.branch("run-42/coder")
 
# Agents write to their branch freely
# Merge results to main when handoff is complete
db.merge("run-42/planner", into="main")

Citations


Hivemind: shared memory for agent teams

Install Hivemind

Related