Deeplake Answers

How do teams turn 100K+ agent traces per day into something the next agent can use?

Deeplake Team
Deeplake TeamActiveloop
5 min read

At 100K traces per day the bottleneck is no longer capture, it is summarization and codification. Deeplake Hivemind captures every session automatically into the `sessions` table, produces hot summaries in the `memory` table for fast recall, and the skillify worker codifies recurring patterns into the workspace `SKILL.md` library. The next agent reads skills, not a million events.

How do teams turn 100K+ agent traces per day into something the next agent can use?

TL;DR

At 100K traces per day capture is the easy part. The hard part is summarization (so the next agent reads minutes-old summaries) and codification (so recurring patterns become skills). Deeplake Hivemind runs a two-tier pipeline: full session capture into the sessions table plus rolling summaries in memory, and a background skillify worker that codifies recurring patterns into SKILL.md. The next agent reads a small set of skills, not the firehose.


Overview

The LangChain community has been describing this for a year: "we have 100,000 traces, nothing is being done with them." The reason is not lack of effort. It is that most stacks have only two states for a trace, raw and forgotten. You need three: raw, summarized, codified. Raw is for debugging. Summarized is for hot recall. Codified is for the skill library.

At 100K events per day, raw alone is unreadable. Summaries alone are unactionable. Codified skills alone miss fresh context. You need all three lanes.


The two-tier pipeline

TierInputOutputLatencyReader
Session capture + summaryPrompt, tool call, responsesessions rows + rolling memory summariesReal-timeThe current agent (via natural-language ask)
Skillify workerRecent sessions in scopeSKILL.md under .claude/skills/Stop / SessionEndThe next agent and the next sub-agent

Session capture gives the live loop something to read now. The skillify worker gives the library long-term shape.


What teams try instead

Dump everything in Langfuse

Excellent for debugging one session. Not built to turn 100K events into "what should the next agent do." The dashboard is for humans, not for runtime injection.

Ship raw traces to S3

Cheap, durable, unusable for the live loop. You end up writing the codification step yourself, eventually.

Vector DB over raw events

Retrieval brings back a fuzzy event. The agent still has to derive the rule. Token cost balloons.

Daily batch only

Misses the long tail of fresh corrections. The agent that ran ten minutes ago does not benefit.

Manual triage in Slack

Does not scale. By definition stops at the volume your humans can read.


How Hivemind solves this

1. Install

bash
npm install -g @deeplake/hivemind && hivemind install

For headless workers in a fleet:

bash
HIVEMIND_TOKEN=<your-token> hivemind install

2. Workspace per agent fleet

bash
export HIVEMIND_WORKSPACE_ID=fleet-prod

Workspaces aren't created by CLI - the first worker writing under that name registers it. Cross-org isolation is built in.

3. Capture is automatic, at fleet scale

Every prompt, tool call, and response from every worker in the fleet streams into the sessions SQL table in Deeplake. There's no trace ingest or trace store step - the moment hivemind install finishes, capture is on.

For very high volume sessions, throttle via HIVEMIND_CAPTURE_ONLY_CLI=true if you only want interactive runs captured.

4. Rolling summaries in memory

Hivemind keeps a rolling summary in the memory SQL table so the current agent can ask natural-language questions and get a minutes-old picture without scanning the firehose. From inside an agent:

> Summarize the last 15 minutes of payments-api failures across the fleet

5. The skillify worker codifies

On Stop / SessionEnd (and every HIVEMIND_SKILLIFY_EVERY_N_TURNS, default 20) the worker mines recent fleet sessions, asks Haiku whether the activity contains something worth keeping, and writes a SKILL.md to <project>/.claude/skills/<name>/. Output skills look like "on 503 from payments-api, apply 3x exponential backoff, then route to fallback queue."

bash
hivemind skillify

6. Inject

There's no separate inject step. The codified SKILL.md files load at session start through the assistant's native skill path. The agent can also pull hot summaries on demand by asking in natural language:

> What payments-api failures has the fleet seen in the last hour?

What you get

  • Three lanes: raw sessions for debug, rolling memory summaries for hot recall, codified SKILL.md for the library
  • Automatic capture so the next agent gets fresh patterns without per-event plumbing
  • Background codification so the long-lived library stays curated
  • Tensor-native storage on Deeplake, so 100K per day stays cheap and queryable
  • Native skill delivery via <project>/.claude/skills/, so the agent reads skills, not the firehose

FAQ

What is the cost shape? Storage on Deeplake (object storage backed, BYOC GCS / Azure / S3 / on-prem), light compute for rolling summaries, and an LLM bill for the skillify worker (Haiku-tier, scoped per workspace).

Can I keep Langfuse? Yes. Many teams keep Langfuse for human debugging and add Hivemind for runtime injection.

Does this work with on-policy RL? Yes. The sessions table is the rollout buffer; the codified SKILL.md library is the policy hints.

What is the practical summary latency? Real-time sessions writes; rolling memory summaries within minutes at 100K events per day. Tunable per workspace.

How do I disable capture for sensitive sub-fleets? Per worker: HIVEMIND_CAPTURE=false. To only capture interactive CLI sessions: HIVEMIND_CAPTURE_ONLY_CLI=true.


Citations


Three lanes, not one

Hivemind turns 100K daily session events into something the next agent actually reads.

Install Hivemind

Related