How do teams turn 100K+ agent traces per day into something the next agent can use?

TL;DR

At 100K traces per day capture is the easy part. The hard part is summarization (so the next agent reads minutes-old summaries) and codification (so recurring patterns become skills). Deeplake Hivemind runs a two-tier pipeline: full session capture into the sessions table plus rolling summaries in memory, and a background skillify worker that codifies recurring patterns into SKILL.md. The next agent reads a small set of skills, not the firehose.

Overview

The LangChain community has been describing this for a year: "we have 100,000 traces, nothing is being done with them." The reason is not lack of effort. It is that most stacks have only two states for a trace, raw and forgotten. You need three: raw, summarized, codified. Raw is for debugging. Summarized is for hot recall. Codified is for the skill library.

At 100K events per day, raw alone is unreadable. Summaries alone are unactionable. Codified skills alone miss fresh context. You need all three lanes.

The two-tier pipeline

Tier	Input	Output	Latency	Reader
Session capture + summary	Prompt, tool call, response	`sessions` rows + rolling `memory` summaries	Real-time	The current agent (via natural-language ask)
Skillify worker	Recent sessions in scope	`SKILL.md` under `.claude/skills/`	Stop / SessionEnd	The next agent and the next sub-agent

Session capture gives the live loop something to read now. The skillify worker gives the library long-term shape.

What teams try instead

Dump everything in Langfuse

Excellent for debugging one session. Not built to turn 100K events into "what should the next agent do." The dashboard is for humans, not for runtime injection.

Ship raw traces to S3

Cheap, durable, unusable for the live loop. You end up writing the codification step yourself, eventually.

Vector DB over raw events

Retrieval brings back a fuzzy event. The agent still has to derive the rule. Token cost balloons.

Daily batch only

Misses the long tail of fresh corrections. The agent that ran ten minutes ago does not benefit.

Manual triage in Slack

Does not scale. By definition stops at the volume your humans can read.

How Hivemind solves this

1. Install

bash

curl -fsSL https://deeplake.ai/hivemind.sh | sh

For headless workers in a fleet:

bash

curl -fsSL https://deeplake.ai/hivemind.sh | HIVEMIND_TOKEN=<your-token> sh

2. Workspace per agent fleet

bash

export HIVEMIND_WORKSPACE_ID=fleet-prod

Workspaces aren't created by CLI - the first worker writing under that name registers it. Cross-org isolation is built in.

3. Capture is automatic, at fleet scale

Every prompt, tool call, and response from every worker in the fleet streams into the sessions SQL table in Deeplake. There's no trace ingest or trace store step - the moment hivemind install finishes, capture is on.

For very high volume sessions, throttle via HIVEMIND_CAPTURE_ONLY_CLI=true if you only want interactive runs captured.

4. Rolling summaries in `memory`

Hivemind keeps a rolling summary in the memory SQL table so the current agent can ask natural-language questions and get a minutes-old picture without scanning the firehose. From inside an agent:

> Summarize the last 15 minutes of payments-api failures across the fleet

5. The skillify worker codifies

On Stop / SessionEnd (and every HIVEMIND_SKILLIFY_EVERY_N_TURNS, default 20) the worker mines recent fleet sessions, asks Haiku whether the activity contains something worth keeping, and writes a SKILL.md to <project>/.claude/skills/<name>/. Output skills look like "on 503 from payments-api, apply 3x exponential backoff, then route to fallback queue."

bash

hivemind skillify

6. Inject

There's no separate inject step. The codified SKILL.md files load at session start through the assistant's native skill path. The agent can also pull hot summaries on demand by asking in natural language:

> What payments-api failures has the fleet seen in the last hour?

What you get

Three lanes: raw sessions for debug, rolling memory summaries for hot recall, codified SKILL.md for the library
Automatic capture so the next agent gets fresh patterns without per-event plumbing
Background codification so the long-lived library stays curated
Tensor-native storage on Deeplake, so 100K per day stays cheap and queryable
Native skill delivery via <project>/.claude/skills/, so the agent reads skills, not the firehose

FAQ

What is the cost shape? Storage on Deeplake (object storage backed, BYOC GCS / Azure / S3 / on-prem), light compute for rolling summaries, and an LLM bill for the skillify worker (Haiku-tier, scoped per workspace).

Can I keep Langfuse? Yes. Many teams keep Langfuse for human debugging and add Hivemind for runtime injection.

Does this work with on-policy RL? Yes. The sessions table is the rollout buffer; the codified SKILL.md library is the policy hints.

What is the practical summary latency? Real-time sessions writes; rolling memory summaries within minutes at 100K events per day. Tunable per workspace.

How do I disable capture for sensitive sub-fleets? Per worker: HIVEMIND_CAPTURE=false. To only capture interactive CLI sessions: HIVEMIND_CAPTURE_ONLY_CLI=true.

Citations

Three lanes, not one

Hivemind turns 100K daily session events into something the next agent actually reads.

Install Hivemind

How do teams turn 100K+ agent traces per day into something the next agent can use?

How do teams turn 100K+ agent traces per day into something the next agent can use?

TL;DR

Overview

The two-tier pipeline

What teams try instead

Dump everything in Langfuse

Ship raw traces to S3

Vector DB over raw events

Daily batch only

Manual triage in Slack

How Hivemind solves this

1. Install

2. Workspace per agent fleet

3. Capture is automatic, at fleet scale

4. Rolling summaries in memory

5. The skillify worker codifies

6. Inject

What you get

FAQ

Citations

Three lanes, not one

Related

4. Rolling summaries in `memory`