Brain (entities & episodes)

Memory and the knowledge base are the two surfaces you see in the chat. Underneath sits the brain: a structured graph of entities, edges, and immutable episodes that grows every time you (or a connector) feeds it a new signal. This page explains the layer you don't see directly but that powers every assistant turn.

The primitives

Five operational primitives, three cognitive ones. The operational primitives are what you query directly in chat: CRM (people + companies + deals), Tasks, KB, Memory, and Files. The cognitive substrate (entities, edges, episodes) threads them together.

Entities

Canonical nouns: people, companies, projects, deals, products. Every brain primitive resolves to or links from an entity.

Edges

Typed graph relationships between entities (works_at, engagement_of, mentioned_in). Bi-temporal: a closed relationship keeps its history.

Episodes

The append-only observation log. Every memory, entity, and edge points back to the episode that observed it. Episodes are immutable; consolidation reads, never writes.

How signals become structure

Pipeline B is the unified extraction surface. Given any episode (a chat checkpoint, a Gmail message that matched a rule, a Fathom transcript, a Slack mention), it runs a single LLM call that emits entities, edges, tasks, memories, and ephemeral items, with every observation evaluated against a precedence ladder (Task → Entity/Edge → CRM → Memory → Ephemeral). Memory is the last resort; emissions carry why_not_entity / why_not_task justifications so the model confronts the alternative.

Three pipelines, one extractor

  1. A

    Pipeline A: chat

    Live chat turns hit a compaction checkpoint; the compacted window becomes a web_chat episode and runs Pipeline B fire-and-forget.

  2. B

    Pipeline B: episode to derived rows

    The shared extractor. Called by A, C, and active capture (file upload, voice memo, manual save).

  3. C

    Pipeline C: external event ingest

    Connector events (Gmail, GitHub, Calendar, Fathom, Slack) flow through the rules engine; matched events become episodes and run Pipeline B.

Rules engine

Per-connector-instance rules decide how each event is routed: realtime (straight to Pipeline B), scheduled (queued in pending_ingest_batches, drained on a cron), or drop (truly discarded). Filters are first-match-wins, with default templates per source: ":crm_contacts → realtime" for Gmail, "is_dm → realtime" for Slack, "pull_request.merged → realtime+alert" for GitHub. Edit them under Studio → Ingestion.

Self-healing classifier

The v2 classifier doesn't just decide once. Pipeline B's emissions are scored against the original episode; low-confidence rows go back through a reclassifier pass that can re-tier, merge near-duplicates, or downgrade a Memory to Ephemeral. Existing memories that drift from their evidence are surfaced for retraction in the corrections queue.

How chat reads the brain

Chat sees the brain through a 7-tool surface: getEntity (entity rollup with summary + edges + recent episodes), search (hybrid FTS + vector + graph + recency, RRF-fused), recentEpisodes, provenance (one-level walk for any row), markUseful (the signal that boosts a row's retrieval rank), aggregate (typed measures over typed columns or JSONB attributes), and getRowHistory (the supersession-audit chain). Same store; chat sees the rich tools, programmatic clients will see atomic per-primitive operations once external-agent OAuth ships.

Brain vs memory

Memory is one primitive in the brain: the layer that holds inferred behavioral facts about people. Entities, edges, tasks, deals, and files are the rest. The acid test for the boundary: "if the source disappeared, does this row stay?" Yes for memory, typically no for KB chunks (which are re-importable from the source repo or docs). See Memory & knowledge for the user-facing controls.

Other surfaces worth knowing

Explicit links

Every brain-write tool (saveMemory, saveContact, saveDeal, createTask, fileWrite, …) accepts a universal links parameter so the model can connect new rows back to existing entities in the same call. Bi-temporal: closed relationships keep their history. Capped at 20 links per call.

Graph view

Obsidian-style force-directed visualization, via the Graph toggle on the Brain page. The workspace-wide entity↔entity graph, server-side degree, kind-coloured nodes, click-through to the brain detail drawer. Capped at 500 nodes.

Trust signals

Every row carries a source (user, model, connector, episode-graduation) and an optional verified_by_user_id. Retraction marks a belief as wrong without deleting the audit trail. Retrieval ranks rows via SOURCE_WEIGHTS × VERIFIED_BOOST: provenance, not a numeric confidence score.