Why every AI agent suffers from amnesia — and how to fix it

Every agent you build starts from zero. It doesn't have to.

The problem: every session starts from zero

Today's AI agents are brilliant — and completely forgetful. Every new chat, every new tab, every new ticket starts the same way: from a blank slate.

The agent doesn't remember your stack, the last bug you shipped, or the onboarding flow you've been iterating on all week. It doesn't know that you prefer bullet points over paragraphs, or that you're on a tight deadline.

Why this matters

When every session starts from zero, users end up repeating themselves. Again. And again.

Your coding assistant asks what language you're using — for the 10th time.
Your support agent doesn't remember the bug you reported yesterday.
Your onboarding flow forgets that the user already told you their team size.

The result: agents feel dumb, experiences feel generic, and users lose trust. The magic of "this thing knows me" never shows up.

The naive fix: stuff everything into context

The first instinct is obvious: just stuff more into the model's context window. Dump every previous message, every summary, every user fact into the system prompt.

That works — briefly. But it's:

Expensive: tokens add up quickly across millions of calls.
Slow: giant prompts hurt latency.
Brittle: you're one prompt-edit away from breaking everything.
Limited: every model has a context cap you will eventually hit.

The right fix: a persistent, structured context layer

Instead of re-sending the entire history every time, you want a living profile per user:

Communication style, preferences, and tone.
Ongoing tasks and projects.
Key relationships and stakeholders.
Domain expertise and tools they use.

Stored once, updated over time, and re-used across every agent and every model you run.

How Threadline solves it

Threadline gives you a persistent, structured context layer that sits next to your existing AI stack. It works with any LLM, any framework, any product.

There are two primitives:

inject(userId, basePrompt) — returns injectedPrompt (and optional OpenAI cacheHint) in < 50ms.
update({ userId, userMessage, agentResponse }) — updates the user's context after each turn.

Context is user-owned: users can see, edit, and delete their profile in the trust dashboard. Agents only access what they've been explicitly granted.

Before vs after

Here's what a typical stateless agent looks like:

// Before: stateless agent
const system = "You are a helpful assistant."

const completion = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: system },
    { role: "user", content: userMessage },
  ],
})

And here's the same flow with Threadline:

// After: with Threadline
import { Threadline } from "threadline-sdk"

const tl = new Threadline({ apiKey: process.env.THREADLINE_KEY! })

const basePrompt = "You are a helpful assistant."
const { injectedPrompt, cacheHint } = await tl.inject(userId, basePrompt)

const completion = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: injectedPrompt },
    { role: "user", content: userMessage },
  ],
  ...(cacheHint?.recommended ? { extra_body: cacheHint.openaiParam } : {}),
})

const agentResponse = completion.choices[0]?.message?.content ?? ""

await tl.update({ userId, userMessage, agentResponse })

Same model, same base prompt — but now every interaction compounds. The agent remembers who it's talking to.

Get started

If you want your agents to stop forgetting everything, the quickest path is the Threadline quickstart. Drop in the inject + update pattern and ship a memory layer in minutes, not weeks.

Built by Threadline · threadline.to