Threadline — The persistent context layer for AI agents

threadline

The persistent context layer for AI agents.

OAuth solved identity. Threadline solves context.

Every agent you build starts from zero. Threadline changes that.

< 50ms context retrieval

Privacy-first by design

Works with any LLM

< 50ms

retrieval

scopes of context

lines to integrate

your-app.com

Your Agent

user interaction

user →

I prefer concise answers, I'm a backend engineer.

prompts + history

context layer

Threadline

threadline.to

tone: "concise"

role: "engineer"

tools: ["cursor", "vercel"]

enriched context

custom agent

Your Agent/Product

already knows

bot →

Got it — keeping it brief.

one context object · every agent · no repeated conversation

Every AI agent you build starts from zero. Your users repeat themselves. You keep rebuilding the same memory layer. There's a better way.

threadline.ts

// npm install threadline-sdk

import { Threadline } from "threadline-sdk"

const tl = new Threadline({ apiKey: process.env.THREADLINE_KEY! })

// Before your AI call — inject user context into the prompt
const { injectedPrompt, cacheHint } = await tl.inject(userId, basePrompt)
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "system", content: injectedPrompt }],
  ...(cacheHint?.openaiParam ? { extra_body: cacheHint.openaiParam } : {}),
})

// After your AI call — update the user's context
await tl.update({ userId, userMessage, agentResponse })

Two lines. Your agent now remembers every user, forever.

Governance built in

User-owned context with a trust dashboard. Users can view, edit, and delete their data at any time.

One context, every agent

Build 10 agents. One context object per user. Zero repetition. Grant each agent only the scopes it needs.

< 50ms retrieval

Redis-cached context served before your LLM call. Never adds latency to your response time.

Works with everything you already use

OpenAIAnthropicLangChainVercel AI SDKNext.jsPythonTypeScriptAny LLM

Free

up to 2,500 calls/month

~1,250 conversations/month

1 call = one inject() or one update()

Builder

$49/mo

up to 250,000 calls/month

~125,000 conversations/month

Growing products

1 call = one inject() or one update()

Scale

$299/mo

up to 2,000,000 calls/month

~1,000,000 conversations/month

Production scale

1 call = one inject() or one update()

Common questions

Why not just use OpenAI or Anthropic's built-in memory?

Model-native memory is built for their own products — ChatGPT, Claude.ai. It doesn't follow your users across the agents you build. It's also locked to one provider, owned by them, and gives you no governance controls. Threadline is model-agnostic, developer-owned, and works with any LLM.

How is this different from a vector database or RAG?

Vector DBs store documents. Threadline stores structured, scoped user context — communication style, ongoing tasks, relationships, expertise — and injects exactly the right facts before each LLM call. No embeddings to manage, no retrieval pipelines to build. Two lines of code.

What happens when a user wants their data deleted?

Users can view, edit, and delete their context at any time from the trust dashboard. Developers can also call the DELETE endpoint programmatically. Context is hard-deleted from Postgres and cache within seconds.

Can I switch LLM providers without losing user memory?

Yes. Threadline is completely model-agnostic. Your context layer lives independently of whichever model you call. Switch from GPT-4o to Claude to Gemini — your users' memory comes with them.

// npm install threadline-sdk import { Threadline } from "threadline-sdk" const tl = new Threadline({ apiKey: process.env.THREADLINE_KEY! }) // Before your AI call — inject user context into the prompt const { injectedPrompt, cacheHint } = await tl.inject(userId, basePrompt) const response = await openai.chat.completions.create({ model: "gpt-4o", messages: [{ role: "system", content: injectedPrompt }], ...(cacheHint?.openaiParam ? { extra_body: cacheHint.openaiParam } : {}), }) // After your AI call — update the user's context await tl.update({ userId, userMessage, agentResponse })

Common questions

Why not just use OpenAI or Anthropic's built-in memory?

How is this different from a vector database or RAG?

What happens when a user wants their data deleted?

Can I switch LLM providers without losing user memory?

Yes. Threadline is completely model-agnostic. Your context layer lives independently of whichever model you call. Switch from GPT-4o to Claude to Gemini — your users' memory comes with them.