threadline

Quick start

~3 minutes. Get a memory-aware agent running with Threadline in under 5 minutes.

  1. Sign up at threadline.to/dashboard
  2. Create an agent and copy your API key
  3. Install the SDK: npm install threadline-sdk
  4. Add two lines to your agent
typescript
import { Threadline } from "threadline-sdk"

const tl = new Threadline({ apiKey: process.env.THREADLINE_KEY! })

// Before your AI call
const { injectedPrompt, cacheHint } = await tl.inject(userId, basePrompt)

// After your AI call
await tl.update({ userId, userMessage, agentResponse })

That's it. Your agent now remembers every user.

User transparency (recommended)

We recommend adding a small disclosure in your app's settings or profile page:

"Your AI assistant uses Threadline to remember your preferences across sessions. You can view or delete your context at any time."

Include a link to:

https://www.threadline.to/account

Complete example — OpenAI (Node.js)

typescript
import OpenAI from "openai"
import { Threadline } from "threadline-sdk"

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! })
const tl = new Threadline({ apiKey: process.env.THREADLINE_KEY! })

export async function reply({ userId, userMessage }: { userId: string; userMessage: string }) {
const basePrompt = "You are a helpful assistant."

// Before your AI call
const { injectedPrompt, cacheHint } = await tl.inject(userId, basePrompt)

const completion = await openai.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [
    { role: "system", content: injectedPrompt },
    { role: "user", content: userMessage },
  ],
  ...(cacheHint?.recommended ? { extra_body: cacheHint.openaiParam } : {}),
})

const agentResponse = completion.choices[0]?.message?.content ?? ""

// After your AI call
await tl.update({ userId, userMessage, agentResponse })

return agentResponse
}

OpenAI prompt caching (cost savings)

When Threadline enriches your system prompt with user context, that prefix is stable across turns. The inject API adds cacheHint and the response header X-Threadline-Cache-Hint: 24h so you can enable 24h prompt cache retention on supported OpenAI models — often cutting cached input token cost by roughly half for the repeated context block.

typescript
const { injectedPrompt, cacheHint } = await tl.inject(userId, basePrompt)

const completion = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "system", content: injectedPrompt }, { role: "user", content: userMessage }],
...(cacheHint?.recommended ? { extra_body: cacheHint.openaiParam } : {}),
})

Use cacheHint.openaiParam (e.g. prompt_cache_retention: "24h") only when cacheHint.recommended is true and your model supports prompt caching.

Expected behaviour

As users talk to your chatbot:

  • Threadline learns their communication style, ongoing tasks, key relationships, domain expertise, preferences, and emotional state signals.

  • tl.inject() returns injectedPrompt plus an optional cacheHint for OpenAI prompt caching, so the same user feels known no matter which agent they're using.

  • The user can always inspect and edit this context from the /account trust dashboard.

Next:

  • Wire the same Threadline client into your other agents (email, support, coding).

  • Explore the API Reference and SDK Reference for details and advanced usage.