Introduction
Memra gives AI assistants persistent memory. Each conversation turn is embedded and stored as a vector, then retrieved with semantic search when your agent needs context — across sessions, users, and agents.
Save
After each AI response, store the exchange with memory.save().
Retrieve
Before the next response, call memory.getContext() to get relevant memories.
Inject
Add the context to your system prompt. Your AI now remembers.
Installation
Install the official Memra client from npm. Works with Node.js and any bundler.
npm install @memra-client/clientPackage: @memra-client/client · TypeScript included
Quick start
Add persistent memory to any AI handler in three lines. Get your API key from the dashboard.
import { MemoryClient } from '@memra-client/client'
const memory = new MemoryClient({ apiKey: 'mk_live_...' })
export async function chat(userId: string, userMessage: string) {
// 1. Pull relevant memories before the model responds
const { context } = await memory.getContext(userId, userMessage)
// Memory goes in system prompt — NOT in the user message
// This tells the AI what the context is and how to use it
const systemPrompt = context.length > 0
? `You are a helpful assistant with memory of past conversations.
Here is what you remember:
${context.map((m, i) => `${i + 1}. [${m.role}]: ${m.content}`).join('\n')}
Use this memory to give personalized responses.`
: `You are a helpful assistant.`
// 2. Pass context as system prompt — works with any AI provider
const aiReply = await yourAI.chat.completions.create({
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: userMessage },
],
})
// 3. Save this exchange to memory
await memory.save(userId, userMessage, aiReply)
return aiReply
}MemoryClient
The main entry point. Create one instance per application and reuse it.
Constructor
import { MemoryClient } from '@memra-client/client'
const memory = new MemoryClient({
apiKey: 'mk_live_...', // required — from your dashboard
baseUrl: 'https://memra.dev/api' // optional, this is the default
})Options
| Parameter | Type | Description |
|---|---|---|
| apiKeyreq | string | Your API key from the Memra dashboard. Prefix: mk_live_ |
| baseUrl | string | API base URL. Default: https://memra.dev/api |
POST /memory/savememory.save()
Saves a conversation exchange to persistent memory. Stores two records — one for the user message and one for the AI reply — both embedded as semantic vectors for future retrieval.
Signature
memory.save(
userId: string, // your end-user's identifier
userMessage: string, // the user's message text
aiReply: string, // the assistant's response text
options?: {
agentId?: string // isolate memory per bot or context (default: 'default')
}
): Promise<{ success: boolean; saved: number }>Parameters
| Parameter | Type | Description |
|---|---|---|
| userIdreq | string | Your application's identifier for this user. Scopes memories to a single user. |
| userMessagereq | string | The user's message text. |
| aiReplyreq | string | The AI assistant's response text. |
| options.agentId | string | Namespace to isolate memories by bot or context. Default: 'default' |
Returns
{ success: boolean; saved: number }saved is always 2 — user message and AI reply are stored as separate records.
Example
const result = await memory.save(
'user_abc123',
'What is my current plan?',
'You are on the Free plan — 500 memory slots available.',
{ agentId: 'support-bot' }
)
// => { success: true, saved: 2 }Errors
| 400 | Bad request | Missing userMessage or aiReply in request body. |
| 401 | Unauthorized | Invalid or missing x-api-key. |
| 429 | Limit reached | Memory limit for your plan reached. See error body for upgrade link. |
GET /memory/contextmemory.getContext()
Performs a semantic search over this user's stored memories and returns the most relevant ones for the given query. Call this before generating an AI response to inject relevant history.
Signature
memory.getContext(
userId: string,
query: string, // the current message — used to find relevant memories
options?: {
agentId?: string // filter to a specific agent namespace (default: 'default')
limit?: number // max memories returned (default: 5)
}
): Promise<ContextResponse>Parameters
| Parameter | Type | Description |
|---|---|---|
| userIdreq | string | User identifier. Scopes the search to this user's memories. |
| queryreq | string | The current user message. Used to find semantically relevant memories. |
| options.agentId | string | Filter memories to this agent namespace. Default: 'default' |
| options.limit | number | Max number of memories to return. Default: 5 |
Response types
interface ContextResponse {
context: Memory[] // semantically relevant memories, ranked by similarity
count: number
latencyMs: number // end-to-end search latency, added client-side
}
interface Memory {
id: string
content: string
role: 'user' | 'assistant'
createdAt: string
similarity: number // 0–1 cosine similarity score
}context, not memories. Memories in context include a similarity score (0–1). The latencyMs field is added client-side by the SDK.Example
const { context, latencyMs } = await memory.getContext(
'user_abc123',
'What plan am I on?',
{ agentId: 'support-bot', limit: 5 }
)
// context is sorted by similarity — highest first
context.forEach(m => {
console.log(m.role + ':', m.content, '|', m.similarity.toFixed(2))
})⚠ Common mistake
Always inject memory into the system prompt, not the user message. Putting context in the user message looks like random text to the AI — it won't use it correctly.
GET /memory/historymemory.getHistory()
Returns messages in chronological order (oldest first). Unlike getContext(), this is not semantic — it returns a time-ordered list, useful for replaying conversation history or building a chat timeline.
Signature
memory.getHistory(
userId: string,
options?: {
agentId?: string // filter by agent (omit = return all agents)
limit?: number // number of messages to return (default: 20)
}
): Promise<{
history: Memory[] // chronological order, oldest first
count: number
}>Parameters
| Parameter | Type | Description |
|---|---|---|
| userIdreq | string | User identifier. |
| options.agentId | string | Filter by agent. Omit to return history across all agents. |
| options.limit | number | Number of messages to return. Default: 20 |
History memories do not have a similarity field — that is only present on getContext() results.
Example
const { history, count } = await memory.getHistory('user_abc123', {
agentId: 'support-bot',
limit: 50
})
console.log('Total messages:', count)
history.forEach(m => {
console.log('[' + m.role + '] ' + m.content)
})DELETE /memory/forgetmemory.forget()
Permanently deletes memories. Pass an agentId to scope deletion to a single agent, or omit it to delete all memories for this user. This action cannot be undone.
Signature
memory.forget(
userId: string,
options?: {
agentId?: string // omit to delete ALL memories for this user
}
): Promise<{ success: boolean; deleted: number }>Parameters
| Parameter | Type | Description |
|---|---|---|
| userIdreq | string | User identifier. |
| options.agentId | string | Scope deletion to this agent. Omit to delete ALL memories for this user. |
Example
// Clear only this agent's memory
await memory.forget('user_abc123', { agentId: 'support-bot' })
// Clear ALL memories for a user across every agent
const result = await memory.forget('user_abc123')
// => { success: true, deleted: 124 }REST API
Use the REST API directly from any language or tool. All endpoints require an API key passed in the x-api-key header. The user's identity is derived from the API key — you do not need to pass a userId in the request body or query string.
Base URL
https://memra.dev/api
| Method | Endpoint | Description |
|---|---|---|
| POST | /memory/save | Save a conversation turn |
| GET | /memory/context | Semantic search — returns most relevant memories |
| GET | /memory/history | Chronological message history |
| DELETE | /memory/forget | Delete memories (by agent or all) |
POST /memory/save
| Parameter | Type | Description |
|---|---|---|
| userMessagereq | string | The user's message text. |
| aiReplyreq | string | The AI assistant's response text. |
| agentId | string | Memory namespace. Default: 'default' |
curl -X POST https://memra.dev/api/memory/save \
-H 'Content-Type: application/json' \
-H 'x-api-key: mk_live_...' \
-d '{
"userMessage": "What is my plan?",
"aiReply": "You are on the Free plan.",
"agentId": "support-bot"
}'{ "success": true, "saved": 2 }GET /memory/context
| Parameter | Type | Description |
|---|---|---|
| queryreq | string | The search query — finds semantically similar memories. |
| agentId | string | Filter to this agent namespace. Default: 'default' |
| limit | number | Max results. Default: 5 |
curl 'https://memra.dev/api/memory/context?query=account+plan&agentId=support-bot&limit=5' \
-H 'x-api-key: mk_live_...'{
"context": [
{
"id": "clx123abc",
"content": "What is my plan?",
"role": "user",
"createdAt": "2026-06-15T10:00:00.000Z",
"similarity": 0.94
}
],
"count": 1
}GET /memory/history
| Parameter | Type | Description |
|---|---|---|
| agentId | string | Filter by agent. Omit for all agents. |
| limit | number | Messages to return. Default: 20 |
curl 'https://memra.dev/api/memory/history?agentId=support-bot&limit=20' \
-H 'x-api-key: mk_live_...'DELETE /memory/forget
| Parameter | Type | Description |
|---|---|---|
| agentId | string | Agent to clear. Omit to delete ALL memories for this user. |
curl -X DELETE https://memra.dev/api/memory/forget \
-H 'Content-Type: application/json' \
-H 'x-api-key: mk_live_...' \
-d '{ "agentId": "support-bot" }'{ "success": true, "deleted": 14 }Error codes
All errors return JSON with an error field. The SDK throws on non-2xx status codes.
| Status | Meaning | Endpoints |
|---|---|---|
| 400 | Missing required fields | save (userMessage/aiReply), context (query) |
| 401 | Invalid or missing API key | All endpoints |
| 429 | Memory limit reached | save only |
429 — Limit reached
Returned only by POST /memory/save when the user has reached their plan limit. Existing memories are safe — only new saves are blocked.
{
"error": "Memory limit reached",
"limit": 500,
"plan": "free",
"upgrade": "https://memra.dev/pricing"
}Free plan includes 500 memory records (250 conversation turns). When the limit is reached, the 429 response includes a limit, plan, and upgrade URL. View pricing →