The Jean Memory Node.js SDK is a headless library for integrating our Context API into your backend services. It’s perfect for developers building API routes, serverless functions, or stateful agents in a Node.js environment.

Installation

npm install @jeanmemory/node

Usage: Creating a Context-Aware API Route

A common use case is to create an API endpoint that your frontend can call. This endpoint will securely fetch context from Jean Memory and then stream a response from your chosen LLM. The example below shows how to create a Next.js API route that is compatible with edge runtimes and the Vercel AI SDK.
import { JeanClient } from '@jeanmemory/node';
import { OpenAIStream, StreamingTextResponse } from 'ai';
import OpenAI from 'openai';

// Create the clients
const jean = new JeanClient({ apiKey: process.env.JEAN_API_KEY });
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Set the runtime to edge for best performance
export const runtime = 'edge';

export default async function POST(req: Request) {
  // 1. Get the user's message and token from the request body
  const { messages, userToken } = await req.json();
  const currentMessage = messages[messages.length - 1].content;

  // Ensure the user token is present
  if (!userToken) {
    return new Response('Unauthorized', { status: 401 });
  }

  // 2. Get context from Jean Memory with speed control
  const contextResponse = await jean.getContext({
    user_token: userToken,
    message: currentMessage,
    speed: "balanced", // Options: "fast", "balanced", "autonomous", "comprehensive"
    // tool="jean_memory", format="enhanced" (defaults)
  });

  // 3. Engineer your final prompt
  const finalPrompt = `
    Using the following context, please answer the user's question.
    The context is a summary of the user's memories related to their question.

    Context:
    ---
    ${contextResponse.text}
    ---

    User Question: ${currentMessage}
  `;
  
  // 4. Call your LLM and stream the response
  const response = await openai.chat.completions.create({
    model: 'gpt-4-turbo',
    stream: true,
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: finalPrompt },
    ],
  });

  const stream = OpenAIStream(response);
  return new StreamingTextResponse(stream);
}
This code sets up a Next.js API route that acts as a secure bridge between your frontend and your language model.
  1. Extract Data: It pulls the latest user message and, most importantly, the userToken from the incoming request. This token, acquired by your frontend via OAuth, authorizes access to the user’s memory.
  2. Fetch Context: It calls jean.getContext(), passing the userToken and the user’s message to the Jean Memory engine. The engine returns a block of relevant, engineered context.
  3. Construct Prompt: It assembles a final prompt, injecting the context from Jean Memory before the user’s actual question. This enriches the LLM’s understanding.
  4. Stream Response: It calls the LLM (in this case, OpenAI) with the context-rich prompt and streams the response back to the frontend using the Vercel AI SDK’s StreamingTextResponse. This provides a responsive, real-time chat experience.

Authentication Flow

As with the Python SDK, the userToken is obtained by your frontend application through a secure OAuth 2.1 flow using our @jeanmemory/react SDK. Your frontend makes an authenticated request to this API route, including the userToken in the request body. See the Authentication guide for more details. Test User Support: The Node.js SDK v2.0.7+ automatically creates test users for development:
// With user token (production)
const context = await jean.getContext({
  user_token: userToken,  // From OAuth flow
  message: "What's my schedule?"
});

// Backward compatibility string overload (automatic test user)
const context = await jean.getContext("What's my schedule?");
This allows you to test core functionality immediately without implementing full authentication during development.

Speed Modes

Jean Memory provides four distinct speed modes to balance response time with context depth:
// Fast: Direct memory search (0.5-1s)
const fastContext = await jean.getContext({
  user_token: userToken,
  message: "What's my schedule?",
  speed: "fast"
});

// Balanced: AI synthesis with Gemini 2.5 Flash (3-5s) - Recommended
const balancedContext = await jean.getContext({
  user_token: userToken,
  message: "How should I handle this situation?",
  speed: "balanced"
});

// Autonomous: Intelligent orchestration (variable latency)
const smartContext = await jean.getContext({
  user_token: userToken,
  message: "Help me plan my project timeline",
  speed: "autonomous"
});

// Comprehensive: Deep document analysis (20-30s)
const deepContext = await jean.getContext({
  user_token: userToken,
  message: "Analyze patterns in my communication style",
  speed: "comprehensive"
});
Speed Mode Selection:
  • fast: Ideal for real-time applications requiring sub-second responses
  • balanced: Recommended for most conversational AI use cases with natural language synthesis
  • autonomous: Best for complex tasks requiring intelligent decision-making
  • comprehensive: Use for research and detailed analysis tasks
Learn more about speed modes →

Additional Configuration

For advanced use cases, you can configure additional options:
// Different tools for specific needs
const context = await jean.getContext({
  user_token: userToken,
  message: currentMessage,
  tool: "search_memory"  // vs "jean_memory" (default)
});

// Simple text response instead of full metadata
const context = await jean.getContext({
  user_token: userToken,
  message: currentMessage,
  format: "simple"  // vs "enhanced" (default)
});

Advanced: Direct Tool Access

For advanced use cases, the JeanClient also provides a tools namespace for direct, deterministic access to the core memory functions.
// The intelligent, orchestrated way (recommended):
const context = await jean.getContext({ user_token: ..., message: "..." });

// The deterministic, tool-based way:
await jean.tools.add_memory({ user_token: ..., content: "My project's deadline is next Friday." });

const search_results = await jean.tools.search_memory({ user_token: ..., query: "project deadlines" });
// Parse search results - they come as JSON in content[0].text
const searchData = JSON.parse(search_results.content[0].text);
console.log('Found memories:', searchData.memories);

// Advanced tools for complex operations:
try {
  const deep_results = await jean.tools.deep_memory_query({ user_token: ..., query: "complex relationship query" });
  // Deep queries return analysis directly in content[0].text
  const analysis = deep_results.content[0].text;
  console.log('Deep analysis:', analysis);
} catch (error) {
  console.error('Deep query failed:', error);
  // Fallback to regular search
}

const doc_result = await jean.tools.store_document({ 
  user_token: ..., 
  title: "Meeting Notes", 
  content: "...", 
  document_type: "markdown" 
});

Performance Expectations

Different operations have different timing characteristics:
  • tools.search_memory(): 1-2 seconds - Fast semantic search, returns JSON
  • getContext() (orchestration): 3-10 seconds - Full AI conversation with context
  • tools.deep_memory_query(): 5-15 seconds - Comprehensive cross-memory analysis
  • tools.store_document(): Immediate response + background processing (30-60 seconds total)

Response Format

Most tool responses follow this structure:
{
  content: [
    {
      type: "text", 
      text: "JSON string or direct text response"
    }
  ]
}