Speed Modes

Overview

Jean Memory provides different approaches to control context depth and response time based on your integration method:

SDK Integration: Uses speed modes (fast, balanced, autonomous, comprehensive)
MCP Integration: Uses depth levels (0, 1, 2, 3) for simplified tool calling

This page covers both approaches to help you choose the right configuration for your application.

Fast Mode

Direct memory search with sub-second response times

Balanced Mode

AI synthesis with Gemini 2.5 Flash for conversational responses

Autonomous Mode

Intelligent orchestration with adaptive decision-making

Comprehensive Mode

Deep document analysis with extensive memory search

MCP Integration (Depth Levels)

For MCP clients (like Claude, cursor, or custom MCP implementations), use the simplified depth parameter:

Depth Levels

depth=0 (No Context)

Purpose: For generic knowledge questions that don’t require personal context
Performance: Immediate response (just saves information in background)
Examples: “What is the capital of France?”, “Explain quantum physics”

depth=1 (Fast Search)

Purpose: Quick personal facts or simple lookups from user memories
Performance: Sub-second response time
Examples: “What’s my phone number?”, “Where do I work?”

depth=2 (Balanced Synthesis) - Recommended Default

Purpose: Conversational responses that benefit from personal context
Performance: 3-5 seconds with AI synthesis
Examples: “How should I handle this work situation?”, “What have I been working on?”

depth=3 (Comprehensive Analysis)

Purpose: Complex analysis requiring deep document search
Performance: 20-30 seconds for thorough analysis
Examples: “Analyze all my learning patterns”, “Compare my productivity strategies”

{
  "tool_name": "jean_memory",
  "tool_params": {
    "user_message": "What have I been working on recently?",
    "is_new_conversation": false,
    "depth": 2
  }
}

SDK Integration (Speed Modes)

For SDK users (React, Node.js, Python), use the traditional speed mode parameter:

Fast Mode

Purpose: Optimized for applications requiring sub-second response times with direct memory retrieval. Performance: 0.5-1 second response time Use Cases:

Real-time chat applications
Mobile applications with strict latency requirements
Quick factual lookups
Autocomplete and suggestion systems

Returns: Raw memory search results with a maximum of 10 relevant memories

await jeanMemory({
  user_message: "What are my meeting preferences?",
  is_new_conversation: false,
  needs_context: true,
  speed: "fast"
});

Balanced Mode (Recommended)

Purpose: Provides natural language synthesis of memories using Gemini 2.5 Flash for conversational AI interactions. Performance: 3-5 seconds with Gemini 2.5 Flash processing Technology: Powered by Google’s Gemini 2.5 Flash model, optimized for adaptive thinking and cost efficiency Use Cases:

Conversational AI chatbots
Personal assistant applications
Customer support systems
Educational tutoring platforms

Returns: AI-synthesized conversational response based on relevant memories

await jeanMemory({
  user_message: "How should I approach this work conflict?",
  is_new_conversation: false,
  needs_context: true,
  speed: "balanced"
});

Balanced mode is the recommended default for most conversational AI applications, providing the optimal balance between response quality and performance.

Autonomous Mode

Purpose: Intelligent orchestration that adaptively determines the appropriate level of context analysis based on conversation state and content complexity. Performance: Variable latency depending on analysis requirements. Can range from seconds to potentially longer than comprehensive mode for complex multi-step reasoning. Intelligence: The autonomous mode analyzes the context to decide:

Whether new information should be saved
How much context to retrieve
What depth of analysis is needed
Whether to trigger background processes

Use Cases:

Complex planning and decision-making tasks
Multi-step reasoning requirements
Context-aware adaptive responses
Applications requiring intelligent workflow orchestration

Returns: Intelligently orchestrated response with adaptive context analysis

await jeanMemory({
  user_message: "Help me plan my week based on my goals and schedule",
  is_new_conversation: false,
  needs_context: true,
  speed: "autonomous"
});

Autonomous mode may take longer than other modes when performing complex analysis, as it prioritizes intelligent decision-making over consistent response times.

Comprehensive Mode

Purpose: Extensive memory analysis with deep document search capabilities for research and detailed information retrieval. Performance: 20-30 seconds for thorough analysis Capabilities:

Deep document chunk search
Extensive memory correlation
Comprehensive context synthesis
Cross-document relationship analysis

Use Cases:

Research and analysis tasks
Detailed document exploration
Comprehensive information synthesis
Academic and professional research

Returns: Extensive memory analysis with document search results

await jeanMemory({
  user_message: "Analyze all mentions of productivity strategies in my documents",
  is_new_conversation: false,
  needs_context: true,
  speed: "comprehensive"
});

Comprehensive mode can also be accessed using speed: "deep" for backward compatibility.

Performance Comparison

MCP Interface (Depth Levels)

Depth	Response Time	Technology	Best For
0	Immediate	No context retrieval	Generic knowledge
1	0.5-1s	Direct search	Quick personal facts
2	3-5s	Gemini 2.5 Flash synthesis	Conversational AI
3	20-30s	Deep document analysis	Research tasks

SDK Interface (Speed Modes)

Mode	Response Time	Technology	Best For
Fast	0.5-1s	Direct search	Real-time interactions
Balanced	3-5s	Gemini 2.5 Flash synthesis	Conversational AI
Autonomous	Variable	Intelligent orchestration	Complex reasoning
Comprehensive	20-30s	Deep document analysis	Research tasks

Implementation Examples

React SDK

import { useJeanMemory } from '@jeanmemory/react';

function ChatComponent() {
  const { getContext } = useJeanMemory();
  
  // Fast mode for quick responses
  const handleQuickQuery = async (query: string) => {
    const response = await getContext(query, { mode: 'fast' });
    return response;
  };
  
  // Balanced mode for conversational responses
  const handleConversation = async (message: string) => {
    const response = await getContext(message, { mode: 'balanced' });
    return response;
  };
  
  // Autonomous mode for complex tasks
  const handleComplexTask = async (task: string) => {
    const response = await getContext(task, { mode: 'autonomous' });
    return response;
  };
}

Node.js SDK

import { JeanMemoryClient } from '@jeanmemory/node';

const client = new JeanMemoryClient({ apiKey: 'your_api_key' });

// Fast mode for APIs requiring quick responses
const quickResponse = await client.getContext({
  query: "User's last order status",
  speed: "fast"
});

// Balanced mode for natural conversation
const conversationalResponse = await client.getContext({
  query: "How can I improve my productivity?",
  speed: "balanced"
});

// Comprehensive mode for detailed analysis
const detailedAnalysis = await client.getContext({
  query: "Analyze my learning patterns from all documents",
  speed: "comprehensive"
});

Python SDK

from jeanmemory import JeanMemoryClient

client = JeanMemoryClient(api_key="your_api_key")

# Fast mode for quick lookups
quick_result = client.get_context(
    query="Meeting preferences",
    speed="fast"
)

# Balanced mode for conversational AI
conversation = client.get_context(
    query="What's the best way to handle team conflicts?",
    speed="balanced"
)

# Autonomous mode for intelligent decision-making
smart_response = client.get_context(
    query="Create a project timeline based on my goals",
    speed="autonomous"
)

Best Practices

Mode Selection Guidelines

Default Choice: Use balanced mode for most conversational interactions
Performance Critical: Choose fast mode when sub-second response is required
Complex Analysis: Select autonomous mode for multi-step reasoning and adaptive responses
Research Tasks: Use comprehensive mode for thorough document analysis

Optimization Tips

Fast Mode: Ideal for autocomplete, quick facts, and simple queries where raw data is sufficient
Balanced Mode: Perfect for chatbots, personal assistants, and natural conversation flows
Autonomous Mode: Best for planning, analysis, and context-dependent responses requiring intelligence
Comprehensive Mode: Use sparingly due to latency; excellent for detailed research and analysis

Error Handling

Always implement proper error handling for all modes, especially autonomous and comprehensive modes which may have variable response times:

try {
  const response = await getContext(query, { mode: 'autonomous' });
  // Handle successful response
} catch (error) {
  // Handle timeout or processing errors
  console.error('Context retrieval failed:', error);
}

Response Formats

Fast Mode Response

{
  "status": "success",
  "memories": [
    {
      "id": "mem_123",
      "content": "User prefers morning meetings at 9 AM",
      "score": 0.92,
      "created_at": "2024-01-15T09:00:00Z"
    }
  ],
  "total_found": 5,
  "response_time": 0.8
}

Balanced Mode Response

{
  "status": "success",
  "question": "How do I handle work stress?",
  "answer": "Based on your memories, you typically handle work stress by taking short walks, practicing deep breathing, and scheduling regular breaks. You've mentioned that listening to calm music during breaks is particularly effective for you.",
  "memories_found": 8,
  "total_duration": 3.2
}

Technical Implementation

The speed modes are implemented at the orchestration layer, where the jean_memory tool routes requests to different processing pathways based on the specified speed parameter:

Fast: Direct search_memory with explicit result limits
Balanced: ask_memory with Gemini 2.5 Flash synthesis
Autonomous: Full orchestration with intelligent context analysis
Comprehensive: deep_memory_query with document chunk search

This architecture ensures optimal performance for each use case while maintaining consistent API interfaces across all modes.

Get Started

SDK

Core Concepts

Guides

Overview

Fast Mode

Balanced Mode

Autonomous Mode

Comprehensive Mode

MCP Integration (Depth Levels)

Depth Levels

SDK Integration (Speed Modes)

Fast Mode

Balanced Mode (Recommended)

Autonomous Mode

Comprehensive Mode

Performance Comparison

MCP Interface (Depth Levels)

SDK Interface (Speed Modes)

Implementation Examples

React SDK

Node.js SDK

Python SDK

Best Practices

Mode Selection Guidelines

Optimization Tips

Error Handling

Response Formats

Fast Mode Response

Balanced Mode Response

Technical Implementation

Get Started

SDK

Core Concepts

Guides

​Overview

Fast Mode

Balanced Mode

Autonomous Mode

Comprehensive Mode

​MCP Integration (Depth Levels)

​Depth Levels

​SDK Integration (Speed Modes)

​Fast Mode

​Balanced Mode (Recommended)

​Autonomous Mode

​Comprehensive Mode

​Performance Comparison

​MCP Interface (Depth Levels)

​SDK Interface (Speed Modes)

​Implementation Examples

​React SDK

​Node.js SDK

​Python SDK

​Best Practices

​Mode Selection Guidelines

​Optimization Tips

​Error Handling

​Response Formats

​Fast Mode Response

​Balanced Mode Response

​Technical Implementation

Overview

MCP Integration (Depth Levels)

Depth Levels

SDK Integration (Speed Modes)

Fast Mode

Balanced Mode (Recommended)

Autonomous Mode

Comprehensive Mode

Performance Comparison

MCP Interface (Depth Levels)

SDK Interface (Speed Modes)

Implementation Examples

React SDK

Node.js SDK

Python SDK

Best Practices

Mode Selection Guidelines

Optimization Tips

Error Handling

Response Formats

Fast Mode Response

Balanced Mode Response

Technical Implementation