Overview
Jean Memory provides different approaches to control context depth and response time based on your integration method:- SDK Integration: Uses speed modes (
fast
,balanced
,autonomous
,comprehensive
) - MCP Integration: Uses depth levels (0, 1, 2, 3) for simplified tool calling
Fast Mode
Direct memory search with sub-second response times
Balanced Mode
AI synthesis with Gemini 2.5 Flash for conversational responses
Autonomous Mode
Intelligent orchestration with adaptive decision-making
Comprehensive Mode
Deep document analysis with extensive memory search
MCP Integration (Depth Levels)
For MCP clients (like Claude, cursor, or custom MCP implementations), use the simplified depth parameter:Depth Levels
depth=0 (No Context)- Purpose: For generic knowledge questions that don’t require personal context
- Performance: Immediate response (just saves information in background)
- Examples: “What is the capital of France?”, “Explain quantum physics”
- Purpose: Quick personal facts or simple lookups from user memories
- Performance: Sub-second response time
- Examples: “What’s my phone number?”, “Where do I work?”
- Purpose: Conversational responses that benefit from personal context
- Performance: 3-5 seconds with AI synthesis
- Examples: “How should I handle this work situation?”, “What have I been working on?”
- Purpose: Complex analysis requiring deep document search
- Performance: 20-30 seconds for thorough analysis
- Examples: “Analyze all my learning patterns”, “Compare my productivity strategies”
SDK Integration (Speed Modes)
For SDK users (React, Node.js, Python), use the traditional speed mode parameter:Fast Mode
Purpose: Optimized for applications requiring sub-second response times with direct memory retrieval. Performance: 0.5-1 second response time Use Cases:- Real-time chat applications
- Mobile applications with strict latency requirements
- Quick factual lookups
- Autocomplete and suggestion systems
Balanced Mode (Recommended)
Purpose: Provides natural language synthesis of memories using Gemini 2.5 Flash for conversational AI interactions. Performance: 3-5 seconds with Gemini 2.5 Flash processing Technology: Powered by Google’s Gemini 2.5 Flash model, optimized for adaptive thinking and cost efficiency Use Cases:- Conversational AI chatbots
- Personal assistant applications
- Customer support systems
- Educational tutoring platforms
Balanced mode is the recommended default for most conversational AI applications, providing the optimal balance between response quality and performance.
Autonomous Mode
Purpose: Intelligent orchestration that adaptively determines the appropriate level of context analysis based on conversation state and content complexity. Performance: Variable latency depending on analysis requirements. Can range from seconds to potentially longer than comprehensive mode for complex multi-step reasoning. Intelligence: The autonomous mode analyzes the context to decide:- Whether new information should be saved
- How much context to retrieve
- What depth of analysis is needed
- Whether to trigger background processes
- Complex planning and decision-making tasks
- Multi-step reasoning requirements
- Context-aware adaptive responses
- Applications requiring intelligent workflow orchestration
Autonomous mode may take longer than other modes when performing complex analysis, as it prioritizes intelligent decision-making over consistent response times.
Comprehensive Mode
Purpose: Extensive memory analysis with deep document search capabilities for research and detailed information retrieval. Performance: 20-30 seconds for thorough analysis Capabilities:- Deep document chunk search
- Extensive memory correlation
- Comprehensive context synthesis
- Cross-document relationship analysis
- Research and analysis tasks
- Detailed document exploration
- Comprehensive information synthesis
- Academic and professional research
Comprehensive mode can also be accessed using
speed: "deep"
for backward compatibility.Performance Comparison
MCP Interface (Depth Levels)
Depth | Response Time | Technology | Best For |
---|---|---|---|
0 | Immediate | No context retrieval | Generic knowledge |
1 | 0.5-1s | Direct search | Quick personal facts |
2 | 3-5s | Gemini 2.5 Flash synthesis | Conversational AI |
3 | 20-30s | Deep document analysis | Research tasks |
SDK Interface (Speed Modes)
Mode | Response Time | Technology | Best For |
---|---|---|---|
Fast | 0.5-1s | Direct search | Real-time interactions |
Balanced | 3-5s | Gemini 2.5 Flash synthesis | Conversational AI |
Autonomous | Variable | Intelligent orchestration | Complex reasoning |
Comprehensive | 20-30s | Deep document analysis | Research tasks |
Implementation Examples
React SDK
Node.js SDK
Python SDK
Best Practices
Mode Selection Guidelines
- Default Choice: Use balanced mode for most conversational interactions
- Performance Critical: Choose fast mode when sub-second response is required
- Complex Analysis: Select autonomous mode for multi-step reasoning and adaptive responses
- Research Tasks: Use comprehensive mode for thorough document analysis
Optimization Tips
- Fast Mode: Ideal for autocomplete, quick facts, and simple queries where raw data is sufficient
- Balanced Mode: Perfect for chatbots, personal assistants, and natural conversation flows
- Autonomous Mode: Best for planning, analysis, and context-dependent responses requiring intelligence
- Comprehensive Mode: Use sparingly due to latency; excellent for detailed research and analysis
Error Handling
Always implement proper error handling for all modes, especially autonomous and comprehensive modes which may have variable response times:Response Formats
Fast Mode Response
Balanced Mode Response
Technical Implementation
The speed modes are implemented at the orchestration layer, where thejean_memory
tool routes requests to different processing pathways based on the specified speed parameter:
- Fast: Direct
search_memory
with explicit result limits - Balanced:
ask_memory
with Gemini 2.5 Flash synthesis - Autonomous: Full orchestration with intelligent context analysis
- Comprehensive:
deep_memory_query
with document chunk search