High-Performance AI Agent Memory Management System - Rust Implementation
Languages: English | 简体中文 | 日本語 | Français | العربية | Deutsch | Español | 한국어
📌 Version Note: This is the Personal/Enterprise Single-Tenant Edition. For SaaS multi-tenant features, see the
feature/saas-multi-tenantbranch.
MemoryOS-Rust is a high-performance AI Agent memory management system built with Rust + Tokio, featuring a 3-Tier memory architecture (STM/MTM/LTM), OpenAI API compatibility, and support for 100,000+ concurrent users.
This edition is optimized for:
- 👤 Individual developers and researchers
- 🏢 Single enterprise/organization deployments
- 🔒 On-premise installations with full data control
- 🚀 High Performance: Rust + Tokio async runtime, designed for high concurrency (end-to-end QPS/latency TBD; Criterion microbenchmarks available).
- 🧠 Unified Vector Storage: All memory tiers (STM/MTM/LTM) use vector databases for persistent storage.
- 💾 3 Vector Database Options: Qdrant (default), Chroma (lightweight), Pinecone (cloud-hosted).
- ⚡ FAQ Heat Tracking: High-frequency Q&A detection with heat score calculation and auto-promotion logic.
- 🔌 Universal Gateway: OpenAI protocol compatible, 10 LLM adapters (OpenAI, Gemini, Claude, Ollama, DeepSeek, OpenRouter, Azure, Groq, Cohere, Mistral).
- 🕸️ Graph Memory: Entity extraction + relation extraction + graph query API (/v1/graph) + DFS path query (v0.4.0).
- 📚 Knowledge Export: FAQ export to Local Markdown + S3 (OpenDAL) + Confluence (REST API) (v0.3.0).
- 🛡️ Security Shield: PII sanitization (email/phone/credit card/SSN/API key), prompt injection defense (17 patterns), IP defense system.
- 🤖 3-Tier LLM Router: Routes requests to different model tiers based on input complexity (heuristic-based) + Tier 0 FAQ direct hit (v0.3.0).
- 🔄 Coordination Layer: Redis/NATS for distributed coordination (Session, Lock, Cache, Message Queue).
- 🎯 6 Performance Optimization Modules: Bloom Filter, LRU Cache, Batch Processing, Heat Buffer, Similarity Filter, Incremental Summary.
- 🎨 Multimodal Memory: QdrantMultiModalStorage + HTTP API (/v1/multimodal/*) (v0.5.0, experimental).
- 🏷️ Memory Versioning & Tags: Version history + tag management + export/import (v0.6.0).
- 🔐 Security Hardening: AES-256-GCM encryption + persistent audit log (JSONL) + GDPR records (JSON) (v0.8.0~v0.9.0).
- 📊 Prometheus Observability: /metrics endpoint + HTTP/Router/FAQ/LLM full-chain metrics (v0.10.0).
- 🧠 LLM FAQ Classification: Automatic FAQ categorization via LLM + /v1/admin/faq/classify API (v0.10.0).
| Feature | MemoryOS-Rust | Mem0 | Advantage |
|---|---|---|---|
| Language | Rust 🦀 | Python 🐍 | 5-10x faster |
| Performance | TBD (not benchmarked) | ~1K QPS | Needs testing |
| FAQ Response | TBD (not benchmarked) | ~100ms | Needs testing |
| Memory Overhead | TBD (not benchmarked) | ~500MB | Needs testing |
| LLM Adapters | 10 | 10+ | Similar |
| Vector DBs | 3 (Qdrant, Chroma, Pinecone) | 5+ | Good coverage |
| Graph Memory | ✅ entity/relation extraction + graph query | ✅ Neo4j | Similar capabilities |
| Hot Config Reload | ✅ 5s auto-refresh | ❌ | Unique feature |
| Smart Routing | ✅ Tier 0 FAQ + heuristic tiers | MemoryOS has Tier 0 | |
| Cost Savings | TBD (not measured) | ~50% | Needs testing |
| Production Ready | Release candidate (pre v1.0) | ✅ Mature | Mem0 is more mature |
When to choose MemoryOS-Rust:
- Want a Rust-based memory layer for AI Agents
- Need tight resource control and low overhead
- Prefer compiled language performance characteristics
- Building in the Rust ecosystem
When to choose Mem0:
- Python ecosystem preference
- Need more vector DB options
- Mature community and examples
| Spec | Minimum (Dev) | Recommended (Prod) |
|---|---|---|
| CPU | 2 vCPU | 4+ vCPU |
| RAM | 4GB | 16GB+ |
| Disk | 10GB SSD | 100GB NVMe |
| OS | Linux / macOS | Linux (K8s) |
docker-compose up -dCreate .env file (optional) or set environment variables:
export GEMINI_API_KEY="your_key_here"
export QDRANT_API_KEY="your_qdrant_key"Copy config file:
cp config.example.toml config.toml
# Edit config.toml to enable desired modules (Router, Wiki, etc.)# Default full-featured mode
cargo run --release --bin memoryos-gateway
# (Advanced) Enable specific features only (if Cargo.toml supports)
# cargo run --release --no-default-features --features "redis,qdrant"curl http://localhost:8080/health/statusDetailed Guide: docs/QUICKSTART.md
graph TD
Client[User Client] -->|OpenAI Protocol| Gateway
subgraph MemoryOS-Rust
Gateway -->|Auth & Shield| Router{LLM Router}
Router -->|Tier 1: Simple| SmallLLM[Small Model]
Router -->|Tier 2: Medium| MediumLLM[Medium Model]
Router -->|Tier 3: Complex| LargeLLM[Large Model]
Gateway -->|Async Event| Queue[NATS/Redis]
Queue --> Worker
Worker -->|Summarize| VectorDB[(Qdrant)]
Worker -->|Export| Wiki[Local/S3/Confluence]
end
Detailed Architecture: docs/ARCHITECTURE.md
- Quick Start - Get started in 5 minutes
- User Manual - Complete usage guide 📖
- Architecture - System design (Graph/Router)
- API Reference - API documentation
- Development Guide - Development setup
- Deployment Guide - K8s/Docker deployment
- K3s Auto-Deploy - One-click K8s cluster 🚀
- Authentication - API Key management
- FAQ System - Auto-promote high-frequency Q&A ⚡
- Optimization Analysis - Algorithm optimization strategies 🚀
- Usage Guide - How to use optimization modules ⚡
- Design Principles - Design philosophy & implementation ⭐
- Comparison - vs Mem0 analysis ⭐
- Roadmap - v0.2.0 → v1.0.0 planning
- API Key Auth - Enterprise auth system (Qdrant persistence) 🔒
- Work Log - Who's doing what, for collaboration ⭐⭐⭐
- Project State - AI context recovery (machine-readable)
- Changelog - Version history
- Contributing - Contribution guidelines
- Documentation Index - Complete docs navigation
⭐ Recommended: Design Principles and Comparison for system design insights
Version: 0.9.0
Status: Release Candidate (pre v1.0)
| Phase | Module | Status | Notes |
|---|---|---|---|
| Phase 1 | Foundation (Config/Log) | Done | Functional |
| Phase 2 | Gateway & Adapters | Done | Basic implementation |
| Phase 3 | Storage (Redis/Qdrant) | Done | Needs production testing |
| Phase 4 | Intelligence (Router/Shield) | Done | Tier routing + FAQ Tier 0 |
| Phase 5 | Worker & Async | Done | Functional |
| Phase 6 | Wiki Export | Done | Local + S3 + Confluence |
| Phase 7 | Graph Memory | Done | Entity/relation extraction + graph query |
| Phase 8 | Multimodal | Done | Qdrant storage + HTTP endpoints (experimental) |
| Phase 9 | Security | Done | AES-256-GCM + audit + GDPR persistence |
| Phase 10 | Benchmarks | Done | Criterion microbenchmarks (see docs/PERFORMANCE_REPORT.md) |
| Phase 11 | Observability | Done | Prometheus /metrics + full-chain instrumentation |
| Phase 12 | LLM FAQ | Done | LLM-based FAQ classification + /v1/admin/faq/classify |
Note: End-to-end performance claims (QPS, latency) have not been independently validated yet. Criterion microbenchmark results are available in
docs/PERFORMANCE_REPORT.md.
- Language: Rust 1.93+
- Async Runtime: Tokio
- Web Framework: Axum
- Short-term Storage: Redis
- Vector Storage: Qdrant
- LLM: OpenAI, Gemini, Claude, Ollama, DeepSeek, OpenRouter, Azure, Groq, Cohere, Mistral (10 adapters)
Contributions are welcome! Please follow this workflow:
- 📖 Read Development Guide
- 📝 Log your task in WORK_LOG.md
- 🔄 Pull latest code:
git pull
- 📊 Update progress in WORK_LOG.md daily
- 🐛 Log issues immediately
- 🔴 Update status if blocked
- ✅ Mark task as complete in WORK_LOG.md
- 📝 Update CHANGELOG.md
- 🚀 Submit code:
git commit && git push
Collaboration: We use WORK_LOG.md (human) + docs/state.json (AI) dual-track recording for transparent collaboration.
Detailed Guide: CONTRIBUTING.md
Current Status: Active Development
This project is in early development. We are actively working on:
- 🐛 Bug fixes and security updates
- 📚 Documentation improvements
- 💡 Community-driven enhancements
See: MAINTENANCE.md for detailed maintenance plan
Looking for multi-tenant SaaS features? Check out the feature/saas-multi-tenant branch, which includes:
- 🏢 Multi-Tenant Architecture: Complete tenant isolation
- 💳 Billing Integration: Usage tracking and quota management
- 🔑 Flexible LLM Configuration: Per-tenant API key management
- 📊 Usage Analytics: Detailed per-tenant metrics
- GitHub Issues: Report Issues
- GitHub Discussions: Join Discussions
- Email: 246803628+TelivANT@users.noreply.github.com
- Security Issues: Please email with subject
[SECURITY]
Apache 2.0 License - See LICENSE
- Original Project: MemoryOS - Python implementation
- Paper: Memory OS of AI Agent
Version: 0.10.0 (Personal Edition) | Updated: 2026-02-20