Skip to content

Production AI Memory OS: <10ms FAQ, 90% cost savings via smart routing, unified gateway for teams — 100K users ready 🦀⚡💰

License

Notifications You must be signed in to change notification settings

TelivANT/memoryos-rust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MemoryOS-Rust

High-Performance AI Agent Memory Management System - Rust Implementation

License Rust Status GitHub stars GitHub release Build Status Docker Pulls

Languages: English | 简体中文 | 日本語 | Français | العربية | Deutsch | Español | 한국어

📌 Version Note: This is the Personal/Enterprise Single-Tenant Edition. For SaaS multi-tenant features, see the feature/saas-multi-tenant branch.


🎯 Overview

MemoryOS-Rust is a high-performance AI Agent memory management system built with Rust + Tokio, featuring a 3-Tier memory architecture (STM/MTM/LTM), OpenAI API compatibility, and support for 100,000+ concurrent users.

This edition is optimized for:

  • 👤 Individual developers and researchers
  • 🏢 Single enterprise/organization deployments
  • 🔒 On-premise installations with full data control

✨ Key Features

  • 🚀 High Performance: Rust + Tokio async runtime, designed for high concurrency (end-to-end QPS/latency TBD; Criterion microbenchmarks available).
  • 🧠 Unified Vector Storage: All memory tiers (STM/MTM/LTM) use vector databases for persistent storage.
  • 💾 3 Vector Database Options: Qdrant (default), Chroma (lightweight), Pinecone (cloud-hosted).
  • FAQ Heat Tracking: High-frequency Q&A detection with heat score calculation and auto-promotion logic.
  • 🔌 Universal Gateway: OpenAI protocol compatible, 10 LLM adapters (OpenAI, Gemini, Claude, Ollama, DeepSeek, OpenRouter, Azure, Groq, Cohere, Mistral).
  • 🕸️ Graph Memory: Entity extraction + relation extraction + graph query API (/v1/graph) + DFS path query (v0.4.0).
  • 📚 Knowledge Export: FAQ export to Local Markdown + S3 (OpenDAL) + Confluence (REST API) (v0.3.0).
  • 🛡️ Security Shield: PII sanitization (email/phone/credit card/SSN/API key), prompt injection defense (17 patterns), IP defense system.
  • 🤖 3-Tier LLM Router: Routes requests to different model tiers based on input complexity (heuristic-based) + Tier 0 FAQ direct hit (v0.3.0).
  • 🔄 Coordination Layer: Redis/NATS for distributed coordination (Session, Lock, Cache, Message Queue).
  • 🎯 6 Performance Optimization Modules: Bloom Filter, LRU Cache, Batch Processing, Heat Buffer, Similarity Filter, Incremental Summary.
  • 🎨 Multimodal Memory: QdrantMultiModalStorage + HTTP API (/v1/multimodal/*) (v0.5.0, experimental).
  • 🏷️ Memory Versioning & Tags: Version history + tag management + export/import (v0.6.0).
  • 🔐 Security Hardening: AES-256-GCM encryption + persistent audit log (JSONL) + GDPR records (JSON) (v0.8.0~v0.9.0).
  • 📊 Prometheus Observability: /metrics endpoint + HTTP/Router/FAQ/LLM full-chain metrics (v0.10.0).
  • 🧠 LLM FAQ Classification: Automatic FAQ categorization via LLM + /v1/admin/faq/classify API (v0.10.0).

vs Mem0 Comparison

Feature MemoryOS-Rust Mem0 Advantage
Language Rust 🦀 Python 🐍 5-10x faster
Performance TBD (not benchmarked) ~1K QPS Needs testing
FAQ Response TBD (not benchmarked) ~100ms Needs testing
Memory Overhead TBD (not benchmarked) ~500MB Needs testing
LLM Adapters 10 10+ Similar
Vector DBs 3 (Qdrant, Chroma, Pinecone) 5+ Good coverage
Graph Memory ✅ entity/relation extraction + graph query ✅ Neo4j Similar capabilities
Hot Config Reload ✅ 5s auto-refresh Unique feature
Smart Routing ✅ Tier 0 FAQ + heuristic tiers ⚠️ Basic MemoryOS has Tier 0
Cost Savings TBD (not measured) ~50% Needs testing
Production Ready Release candidate (pre v1.0) ✅ Mature Mem0 is more mature

When to choose MemoryOS-Rust:

  • Want a Rust-based memory layer for AI Agents
  • Need tight resource control and low overhead
  • Prefer compiled language performance characteristics
  • Building in the Rust ecosystem

When to choose Mem0:

  • Python ecosystem preference
  • Need more vector DB options
  • Mature community and examples

💻 System Requirements

Spec Minimum (Dev) Recommended (Prod)
CPU 2 vCPU 4+ vCPU
RAM 4GB 16GB+
Disk 10GB SSD 100GB NVMe
OS Linux / macOS Linux (K8s)

🚀 Quick Start

1.Start Dependencies

docker-compose up -d

2. Configuration

Create .env file (optional) or set environment variables:

export GEMINI_API_KEY="your_key_here"
export QDRANT_API_KEY="your_qdrant_key"

Copy config file:

cp config.example.toml config.toml
# Edit config.toml to enable desired modules (Router, Wiki, etc.)

3. Run

# Default full-featured mode
cargo run --release --bin memoryos-gateway

# (Advanced) Enable specific features only (if Cargo.toml supports)
# cargo run --release --no-default-features --features "redis,qdrant"

4. Test

curl http://localhost:8080/health/status

Detailed Guide: docs/QUICKSTART.md


🏗️ Architecture

graph TD
    Client[User Client] -->|OpenAI Protocol| Gateway
    subgraph MemoryOS-Rust
        Gateway -->|Auth & Shield| Router{LLM Router}
        Router -->|Tier 1: Simple| SmallLLM[Small Model]
        Router -->|Tier 2: Medium| MediumLLM[Medium Model]
        Router -->|Tier 3: Complex| LargeLLM[Large Model]
        Gateway -->|Async Event| Queue[NATS/Redis]
        Queue --> Worker
        Worker -->|Summarize| VectorDB[(Qdrant)]
        Worker -->|Export| Wiki[Local/S3/Confluence]
    end
Loading

Detailed Architecture: docs/ARCHITECTURE.md


📚 Documentation

User Documentation

Performance Optimization

Deep Dive

Developer Documentation

⭐ Recommended: Design Principles and Comparison for system design insights


📊 Project Status

Version: 0.9.0
Status: Release Candidate (pre v1.0)

Phase Module Status Notes
Phase 1 Foundation (Config/Log) Done Functional
Phase 2 Gateway & Adapters Done Basic implementation
Phase 3 Storage (Redis/Qdrant) Done Needs production testing
Phase 4 Intelligence (Router/Shield) Done Tier routing + FAQ Tier 0
Phase 5 Worker & Async Done Functional
Phase 6 Wiki Export Done Local + S3 + Confluence
Phase 7 Graph Memory Done Entity/relation extraction + graph query
Phase 8 Multimodal Done Qdrant storage + HTTP endpoints (experimental)
Phase 9 Security Done AES-256-GCM + audit + GDPR persistence
Phase 10 Benchmarks Done Criterion microbenchmarks (see docs/PERFORMANCE_REPORT.md)
Phase 11 Observability Done Prometheus /metrics + full-chain instrumentation
Phase 12 LLM FAQ Done LLM-based FAQ classification + /v1/admin/faq/classify

Note: End-to-end performance claims (QPS, latency) have not been independently validated yet. Criterion microbenchmark results are available in docs/PERFORMANCE_REPORT.md.


🛠️ Tech Stack

  • Language: Rust 1.93+
  • Async Runtime: Tokio
  • Web Framework: Axum
  • Short-term Storage: Redis
  • Vector Storage: Qdrant
  • LLM: OpenAI, Gemini, Claude, Ollama, DeepSeek, OpenRouter, Azure, Groq, Cohere, Mistral (10 adapters)

🤝 Contributing

Contributions are welcome! Please follow this workflow:

Before Starting

  1. 📖 Read Development Guide
  2. 📝 Log your task in WORK_LOG.md
  3. 🔄 Pull latest code: git pull

During Work

  1. 📊 Update progress in WORK_LOG.md daily
  2. 🐛 Log issues immediately
  3. 🔴 Update status if blocked

After Completion

  1. ✅ Mark task as complete in WORK_LOG.md
  2. 📝 Update CHANGELOG.md
  3. 🚀 Submit code: git commit && git push

Collaboration: We use WORK_LOG.md (human) + docs/state.json (AI) dual-track recording for transparent collaboration.

Detailed Guide: CONTRIBUTING.md


🔧 Maintenance Status

Current Status: Active Development

This project is in early development. We are actively working on:

  • 🐛 Bug fixes and security updates
  • 📚 Documentation improvements
  • 💡 Community-driven enhancements

See: MAINTENANCE.md for detailed maintenance plan


🏢 Enterprise & SaaS Edition

Looking for multi-tenant SaaS features? Check out the feature/saas-multi-tenant branch, which includes:

  • 🏢 Multi-Tenant Architecture: Complete tenant isolation
  • 💳 Billing Integration: Usage tracking and quota management
  • 🔑 Flexible LLM Configuration: Per-tenant API key management
  • 📊 Usage Analytics: Detailed per-tenant metrics

📞 Contact


📄 License

Apache 2.0 License - See LICENSE


🌟 Related Projects


Version: 0.10.0 (Personal Edition) | Updated: 2026-02-20

About

Production AI Memory OS: <10ms FAQ, 90% cost savings via smart routing, unified gateway for teams — 100K users ready 🦀⚡💰

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •