Skip to content

A comprehensive MCP (Model Context Protocol) server for ML model training, fine-tuning, and experimentation. Transform your AI assistant into a full ML engineering environment.

License

Notifications You must be signed in to change notification settings

PushPullCommitPush/ml-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ML Lab MCP

A comprehensive MCP (Model Context Protocol) server for ML model training, fine-tuning, and experimentation. Transform your AI assistant into a full ML engineering environment.

Features

Unified Credential Management

  • Encrypted vault for API keys (Lambda Labs, RunPod, Mistral, OpenAI, Together AI, etc.)
  • PBKDF2 key derivation with AES encryption
  • Never stores credentials in plaintext

Dataset Management

  • Register datasets from local files OR client-provided content (JSONL, CSV, Parquet)
  • Upload datasets directly without server filesystem access
  • Automatic schema inference and statistics
  • Train/val/test splitting
  • Template-based transformations

Experiment Tracking

  • SQLite-backed experiment storage
  • Version control and comparison
  • Fork experiments with config modifications
  • Full metrics history

Multi-Backend Training

  • Local: transformers + peft + trl for local GPU training
  • Mistral API: Native fine-tuning for Mistral models
  • Together AI: Hosted fine-tuning service
  • OpenAI: GPT model fine-tuning

Cloud GPU Provisioning

  • Lambda Labs: H100, A100 instances
  • RunPod: Spot and on-demand GPUs
  • Automatic price comparison across providers
  • Smart routing based on cost and availability

Remote VPS Support

  • Use any SSH-accessible machine (Hetzner, Hostinger, OVH, home server, university cluster)
  • Automatic environment setup
  • Dataset sync via rsync
  • Training runs in tmux (persistent across disconnects)
  • Amortized hourly cost calculation from monthly fees

Cost Estimation

  • Pre-training cost estimates across all providers
  • Real-time pricing queries
  • Time estimates based on model and dataset size

Ollama Integration

  • Deploy fine-tuned GGUF models to Ollama
  • Pull models from Ollama registry
  • Chat/inference testing directly from MCP
  • Model management (list, delete, copy)

Open WebUI Integration

  • Create model presets with system prompts
  • Knowledge base management (RAG)
  • Chat through Open WebUI (applies configs + knowledge)
  • Seamless Ollama ↔ Open WebUI workflow

Installation

pip install ml-lab-mcp

# With training dependencies
pip install ml-lab-mcp[training]

# With cloud provider support
pip install ml-lab-mcp[cloud]

# Everything
pip install ml-lab-mcp[training,cloud,dev]

Quick Start

1. Initialize and Create Vault

ml-lab init
ml-lab vault create

2. Add Provider Credentials

ml-lab vault unlock
ml-lab vault add --provider lambda_labs --api-key YOUR_KEY
ml-lab vault add --provider mistral --api-key YOUR_KEY

3. Configure with Claude Code / Claude Desktop

Add to your MCP configuration:

{
  "mcpServers": {
    "ml-lab": {
      "command": "ml-lab",
      "args": ["serve"]
    }
  }
}

MCP Tools

Credentials

Tool Description
creds_create_vault Create encrypted credential vault
creds_unlock Unlock vault with password
creds_add Add provider credentials
creds_list List configured providers
creds_test Verify credentials work (Lambda Labs, GCP, OpenAI supported)

Datasets

Tool Description
dataset_register Register a dataset from a local file
dataset_register_content Register a dataset from client-provided content (CSV, JSON, JSONL, Parquet)
dataset_list List all datasets
dataset_inspect View schema and statistics
dataset_preview Preview samples
dataset_split Create train/val/test splits
dataset_transform Apply template transformations

Experiments

Tool Description
experiment_create Create new experiment
experiment_list List experiments
experiment_get Get experiment details
experiment_compare Compare multiple experiments
experiment_fork Fork with modifications

Training

Tool Description
train_estimate Estimate cost/time across providers
train_launch Start training run
train_status Check run status
train_stop Stop training

Infrastructure

Tool Description
infra_list_gpus List available GPUs with pricing
infra_provision Provision cloud instance
infra_terminate Terminate instance

Remote VPS

Tool Description
vps_register Register a VPS (host, user, key, GPU info, monthly cost)
vps_list List all registered VPS machines
vps_status Check VPS status (online, GPU, running jobs)
vps_unregister Remove a VPS from registry
vps_setup Install training dependencies on VPS
vps_sync Sync dataset to VPS
vps_run Run command on VPS
vps_logs Get training logs from VPS

Ollama

Tool Description
ollama_status Check Ollama status (running, version, GPU)
ollama_list List models in Ollama
ollama_pull Pull model from registry
ollama_deploy Deploy GGUF to Ollama
ollama_chat Chat with a model
ollama_delete Delete a model

Open WebUI

Tool Description
owui_status Check Open WebUI connection
owui_list_models List model configurations
owui_create_model Create model preset (system prompt, params)
owui_delete_model Delete model configuration
owui_list_knowledge List knowledge bases
owui_create_knowledge Create knowledge base
owui_add_knowledge_file Add file to knowledge base
owui_chat Chat through Open WebUI

Security

Tool Description
security_audit_log View recent audit log entries
security_audit_summary Get audit activity summary
security_tailscale_status Check Tailscale VPN connection
security_ssh_key_rotate Rotate SSH key for a VPS
creds_expiry_check Check credential expiry status
creds_rotate Rotate credentials for a provider

Codex Integration (Executor LLM)

Tool Description
codex_status Check if Codex CLI is available
codex_analyze_error Have Codex analyze errors and suggest fixes
codex_generate_training_script Generate training script from experiment config
codex_fix_code Have Codex fix issues in training code
codex_optimize_config Optimize training config for quality/speed/memory
codex_debug_training Debug training issues from logs
codex_run Run arbitrary task with Codex

Architecture: Separation of intelligence vs execution:

  • Planner LLM (Claude): Reasoning, architecture, tradeoffs
  • Executor LLM (Codex): Precise code edits + CLI work
  • World tools (ML Lab): Data, training, infra, deployment

Deep Thinking Analysis (Ollama Reasoning Models)

Tool Description
thinking_analyze Run deep analysis with reasoning models (DeepSeek R1, QwQ)
thinking_schedule Schedule automated analysis (after training, daily, weekly)
thinking_reports List and retrieve analysis reports

Analysis Types:

  • training - Analyze loss curves, convergence, hyperparameters
  • experiment - Compare experiments, suggest improvements
  • activity - Review audit logs for patterns/anomalies
  • cost - Cost efficiency analysis and recommendations
  • dataset - Data quality and training suitability

Schedule Frequencies:

  • after_training - Auto-analyze when training completes
  • hourly / daily / weekly - Periodic reviews

Reports stored in ~/.cache/ml-lab/reports/ for historical reference.

Example Workflow

User: Fine-tune Llama 3.1 8B on my customer support data

Claude: I'll help you fine-tune Llama 3.1 8B. Let me first check your data.

[Uses dataset_register to load ./support_data.jsonl]

Found 15,000 samples. Let me estimate the costs:

[Uses train_estimate]

| Provider      | Time   | Cost   | Availability |
|---------------|--------|--------|--------------|
| vps:hetzner-01| ~4h    | $0.28* | Idle         |
| Local (4090)  | ~12h   | $0     | Ready        |
| Lambda H100   | ~2h    | $5.50  | 3 available  |
| Together API  | ~4h    | $12    | Queued       |

*amortized from $200/mo

Which would you prefer?

User: Use Lambda Labs

Claude: [Uses experiment_create, then train_launch with backend=lambda]

Training started! Run ID: abc123
I'll monitor progress and let you know when it completes.

Architecture

src/ml_lab/
├── server.py           # MCP server entry point (61 tools)
├── credentials.py      # Encrypted credential vault
├── cli.py              # Command-line interface
├── backends/
│   ├── base.py         # Training backend interface
│   ├── local.py        # Local GPU training
│   ├── mistral_api.py  # Mistral fine-tuning API
│   ├── together_api.py # Together AI API
│   ├── openai_api.py   # OpenAI fine-tuning API
│   └── vertex_api.py   # Google Vertex AI (Gemini)
├── cloud/
│   ├── base.py         # Cloud provider interface
│   ├── lambda_labs.py  # Lambda Labs integration
│   ├── runpod.py       # RunPod integration
│   ├── modal_provider.py # Modal integration
│   └── remote_vps.py   # Generic SSH VPS support (+ Tailscale)
├── storage/
│   ├── datasets.py     # Dataset management
│   └── experiments.py  # Experiment tracking
├── inference/
│   ├── ollama.py       # Ollama integration
│   ├── openwebui.py    # Open WebUI integration
│   └── thinking.py     # Deep thinking analysis (DeepSeek R1, QwQ)
├── integrations/
│   └── codex.py        # Codex CLI integration (executor LLM)
├── security/
│   └── audit.py        # Audit logging
└── evals/
    └── benchmarks.py   # Evaluation suite

Security

  • Credentials encrypted with Fernet (AES-128-CBC)
  • PBKDF2-SHA256 key derivation (480,000 iterations)
  • Vault file permissions set to 600 (owner read/write only)
  • API keys never logged or transmitted unencrypted
  • Audit logging: All sensitive operations logged to ~/.cache/ml-lab/audit.log
  • Credential expiry: Automatic tracking with rotation reminders
  • Tailscale support: Optional VPN requirement for VPS connections
  • SSH key rotation: Automated rotation with rollback on failure

Supported Providers

Compute Providers

  • Lambda Labs (H100, A100, A10)
  • RunPod (H100, A100, RTX 4090)
  • Modal (serverless GPU functions)

Fine-Tuning APIs

  • Mistral AI (Mistral, Mixtral, Codestral)
  • Together AI (Llama, Mistral, Qwen)
  • OpenAI (GPT-4o, GPT-3.5)
  • Google Vertex AI (Gemini 1.5 Pro, Gemini 1.5 Flash)

Model Hubs

  • Hugging Face Hub
  • Replicate
  • Ollama (local GGUF models)

Contributing

Contributions welcome! Please read CONTRIBUTING.md for guidelines.

License

PolyForm Noncommercial 1.0.0 - free for personal use, contact for commercial licensing.

See LICENSE for details.

About

A comprehensive MCP (Model Context Protocol) server for ML model training, fine-tuning, and experimentation. Transform your AI assistant into a full ML engineering environment.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages