Skip to content

Conversation

@De-funkd
Copy link

@De-funkd De-funkd commented Dec 3, 2025

This PR adds a complete, implementation of Pi0.5 for the Ark Robotics Framework. The implementation follows a HuggingFace-style wrapper pattern, integrating with the LeRobot Pi0.5 policy while maintaining compatibility with the existing ArkML architecture.

Key Features

1. Pi0.5 HuggingFace Wrapper

  • Complete Pi05Policy wrapper that leverages the actual LeRobot Pi0.5 policy
  • Follows the same design pattern as existing PiZeroNet for consistency
  • Supports multi-stage training (pretrain + post-training) with flow matching
  • Implements Pi0.5-specific architectural features:
    • Flow matching for precise action prediction
    • Multiple prediction heads (subtask, FAST, flow)
    • Enhanced vision-language backbone (SigLIP-Gemma)

2. Complete Algorithm Pipeline

  • Pi05Algorithm: Multi-stage training algorithm following LeRobot guidelines
  • Pi05Trainer: Handles both pretrain (CE(text) + CE(FAST tokens)) and post-train (CE(subtask) + α × flow_matching_loss) stages
  • Pi05Evaluator: Comprehensive evaluation with action metrics
  • Pi05Dataset: Multi-modality dataset support for different training stages

3. Structurally Identical Node Implementation

  • Pi05Node: Mirror of PiZeroPolicyNode structure but using Pi05Policy internally
  • Only accesses model methods without manual tokenization or LeRobot internals
  • Maintains identical interface: predict(), reset(), forward(), etc.

4. Comprehensive Testing & Benchmarking

  • Full test suite with 17 comprehensive verification tests
  • Integration tests verifying compatibility with PiZero
  • Performance benchmarks for flow matching and backbone operations
  • Repository integrity tests ensuring no regressions

Architecture Highlights

Flow Matching Implementation

  • Vector field networks for action prediction
  • Euler integration for precise action trajectories
  • Multi-stage training with configurable loss weights

Multi-Stage Training Support

  • Pretraining: CE(text) + CE(FAST tokens) for foundational representation learning
  • Post-training: CE(subtask) + α × flow_matching_loss for precise action prediction
  • Configurable hyperparameters including flow_alpha, integration steps

Enhanced Backbone Support

  • Vision-language models like SigLIP-Gemma
  • Proper normalization and preprocessing
  • Multi-modal input handling

Testing Coverage

  • Core functionality verification
  • Integration with existing PiZero workflows
  • Device compatibility (CPU/CUDA)
  • Serialization/deserialization
  • Batch size handling
  • Parameter consistency checks
  • Performance benchmarks

Framework Compatibility

  • All existing algorithms continue to work without changes
  • Pi0.5 can be used identically to PiZero (same service commands)
  • No breaking changes to public APIs
  • Maintains existing deployment workflows
  • Dependency issues resolved: Framework now loads cleanly with both algorithms

Complete

  • Complete with README, usage examples, and benchmarking
  • Can be loaded via: arkml-policy algo=pi05 algo.model.model_path=...

… tokenizer

- Create complete pi05 directory structure with algorithm, models, dataset, trainer, evaluator
- Implement FAST tokenizer for action discretization
- Add flow matching architecture with ActionFlowExpert
- Implement stage-based training (pretrain and posttrain)
- Add multi-modal dataset support (web_caption, qa, bounding_boxes, etc.)
- Create Pi05Node for inference pipeline
- Update README with Pi0.5 usage instructions
- Fix import issue in pizero algorithm
- Register pi05 in policy registry
@De-funkd
Copy link
Author

De-funkd commented Dec 3, 2025

@cmower @Refinath this is the new clean PR

@cmower cmower self-requested a review December 5, 2025 20:32
Copy link
Contributor

@cmower cmower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @De-funkd - please can you address my comments. And also @Refinath will review.

weight_decay=self.weight_decay,
num_epochs=self.max_epochs,
grad_accum=1.0, # Gradient accumulation
output_dir='./output', # TODO: Get from config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please address TODO

from arkml.core.policy import BasePolicy


class Pi05Node(BasePolicy):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to implement publisher/subscriber/services similar to Pi0 node

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to make a derived class from from arkml.core.policy_node import PolicyNode
class Pi05Node(PolicyNode) ...

@cmower cmower requested a review from Refinath December 5, 2025 20:44
import numpy as np


class Pi05Evaluator:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try make a sub class from arkml.core.algorithm import Evaluator

from arkml.core.app_context import ArkMLContext


def flow_matching_loss(pred, target):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same function is available in utils.py

return F.mse_loss(pred, target)


class DummyBackbone(torch.nn.Module):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still do we need DummyBackbone?

Copy link
Contributor

@Refinath Refinath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the PR

@De-funkd
Copy link
Author

Hey! @Refinath @cmower i've just pushed some changes hopefully they resolve all the comments
Cheers

@@ -0,0 +1,148 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the use of this file ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! this was something that i used to just quickly load the model and test if it works, this was not supposed to be part of this. I pushed this along with the other files by mistake .
Apologies

@cmower
Copy link
Contributor

cmower commented Jan 2, 2026

hey @De-funkd thanks for your contribution! @Refinath will do some final checks today, and hopefully we can merge 😄

Refinath and others added 11 commits January 2, 2026 22:34
…pipeline

- Update Pi05Algorithm.train() signature to not accept dataset parameters
- Load datasets internally using self.cfg following PiZero pattern
- Make Pi05Node constructor structurally identical to PiZeroPolicyNode
- Update Pi05Node to accept cfg and device parameters instead of model
- Fix rollout lifecycle issues to match PiZero behavior
- Add ConfigPath class to utils for YAML config loading
- Update registry to properly import pi05 algorithm and models
- Fix import paths in train.py, policy_service.py, and example files
- Update pi05 config to match expected structure

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…Policy entries

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants