diff --git a/recipes/Cluster_create_RayCluster.md b/recipes/Cluster_create_RayCluster.md index 6ce311fb6..ea9be8e25 100644 --- a/recipes/Cluster_create_RayCluster.md +++ b/recipes/Cluster_create_RayCluster.md @@ -10,8 +10,8 @@ $ xpk cluster create-ray --project=golden-project --zone=us-central1-a --cluster [XPK] Starting xpk v0.0.0 [XPK] Starting cluster create for cluster golden-cluster: [XPK] Working on golden-project and us-central1-a -[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run. -gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a +[XPK] Task: `Get reservation golden-reservation` is implemented by the following command not running since it is a dry run. +gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a --format="json(specificReservation,aggregateReservation,status,deploymentType,resourcePolicies)" [XPK] Task: `Determine server supported GKE versions for default gke version` is implemented by the following command not running since it is a dry run. gcloud container get-server-config --project=golden-project --region=us-central1 --flatten="channels" --filter="channels.channel=RAPID" --format="value(channels.defaultVersion)" [XPK] Task: `Determine server supported GKE versions for valid versions` is implemented by the following command not running since it is a dry run. @@ -50,8 +50,6 @@ gcloud beta container clusters describe golden-cluster --location us-central1 -- We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=, requires_workload_policy=False, gpu_config=None, parallel_containers=2) [XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run. gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --location=us-central1 --format="csv[no-heading](name)" -[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run. -gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a [XPK] Creating 1 node pool or pools of tpu7x-8 Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu7x', gce_machine_type='tpu7x-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='tpu7x-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=, requires_workload_policy=False, gpu_config=None, parallel_containers=2) [XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run. @@ -64,8 +62,6 @@ kubectl get configmap golden-cluster-resources-configmap -o=custom-columns="Conf [XPK] Pretending all the jobs succeeded [XPK] Create or delete node pool request complete. [XPK] Creating ConfigMap for cluster -[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run. -gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a [XPK] Temp file (0604d72ef175c94fc796d8f02cff009b4241e85d444d22d414a56a47764d7bbb) content: kind: ConfigMap apiVersion: v1 diff --git a/recipes/Cluster_create_private.md b/recipes/Cluster_create_private.md index 3c67d9539..f7b2341b8 100644 --- a/recipes/Cluster_create_private.md +++ b/recipes/Cluster_create_private.md @@ -12,8 +12,8 @@ $ xpk cluster create-pathways --project=golden-project --zone=us-central1-a --cl [XPK] Working on golden-project and us-central1-a [XPK] Task: `Retrieve available pathways machine types` is implemented by the following command not running since it is a dry run. gcloud compute machine-types list --filter "guestCpus >= 49 AND memoryMb >= 238592 AND zone = 'us-central1-a'" --format="value(name)" --project=golden-project -[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run. -gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a +[XPK] Task: `Get reservation golden-reservation` is implemented by the following command not running since it is a dry run. +gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a --format="json(specificReservation,aggregateReservation,status,deploymentType,resourcePolicies)" [XPK] Task: `Determine server supported GKE versions for default gke version` is implemented by the following command not running since it is a dry run. gcloud container get-server-config --project=golden-project --region=us-central1 --flatten="channels" --filter="channels.channel=RAPID" --format="value(channels.defaultVersion)" [XPK] Task: `Determine server supported GKE versions for valid versions` is implemented by the following command not running since it is a dry run. @@ -54,8 +54,6 @@ gcloud beta container clusters describe golden-cluster-private --location us-cen We assume that the underlying system is: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v5p-slice', gce_machine_type='ct5p-hightpu-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v5p-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=, requires_workload_policy=False, gpu_config=None, parallel_containers=1) [XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run. gcloud beta container node-pools list --cluster golden-cluster-private --project=golden-project --location=us-central1 --format="csv[no-heading](name)" -[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run. -gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a [XPK] Creating 1 node pool or pools of v5p-8 Underlyingly, we assume that means: SystemCharacteristics(topology='2x2x1', vms_per_slice=1, gke_accelerator='tpu-v5p-slice', gce_machine_type='ct5p-hightpu-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v5p-8', supports_sub_slicing=False, supports_super_slicing=False, supports_accelerator_network_profile=False, docker_platform=, requires_workload_policy=False, gpu_config=None, parallel_containers=1) [XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run. @@ -69,8 +67,6 @@ kubectl get configmap golden-cluster-private-resources-configmap -o=custom-colum [XPK] Pretending all the jobs succeeded [XPK] Create or delete node pool request complete. [XPK] Creating ConfigMap for cluster -[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run. -gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a [XPK] Temp file (8669497cfbe494756d36922054f924d7dca463141f0e5d0329e517c880cf2f06) content: kind: ConfigMap apiVersion: v1 diff --git a/recipes/Cluster_create_sub-slicing.md b/recipes/Cluster_create_sub-slicing.md index b03f037e1..ea1e6c2a1 100644 --- a/recipes/Cluster_create_sub-slicing.md +++ b/recipes/Cluster_create_sub-slicing.md @@ -10,10 +10,8 @@ $ SUB_SLICING_ENABLED=true xpk cluster create --project=golden-project --zone=us [XPK] Starting xpk v0.0.0 [XPK] Starting cluster create for cluster golden-cluster: [XPK] Working on golden-project and us-central1-a -[XPK] Task: `Get reservation deployment type` is implemented by the following command not running since it is a dry run. -gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a --format="value(deploymentType)" -[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run. -gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a +[XPK] Task: `Get reservation golden-reservation` is implemented by the following command not running since it is a dry run. +gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a --format="json(specificReservation,aggregateReservation,status,deploymentType,resourcePolicies)" [XPK] Task: `Determine server supported GKE versions for default gke version` is implemented by the following command not running since it is a dry run. gcloud container get-server-config --project=golden-project --region=us-central1 --flatten="channels" --filter="channels.channel=RAPID" --format="value(channels.defaultVersion)" [XPK] Task: `Determine server supported GKE versions for valid versions` is implemented by the following command not running since it is a dry run. @@ -52,8 +50,6 @@ gcloud beta container clusters describe golden-cluster --location us-central1 -- We assume that the underlying system is: SystemCharacteristics(topology='4x4', vms_per_slice=4, gke_accelerator='tpu-v6e-slice', gce_machine_type='ct6e-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v6e-16', supports_sub_slicing=True, supports_super_slicing=False, supports_accelerator_network_profile=True, docker_platform=, requires_workload_policy=False, gpu_config=None, parallel_containers=1) [XPK] Task: `Get All Node Pools` is implemented by the following command not running since it is a dry run. gcloud beta container node-pools list --cluster golden-cluster --project=golden-project --location=us-central1 --format="csv[no-heading](name)" -[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run. -gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a [XPK] Creating 1 node pool or pools of v6e-16 Underlyingly, we assume that means: SystemCharacteristics(topology='4x4', vms_per_slice=4, gke_accelerator='tpu-v6e-slice', gce_machine_type='ct6e-standard-4t', chips_per_vm=4, accelerator_type=TPU, device_type='v6e-16', supports_sub_slicing=True, supports_super_slicing=False, supports_accelerator_network_profile=True, docker_platform=, requires_workload_policy=False, gpu_config=None, parallel_containers=1) [XPK] Task: `Get Node Pool Zone` is implemented by the following command not running since it is a dry run. @@ -66,8 +62,6 @@ kubectl get configmap golden-cluster-resources-configmap -o=custom-columns="Conf [XPK] Pretending all the jobs succeeded [XPK] Create or delete node pool request complete. [XPK] Creating ConfigMap for cluster -[XPK] Task: `Describe reservation` is implemented by the following command not running since it is a dry run. -gcloud beta compute reservations describe golden-reservation --project=golden-project --zone=us-central1-a [XPK] Temp file (8d0f4b1e96d79a5d572cbb1a403ac3285b6a9390b6092b86a76bf66705e35d44) content: kind: ConfigMap apiVersion: v1 diff --git a/recipes/Cluster_create_super-slicing.md b/recipes/Cluster_create_super-slicing.md index 28ac9f6cf..df0f21252 100644 --- a/recipes/Cluster_create_super-slicing.md +++ b/recipes/Cluster_create_super-slicing.md @@ -1,19 +1,19 @@ # Cluster create super-slicing + Creates a GKE cluster with TPU super-slicing enabled for multi-slice training. # Running the command + ```shell #golden -xpk cluster create --project=golden-project --zone=us-central1-a --cluster=golden-cluster --tpu-type=tpu7x-4x4x4 --reservation=golden-reservation/reservationBlocks/block/reservationSubBlocks/subblock --super-slicing --num-cubes=5 +DRY_RUN_RESERVATION_SUB_BLOCKS='[{"name": "sub0", "count": 16, "inUseCount": 0}, {"name": "sub1", "count": 16, "inUseCount": 0}, {"name": "sub2", "count": 16, "inUseCount": 15}, {"name": "sub3", "count": 16, "inUseCount": 0}]' xpk cluster create --project=golden-project --zone=us-central1-a --cluster=golden-cluster --tpu-type=tpu7x-4x4x4 --reservation=golden-reservation/reservationBlocks/block --super-slicing --num-cubes=3 ```