Skip to content

Conversation

@anson627
Copy link
Contributor

No description provided.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new AKS blog post that describes how to run/scale Ray workloads on AKS with Anyscale, covering multi-region capacity, unified storage (BlobFuse2), and service-principal-based authentication. It also includes diagram assets (SVGs) and their Mermaid sources to illustrate the storage and authentication flows.

Changes:

  • Add new blog post content for “Scaling Ray on AKS” with examples and architecture guidance.
  • Add Mermaid source diagrams for storage and authentication flows, plus exported SVG versions.
  • Add supporting screenshots used by the post.

Reviewed changes

Copilot reviewed 3 out of 7 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
website/blog/2026-02-13-scaling-ray-aks/index.md New blog post describing scaling Ray on AKS with multi-region, storage, and auth guidance
website/blog/2026-02-13-scaling-ray-aks/cluster-storage.mmd Mermaid source for the cluster storage architecture diagram
website/blog/2026-02-13-scaling-ray-aks/cluster-storage.svg Exported SVG for the cluster storage architecture diagram
website/blog/2026-02-13-scaling-ray-aks/auth-flow.mmd Mermaid source for the service principal authentication flow diagram
website/blog/2026-02-13-scaling-ray-aks/auth-flow.svg Exported SVG for the authentication flow diagram

Comment on lines +3 to +6
description: "Learn how to run production-grade Ray workloads on Azure Kubernetes Service with multi-region support, unified storage, and secure authentication."
date: 2026-02-13
authors:
- anson-qian
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This post is future-dated (2026-02-13). Docusaurus publishes future-dated posts immediately, so merging this PR will publish it right away. If this isn’t intended to go live yet, add draft: true (or unlisted: true) to the front matter before merging.

Copilot uses AI. Check for mistakes.
- anson-qian
- bob-mital
- kenneth-kilty
categories:
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

categories: is present but empty in the front matter. Other blog posts in this repo don’t use categories in front matter, so this is likely accidental and should be removed to avoid confusing/unused metadata.

Suggested change
categories:

Copilot uses AI. Check for mistakes.
- **Operational simplicity** through automated credential management with service principal

Whether you're [fine-tuning models with DeepSpeed or LLaMA-Factory](https://github.com/Azure-Samples/aks-anyscale/tree/main/examples/finetuning) or [deploying inference endpoints for LLMs ranging from small to large-scale reasoning models](https://github.com/Azure-Samples/aks-anyscale/tree/main/examples/inferencing), this architecture delivers a production-grade ML platform that scales with your needs.

Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This post is missing the <!-- truncate --> marker that most blog posts use to control the listing-page excerpt. Add it after the intro so the blog index doesn’t render the entire article preview.

Suggested change
<!-- truncate -->

Copilot uses AI. Check for mistakes.
- **Improve fault tolerance**: If one region experiences an outage or capacity shortage, workloads can be automatically rerouted to healthy clusters
- **Scale beyond single-cluster limits**: Azure imposes quota limits on GPU instances per region, but multi-region deployments let you aggregate capacity

To add a cluster or another region to your existing Anyscale cloud, define a cloud resource ([cloud_resource.yaml](./aks-anyscale/cloud_resource.yaml)):
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link ./aks-anyscale/cloud_resource.yaml points to a file that isn’t present in this blog post directory, so it will be broken on the site. Update the link to the correct location (for example, the Azure-Samples repo path) or add the referenced file to the post assets.

Suggested change
To add a cluster or another region to your existing Anyscale cloud, define a cloud resource ([cloud_resource.yaml](./aks-anyscale/cloud_resource.yaml)):
To add a cluster or another region to your existing Anyscale cloud, define a cloud resource ([cloud_resource.yaml](https://github.com/Azure-Samples/aks-anyscale/blob/main/config/cloud_resource.yaml)):

Copilot uses AI. Check for mistakes.
-f "$CLOUD_RESOURCE_YAML"
```

With infrastructure deployed across multiple regions, you can manage and monitor Ray workloads from the Anyscale console. The single-pane-of-glass view shows all registered clusters and their available resources:
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“single-pane-of-glass” is typically written without hyphens (“single pane of glass”). Consider updating to match common Microsoft style guidance.

Suggested change
With infrastructure deployed across multiple regions, you can manage and monitor Ray workloads from the Anyscale console. The single-pane-of-glass view shows all registered clusters and their available resources:
With infrastructure deployed across multiple regions, you can manage and monitor Ray workloads from the Anyscale console. The single pane of glass view shows all registered clusters and their available resources:

Copilot uses AI. Check for mistakes.
Comment on lines +82 to +83
--enable-blob-driver
...
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This az aks create example is missing a line-continuation (\) after --enable-blob-driver, so the command won’t paste/run as written. Add the continuation (and consider replacing ... with a commented placeholder) to keep the snippet copy/pasteable.

Suggested change
--enable-blob-driver
...
--enable-blob-driver \
# ...additional flags as needed...

Copilot uses AI. Check for mistakes.
storage: 100Gi
```

5. Configure Ray workloads read from and write to mounted blob path (`/mnt/cluster_storage`).
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 5 reads awkwardly (“Configure Ray workloads read from and write…”). Consider rephrasing to “Configure Ray workloads to read from and write to …” for correct grammar.

Suggested change
5. Configure Ray workloads read from and write to mounted blob path (`/mnt/cluster_storage`).
5. Configure Ray workloads to read from and write to the mounted blob path (`/mnt/cluster_storage`).

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,166 @@
---
title: "Scaling Ray on AKS"
description: "Learn how to run production-grade Ray workloads on Azure Kubernetes Service with multi-region support, unified storage, and secure authentication."
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The front-matter description is ~145 characters, which is shorter than the repo’s 150–160 character SEO guidance. Please expand it slightly so it lands in the 150–160 range.

Suggested change
description: "Learn how to run production-grade Ray workloads on Azure Kubernetes Service with multi-region support, unified storage, and secure authentication."
description: "Learn how to run production-grade Ray workloads on Azure Kubernetes Service with multi-region support, unified storage, and secure authentication for AI."

Copilot uses AI. Check for mistakes.
```bash
anyscale cloud resource create \
--cloud "$ANYSCALE_CLOUD_NAME" \
-f "$CLOUD_RESOURCE_YAML"
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI example uses -f "$CLOUD_RESOURCE_YAML", but the post never defines CLOUD_RESOURCE_YAML (and earlier names the file cloud_resource.yaml). Consider using the explicit path shown in the text or add a short snippet showing how to set CLOUD_RESOURCE_YAML.

Suggested change
-f "$CLOUD_RESOURCE_YAML"
-f ./aks-anyscale/cloud_resource.yaml

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant