Merge pull request #3159 from madeline-underwood/ray

pareenaverma · web-flow · commit 60b27598fe8a · 2026-04-14T10:15:11.000-04:00
Ray
diff --git a/content/learning-paths/servers-and-cloud-computing/ray-on-axion/_index.md b/content/learning-paths/servers-and-cloud-computing/ray-on-axion/_index.md
@@ -1,10 +1,6 @@
 ---
 title: Scale AI workloads with Ray on Google Cloud C4A Axion VM
 description: Deploy and run distributed AI workloads using Ray on Google Cloud Axion C4A Arm-based VMs, covering parallel tasks, hyperparameter tuning, and model serving with Ray Core, Train, Tune, and Serve.
-
-draft: true
-cascade:
-    draft: true
     
 minutes_to_complete: 30
 
diff --git a/content/learning-paths/servers-and-cloud-computing/ray-on-axion/distributed_workloads.md b/content/learning-paths/servers-and-cloud-computing/ray-on-axion/distributed_workloads.md
@@ -6,11 +6,11 @@ weight: 6
 layout: learningpathall
 ---
 
-## Run Distributed Workloads with Ray
+## Run distributed workloads with Ray
 
 This section demonstrates how to execute parallel tasks and distributed training workloads using Ray on Arm.
 
-You will run simple distributed functions and then scale to multi-worker training using Ray.
+You'll run distributed functions and then scale to multi-worker training using Ray.
 
 ## Run distributed tasks
 
@@ -32,7 +32,7 @@ results = ray.get([square.remote(i) for i in range(10)])
 print("Results:", results)
 ```
 
-### Explanation
+### Code explanation
 
 * `ray.init()` → connects to the running Ray cluster
 * `@ray.remote` → converts a function into a distributed task
@@ -92,7 +92,7 @@ trainer = TorchTrainer(
 trainer.fit()
 ```
 
-### Execute training
+### Run the training script
 
 ```bash
 python3 ray_train.py
@@ -115,14 +115,14 @@ The output is similar to:
 
 This confirms distributed training across multiple workers.
 
-## Explanation
+## Training code explanation
 
 * `TorchTrainer` → handles distributed training execution
 * `ScalingConfig(num_workers=2)` → runs training on 2 workers
 * Each worker executes training in parallel
-* Logs may appear from multiple processes
+* Logs can appear from multiple processes
 
-## Ray Jobs View (Tasks & Training)
+## Ray Jobs view (tasks and training)
 
 ![Ray Dashboard Jobs tab showing successful execution of ray_test.py and ray_train.py#center](images/ray-jobs.png "Ray Jobs tab showing distributed tasks and training execution status")
 
@@ -137,6 +137,6 @@ You have successfully:
 * Executed parallel tasks using Ray Core
 * Converted functions into distributed workloads
 * Performed distributed training using multiple workers
-* Observed execution in the Ray dashboard
+* Observed execution in the Ray Dashboard
 
-Next, you will perform hyperparameter tuning, deploy models, and benchmark performance.
+Next, you'll perform hyperparameter tuning, deploy models, and benchmark performance.
diff --git a/content/learning-paths/servers-and-cloud-computing/ray-on-axion/firewall-setup.md b/content/learning-paths/servers-and-cloud-computing/ray-on-axion/firewall-setup.md
@@ -6,7 +6,7 @@ weight: 3
 layout: learningpathall
 ---
 
-Create a firewall rule in Google Cloud Console to expose required ports for the Ray dashboard and Ray Serve API.
+Create a firewall rule in Google Cloud Console to expose required ports for the Ray Dashboard and Ray Serve API.
 
 {{% notice Note %}}
 For help with GCP setup, see the Learning Path [Getting started with Google Cloud Platform](/learning-paths/servers-and-cloud-computing/csp/google/).
@@ -38,7 +38,7 @@ Finally, select **Specified protocols and ports** under the **Protocols and port
 
 Then select **Create**.
 
-![Google Cloud Console Protocols and ports section with TCP ports configured alt-txt#center](images/network-port.png "Setting Ray ports in the firewall rule")
+![Google Cloud Console Protocols and ports section showing TCP checkbox selected with ports 8265, 8000, and 6379 configured for Ray Dashboard, Serve API, and Head Node#center](images/network-port.png "Setting Ray ports in the firewall rule")
 
 ## What you've accomplished and what's next
 
diff --git a/content/learning-paths/servers-and-cloud-computing/ray-on-axion/setup_and_cluster.md b/content/learning-paths/servers-and-cloud-computing/ray-on-axion/setup_and_cluster.md
@@ -10,7 +10,7 @@ layout: learningpathall
 
 This section guides you through installing Ray on a GCP Arm64 (Axion) virtual machine and setting up a single-node distributed computing cluster.
 
-You will configure the environment, install dependencies, and initialize a Ray cluster optimized for Arm-based infrastructure.
+You'll configure the environment, install dependencies, and initialize a Ray cluster optimized for Arm-based infrastructure.
 
 ## Update your system
 
@@ -96,7 +96,7 @@ ray start --head --dashboard-host=0.0.0.0 --num-cpus=4
 ```
 
 * `--head` → starts the main node (scheduler)
-* `--dashboard-host=0.0.0.0` → allows external dashboard access
+* `--dashboard-host=0.0.0.0` → allows external Ray Dashboard access
 * `--num-cpus=4` → allocates 4 CPU cores
 
 The output is similar to:
@@ -165,30 +165,30 @@ Pending Demands:
  (no resource demands)
 ```
 
-## Access the dashboard
+## Access the Ray Dashboard
 
-Open in browser:
+Open the following URL in your browser:
 
 ```
 http://<VM-IP>:8265
 ```
 
-This dashboard provides visibility into jobs, tasks, and resource utilization.
+The Ray Dashboard provides visibility into jobs, tasks, and resource utilization.
 
-## Ray Dashboard Overview
+## Ray Dashboard overview
 
-![Ray Dashboard showing cluster overview, utilization, and navigation tabs#center](images/ray-dashboard.png "Ray Dashboard Overview showing cluster status and metrics")
+![Ray Dashboard showing cluster overview, utilization, and navigation tabs#center](images/ray-dashboard.png "Ray Dashboard overview showing cluster status and metrics")
 
-This dashboard helps monitor distributed execution and debug workloads in real time.
+The Ray Dashboard helps monitor distributed execution and debug workloads in real time.
 
 ## What you've learned and what's next
 
 You have successfully:
 
-* Installed Ray on Arm-based SUSE VM
+* Installed Ray on an Arm-based SUSE VM
 * Created an isolated Python environment
 * Installed required dependencies
 * Initialized a Ray cluster
-* Verified cluster status and dashboard
+* Verified cluster status and Ray Dashboard
 
-Next, you will run distributed workloads using Ray.
+Next, you'll run distributed workloads using Ray.
diff --git a/content/learning-paths/servers-and-cloud-computing/ray-on-axion/tuning_serving_benchmark.md b/content/learning-paths/servers-and-cloud-computing/ray-on-axion/tuning_serving_benchmark.md
@@ -6,7 +6,7 @@ weight: 7
 layout: learningpathall
 ---
 
-## Ray Tune, Serve, and Benchmarking
+## Hyperparameter tuning, serving, and benchmarking
 
 This section demonstrates hyperparameter tuning, model serving, and performance benchmarking using Ray.
 
@@ -42,20 +42,19 @@ results = tuner.fit()
 print("Best result:", results.get_best_result(metric="score", mode="max"))
 ```
 
-### Explanation
+### Code explanation
 
 * `tune.grid_search()` → tries multiple hyperparameter values
-* Each value runs as a **separate parallel trial**
+* Each value runs as a separate parallel trial
 * `session.report()` → sends metrics back to Ray
 * `Tuner.fit()` → executes all trials
 
-### Execute tuning
+### Run hyperparameter tuning
 
 ```bash
 python3 ray_tune.py
 ```
 
-### Output
 The output is similar to:
 
 ```output
@@ -84,7 +83,7 @@ Best result: Result(
 
 ### Understanding the output
 
-* Ray created **3 parallel trials** using different learning rates
+* Ray created 3 parallel trials using different learning rates
 * Each trial executed independently on available CPU cores
 * Scores represent the performance of each configuration
 
@@ -96,7 +95,7 @@ Best result: Result(
 
 **Best configuration = learning rate 0.1**
 
-* Total runtime ≈ **1 second** (parallel execution)
+* Total runtime ≈ 1 second (parallel execution)
 * Results stored in:
 
 ```bash
@@ -127,7 +126,7 @@ app = Model.bind()
 serve.run(app)
 ```
 
-### Explanation
+### Code explanation
 
 * `serve.start()` → initializes serving system
 * `@serve.deployment` → defines deployable service
@@ -148,14 +147,14 @@ curl http://127.0.0.1:8000/
 The output is similar to:
 
 ```output
-{"message":"Hello from Ray Serve on ARM VM!"}
+{"message":"Hello from Ray Serve on Arm VM!"}
 ```
 
-## Ray Tune Execution in Dashboard
+## Ray Tune execution in Ray Dashboard
 
 ![Ray Dashboard Jobs tab showing ray_tune.py trials with SUCCEEDED status#center](images/ray-jobs-status.png "Ray Tune trials executed successfully with different configurations")
 
-The dashboard shows all jobs executed successfully, confirming correct Ray cluster operation.
+The Ray Dashboard shows all jobs executed successfully, confirming correct Ray cluster operation.
 
 ## Benchmark distributed execution
 
@@ -184,7 +183,7 @@ print("Execution Time:", end - start)
 ```
 
 
-### Execute benchmark
+### Run the benchmark
 
 ```bash
 ray stop
@@ -202,10 +201,10 @@ Execution Time: 5.171869277954102
 ## Understanding the benchmark
 
 * 20 tasks executed in parallel
-* Each task takes ~1 second
+* Each task takes approximately 1 second
 * With 4 CPUs → total time ≈ 5 seconds
 
-**Sequential execution would take ~20 seconds**
+**Sequential execution would take approximately 20 seconds**
 
 * Confirms Ray parallel execution