Merge pull request #2241 from pareenaverma/content_review

pareenaverma · web-flow · commit c4ae2b4eec82 · 2025-08-19T15:52:04.000-04:00
Tech review of arcee on gcp
diff --git a/content/learning-paths/servers-and-cloud-computing/arcee-foundation-model-on-gcp/01_launching_an_axion_instance.md b/content/learning-paths/servers-and-cloud-computing/arcee-foundation-model-on-gcp/01_launching_an_axion_instance.md
@@ -11,7 +11,7 @@ layout: learningpathall
 Before you begin, make sure you have the following:
 
 - A Google Cloud account  
-- Permission to launch a Compute Engine Axion instance of type `c4a-standard-16` (or larger)  
+- Permission to launch a Google Axion instance of type `c4a-standard-16` (or larger)  
 - At least 128 GB of available storage
 
 If you're new to Google Cloud, check out the Learning Path [Getting Started with Google Cloud](/learning-paths/servers-and-cloud-computing/csp/google/).
@@ -33,6 +33,8 @@ In the left sidebar, select **OS and storage**.
 
 Under **Operating system and storage**, click on **Change**
 
+Select Ubuntu as the Operating system. For version select Ubuntu 24.04 LTS Minimal.
+
 Set the size of the disk to 128 GB, then click on **Select**.
 
 ## Review and launch the instance
diff --git a/content/learning-paths/servers-and-cloud-computing/arcee-foundation-model-on-gcp/05_downloading_and_optimizing_afm45b.md b/content/learning-paths/servers-and-cloud-computing/arcee-foundation-model-on-gcp/05_downloading_and_optimizing_afm45b.md
@@ -86,7 +86,7 @@ This command creates a 4-bit quantized version of the model:
 - `llama-quantize` is the quantization tool from Llama.cpp.
 - `afm-4-5B-F16.gguf` is the input GGUF model file in 16-bit precision. 
 - `Q4_0` applies zero-point 4-bit quantization.
-- This reduces the model size by approximately 45% (from ~15GB to ~8GB).
+- This reduces the model size by approximately 70% (from ~15GB to ~4.4GB).
 - The quantized model will use less memory and run faster, though with a small reduction in accuracy.
 - The output file will be `afm-4-5B-Q4_0.gguf`.
 
@@ -104,7 +104,7 @@ bin/llama-quantize models/afm-4-5b/afm-4-5B-F16.gguf models/afm-4-5b/afm-4-5B-Q8
 
 This command creates an 8-bit quantized version of the model:
 - `Q8_0` specifies 8-bit quantization with zero-point compression.
-- This reduces the model size by approximately 70% (from ~15GB to ~4.4GB).
+- This reduces the model size by approximately 45% (from ~15GB to ~8GB).
 - The 8-bit version provides a better balance between memory usage and accuracy than 4-bit quantization.
 - The output file is named `afm-4-5B-Q8_0.gguf`.
 - Commonly used in production scenarios where memory resources are available.
diff --git a/content/learning-paths/servers-and-cloud-computing/arcee-foundation-model-on-gcp/_index.md b/content/learning-paths/servers-and-cloud-computing/arcee-foundation-model-on-gcp/_index.md
@@ -17,7 +17,7 @@ learning_objectives:
     - Evaluate model quality by measuring perplexity
 
 prerequisites:
-    - A [Google Cloud account](https://console.cloud.google.com/) with permission to launch Axion (`c4a.4x-standard-16` or larger) instances
+    - A [Google Cloud account](https://console.cloud.google.com/) with permission to launch Axion (`c4a-standard-16` or larger) instances
     - Basic familiarity with Linux and SSH
 
 author: Julien Simon
@@ -28,6 +28,7 @@ skilllevels: Introductory
 subjects: ML
 arm_ips:
     - Neoverse
+cloud_service_providers: Google Cloud
 tools_software_languages:
     - Google Cloud
     - Hugging Face