Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
title: Deploy Kafka on the Microsoft Azure Cobalt 100 processors

minutes_to_complete: 30

who_is_this_for: This Learning Path is designed for software developers looking to migrate their Kafka workloads from x86_64 to Arm-based platforms, specifically on the Microsoft Azure Cobalt 100 processors.

learning_objectives:
- Provision an Azure Arm64 virtual machine using Azure console, with Ubuntu Pro 24.04 LTS as the base image.
- Deploy Kafka on the Ubuntu virtual machine.
- Perform Kafka baseline testing and benchmarking on both x86_64 and Arm64 virtual machines.

prerequisites:
- A [Microsoft Azure](https://azure.microsoft.com/) account with access to Cobalt 100 based instances (Dpsv6).
- Basic understanding of Linux command line.
- Familiarity with the [Apache Kafka architecture](https://kafka.apache.org/) and deployment practices on Arm64 platforms.

author: Jason Andrews

### Tags
skilllevels: Advanced
subjects: Storage
cloud_service_providers: Microsoft Azure

armips:
- Neoverse

tools_software_languages:
- Kafka
- kafka-producer-perf-test.sh
- kafka-consumer-perf-test.sh

operatingsystems:
- Linux

further_reading:
- resource:
title: Kafka Manual
link: https://kafka.apache.org/documentation/
type: documentation
- resource:
title: Kafka Performance Tool
link: https://codemia.io/knowledge-hub/path/use_kafka-producer-perf-testsh_how_to_set_producer_config_at_kafka_210-0820
type: documentation
- resource:
title: Kafka on Azure
link: https://learn.microsoft.com/en-us/samples/azure/azure-quickstart-templates/kafka-ubuntu-multidisks/
type: documentation


### FIXED, DO NOT MODIFY
# ================================================================================
weight: 1 # _index.md always has weight of 1 to order correctly
layout: "learningpathall" # All files under learning paths have this same wrapper
learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
# ================================================================================
# FIXED, DO NOT MODIFY THIS FILE
# ================================================================================
weight: 21 # Set to always be larger than the content in this path to be at the end of the navigation.
title: "Next Steps" # Always the same, html page title.
layout: "learningpathall" # All files under learning paths have this same wrapper for Hugo processing.
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
title: "Overview"

weight: 2

layout: "learningpathall"
---

## Cobalt 100 Arm-based processor

Azure’s Cobalt 100 is built on Microsoft's first-generation, in-house Arm-based processor: the Cobalt 100. Designed entirely by Microsoft and based on Arm’s Neoverse N2 architecture, this 64-bit CPU delivers improved performance and energy efficiency across a broad spectrum of cloud-native, scale-out Linux workloads. These include web and application servers, data analytics, open-source databases, caching systems, and more. Running at 3.4 GHz, the Cobalt 100 processor allocates a dedicated physical core for each vCPU, ensuring consistent and predictable performance.

To learn more about Cobalt 100, refer to the blog [Announcing the preview of new Azure virtual machine based on the Azure Cobalt 100 processor](https://techcommunity.microsoft.com/blog/azurecompute/announcing-the-preview-of-new-azure-vms-based-on-the-azure-cobalt-100-processor/4146353).

## Apache Kafka
Apache Kafka is a high-performance, open-source distributed event streaming platform designed for building real-time data pipelines and streaming applications.

It allows you to publish, subscribe to, store, and process streams of records in a fault-tolerant and scalable manner. Kafka stores data in topics, which are partitioned and replicated across a cluster to ensure durability and high availability.

Kafka is widely used for messaging, log aggregation, event sourcing, real-time analytics, and integrating large-scale data systems. Learn more from the [Apache Kafka official website](https://kafka.apache.org/) and its [official documentation](https://kafka.apache.org/documentation).
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
title: Baseline Testing
weight: 5

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Run a Baseline test with Kafka

After installing Kafka on your Arm64 virtual machine, you can perform a simple baseline test to validate that Kafka runs correctly and produces the expected output.

Kafka 4.1.0 uses **KRaft**, which integrates the control and data planes, eliminating the need for a separate ZooKeeper instance.

We need 4 terminals to complete this test. The first will start the Kafka server, the second will create a topic, and the final two will send and receive messages, respectively.

### Initial Setup: Configure & Format KRaft
**KRaft** is Kafka's new metadata protocol that integrates the responsibilities of ZooKeeper directly into Kafka, simplifying deployment and improving scalability by making the brokers self-managing.

First, you must configure your `server.properties` file for KRaft and format the storage directory. These steps are done only once.

**1. Edit the Configuration File**: Open your `server.properties` file.

```console
nano /opt/kafka/config/server.properties
```

**2. Add/Modify KRaft Properties:** Ensure the following lines are present and correctly configured for a single-node setup.

This configuration file sets up a single Kafka server to act as both a **controller** (managing cluster metadata) and a broker (handling data), running in **KRaft** mode. It defines the node's unique ID and specifies the local host as the sole participant in the **controller** quorum.

```java
process.roles=controller,broker
node.id=1
controller.quorum.voters=1@localhost:9093
listeners=PLAINTEXT://:9092,CONTROLLER://:9093
advertised.listeners=PLAINTEXT://localhost:9092
log.dirs=/tmp/kraft-combined-logs
```
**3. Format the Storage Directory:** Use the `kafka-storage.sh` tool to format the metadata directory.

```console
bin/kafka-storage.sh format -t $(bin/kafka-storage.sh random-uuid) -c config/server.properties
```
You should see an output similar to:

```output
Formatting metadata directory /tmp/kraft-combined-logs with metadata.version 4.1-IV1.
```

Now, Perform the Baseline Test

### Terminal 1 – Start Kafka Broker
This command starts the Kafka broker (the main server that sends and receives messages) in KRaft mode. Keep this terminal open.

```console
cd /opt/kafka
bin/kafka-server-start.sh config/server.properties
```
### Terminal 2 – Create a Topic
This command creates a new Kafka topic named `test-topic-kafka` (like a channel where messages will be stored and shared) with 1 partition and 1 copy (replica).

```console
cd /opt/kafka
bin/kafka-topics.sh --create --topic test-topic-kafka --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
```
You should see output similar to:

```output
Created topic test-topic-kafka.
```

- **Verify topic**

```console
bin/kafka-topics.sh --list --bootstrap-server localhost:9092
```
You should see output similar to:

```output
__consumer_offsets
test-topic-kafka
```

### Terminal 3 – Console Producer (Write Message)
This command starts the **Kafka Producer**, which lets you type and send messages into the `test-topic-kafka` topic. For example, when you type `hello from azure vm`, this message will be delivered to any Kafka consumer subscribed to that topic.

```console
cd /opt/kafka
bin/kafka-console-producer.sh --topic test-topic-kafka --bootstrap-server localhost:9092
```
You should see an empty prompt where you can start typing. Type `hello from azure arm vm` and press **Enter**.

### Terminal 4 – Console Consumer (Read Message)
This command starts the **Kafka Consumer**, which listens to the `test-topic-kafka` topic and displays all messages from the beginning.

```console
cd /opt/kafka
bin/kafka-console-consumer.sh --topic test-topic-kafka --from-beginning --bootstrap-server localhost:9092
```

You should see your message `hello from azure arm vm` displayed in this terminal, confirming that the producer's message was successfully received.

Now you can proceed to benchmarking Kafka’s performance on the Azure Cobalt 100 Arm virtual machine.
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
---
title: Benchmarking with Official Kafka Tools
weight: 6

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Benchmark Kafka on Azure Cobalt 100 Arm-based instances and x86_64 instances

Kafka’s official performance tools (**kafka-producer-perf-test.sh** and **kafka-consumer-perf-test.sh**) let you generate test workloads, measure message throughput, and record end-to-end latency.

## Steps for Kafka Benchmarking

Before starting the benchmark, ensure that the **Kafka broker** are already running in separate terminals.

Now, open two new terminals—one for the **producer benchmark** and another for the **consumer benchmark**.

### Terminal A - Producer Benchmark

The producer benchmark measures how fast Kafka can send messages, reporting throughput and latency percentiles.

```console
cd /opt/kafka
bin/kafka-producer-perf-test.sh \
--topic test-topic-kafka \
--num-records 1000000 \
--record-size 100 \
--throughput -1 \
--producer-props bootstrap.servers=localhost:9092
```
You should see output similar to:

```output
1000000 records sent, 252589.0 records/sec (24.09 MB/sec), 850.85 ms avg latency, 1219.00 ms max latency, 851 ms 50th, 1184 ms 95th, 1210 ms 99th, 1218 ms 99.9th.
```
### Terminal B - Consumer benchmark

The consumer benchmark measures how fast Kafka can read messages from the topic, reporting throughput and total messages consumed.

```console
cd /opt/kafka
bin/kafka-consumer-perf-test.sh \
--topic test-topic-kafka \
--bootstrap-server localhost:9092 \
--messages 1000000 \
--timeout 30000
```
You should see output similar to:

```output
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
2025-09-03 06:07:13:616, 2025-09-03 06:07:17:545, 95.3674, 24.2727, 1000001, 254517.9435, 3354, 575, 165.8564, 1739132.1739
```

## Benchmark Results Table Explained:

- **Messages Processed** – Total number of messages handled during the test.
- **Records/sec** – Rate of messages sent or consumed per second.
- **MB/sec** – Data throughput in megabytes per second.
- **Avg Latency (ms)** – Average delay in sending messages (producer only).
- **Max Latency (ms)** – Longest observed delay in sending messages (producer only).
- **50th (ms)** – Median latency (half the messages were faster, half slower).
- **95th (ms)** – Latency below which 95% of messages were delivered.
- **99th (ms)** – Latency below which 99% of messages were delivered.
- **99.9th (ms)** – Latency below which 99.9% of messages were delivered.

## Benchmark summary on Arm64:
Here is a summary of benchmark results collected on an Arm64 **D4ps_v6 Ubuntu Pro 24.04 LTS virtual machine**.
### Consumer Performance Test
| Metric | Value | Unit |
|-----------------------------|-------------|---------------|
| Total Time Taken | 3.875 | Seconds |
| Data Consumed | 95.3674 | MB |
| Throughput (Data) | 24.6110 | MB/sec |
| Messages Consumed | 1,000,001 | Messages |
| Throughput (Messages) | 258,064.77 | Messages/sec |
| Rebalance Time | 3348 | Milliseconds |
| Fetch Time | 527 | Milliseconds |
| Fetch Throughput (Data) | 180.9629 | MB/sec |
| Fetch Throughput (Messages)| 1,897,535.10| Messages/sec |

### Producer Performance Test
| Metric | Records Sent | Records/sec | Throughput | Average Latency | Maximum Latency | 50th Percentile Latency | 95th Percentile Latency | 99th Percentile Latency | 99.9th Percentile Latency |
|--------|--------------|-------------|------------|-----------------|-----------------|-------------------------|-------------------------|-------------------------|---------------------------|
| Value | 1,000,000 | 257,532.8 | 24.56 | 816.19 | 1237.00 | 799 | 1168 | 1220 | 1231 |
| Unit | Records | Records/sec | MB/sec | ms | ms | ms | ms | ms | ms |

## Benchmark summary on x86_64:
Here is a summary of the benchmark results collected on x86_64 **D4s_v6 Ubuntu Pro 24.04 LTS virtual machine**.
### Consumer Performance Test
| Metric | Value | Unit |
|--------------------|-------------|---------------|
| Total Time Taken | 3.811 | Seconds |
| Data Consumed | 95.3674 | MB |
| Throughput (Data) | 25.0243 | MB/sec |
| Messages Consumed | 1,000,001 | Messages |
| Throughput (Messages) | 262,398.58 | Messages/sec |
| Rebalance Time | 3271 | Milliseconds |
| Fetch Time | 540 | Milliseconds |
| Fetch Throughput (Data) | 176.6064 | MB/sec |
| Fetch Throughput (Messages) | 1,851,853.70| Messages/sec |

### Producer Performance Test
| Metric | Records Sent | Records/sec | Throughput | Average Latency | Maximum Latency | 50th Percentile Latency | 95th Percentile Latency | 99th Percentile Latency | 99.9th Percentile Latency |
|--------|--------------|-------------|------------|-----------------|-----------------|-------------------------|-------------------------|-------------------------|---------------------------|
| Value | 1,000,000 | 242,013.6 | 23.08 | 840.69 | 1351.00 | 832 | 1283 | 1330 | 1350 |
| Unit | Records | Records/sec | MB/sec | ms | ms | ms | ms | ms | ms |

## Benchmark comparison insights
When comparing the results on Arm64 vs x86_64 virtual machines:


- The Kafka **consumer** achieved **25.02 MB/sec throughput**, processing ~**262K messages/sec** with fetch throughput exceeding **1.85M messages/sec**.
- The Kafka **producer** sustained **23.08 MB/sec throughput**, with an average latency of ~**841 ms** and peak latency of ~**1351 ms**.
- These results confirm stable Kafka performance on the **Azure Ubuntu Pro arm64 virtual machine**, validating its suitability for **baseline testing and benchmarking**.

You have now benchmarked Kafka on an Azure Cobalt 100 Arm64 virtual machine and compared results with x86_64.
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: Create an Arm based cloud virtual machine using Microsoft Cobalt 100 CPU
weight: 3

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Introduction

There are several ways to create an Arm-based Cobalt 100 virtual machine : the Microsoft Azure console, the Azure CLI tool, or using your choice of IaC (Infrastructure as Code). This guide will use the Azure console to create a virtual machine with Arm-based Cobalt 100 Processor.

This learning path focuses on the general-purpose virtual machine of the D series. Please read the guide on [Dpsv6 size series](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/general-purpose/dpsv6-series) offered by Microsoft Azure.

If you have never used the Microsoft Cloud Platform before, please review the microsoft [guide to Create a Linux virtual machine in the Azure portal](https://learn.microsoft.com/en-us/azure/virtual-machines/linux/quick-create-portal?tabs=ubuntu).

#### Create an Arm-based Azure Virtual Machine

Creating a virtual machine based on Azure Cobalt 100 is no different from creating any other virtual machine in Azure. To create an Azure virtual machine, launch the Azure portal and navigate to "Virtual Machines".
1. Select "Create", and click on "Virtual Machine" from the drop-down list.
2. Inside the "Basic" tab, fill in the Instance details such as "Virtual machine name" and "Region".
3. Choose the image for your virtual machine (for example, Ubuntu Pro 24.04 LTS) and select “Arm64” as the VM architecture.
4. In the “Size” field, click on “See all sizes” and select the D-Series v6 family of virtual machines. Select “D4ps_v6” from the list.

![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/instance.png "Figure 1: Select the D-Series v6 family of virtual machines")

5. Select "SSH public key" as an Authentication type. Azure will automatically generate an SSH key pair for you and allow you to store it for future use. It is a fast, simple, and secure way to connect to your virtual machine.
6. Fill in the Administrator username for your VM.
7. Select "Generate new key pair", and select "RSA SSH Format" as the SSH Key Type. RSA could offer better security with keys longer than 3072 bits. Give a Key pair name to your SSH key.
8. In the "Inbound port rules", select HTTP (80) and SSH (22) as the inbound ports.

![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/instance1.png "Figure 2: Allow inbound port rules")

9. Click on the "Review + Create" tab and review the configuration for your virtual machine. It should look like the following:

![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/ubuntu-pro.png "Figure 3: Review and Create an Azure Cobalt 100 Arm64 VM")

10. Finally, when you are confident about your selection, click on the "Create" button, and click on the "Download Private key and Create Resources" button.

![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/instance4.png "Figure 4: Download Private key and Create Resources")

11. Your virtual machine should be ready and running within no time. You can SSH into the virtual machine using the private key, along with the Public IP details.

![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/final-vm.png "Figure 5: VM deployment confirmation in Azure portal")

{{% notice Note %}}

To learn more about Arm-based virtual machine in Azure, refer to “Getting Started with Microsoft Azure” in [Get started with Arm-based cloud instances](/learning-paths/servers-and-cloud-computing/csp/azure).

{{% /notice %}}
Loading