-
Notifications
You must be signed in to change notification settings - Fork 4
Blog on "What's new in Network Observability 1.9" #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
edbe0a7
c840d2f
65587aa
8fc3d62
9120c90
51e1759
bef7a52
0fec3ea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,228 @@ | ||
--- | ||
layout: :theme/post | ||
title: "What's new in Network Observability 1.9" | ||
description: "New features: IPsec, flowlog filter query; enhancements in network Observability CLI" | ||
tags: network,observability,IPsec,filter,query,udn,cli | ||
authors: [stleerh] | ||
--- | ||
|
||
[Network Observability 1.9](https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/network_observability/index) is an optional operator that provides insights into your network traffic, including features like packet drops, latencies, DNS tracking, and more. You can view this in the form of graphs, a table, or topology. | ||
|
||
This version aligns with [Red Hat OpenShift Container Platform (OCP) 4.19](https://docs.redhat.com/en/documentation/openshift_container_platform/4.19) but is backwards-compatible with older OCP and Kubernetes releases. For installation instructions, check out the documentation on [OpenShift Container Platform](https://docs.redhat.com/en/documentation/openshift_container_platform/4.19#Install) and [Network Observability](https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/network_observability/installing-network-observability-operators). | ||
|
||
This article covers the new features in this release, namely IPsec tracking, flowlogs-pipeline filter query, UDN Mapping, and Network Observability CLI enhancements. If you want to learn about the past features, read my older [What's new in Network Observability](https://developers.redhat.com/author/steven-lee) articles. | ||
|
||
## IPsec tracking | ||
|
||
Network Observability can identify IPsec traffic flows and indicate if they were successfully encrypted or not. To try this out, you need to have OVN-Kubernetes as your Container Network Interface (CNI), which is the default for OCP. | ||
|
||
To enable IPsec, follow the instructions on [Configuring IPsec encryption](https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/networking/network-security#configuring-ipsec-ovn). If you didn't enable IPsec during cluster installation, it's a bit tricky to set up, so I've provided a quick guide below on setting this up in a test environment. | ||
|
||
Network Observability can identify encrypted IPsec traffic between pods. In the OpenShift web console, when you create the FlowCollector instance, scroll down to the **Agent configuration** and open up the section named **Features**. In the dropdown for **Value**, select **IPSec** as shown in Figure 1. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Last IPsec mention needs to have the 's' in lowercase. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. that's unfortunate but @stleerh was right to spell it that way, because it's how it was written in the API: https://github.com/netobserv/network-observability-operator/blob/main/api/flowcollector/v1beta2/flowcollector_types.go#L194 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (that's actually the same problem with the eBPF manager, if we wanted to adopt the correct acronym in spelling, that should be "eBPFManager") There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, in this case, I didn't realize this was a value used by the API. We can dismiss this suggestion. |
||
|
||
<br> | ||
Figure 1: Enable IPsec eBPF feature | ||
|
||
The equivalent in YAML is to add this in the **spec** section: | ||
|
||
```yaml | ||
spec: | ||
agent: | ||
ebpf: | ||
features: | ||
- IPSec | ||
``` | ||
|
||
### Network Observability - IPsec feature | ||
|
||
In **Observe > Network Traffic, Traffic flows tab**, it adds a new column **IPSec Status** that has the possible values of "success", "error", or "n/a" (Figure 2). | ||
|
||
<br> | ||
Figure 2: Flows table with IPSec Status column | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IPsec with s lowercase |
||
|
||
IPsec flows always appear as node-to-node traffic, but they are actually encapsulated pod-to-pod or host-to-pod traffic. There are two types of encapsulation used for IPsec-encrypted flows. The first is ESP encapsulation, which is the traditional IPsec mode. ESP packets don't have ports, hence the ports are `n/a`. The second is UDP encapsulation. In the table, the destination port is 6081, so they are OVN Geneve tunnel traffic. If you only see UDP encapsulated traffic (no ESP), then you must have configured `encapsulation: Always` when configuring IPsec. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should we also refer to ipsec related filters ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are there more filters than the "IPSec Status" column? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think so @jpinsonneau pls confirm There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no there's only that one |
||
In **Observe > Dashboards, Network / Main dashboard** dropdown, it shows the percentage of traffic encrypted and the IPsec traffic rate (Figure 3). | ||
|
||
<br> | ||
Figure 3: Network/Main dashboard with IPsec data | ||
|
||
### Enable IPsec on OVN-Kubernetes | ||
|
||
Here are the steps to enable IPsec on OVN-Kubernetes. You can skip this section if you already have IPsec enabled. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this should be replaced with pointer to OCP IPsec doc IMO There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do have a link to the OCP IPsec doc earlier (line 19). Because there are some non-trivial issues that the doc doesn't point out, I feel leaving this section in has value. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just don't think its good idea for netobserv doc to teach OCP IPsec users about IPsec it seems out of scope to me, probably we should open OCP doc bug if ipsec doc isn't complete or missing critical info WDYT ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO, as long as it's the upstream / community blog, that's ok, we have a broader "freedom of expression" to mention this kind of things. But I'm not sure it will fly unchanged if you submit that to the RH blog ... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @msherif1234 Actually, the OCP doc on configuring IPsec is complete in terms of explaining and covering every possible combination (ten sections from 3.6.1 to 3.6.10), whereas this blog focuses on the scenario supported by Network Observability. These ten sections do not cover prerequisites, such as reducing the MTU, which is in another documentation. It doesn't highlight what to expect when you issue some of these commands (e.g. need to wait a few minutes), which is typically not their documentation style. They are documenting it holistically from an IPsec point-of-view, but this blog just wants you to be able to set up IPsec properly and enough so you can view the traffic using Network Observability. If you're unable to do that without having to read pages of pages of IPsec and MTU documentation, then the blog is not going to be very useful. In the end, if you already have IPsec enabled (unlikely), then it's harmless and you can skip this section, which is what it tells you to do. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Technically, I did this in previous blogs, such as how to set up UDN, but it was a lot shorter to explain that. |
||
|
||
In addition to enabling IPsec, you should reduce the MTU value as this is necessary to avoid packet fragmentation and dropped connections. Enter `oc edit networks.operator.openshift.io` to update the configuration, but read below to get the right MTU values. The migration could take half an hour or more so be prepared for some downtime. | ||
|
||
```yaml | ||
spec: | ||
defaultNetwork: | ||
ovnKubernetesConfig: | ||
gatewayConfig: | ||
routingViaHost: true # change | ||
ipsecConfig: | ||
mode: Full # change | ||
mtu: 8855 # update with your MTU value and below | ||
migration: # add this section | ||
mtu: | ||
network: | ||
from: 8901 | ||
to: 8855 | ||
machine: | ||
to: 9001 | ||
``` | ||
|
||
Make the two changes with the comment "change" to enable IPsec. Reduce the network MTU by 46 bytes, which is needed by the ESP header to do IPsec encryption. My current network MTU value was 8901 (using jumbo frames) and was reduced to 8855. Therefore, the overlay packets generated by the pods won't exceed 8855 bytes, and this provides the 46-byte overhead for the ESP header. You also need to provide the machine or physical MTU on the interface even though it won't be changed. | ||
|
||
To get the network MTU, enter `oc get networks.operator.openshift.io -o yaml | grep mtu`. To get the machine MTU, look at the example below and issue the commands, but replace with your pod name. | ||
|
||
``` | ||
$ oc project openshift-ovn-kubernetes | ||
$ oc get pods | ||
NAME READY STATUS RESTARTS AGE | ||
ovnkube-control-plane-595bf6d946-9gjjb 2/2 Running 2 (4h48m ago) 5h3m | ||
ovnkube-node-5h682 8/8 Running 1 (4h58m ago) 4h58m | ||
ovnkube-node-94z84 8/8 Running 0 5h3m | ||
|
||
$ oc rsh ovnkube-node-5h682 | ||
|
||
sh-5.1# ip link show | ||
... | ||
2: ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq master ovs-system state UP mode DEFAULT group default qlen 1000 | ||
link/ether 06:a5:5a:cd:ff:6b brd ff:ff:ff:ff:ff:ff | ||
altname enp0s5 | ||
``` | ||
|
||
Look for the interface such as *ensX* and the *mtu* value. Here, it's 9001. The original network MTU was 100 bytes less at 8901 to provide overhead for the Geneve header. Now we're reducing it another 46 bytes to 8855 for the ESP header. | ||
|
||
Now you can start the migration and IPsec enablement. During this process, you will lose connection to the cluster. It may be five minutes or longer before it automatically connects back. When it does, IPsec pods should be running. | ||
|
||
``` | ||
$ oc get pods | ||
NAME READY STATUS RESTARTS AGE | ||
ovn-ipsec-containerized-jpnxr 1/1 Running 1 4m14s | ||
ovn-ipsec-containerized-w6r9c 1/1 Running 1 4m14s | ||
ovn-ipsec-host-27j49 2/2 Running 2 4m14s | ||
ovn-ipsec-host-jjb54 2/2 Running 2 4m14s | ||
ovnkube-control-plane-595bf6d946-9gjjb 2/2 Running 6 5h24m | ||
ovnkube-node-dfhck 7/8 Running 7 (16s ago) 3m32s | ||
ovnkube-node-dtq28 7/8 Running 8 4m11s | ||
``` | ||
|
||
While the IPsec pods are up and running, the other parts of the cluster might not be ready yet, including Network Observability. To ensure the cluster is stable, enter `oc adm wait-for-stable-cluster`. | ||
|
||
|
||
## Flowlogs-pipeline filter | ||
|
||
Flowlogs-pipeline filter lets you filter data after it has been enriched with Kubernetes information, and before ingestion. It can filter logs (Loki data), metrics (Prometheus data), or logs to be exported, or all of them, which corresponds to the **outputTarget** field (Figure 4) when configuring a FlowCollector instance. In the FlowCollector form view, scroll down to the **Processor configuration** section and click to open it. Then click **Filters** and then **Add Filters**. You can also set a different sampling value than the one used by the eBPF Agent. | ||
|
||
The query uses a simple query language that supports 8 comparison operators, the 6 standard ones (`=`, `!=`, `>`, `>=`, `<`, `<=`), plus two more to match or not match a regular expression (`=~`, `!~`). It can check if a field exists or not (`with(field)`, `without(field)`. Finally, the query language allows `and` and `or` with parentheses to specify precedence for more complex expressions. | ||
|
||
<br> | ||
Figure 4: Flowlogs-pipeline filters configuration | ||
|
||
Don't confuse this with the eBPF flow filter, which happens at a much earlier stage at the packet level. Flowlogs-pipeline filter doesn't benefit as much from resource savings as eBPF flow filter because part of the processing of flows has already happened. To get a list of field names for the query, see the [documentation](https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/network_observability/json-flows-format-reference) or click a row in the Traffic flows table, and then click the **Raw** tab. | ||
|
||
Here's a query to include only "netobserv" traffic. | ||
|
||
``` | ||
SrcK8S_Namespace="netobserv" or DstK8S_Namespace="netobserv" | ||
``` | ||
|
||
For more information, see [FLP filtering language](https://github.com/netobserv/flowlogs-pipeline/blob/main/docs/filtering.md). | ||
|
||
|
||
## UDN Mapping (GA) | ||
|
||
The eBPF feature **UDNMapping** reached General Availability (GA). Network Observability added support for the ClusterUserDefinedNetwork object. In 1.8, it only supported UserDefinedNetwork. ClusterUserDefinedNetwork allows pods in different namespaces to communicate with each other using the cluster UDN. To learn more about how to set up a UDN, see the article [User defined networks in Red Hat OpenShift Virtualization](https://www.redhat.com/en/blog/user-defined-networks-red-hat-openshift-virtualization). | ||
|
||
|
||
## Network Observability CLI enhancements | ||
|
||
The Network Observability CLI is a command line tool based on the Network Observability code base. It is an `oc` plugin that captures, displays, and saves flows, metrics, and/or packet information. | ||
|
||
Installation is simple. Download the [oc_netobserv file](https://mirror.openshift.com/pub/cgw/netobserv/latest/) and put it in a location that is accessible from the command line (e.g. */usr/local/bin* on Linux). Make the file executable. You don't even need to install Network Observability Operator, nor will it conflict with it if it's installed. | ||
|
||
You must be able to access your cluster from the command line using `oc` or `kubectl`. It's best to widen your terminal to 100 characters or more for a better display. To run the program, enter `oc netobserv` or just call the script directly, `oc-netobserv`, followed by various options. | ||
|
||
Decide whether you want to see flows, metrics, or capture packets. If you want more information on this, enter one or more of these commands. | ||
|
||
``` | ||
oc netobserv flows help | ||
oc netobserv metrics help | ||
oc netobserv packets help | ||
oc netobserv help # for general help | ||
``` | ||
|
||
With flows, it displays a text-based traffic flows table. With metrics, you are given a link that creates a dynamic dashboard in the OpenShift web console. And with packets, you also get flows, and it saves a pcapng file that can be loaded in with a tool such as Wireshark. All of this data can be saved locally upon exit. | ||
|
||
Network Observability CLI deploys in its own namespace and is automatically removed once the CLI exits. To manually exit, press ctrl-c. It asks if you want to save the data in the directory **./output** and then exits. | ||
|
||
The rest of this section covers the new features in Network Observability CLI 1.9. Here are the new options: | ||
|
||
| Option | Description | | ||
| ------------------------- | ------------------------------------ | | ||
| `--enable_ipsec` | Enable eBPF IPsec tracking feature | | ||
| `--enable_network_events` | Enable eBPF Network Events feature | | ||
| `--enable_udn_mapping` | Enable eBPF UDN Mapping feature | | ||
| `--sampling` | Set sampling interval, defaults to 1 | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see you specifically changed sampling ratio to interval, can you elaborate why? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A ratio, such as 1:10 or 3:2, is the wrong term as it requires two values, the antecedent (1st value) and the consequent (2nd value). It typically reads like "1 in 10" or "1 for 10". In what we call "sampling", the antecedent is always 1 and the sampling value is the consequent. I don't think anyone will understand this if we call it the "sampling consequent". It turns out the common term for this is "sampling interval". In fact, you can Google search what a "sampling interval" is. Here's one definition.
"Sampling rate" is, in fact, wrong. Rate is the inverse of interval. That is: rate = 1 / interval Therefore, if you say the sampling rate is 10, that means you are doing something 10 times per second (assuming the unit is in seconds), rather than 1 out of 10. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Awesome, I didn't know this meaning of "interval", I've been struggling to find an accurate term all this time There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| `--query` | Define a query filter | | ||
| `--exclude_interfaces` | List of interfaces to exclude | | ||
| `--include_list` | List of metric names (metrics only) | | ||
| `--yaml` | Generate a FlowCollector YAML | | ||
|
||
This version catches up with all of the eBPF features that were introduced in Network Observability, including the latest IPsec tracking. These are the options that begin with `--enable`. The new `--sampling` option sets the sampling interval and defaults to 1, which is unlike Network Observability Operator which defaults to 50. The `--regexes` option has been removed in favor of the `--query` option. It allows you to enter an expression, similar to the filter UI in **Observe > Network Traffic**. | ||
|
||
Here's an example command that enables IPsec tracking and displays flows where the IPsec encryption was successful. It uses a sampling interval of 10. | ||
|
||
``` | ||
oc netobserv flows --query='IPSecStatus="success"' --enable_ipsec --sampling=10 | ||
``` | ||
|
||
Pay attention to the syntax of `--query`. It is followed by an equals character and wrapped in a pair of single quotes. Double quotes are used if the value is a string. The field name does not have quotes. The query and the [field names](https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/network_observability/json-flows-format-reference) are the same as flowlogs-pipeline filter query. Figure 5 shows the screenshot of this command. | ||
|
||
<br> | ||
Figure 5: Network Observability CLI - Flows table with IPsec | ||
|
||
The `--exclude_interfaces` is a list of comma-separated interfaces to exclude. The `--include_list` is a metrics-only option to specify a list of comma-separated metric names. See the metrics help for the default list of names. | ||
|
||
The last new option is the unique `--yaml`. Add in your features and filter options after `oc netobserv --yaml` command. It creates a FlowCollector YAML that you can apply and reuse, and prints out the command to run. Save this command. When you run it, it starts up Network Observability CLI with all the options you gave it. Here's an example of running with this option. | ||
|
||
``` | ||
$ oc netobserv flows --enable_ipsec --yaml | ||
... | ||
You can create flows agents by executing: | ||
oc apply -f ./output/flows_capture_2025_07_14_07_14.yml | ||
|
||
Then create the collector using: | ||
oc run -n netobserv-cli collector \ | ||
--image=registry.redhat.io/network-observability/network-observability-cli-rhel9:1.9.0 --image-pull-policy='Always' --overrides='{"spec": {"serviceAccount": "netobserv-cli"}}' \ | ||
--command -- bash -c "/network-observability-cli get-flows --options enable_ipsec --loglevel info --maxtime 5m --maxbytes 50000000 && sleep infinity" | ||
|
||
And follow its progression with: | ||
oc logs collector -n netobserv-cli -f | ||
``` | ||
|
||
Finally, the pcapng file was enhanced to include enrichment data as comments. The command below starts the packet capture. Upon exit, save the capture output. | ||
|
||
``` | ||
oc netobserv packets --port=443 | ||
``` | ||
|
||
Now run Wireshark on this file, which is in the **./output/pcap** directory. Update with your output filename. | ||
|
||
``` | ||
# Update with your output filename | ||
wireshark output/pcap/2025-07-22T212639Z.pcapng | ||
``` | ||
|
||
When Wireshark comes up, select a row and click the **Packet comments** section to open up this section (Figure 6). | ||
|
||
<br> | ||
Figure 6: Wireshark - Packet capture file with comments | ||
|
||
|
||
## Summary | ||
|
||
This is another solid release from the Network Observability team. If you use IPsec, you can get insight into this type of traffic. A filter query was added in both flowlogs-pipeline and the Network Observability CLI. If you want to easily capture flows, metrics, and packets, Network Observability CLI is the tool for you! Write to us on the [discussion board](https://github.com/netobserv/network-observability-operator/discussions) if you have any feedback or suggestions for improvements. | ||
|
||
_Special thanks to Julien Pinsonneau, Joel Takvorian, Mehul Modi, and Mohamed S. Mahmoud for reviewing._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean between nodes not pods right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is referring to the actual IPsec traffic, which is either pod-to-pod or host-to-pod. It mentions (in line 43):
IPsec flows always appear as node-to-node traffic, but they are actually encapsulated pod-to-pod or host-to-pod traffic.