forked from girishmg/ovn-kubernetes
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sdn 1185 new #54
Open
cathy-zhou
wants to merge
89
commits into
master
Choose a base branch
from
sdn-1185-new
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Sdn 1185 new #54
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Improve kubectl_wait_pods: * use kubectl rollout to wait for specific DaemonSets * use kubectl wait pods to wait for specific pods * respect total timeout Signed-off-by: Andreas Karis <[email protected]>
Apply retry to namespace controller in ovnk node that, upon a namespace update, syncs UDP conntrack entries. Signed-off-by: Riccardo Ravaioli <[email protected]>
Signed-off-by: Riccardo Ravaioli <[email protected]>
Signed-off-by: Riccardo Ravaioli <[email protected]>
…ange Command used: $ mockery --name NodeWatchFactory --dir ~/go/src/github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory/ --output ~/go/src/github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory/mocks/ 04 Oct 22 19:09 CEST INF Starting mockery dry-run=false version=v2.14.0 04 Oct 22 19:09 CEST INF Walking dry-run=false version=v2.14.0 04 Oct 22 19:09 CEST INF Generating mock dry-run=false interface=NodeWatchFactory qualified-name=github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory version=v2.14.0 Signed-off-by: Riccardo Ravaioli <[email protected]>
Signed-off-by: Riccardo Ravaioli <[email protected]>
Add the option to deploy IPsec encryption on kind. Signed-off-by: Andreas Karis <[email protected]>
When changing the hybrid overlay to dynamically allocated IP addresses there where instanses where the .3 address was blindly returned. Correct that mistake Signed-off-by: Jacob Tanenbaum <[email protected]>
correct the hybrid overlay port MAC address
apply retry logic to ovnk node controllers
This PR reverts the commits 82094e2 and df3c905. This will pave the way for using 'options:hairpin_snat_ip' feature of OVN loadbalancers. Signed-off-by: Surya Seetharaman <[email protected]> Co-authored-by: Andrew Stoycos <[email protected]>
This commit adds a new masqueradeIP which can be used to tell OVN to explicitly SNAT towards that masqueradeIP for hairpin service traffic. Signed-off-by: Surya Seetharaman <[email protected]> Co-authored-by: Andrew Stoycos <[email protected]>
This commit adds the hairpin SNAT masqueradeIP to all the load balancers via the 'options:hairpin_snat_ip' field in OVN load balancers. This will help us in handling the NP service hairpin traffic correctly. Example: _uuid : c64e4c7b-66f7-4739-a633-06791efe835a external_ids : {"k8s.ovn.org/kind"=Service, "k8s.ovn.org/owner"="surya5/hello-world-net-pol"} health_check : [] ip_port_mappings : {} name : "Service_surya5/hello-world-net-pol_TCP_cluster" options : {event="false", hairpin_snat_ip="169.254.169.5 fd69::5", reject="true", skip_snat="false"} protocol : tcp selection_fields : [] vips : {"10.96.1.35:80"="10.244.1.3:8080,10.244.2.3:8080", "[fd00:10:96::a6a6]:80"="[fd00:10:244:2::3]:8080,[fd00:10:244:3::3]:8080"} NOTE: This might not be needed in the GR LBs, but no harm in adding it to all, so that we continue to leverage LBGs. An e2e test is added that demonstrates that the hairpin traffic will have the fixed masqueradeIP as the srcIP in its response packet. This commit also consolidates pokeEndpointHostName and pokeEndpointClientIP into a single function since they were both doing more or less the same logic. Signed-off-by: Surya Seetharaman <[email protected]> Co-authored-by: Andrew Stoycos <[email protected]>
This PR adds the logic to allow or deny traffic correctly in case of podA->svc->podA traffic. If allow-from-podA is specified then the hairpin traffic should be allowed and if podA is not in the whitelist then we shouldn't be allowing hairpin traffic to come back in. Logic is to add an extra match to all existing ingress ACLs that match on namespace or pod selectors. This will look like this: _uuid : 6f2566ce-9236-4bee-896c-be56a91aba16 action : allow-related direction : to-lport external_ids : {Ingress_num="0", ipblock_cidr="false", l4Match=None, namespace=surya5, policy=allow-ingress-to-foo4-from-foo2, policy_type=Ingress} label : 0 log : false match : "((ip4.src == {$a6612447164629805900} || (ip4.src == 169.254.169.5 && ip4.dst == {$a6612447164629805900})) || (ip6.src == {$a6612449363653062322} || (ip6.src == fd69::5 && ip6.dst == {$a6612449363653062322}))) && outport == @a14627396333488653719" meter : acl-logging name : surya5_allow-ingress-to-foo4-from-foo2_0 options : {} priority : 1001 severity : [] The match here tries to check if: 1) traffic is from the allowed-list-address-set OR 2) traffic is hairpin traffic destined towards allowed-list-address-set for both v4 and v6 cases. second point is what enables service hairpinning to behave correctly since we know destination will be same as source for hairpin. Signed-off-by: Surya Seetharaman <[email protected]> Co-authored-by: Andrew Stoycos <[email protected]> Co-authored-by: Nadia Pinaeva <[email protected]>
Signed-off-by: Riccardo Ravaioli <[email protected]>
Signed-off-by: Riccardo Ravaioli <[email protected]>
OCPBUGS-2868: Ignore addresses in masquerade subnet when retrieving gateway IPs
Don't log in iterateRetryResources when there are no retry entries
When nextQueryTime for dns entry is already expired and update goroutine is trying to use negative duration to reset ticker. This causes process crash on the ovnk master. This change avoids proces crash and ensures to trigger ticker in shortest possible time (1 ms) so that update happens immediately on expired dns entry. Signed-off-by: Periyasamy Palanisamy <[email protected]>
…l-dns-update-panic Handle expired entry while handling dns update
When there are multiple addresses on the primary interface for a node, conntrack uses the first one by default. During upgrade that ends up being the masquerade address because it was added last. This causes connection problems because the masquerade address ends up in a dadfailed state because we use the same address on all nodes. Since we don't care about DAD on that address, let's disable it so the address doesn't get into a bad state. Signed-off-by: Ben Nemec <[email protected]>
…airpin Fix Network Policy Service Hairpin Traffic
Signed-off-by: Riccardo Ravaioli <[email protected]>
…pods Hold lock when deleting completed pod during update event
…ases If real conntrack operations are performed against the CI host we may get operation errors because we're not running a real node in the testcases and we fake out all the iptables stuff too. We shouldn't mix real operations in the CI host's netns with fake operations in the testscases because then we cannot control the test parameters. Second, don't start the node IP handler's netlink watch or sync host IPs in testcases because it scrapes the CI host's IP addresses which service deletion uses when clearing conntrack, which the testcases know nothing about. Just stick with the fake node IP that the testcases already add. This all fixes errors like: [91m[1m• Failure [0.114 seconds][0m Node Operations [90m/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/node/gateway_localnet_linux_test.go:161[0m on delete [90m/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/node/gateway_localnet_linux_test.go:1211[0m [91m[1mdeletes iptables rules with ExternalIP [It][0m [90m/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/node/gateway_localnet_linux_test.go:1212[0m [91mUnexpected error: <*errors.errorString | 0xc000de8cb0>: { s: "DeleteService failed for nodePortWatcher: failed to delete conntrack entry for service namespace1/service1: failed to delete conntrack entry for service namespace1/service1 with svcVIP 1.1.1.1, svcPort 8032, protocol TCP: operation not permitted", } DeleteService failed for nodePortWatcher: failed to delete conntrack entry for service namespace1/service1: failed to delete conntrack entry for service namespace1/service1 with svcVIP 1.1.1.1, svcPort 8032, protocol TCP: operation not permitted occurred[0m /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/node/gateway_localnet_linux_test.go:1243 [90m------------------------------[0m [0mNode Operations[0m [90mon delete[0m [1mdeletes iptables rules for NodePort[0m [37m/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/node/gateway_localnet_linux_test.go:1295[0m Signed-off-by: Dan Williams <[email protected]>
We need to wait until our client-side shared informer (and thus watch factory) has seen the annotation update before we can delete the pod, otherwise the object retry code may not get the update. Signed-off-by: Dan Williams <[email protected]>
node: mock conntrack delete operations for service and EgressIP testcases
Fixes a scenario where a pod exists but the node does not and upon ovnk master reboot our sync would fail. During sync we should attempt to preallocate all existing pod IPs, which include terminating pods with finalizers. Pods in terminating state should still retain networking access. Signed-off-by: Tim Rozet <[email protected]>
kind.sh: Add IPsec option to kind deployments
Handle cases where a pod may exist but the node does not
Reported-at: https://issues.redhat.com/browse/OCPBUGS-3416 Signed-off-by: Martin Kennelly <[email protected]>
Dockerfile.fedora: bump OVN to 22.09.0-22
…swithnolsatstartup Skip cleaning LSPs when LS doesn't exist
* Add help text for ipsec option to kind.sh * Fix OpenSSL version check * Run certificate signer in foreground * In ovn-ipsec.yaml.j2 make the counter increment the same as in downstream cluster-network-operator Signed-off-by: Andreas Karis <[email protected]>
…22-12-02 kind: Fix ipsec in kind.sh and fix counter expression
There can be multiple management ports. Currently in the case of Full mode with hardware backed mgmt port, the virtual function representor and virtual function netdev exists separately. However the management port config belongs to the virtual function representor in this case. This caused the Egress IP health check to fail to initialize since there won't be an IP address on the virtual function representor interface. Instead the code now iterates through all management port interfaces, skips the virtual function representor and finds the virtual function netdev management port with a valid IP address configuration. In the case of DPU, there will be a DPU-Host mode node and DPU mode node. Also it will always use hardware backed mgmt ports. The DPU mode node will only have a virtual function representor, thus the Egress IP Health Check won't run on this node. The DPU-Host mode node will only have a virtual function netdev; thus the Egress IP Health Check will run on this node. Reported-at: https://issues.redhat.com/browse/OCPBUGS-4098 Signed-off-by: William Zhao <[email protected]>
Mainly to pickup the fix for the session affinity timeout bug. Signed-off-by: Numan Siddique <[email protected]>
Existing OVN code can be shared by the default network controller and the upcoming secondary layer 3 network controller. Update it to get ready for the seconary layer 3 network support. Signed-off-by: Yun Zhou <[email protected]>
Dockerfile.fedora: bump OVN to 22.09.0-25
This PR adds support for session affinity timeout which is required for OVNK plugin to become CNCF complaint and pass the conformance tests around this feature. NOTE: Most of the work has been done in OVN, only bit OVNK does is to enable this in the right manner as an option on the load balancer if a service with session affinity is created. OVN version bump is being done in the fedora image used to the build. NOTE2: We used to use the selectionFields option before, this is no longer needed because there can never be a session affinity without a timeout set (defaults to 3hrs), so if affinity-timeout is set then OVN does the rest of the logic to determin stickyness. NOTE3: Is the stickness timeout calculated for packets or sessions? (can't remember what IPVS does and OVN does) :( so do we timeout after the last packet ? Signed-off-by: Surya Seetharaman <[email protected]>
…ty-timeout Support LB Session Affinity TimeOut
make code shareable for secondary network controller
Fix FakeAddressSetFactory.ProcessEachAddressSet locking to allow calling other FakeAddressSetFactory in iteratorFn. Signed-off-by: Nadia Pinaeva <[email protected]>
currently there is a race condition that during startup a pod can be assigned an address that was previously assigned to a nodes hybrid overlay distributed router ip. between syncPods() completeing and the nodes AddNode() event retrying an incoming addPod() can be allocated the address that was assigned to the node for hybrid overlay fix that by assigning the existing hybrid overlayDRIP addresses during syncPods(). Since we do not support a user manually editing the hybridOverlay DRIP address or podIPs that any IP address once set as the hybridOverlay DRIP address will always be valid Signed-off-by: Jacob Tanenbaum <[email protected]>
When pod updates happen, handlers may be invoked that add pods IPs to address sets. A prime example of this is network policy handlePeerPods. Each time the pod is updated, the address will be added to the address set with a mutate-insert operation. The cost of this txn when there are many network policies that apply to this pod is high. For example, with a pod that has 730 network policies, and 7 pod updates that would result in 5110 transactions instead of only the 730 needed to add this pod in the first place. Signed-off-by: Tim Rozet <[email protected]>
Checks the cache in order to decide whether or not we actually need to send a transaction to add the port. This results in a perf gain of not wasting OVSDB transactions. Signed-off-by: Tim Rozet <[email protected]>
Eliminates duplicate transactions for adding to address sets and port groups
to avoid ovn-controller syntax error. Signed-off-by: Nadia Pinaeva <[email protected]>
LinkRoutesAddOrUpdateSourceOrMTU/LinkRoutesApply: update route GW if it was changed
selector to reduce the number of per-namespace pod handlers. We don't need per-namespace handlers in case of empty namespace selector, just use pod selector with empty namespace. Signed-off-by: Nadia Pinaeva <[email protected]>
…found-fast-fix Fix address set cleanup: only delete address sets owned by given object.
…selector Use PeerPodHandler if namespaceAndPod selector has empty namespace
…ric_desc_pod_creation Metrics: update pod creation latency metric description
…ems. Commit 09af653 moved the LRP creation from ensureNodeLogicalNetwork to syncNodeClusterRouterPort, so that when creating LRP the gateway-chassis can be set, to pin LS to a chassis. However, Commit c58554a accidently reverted the change when converting nbctl to libovsdb (it is likely a rebase problem). It creates LRP in ensureNodeLogicalNetwork without setting gateway-chassis, which end up with all existing ovn-controllers change monitor condition to monitor the new LS datapath. And later, when ovnkube-node annotates the chassis-id, ovnkube-master calls syncNodeClusterRouterPort, which updates the LRP and sets the gateway-chassis, so all the ovn-controllers (except the new node) change monitor condition again to remove the LS. Such monitor condition changes come from every existing node, so the SB DB become extremely busy refiltering all the existing lflows/PBs for all the monitor conditions - O(M x N) problem. Even if condtional-monitoring is disabled, this would cause unnecessary churns for all the ovn-controllers. This patch fixes the problem by: 1. Remove the LRP creation from the ensureNodeLogicalNetwork. 2. Combining the LRP creation and gateway-chassis setting into the same transaction. For this purpose the related APIs in router.go are refactored/simplified. Signed-off-by: Han Zhou <[email protected]>
…-master.v2 fix hybrid overlay allocation on the master
…le-creating-lrp Set gateway-chassis when LRP is being created, to avoid scaling problems
This reverts commit dabaf9f. Original PR never passed control plane tests. Signed-off-by: Tim Rozet <[email protected]>
Revert "fix hybridOverlay DRIP address allocation"
Today, port group is created for per network policy, even if they share the same local pod selector. This means for the same pod that is selected, its logical port will be added/deleted for multiple port groups. This commit is to add support of sharing of port_groups for multiple network policies if they have the same local pod selector. Signed-off-by: Yun Zhou <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
- What this PR does and why is it needed
- Special notes for reviewers
- How to verify it
- Description for the changelog