Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

warm/fast-reboot fails on latest master due to error in PortsOrch::bake(): Invalid port table #21688

Open
stepanblyschak opened this issue Feb 10, 2025 · 4 comments · May be fixed by sonic-net/sonic-swss#3505
Labels
Triaged this issue has been triaged

Comments

@stepanblyschak
Copy link
Collaborator

Description

Warm or fast reboot fails with the following error log:

2025 Feb 10 09:33:07.111055 sonic NOTICE swss#orchagent: :- bake: foundPortConfigDone = 1
2025 Feb 10 09:33:07.111080 sonic NOTICE swss#orchagent: :- bake: foundPortInitDone = 1
2025 Feb 10 09:33:07.111395 sonic NOTICE swss#orchagent: :- bake: m_portTable->getKeys 263
2025 Feb 10 09:33:07.111403 sonic NOTICE swss#orchagent: :- bake: portCount = 257, m_portCount = 0
2025 Feb 10 09:33:07.111403 sonic ERR swss#orchagent: :- bake: Invalid port table: portCount, expecting 257, got 261

PORT_TABLE contains PortChannel oper_status entries which are not expected by portsorch:

admin@sonic:~$ redis-cli -n 0 keys 'PORT_TABLE:Port*'
1) "PORT_TABLE:PortChannel103"
2) "PORT_TABLE:PortChannel102"
3) "PORT_TABLE:PortConfigDone"
4) "PORT_TABLE:PortInitDone"
5) "PORT_TABLE:PortChannel101"
6) "PORT_TABLE:PortChannel104"
admin@r-moose-01:~$ redis-cli -n 0 hgetall 'PORT_TABLE:PortChannel103'
1) "oper_status"
2) "up"

Reproduction

  1. Build latest master sonic-buildimage and update sonic-swss to latest master
  2. Load T0 config with PortChannels
  3. Do warm or fast-reboot
  4. Observe error in the log

RCA

Likelly caused by PR sonic-net/sonic-swss#3383 which moves updateDbPortOperStatus() to run for all port types:

+    updateDbPortOperStatus(port, status);
+
     if (port.m_type == Port::PHY)
     {
-        updateDbPortOperStatus(port, status);

https://github.com/sonic-net/sonic-swss/pull/3383/files?diff=unified&w=0#diff-bb736c6a11da68afacf95d761fb52a3c11726cdd72c967d4803b8c9b1a6dcb69R8271

@stepanblyschak
Copy link
Collaborator Author

@bradh352
Copy link
Contributor

Ah, @VladimirKuk suggested the move outside the if block, and I didn't fully evaluate what port types that call was fully valid for. I'll submit a PR correction to put that under an if block for PHY and TUNNEL types.

bradh352 added a commit to bradh352/sonic-swss that referenced this issue Feb 10, 2025
PORT_TABLE contains PortChannel oper_status entries which are not
expected by portsorch which leads to warm/fastreboot failures
like:
```
2025 Feb 10 09:33:07.111055 sonic NOTICE swss#orchagent: :- bake: foundPortConfigDone = 1
2025 Feb 10 09:33:07.111080 sonic NOTICE swss#orchagent: :- bake: foundPortInitDone = 1
2025 Feb 10 09:33:07.111395 sonic NOTICE swss#orchagent: :- bake: m_portTable->getKeys 263
2025 Feb 10 09:33:07.111403 sonic NOTICE swss#orchagent: :- bake: portCount = 257, m_portCount = 0
2025 Feb 10 09:33:07.111403 sonic ERR swss#orchagent: :- bake: Invalid port table: portCount, expecting 257, got 261
```

Fixes sonic-net/sonic-buildimage#21688

Signed-off-by: Brad House (@bradh352)
@bradh352
Copy link
Contributor

can you try sonic-net/sonic-swss#3505

@bingwang-ms
Copy link
Contributor

Discussed in issue triage meeting, issue is fixed by sonic-net/sonic-swss#3505. PR is pending merge

@bingwang-ms bingwang-ms added the Triaged this issue has been triaged label Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Triaged this issue has been triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants