You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug A clear and concise description of what the bug is.
When starting OTBR docker with simulation RCP using socat 1.8.0.2, after some time, the host program will crash.
To Reproduce Information to reproduce the behavior, including:
Have socat 1.8.0.2 installed. (On our PCs, apt-get install socat will install this version)
Modify the ncp_mode script to run otbr docker easier. (attached below)
Use an expect script to start the otbr docker image: sudo top_builddir=./build/temp bash ./tests/scripts/ncp_mode expect docker_debug.exp (the script attached below)
Join the node into network using dataset: 0e080000000000010000000300001435060004001fffe002087d61eb42cdc48d6a0708fd0d07fca1b9f0500510ba088fc2bd6c3b3897f7a10f58263ff3030f4f70656e5468726561642d353234660102524f04109dc023ccd447b12b50997ef68020f19e0c0402a0f7f8
Start a simulation cli ftd node and attach the network using the same dataset.
Use the cli ftd node to ping the br node. After about 20 pings, the otbr in docker will crash.
Expected behavior A clear and concise description of what you expected to happen.
The ping can work as expected
Console/log output If applicable, add console/log output to help explain your problem.
OTBR log before crash.
Feb 17 02:33:09 11d0d0ac6a01 otbr-agent[109]: 00:03:43.525 [D] RadioSelector-: RadioSelector: UpdateOnTxSucc 15.4 - neighbor:[6e43b8173a1e8adc rloc16:0xe400 radio-pref:{15.4:255} state:Valid]
Feb 17 02:33:09 11d0d0ac6a01 otbr-agent[109]: 00:03:43.525 [I] MeshForwarder-: Sent IPv6 ICMP6 msg, len:196, chksum:8f65, ecn:no, to:0xe400, sec:yes, prio:low, radio:15.4
Feb 17 02:33:09 11d0d0ac6a01 otbr-agent[109]: 00:03:43.525 [I] MeshForwarder-: src:[fd7d:61eb:42cd:8d6a:42:58ff:fe0c:64c4]
Feb 17 02:33:09 11d0d0ac6a01 otbr-agent[109]: 00:03:43.525 [I] MeshForwarder-: dst:[fd1b:8173:6f2a:1:2c50:7562:b3a6:a5f6]
Feb 17 02:33:10 11d0d0ac6a01 otbr-agent[109]: 00:03:44.520 [D] P-SpinelDrive-: Received spinel frame, flg:0x2, iid:0, tid:0, cmd:PROP_VALUE_IS, key:STREAM_RAW, len:125, rssi:-20 ...
Feb 17 02:33:10 11d0d0ac6a01 otbr-agent[109]: 00:03:44.520 [D] P-SpinelDrive-: ... noise:-128, flags:0x0010, channel:20, lqi:0, timestamp:875784317018, rxerr:0
Feb 17 02:33:10 11d0d0ac6a01 otbr-agent[109]: 00:03:44.520 [D] RadioSelector-: RadioSelector: UpdateOnRx 15.4 - neighbor:[6e43b8173a1e8adc rloc16:0xe400 radio-pref:{15.4:255} state:Valid]
Feb 17 02:33:13 11d0d0ac6a01 otbr-agent[109]: 00:03:47.503 [N] MeshForwarder-: Dropping (reassembly queue) IPv6 ICMP6 msg, len:148, chksum:9eb8, ecn:no, sec:yes, error:ReassemblyTimeout, prio:normal, rss:-20.0, radio:15.4
Feb 17 02:33:13 11d0d0ac6a01 otbr-agent[109]: 00:03:47.503 [N] MeshForwarder-: src:[fd1b:8173:6f2a:1:2c50:7562:b3a6:a5f6]
Feb 17 02:33:13 11d0d0ac6a01 otbr-agent[109]: 00:03:47.503 [N] MeshForwarder-: dst:[fdde:ad00:beef:0:c9cf:f625:504c:e0c6]
Feb 17 02:33:19 11d0d0ac6a01 otbr-agent[109]: 00:03:53.371 [I] Mle-----------: Send Advertisement (ff02:0:0:0:0:0:0:1)
Feb 17 02:33:19 11d0d0ac6a01 otbr-agent[109]: 00:03:53.371 [D] P-SpinelDrive-: Sent spinel frame, flg:0x2, iid:0, tid:13, cmd:PROP_VALUE_SET, key:STREAM_RAW, len:70, channel:20, maxbackoffs:4, maxretries:15 ...
Feb 17 02:33:19 11d0d0ac6a01 otbr-agent[109]: 00:03:53.371 [D] P-SpinelDrive-: ... csmaCaEnabled:1, isHeaderUpdated:0, isARetx:0, skipAes:0, txDelay:0, txDelayBase:0
Feb 17 02:33:24 11d0d0ac6a01 otbr-agent[109]: 00:03:58.371 [W] P-RadioSpinel-: radio tx timeout
Feb 17 02:33:24 11d0d0ac6a01 otbr-agent[109]: 00:03:58.375 [C] P-RadioSpinel-: Failed to communicate with RCP - no response from RCP during initialization
Feb 17 02:33:24 11d0d0ac6a01 otbr-agent[109]: 00:03:58.375 [C] P-RadioSpinel-: This is not a bug and typically due a config error (wrong URL parameters) or bad RCP image:
Feb 17 02:33:24 11d0d0ac6a01 otbr-agent[109]: 00:03:58.375 [C] P-RadioSpinel-: - Make sure RCP is running the correct firmware
Feb 17 02:33:24 11d0d0ac6a01 otbr-agent[109]: 00:03:58.375 [C] P-RadioSpinel-: - Double check the config parameters passed as `RadioURL` input
Feb 17 02:33:24 11d0d0ac6a01 otbr-agent[109]: 00:03:58.375 [C] Platform------: HandleRcpTimeout() at radio_spinel.cpp:2035: RadioSpinelNoResponse
Additional context Add any other context about the problem here.
Updated ncp_mode script:
#!/bin/bash
#
# Copyright (c) 2024, The OpenThread Authors.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
# 1. Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# 3. Neither the name of the copyright holder nor the
# names of its contributors may be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
#
# Test basic functionality of otbr-agent under NCP mode.
#
# Usage:
# ./ncp_mode
set -euxo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
readonly SCRIPT_DIR
EXPECT_SCRIPT_DIR="${SCRIPT_DIR}/expect"
readonly EXPECT_SCRIPT_DIR
#---------------------------------------
# Configurations
#---------------------------------------
OT_CLI="${OT_CLI:-ot-cli-ftd}"
readonly OT_CLI
OT_NCP="${OT_NCP:-ot-ncp-ftd}"
readonly OT_NCP
OT_RCP="${OT_RCP:-ot-rcp}"
readonly OT_RCP
OTBR_DOCKER_IMAGE="${OTBR_DOCKER_IMAGE:-otbr-ncp}"
readonly OTBR_DOCKER_IMAGE
ABS_TOP_BUILDDIR="$(cd "${top_builddir:-"${SCRIPT_DIR}"/../../}" && pwd)"
readonly ABS_TOP_BUILDDIR
ABS_TOP_SRCDIR="$(cd "${top_srcdir:-"${SCRIPT_DIR}"/../../}" && pwd)"
readonly ABS_TOP_SRCDIR
ABS_TOP_OT_SRCDIR="${ABS_TOP_SRCDIR}/third_party/openthread/repo"
readonly ABS_TOP_OT_SRCDIR
ABS_TOP_OT_BUILDDIR="${ABS_TOP_BUILDDIR}/../simulation"
readonly ABS_TOP_BUILDDIR
OTBR_COLOR_PASS='\033[0;32m'
readonly OTBR_COLOR_PASS
OTBR_COLOR_FAIL='\033[0;31m'
readonly OTBR_COLOR_FAIL
OTBR_COLOR_NONE='\033[0m'
readonly OTBR_COLOR_NONE
readonly OTBR_VERBOSE="${OTBR_VERBOSE:-0}"
#----------------------------------------
# Helper functions
#----------------------------------------
die()
{
exit_message="$*"
echo " *** ERROR: $*"
exit 1
}
exists_or_die()
{
[[ -f $1 ]] || die "Missing file: $1"
}
executable_or_die()
{
[[ -x $1 ]] || die "Missing executable: $1"
}
write_syslog()
{
logger -s -p syslog.alert "OTBR_TEST: $*"
}
#----------------------------------------
# Test constants
#----------------------------------------
TEST_BASE=/tmp/test-otbr
readonly TEST_BASE
OTBR_AGENT=otbr-agent
readonly OTBR_AGENT
STAGE_DIR="${TEST_BASE}/stage"
readonly STAGE_DIR
BUILD_DIR="${TEST_BASE}/build"
readonly BUILD_DIR
OTBR_DBUS_CONF="${ABS_TOP_BUILDDIR}/src/agent/otbr-agent.conf"
readonly OTBR_DBUS_CONF
OTBR_AGENT_PATH="${ABS_TOP_BUILDDIR}/src/agent/${OTBR_AGENT}"
readonly OTBR_AGENT_PATH
# The node ids
LEADER_NODE_ID=1
readonly LEADER_NODE_ID
# The TUN device for OpenThread border router.
TUN_NAME=wpan0
readonly TUN_NAME
#----------------------------------------
# Test steps
#----------------------------------------
do_build_ot_simulation()
{
sudo rm -rf "${ABS_TOP_OT_BUILDDIR}/ncp"
sudo rm -rf "${ABS_TOP_OT_BUILDDIR}/cli"
sudo rm -rf "${ABS_TOP_OT_BUILDDIR}/rcp"
OT_CMAKE_BUILD_DIR=${ABS_TOP_OT_BUILDDIR}/ncp "${ABS_TOP_OT_SRCDIR}"/script/cmake-build simulation \
-DOT_MTD=OFF -DOT_RCP=OFF -DOT_APP_CLI=OFF -DOT_APP_RCP=OFF \
-DOT_BORDER_ROUTING=ON -DOT_NCP_INFRA_IF=ON -DOT_SIMULATION_INFRA_IF=OFF \
-DOT_SRP_SERVER=ON -DOT_SRP_ADV_PROXY=ON -DOT_PLATFORM_DNSSD=ON -DOT_SIMULATION_DNSSD=OFF -DOT_NCP_DNSSD=ON \
-DBUILD_TESTING=OFF
OT_CMAKE_BUILD_DIR=${ABS_TOP_OT_BUILDDIR}/cli "${ABS_TOP_OT_SRCDIR}"/script/cmake-build simulation \
-DOT_MTD=OFF -DOT_RCP=OFF -DOT_APP_NCP=OFF -DOT_APP_RCP=OFF \
-DOT_BORDER_ROUTING=OFF \
-DBUILD_TESTING=OFF
OT_CMAKE_BUILD_DIR=${ABS_TOP_OT_BUILDDIR}/rcp "${ABS_TOP_OT_SRCDIR}"/script/cmake-build simulation \
-DOT_FTD=OFF -DOT_MTD=OFF -DOT_APP_CLI=OFF -DOT_APP_NCP=OFF \
-DBUILD_TESTING=OFF
}
do_build_otbr_docker()
{
otbr_docker_options=(
"-DOT_THREAD_VERSION=1.4"
"-DOTBR_DBUS=ON"
"-DOTBR_FEATURE_FLAGS=ON"
"-DOTBR_TELEMETRY_DATA_API=ON"
"-DOTBR_TREL=ON"
"-DOTBR_LINK_METRICS_TELEMETRY=ON"
"-DOTBR_SRP_ADVERTISING_PROXY=ON"
"-DOTBR_BACKBONE_ROUTER=ON"
)
sudo docker build -t "${OTBR_DOCKER_IMAGE}" \
-f ./etc/docker/Dockerfile . \
--build-arg NAT64=0 \
--build-arg NAT64_SERVICE=0 \
--build-arg DNS64=0 \
--build-arg WEB_GUI=0 \
--build-arg REST_API=0 \
--build-arg FIREWALL=0 \
--build-arg OTBR_OPTIONS="${otbr_docker_options[*]}"
}
setup_infraif()
{
if ! ip link show backbone1 >/dev/null 2>&1; then
echo "Creating backbone1 with Docker..."
docker network create --driver bridge --ipv6 --subnet 9101::/64 -o "com.docker.network.bridge.name"="backbone1" backbone1
else
echo "backbone1 already exists."
fi
sudo sysctl -w net.ipv6.conf.backbone1.accept_ra=2
sudo sysctl -w net.ipv6.conf.backbone1.accept_ra_rt_info_max_plen=64
}
test_setup()
{
executable_or_die "${OTBR_AGENT_PATH}"
# Remove flashes
sudo rm -vrf "${TEST_BASE}/tmp"
# OPENTHREAD_POSIX_DAEMON_SOCKET_LOCK
sudo rm -vf "/tmp/openthread.lock"
ot_cli=$(find "${ABS_TOP_OT_BUILDDIR}" -name "${OT_CLI}")
ot_ncp=$(find "${ABS_TOP_OT_BUILDDIR}" -name "${OT_NCP}")
ot_rcp=$(find "${ABS_TOP_OT_BUILDDIR}" -name "${OT_RCP}")
executable_or_die "${ot_cli}"
executable_or_die "${ot_ncp}"
executable_or_die "${ot_rcp}"
export EXP_OTBR_AGENT_PATH="${OTBR_AGENT_PATH}"
export EXP_OT_CLI_PATH="${ot_cli}"
export EXP_OT_NCP_PATH="${ot_ncp}"
export EXP_OT_RCP_PATH="${ot_rcp}"
# We will be creating a lot of log information
# Rotate logs so we have a clean and empty set of logs uncluttered with other stuff
if [[ -f /etc/logrotate.conf ]]; then
sudo logrotate -f /etc/logrotate.conf || true
fi
# Preparation for otbr-agent
exists_or_die "${OTBR_DBUS_CONF}"
sudo cp "${OTBR_DBUS_CONF}" /etc/dbus-1/system.d
write_syslog "AGENT: kill old"
sudo killall "${OTBR_AGENT}" || true
setup_infraif
# From now on - all exits are TRAPPED
# When they occur, we call the function: output_logs'.
trap test_teardown EXIT
}
test_teardown()
{
# Capture the exit code so we can return it below
EXIT_CODE=$?
readonly EXIT_CODE
write_syslog "EXIT ${EXIT_CODE} - output logs"
sudo pkill -f "${OTBR_AGENT}" || true
sudo pkill -f "${OT_CLI}" || true
sudo pkill -f "${OT_NCP}" || true
wait
echo 'clearing all'
sudo rm /etc/dbus-1/system.d/otbr-agent.conf || true
sudo rm -rf "${STAGE_DIR}" || true
sudo rm -rf "${BUILD_DIR}" || true
exit_message="Test teardown"
echo "EXIT ${EXIT_CODE}: MESSAGE: ${exit_message}"
exit ${EXIT_CODE}
}
otbr_exec_expect_script()
{
local log_file="tmp/log_expect"
for script in "$@"; do
echo -e "\n${OTBR_COLOR_PASS}EXEC${OTBR_COLOR_NONE} ${script}"
sudo killall ot-rcp || true
sudo killall ot-cli || true
sudo killall ot-cli-ftd || true
sudo killall ot-cli-mtd || true
sudo killall ot-ncp-ftd || true
sudo killall ot-ncp-mtd || true
sudo rm -rf tmp
mkdir tmp
{
sudo -E expect -df "${script}" 2>"${log_file}"
} || {
local EXIT_CODE=$?
echo -e "\n${OTBR_COLOR_FAIL}FAIL${OTBR_COLOR_NONE} ${script}"
cat "${log_file}" >&2
return "${EXIT_CODE}"
}
echo -e "\n${OTBR_COLOR_PASS}PASS${OTBR_COLOR_NONE} ${script}"
if [[ ${OTBR_VERBOSE} == 1 ]]; then
cat "${log_file}" >&2
fi
done
}
do_expect()
{
if [[ $# != 0 ]]; then
otbr_exec_expect_script "$@"
else
mapfile -t test_files < <(find "${EXPECT_SCRIPT_DIR}" -type f -name "ncp_*.exp")
otbr_exec_expect_script "${test_files[@]}" || die "ncp expect script failed!"
fi
exit 0
}
print_usage()
{
cat <<EOF
USAGE: $0 COMMAND
COMMAND:
build_ot_sim Build simulated ot-cli-ftd and ot-ncp-ftd for testing.
build_otbr_docker Build otbr docker image for testing.
expect Run expect tests for otbr NCP mode.
help Print this help.
EXAMPLES:
$0 build_ot_sim build_otbr_docker expect
EOF
exit 0
}
main()
{
if [[ $# == 0 ]]; then
print_usage
fi
export EXP_TUN_NAME="${TUN_NAME}"
export EXP_LEADER_NODE_ID="${LEADER_NODE_ID}"
export EXP_OTBR_DOCKER_IMAGE="${OTBR_DOCKER_IMAGE}"
while [[ $# != 0 ]]; do
case "$1" in
build_ot_sim)
do_build_ot_simulation
;;
build_otbr_docker)
do_build_otbr_docker
;;
expect)
shift
test_setup
do_expect "$@"
;;
help)
print_usage
;;
*)
echo
echo -e "${OTBR_COLOR_FAIL}Warning:${OTBR_COLOR_NONE} Ignoring: '$1'"
;;
esac
shift
done
}
main "$@"
Describe the bug A clear and concise description of what the bug is.
When starting OTBR docker with simulation RCP using socat 1.8.0.2, after some time, the host program will crash.
To Reproduce Information to reproduce the behavior, including:
apt-get install socat
will install this version)ncp_mode
script to run otbr docker easier. (attached below)mkdir -p build/temp && sudo top_builddir=./build/temp bash ./tests/scripts/ncp_mode build_ot_sim
sudo top_builddir=./build/temp bash ./tests/scripts/ncp_mode build_otbr_docker
sudo top_builddir=./build/temp bash ./tests/scripts/ncp_mode expect docker_debug.exp
(the script attached below)0e080000000000010000000300001435060004001fffe002087d61eb42cdc48d6a0708fd0d07fca1b9f0500510ba088fc2bd6c3b3897f7a10f58263ff3030f4f70656e5468726561642d353234660102524f04109dc023ccd447b12b50997ef68020f19e0c0402a0f7f8
Expected behavior A clear and concise description of what you expected to happen.
The ping can work as expected
Console/log output If applicable, add console/log output to help explain your problem.
OTBR log before crash.
Additional context Add any other context about the problem here.
Updated ncp_mode script:
docker_debug.exp:
We have proved that this is related to socat version. If we use an older version of socat (1.7.4.4), this issue doesn't happen.
The text was updated successfully, but these errors were encountered: