-
Notifications
You must be signed in to change notification settings - Fork 694
Description
I want to ask a question about the time-consuming of opus_decode
after compiling on the Mac & iOS platform.
Arm neon inference is in the existing code (vec_neon.h). I measured the time consumption of running opus_demo(with compiling --enable_deep_plc) on a mac m3pro laptop for the opus_decode
function(only happens when losing the packet), and the average time consumption is about 0.3ms, while on an iOS device (iphone12pro max) the average time is 2.5ms, and the peak time is 7ms (only 1.5ms for without enable_deep_plc). Both are on the Release mode.
I've already used FARGAN. I add the -march=armv8.2-a+dotprod
option.
Has anybody ever measure the time of using dnn network online?
Here I post my cmake compile configure shell. I didn't modify the CMakeLists.txt.
For mac m3pro:
SRC_DIR="../opus-1.5.2_cmake"
BUILD_DIR="./build"
LIB_DIR="./output"
printf "=== start config arm64 ===\n"
printf "cur dir: ${PWD}\n"
rm -rf $BUILD_DIR
cmake ${SRC_DIR} -B ${BUILD_DIR} \
-DOPUS_DEEP_PLC=ON \
-DOPUS_BUILD_PROGRAMS=ON \
-DCMAKE_OSX_ARCHITECTURES="x86_64;arm64" \
-DCMAKE_OSX_DEPLOYMENT_TARGET="10.15"\
-DCMAKE_XCODE_ATTRIBUTE_ONLY_ACTIVE_ARCH=NO \
-DCMAKE_BUILD_TYPE=Release \
cmake --build ${BUILD_DIR} --target opus
For iOS:
SRC_DIR="./opus-1.5.2"
BUILD_DIR="./build"
LIB_DIR="./libs_ios_load"
printf "cur dir: ${PWD}\n"
rm -rf $BUILD_DIR
cmake ${SRC_DIR} -B ${BUILD_DIR} \
-DOPUS_DEEP_PLC=ON \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_SYSTEM_NAME=iOS \
-DCMAKE_OSX_ARCHITECTURES="arm64" \
-DCMAKE_XCODE_ATTRIBUTE_ONLY_ACTIVE_ARCH=NO \
-DCMAKE_C_FLAGS="-march=armv8.2-a+dotprod" \
-DCMAKE_CXX_FLAGS="-march=armv8.2-a+dotprod" \
cmake --build ${BUILD_DIR} --target opus
Is my compile option wrong? Why is my opus_decode time consumption so high? Is "-march=armv8.2-a+dotprod" option the default in CMakeLists.txt?