Graph def optimize#185
Open
Aloyshaaaa wants to merge 59 commits intoMooreThreads:mainfrom
Open
Conversation
new fix pr (MooreThreads#67)
…l+BiasAdd into single kernel launch
…l+BiasAdd into single kernel launch
Merge upstream/main into graph_def_optimize while preserving the PR-side fusion implementations under the new musa_ext/kernels/fusion layout. The only content conflict in tensordot_bias_fusion.cc kept the upstream include path and equivalent output-name removal logic, while PR-only fusion files were moved with the directory reorg and their includes were updated. Constraint: Upstream moved fusion implementations from musa_ext/mu/graph_fusion to musa_ext/kernels/fusion in MooreThreads#195 Rejected: Keep PR-only fusion files under musa_ext/mu/graph_fusion | would reintroduce the old layout and conflict with the current build organization Confidence: medium Scope-risk: moderate Directive: Future fusion implementation files should live under musa_ext/kernels/fusion, not musa_ext/mu/graph_fusion Tested: git diff --cached --check; verified no unmerged paths; verified no #include "mu/graph_fusion/ references remain Not-tested: CMake/build unavailable because cmake is not installed in this environment
…nsorflow_musa_extension into graph_def_optimize
…nsorflow_musa_extension into graph_def_optimize
The PR branch needed to absorb upstream/main after the linear fusion path was renamed and generalized from MusaLinearRelu to MusaLinearActivation. The conflict resolution now takes the upstream contents for both conflicted files, including the upstream MusaLinearActivation kernel implementation and upstream fusion test expectations. Constraint: User requested all conflict resolutions follow upstream exactly Rejected: Preserve branch-specific relaxed float32 tolerances | conflicts must use upstream side Confidence: medium Scope-risk: moderate Tested: compared conflicted files against upstream/main; git diff --check; rg conflict marker scan; python -m py_compile test/fusion/linear_relu_fusion_test.py Not-tested: Full MUSA build/tests; local TensorFlow is 2.21.0 while build.sh requires 2.6.1 and libmusa_plugin.so is not built
The graph_def_optimize branch moved PLN cascade headers under the fusion kernel directory and keeps the transpose helper under array kernels, but a few fusion ops still referenced the old include locations. Build failed before linking even though TensorFlow 2.6.1 and MUSA were configured correctly. Constraint: Build runs inside aloysha_musa435_tf261 with the tf261 conda environment. Rejected: Add broader include directories | hides stale include paths and makes future source moves harder to catch. Confidence: high Scope-risk: narrow Tested: ./build.sh release in aloysha_musa435_tf261; produced build/libmusa_plugin.so and _runtime_config_bindings module Not-tested: Wheel packaging and runtime op execution
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
s80 耗时15.7ms