Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make computation of mesh laplacian more efficient #10002

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

minnerbe
Copy link

@minnerbe minnerbe commented Feb 5, 2025

This PR slightly changes the computation of the mesh laplacian to be more efficient. In essence, the introduced changes are:

  • Remove the redundant computation of the triangle area. The cross product $|e_{ij} \times e_{jk}|$ computes (twice) the area of a triangle for all pairs of edges, but was computed 6 times for all triangles (3x for the stiffness, 3x for the mass).
  • Aggregate triangle areas over faces instead of edges.
  • Slightly optimize the number of operations in some expressions.

All tests pass. In addition, I tested the validity of the changes with random triangulations with 1e2-1e5 nodes while benchmarking. The corresponding benchmarking script is not included in this PR, but can be found here. Since this doesn't change the observable behavior of the code, I did not include a line in the changelog.

The performance improvements vary across architectures and with the number of nodes of the triangulation. Below, you can find the speedup for a given architecture (rows) for different normalization methods (columns) with different numbers of nodes in the mesh (tables). This speedup is of course anecdotal, but the fact that there is speedup seems pretty consistent.

N=1e2             None    sym    rw
x86_64            1.29x   2.00x  2.02x
aarch64           1.37x   2.16x  2.21x
cuda              1.32x   2.07x  2.07x

N=1e4             None    sym    rw
x86_64            1.19x   2.24x  2.13x
aarch64           1.08x   1.99x  1.96x
cuda              1.28x   2.06x  2.06x

N=1e5             None    sym    rw
x86_64            1.08x   1.75x  1.83x
aarch64           1.03x   1.91x  1.95x
cuda              2.24x   2.95x  2.98x

@minnerbe minnerbe requested a review from wsad1 as a code owner February 5, 2025 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant