Faster subset gridpoint #452
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Checklist:
What kind of change does this PR introduce?:
When subsetting a curvilinear dataset (2D lat lon) to gridpoints, use a
scipy.spatial.KDTreeto find nearest neighbours with euclidean distance in lat/lon space, instead of computing the great circle distance (in meters) for all points.We are already using a lat/lon euclidean distance for the rectilinear case (1D lat / lon). The loss in precision is compensated by a significant performance boost. For example, my use case needed to extract 94 points from a 800x1000 grid. Instead of computing 94x1000x800 great circle distances, we now only need to compute 94 when tolerance is passed. None otherwise.
Before this change, my use case took ~120 s and now it takes 350 ms.
I also modified how we get the lat and lon coordinate to use the utils instead of relying on
latandlonnames. And I modified these utils so a variable named "lon" is detected as a longitude (and similarly for "lat").Does this PR introduce a breaking change?:
Kinda as the distance metric has changed. In most cases, I don't expect different result, but there could be some extreme cases with points near the poles where a different neighbour is now choosed.
Other information: