Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geoplot.kdeplot --> overlapping isolines #266

Open
patsaylor opened this issue Feb 25, 2022 · 8 comments
Open

geoplot.kdeplot --> overlapping isolines #266

patsaylor opened this issue Feb 25, 2022 · 8 comments

Comments

@patsaylor
Copy link

Using the kdeplot, with only one dataset, it appears that isolines are overlapping instead of merging, producing two local maxima instead of a consistent isoline around all features.
Screen Shot 2022-02-25 at 9 01 22 AM

@ResidentMario
Copy link
Owner

Can you provide a copy of the dataset, a copy of the code, and the output of pip list please?

@patsaylor
Copy link
Author

geoplot_sample_data.csv

Hello! thanks for looking into the issue-- here
gis_pip_list.txt
is the sample data, and pip file:

@patsaylor
Copy link
Author

Sorry ^ above comments got cut off. please find the files attached, and thanks for looking into the issue!

@ResidentMario
Copy link
Owner

The sample data you provided is in some unknown non-CSV format:

image

Can you reupload in CSV format please?

Also, can you provide the code please? In order to look into this, I need a minimum reproducible example.

@patsaylor
Copy link
Author

geoplot_sample.zip
Hello, please let me know if this will work - included:
input file
run script
sample output figure

thanks again for looking into this!

@ResidentMario
Copy link
Owner

Alright, I was able to repro.

The underlying issue is an interesting one. Latitudes and longitudes are on a continuous [-180, 180] axis. So for example the next degree after you reach 180 degrees longitude is -179 degrees longitude. The kernel density estimation algorithm is naive to this; it expects a smooth continuous axis from (-inf, +inf) in both directions.

Because your data straddles the longitudinal boundary (the international date line), the KDE algorithm is generating kernel density boundaries that (1) don't connect at the boundary and (2) extend past the maximum extent of the coordinate system, e.g. past 180 degrees and -180 degrees. If you remove the projection, this becomes obvious:

image

You are using a projection, and so cartopy appears to "wrap" the coordinate values that are past the maximum coordinate value back to the coordinate grid. I think it just translates e.g. 182 to -2 degrees, 185 to -5 degrees, etcetera. This is causing the boundary lines to overlap themselves, as you are seeing here.

So that explains what's going wrong, now how to solve it? I'm not sure actually. Fixing this boundary issue would require writing a custom kernel density estimator kernel, which seaborn actually (surprisingly, IMO) removed support for recently (see here). I might just have to put a disclaimer in the documentation telling people the plot won't work for data straddling coordinate boundaries.

Now that I think about it, I think all of the other analytical plot types (specifically voronoi and quadtree) share this problem. 😓

@patsaylor
Copy link
Author

Thanks for the thorough investigation and clear explanation of the issue.
Thinking about steps forward--
Did you happen to look into separating the analysis step from projection? Perhaps converting the longitude coordinates to 0-360 prior to KDE analysis, and then reprojecting for mapping?

@ResidentMario
Copy link
Owner

Not sure, but maybe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants