Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does the package process raster metadata (scale, offset, nodata) when passed to ebv_add? #42

Open
bettasimousss opened this issue Sep 12, 2024 · 1 comment

Comments

@bettasimousss
Copy link

bettasimousss commented Sep 12, 2024

Hello,

I have a set of raster datasets in geotiff format, they are stored as UINT16 (scale = 10000, offset = 0) with nodata set to 65535.
After setting up the metadata on the portal, downloading the json I create and populate the netCDF file with these rasters as follows:

ebv_create_taxonomy(jsonpath = metadata_json, outputpath = newNc, taxonomy = taxo_path,sep = ',', lsid = FALSE, epsg = 3035, resolution = c(1000,1000), prec = 'integer', fillvalue = 65535, extent = c(723000.0000000000000000,7700000.0000000000000000,160000.0000000000000000,6615000.0000000000000000), overwrite=T, verbose=FALSE)

At this point, visualizing the data in panoply the arrays are filled with 65535 as expected.
Now, I wonder if I should add the data layers as path to the tiff and in this case it seems ebvcube is taking them as float and overriding the NODATA value with the default one for float, and setting all valid values to 0.

Or is it preferable to add the data as arrays, providing the values as integers (not applying scale). In this case, how to setup the scale so that the visualization in the portal is still on the [0,1] range ?

Thanks in advance for your support !

@LuiseQuoss
Copy link
Collaborator

Hello Sara,

thanks for your feedback and questions.
We currently have not yet implemented scale and offset. There are two approaches possible for you:

  1. Pass your data as float values and do the same with the NoData value. For this you need to read the data from the Tiffs and pass the data as arrays with the values converted to floats to ebv_add_data().
  2. Pass the integer values directly from the Tiff files to ebv_add_data() without converting to floats. Clearly define the units in the netCDF metadata to something like “Continuous 0-1 Score (x10000)” to inform users that values need to be divided by 10000 to obtain scores in the 0-1 range. This dataset shows a similar implementation.
    Option 2 seems the most suitable given that you are dealing with a very large datasets.

We have plans to implement the scale and offset for the EBVCube netCDFs. However, this will only happen in the mid-term.

I started investigating the overwriting of the NoData value. Thanks for pointing it out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants