Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop additional metadata elements for OAM Catalog #55

Open
smit1678 opened this issue Nov 16, 2016 · 19 comments
Open

Develop additional metadata elements for OAM Catalog #55

smit1678 opened this issue Nov 16, 2016 · 19 comments
Labels
v2 Features and ideas to be considered for v2 implementation
Milestone

Comments

@smit1678
Copy link
Collaborator

smit1678 commented Nov 16, 2016

From other OAM improvements, we're going to look to implement some additional metadata elements for PacDID.

Next action: itemize the elements we need to add.

Recap of current:

Current metadata tracked

element type name description
uuid string File unique URI to file
projection string Projection CRS of the datasource in full WKT format
bbox array Bounding Box Pair of min and max coordinates in CRS units, (min_x, min_y, max_x, max_y)
footprint string Datasource footprint WKT format, describing the actual footprint of the imagery
gsd number Ground Spatial Distance Average ground spatial distance (resolution) of the datasource imagery, expressed in meters
file_size number File Size File size on disk in bytes
acquisition_start string Acquisition Date Start First date of acquisition in UTC (Combined date and time representation)
acquisition_end string Acquisition Date End Last date of acquisition in UTC (Combined date and time representation) (optional)
title string Title Human friendly title of the image
platform string Type of imagery List of possible platform sources limited to satellite, aircraft, UAV, balloon, kite
provider string Imagery Provider Provider/owner of the OIN bucket
contact string Contact Name and email address of the data provider
properties object Properties Additional properties about the image (optional)
@smit1678 smit1678 added this to the OAM Version 2 milestone Nov 16, 2016
@smit1678
Copy link
Collaborator Author

Below is a recap of the future additions tracked in previous threads (https://github.com/hotosm/OpenAerialMap/tree/master/metadata, #31, #17) plus input received from the SPC and the PacDID project.

Future additions from our last review

element type name description
license string License Type of license for imagery
tags string Tags User provided tags

Input from SPC/PacDID

element type name description
bands string Bands available Bands available or imaging bands
uav_type string Type of UAV Type of UAV used for collection
operator string Operator of the UAV Person or company that operated the UAV (could differ from provider)
gps string GPS Method Used GPS method used -- autonomous or differential
abstract string Abstract Short descriptor of the project or purpose of imagery

@cgiovando Additional metadata items for input before we select and finalize? Input on how we want to finalize? License and tag are candidates that we'll definitely want to include.

@nbumbarger
Copy link
Collaborator

Just wanted to update that we're planning to start work on this enhancement. Some aspects can't be developed until the list is finalized (form validation, for example), but generically scaffolding support for more metadata properties is something we can begin now.

@smit1678
Copy link
Collaborator Author

From our thinking, we're leaning towards not having UAV-specific metadata items in the spec. bands could be an interesting addition. license and tags should be good to be included. The biggest question we're thinking about is how/if can we improve the core OIN spec as well as have an extended OAM version.

@mojodna Based on the mosaic work, is there anything we want to be thinking about for helping group items or connect between the Uploader and Catalog better?

@nbumbarger
Copy link
Collaborator

How would contributors input bands in the form? Would it be a set of checkboxes with some common ones? RE R G B NIR SWIR PAN? Or maybe Color Multispectral Panchromatic?

@nbumbarger
Copy link
Collaborator

@smit1678 - I'm going to need staging credentials and URLs for the catalog API in order to work on this feature next week. If anyone has access to these, could you get ahold of me on the Slack channel? As per the docs, the needed variables are:

OAM_DEBUG - Debug mode true or false (default)
AWS_SECRET_KEY_ID - AWS secret key id for reading OIN buckets
AWS_SECRET_ACCESS_KEY - AWS secret access key for reading OIN buckets
DBURI - MongoDB connection url
SECRET_TOKEN - The token used for post requests to /tms endpoint

cc @mojodna @danielfdsilva

@smit1678
Copy link
Collaborator Author

smit1678 commented Dec 19, 2016 via email

@nbumbarger
Copy link
Collaborator

Thanks, @smit1678. I have it running.

@mojodna
Copy link
Collaborator

mojodna commented Dec 19, 2016

For the small-scale mosaicking (multiple images that are part of the same scene), we may want to add an optional "scene id" UUID (to group them together). That would also help people who've subdivided imagery and want to be able to associate it back together.

The the large-scale mosaicking (everything), gsd and acquisition dates are probably the main inputs. Having "quality" or "priority" might be helpful, but that's incredibly subjective.

@nbumbarger
Copy link
Collaborator

@mojodna I think the scene concept is implicit in the upload form, because it allows the user to add a list of images within a dataset but also allows for adding multiple datasets. It is true, however, that images submitted in a group within a dataset are not distinguished in the database. Are you proposing that the images in each submitted dataset be automatically assigned a unique group ID?

@nbumbarger
Copy link
Collaborator

@smit1678 @mojodna I added support for tags (optional) and license (required) to the form and catalog system (not yet merged); those seemed to be the highest priorities. Beyond the UUID question above, is there anything else we want to include at this time (quality, bands)? GSD is already generated, and acquisition dates are required with submission.

@nbumbarger
Copy link
Collaborator

In addition to adding a description of the new metadata properties to the API documentation, we also need to discuss where they should be exposed in the browser. For example, the license should probably be included in the image preview panel, but maybe we don't want to allow tags, considering that the user is currently able to attach an arbitrary number of them.

@mojodna
Copy link
Collaborator

mojodna commented Dec 28, 2016

Are you proposing that the images in each submitted dataset be automatically assigned a unique group ID?

Yes. The uploader is aware of relationships between images, but the rest of the toolchain isn't.

I'm tempted to advocate for the inclusion of bands, but we can table that until/if we decide that RGB(A) imagery is overly limiting.

We can also follow a philosophy of allowing common properties to emerge using well-known tags. This would require that metadata is mutable (in order to change them in response to consensus), which I'm not sure is the case right now.

@cgiovando recommended looking through OGC Earth Observation Metadata profile of Observations & Measurements to see if there's anything we're missing and to keep an eye out for future harmonization.

@mojodna
Copy link
Collaborator

mojodna commented Dec 29, 2016

Are you proposing that the images in each submitted dataset be automatically assigned a unique group ID?

Yes. The uploader is aware of relationships between images, but the rest of the toolchain isn't.

Further elaborating on this, the uploader should assign unique ids to each scene/dataset. Right now, the entire upload gets an id, as do the individual images, but there's no id that ties images to scenes within an upload.

E.g.: https://upload-api.openaerialmap.org/uploads/58655b07f91c99bd00e9c7ab

@smit1678
Copy link
Collaborator Author

I started looking at the OGC EO profile, http://docs.opengeospatial.org/is/10-157r4/10-157r4.html, to get a sense of what/if we're missing. The spec covers much more than just optical and can be very mission/equipment specific.

I've tried to capture the Optical earth observation product information needed here: https://docs.google.com/spreadsheets/d/1yhX1cTfpa75wSKCDtRJTJ3rViw6exenTMXWsb1gpnnc/edit?usp=sharing.

Not mapped to OIN/OAM's current metadata spec. I think there will be ways for future harmonization. We seem to capture a subset of the information.

Couple items that we are missing that are specific to Optical products:

  • Details about cloud and snow cover
  • Details about sensor angle and other equipment specifics
  • Details about the resulting product - browse, mask, and product descriptions

@smit1678
Copy link
Collaborator Author

Related to hotosm/oam-uploader-api#52 and looking at this further, it looks like one useful upgrade to the core OIN spec is creation_date and potentially modification_date. According to OGC, creation_date is:

creation date for the metadata item. When retrieved from a metadata catalogue, the creationDate is the date when the metadata item was ingested for the first time (i.e. inserted) in the catalogue.

This differs from acquisition_ time which is called phenomenonTime by OGC.

for upload IDs

Perhaps we can look at how OGC does it by using something like:

  • composedOf: Link to an EO product that is part of this EO product (e.g. a phr:DataStrip is composed of one or more phr:Scene)
  • subsetOf: Link to the “father” EO product (e.g. a phr:Scene is a subset of a phr:DataStrip)
  • linkedWith: Specify a link to another EO product (e.g. ERS1 and ERS2 interferometric pair)

@mojodna
Copy link
Collaborator

mojodna commented Jan 14, 2017

While working on re-indexing the HOT OIN bucket, I realized that the OIN metadata JSON should include something that identifies it as such (along with a version number), similar to how TileJSON does it:

{
  "...": "...",
  "tilejson": "2.1.0"
}

(The indexer currently attempts to treat all JSON files present in a bucket as OIN metadata, which is no longer true for the HOT bucket (there's footprint GeoJSON + TileJSON for the tiler); I have a temporary workaround that'll show up in a PR shortly, but it checks for uuid, which isn't a fully reliable check.)

@mojodna
Copy link
Collaborator

mojodna commented Jan 14, 2017

I'm using uploaded_at (for now) to signify when the imagery was updated (which is distinct from when it was ingested into the catalog).

mojodna added a commit to mojodna/oam-api that referenced this issue Jan 14, 2017
mojodna added a commit to mojodna/oam-api that referenced this issue Jan 14, 2017
This skips JSON files that do not contain a `uuid` property. It would be
better if there were an OIN-specific key/value to check for.

Refs hotosm/OpenAerialMap#55
@smit1678
Copy link
Collaborator Author

Ok, let's bring this home to a close. I started this in OIN: openimagerynetwork/oin-metadata-spec#14.

@mojodna @nbumbarger @cgiovando These changes look right and agreed upon?

Current OAM through Uploader:

{
    "uuid": "",
    "title": "",
    "projection": "",
    "bbox": [],
    "footprint": "",
    "gsd": ,
    "file_size": ,
    "acquisition_start": "",
    "acquisition_end": "",
    "platform": "",
    "provider": "",
    "contact": "",
    "properties": {
        "sensor": "",
        "thumbnail": "",
        "tms": ""
    }
}

New additions proposed to OAM through Uploader:

{
    "oin": "",              # OIN Version number. An update to OIN spec
    "uploaded_at" : "",     # date metdata uploaded into OIN. An update to OIN spec
    "uuid": "",
    "title": "",
    "projection": "",
    "bbox": [],
    "footprint": "",
    "gsd": ,
    "file_size": ,
    "acquisition_start": "",
    "acquisition_end": "",
    "platform": "",
    "provider": "",
    "contact": "",
    "creation_date": "",
    "properties": {
        "sensor": "",
        "thumbnail": "",
        "license": "",          # new addition, doesn't affect OIN spec
        "tags": "",             # new addition, doesn't affect OIN spec
        "tms": "",              
        "wmts": ""              # new addition, doesn't affect OIN spec
    }
}

@smit1678 smit1678 mentioned this issue Jan 18, 2017
@mojodna
Copy link
Collaborator

mojodna commented Jan 18, 2017

👍

1.1 for the OIN version?

@cgiovando cgiovando added the v2 Features and ideas to be considered for v2 implementation label Apr 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v2 Features and ideas to be considered for v2 implementation
Projects
None yet
Development

No branches or pull requests

4 participants