Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added depth ave currents #144

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

added depth ave currents #144

wants to merge 2 commits into from

Conversation

justinbuck
Copy link
Collaborator

Proponents:
Moderator: @OceanGlidersCommunity/format-mantainers

Type of PR

  • Typo without possible change of interpretation of the related text.
  • Fix of some error, inconsistency, unforeseen limitation.
  • Style that only affects visually the compiled document
  • Addition that does not require change in the current structure.
  • Enhancement that require changes to improve the format.

Related Issues

Dates when it got review approvals

Release checklist

  • Approved by at least two members of the committee?
  • There were modifications after the review approvals? If so, please
    ask reviewers to update their review.
  • Proponents and moderador should explicitly agree that it is ready to
    to merge.
  • The moderador is the one in charge to actually merge or close this PR
    according to the final decision.

For maintainers

  • Update the moderator with a volunteer from the committee. It would be
    best to have one single moderator to guide and help this PR to move
    forward. It is OK to update the moderador pass it to another one.
  • Confirm that the associated branch was deleted after the merging.
  • Wrap-up and close the related issues.

Comments

@castelao
Copy link
Member

I think this is related to issue #19 .

Using N_MEASUREMENTS would mean repeating many times (on the order of 100-1000) the same value?

OG_Format.adoc Outdated
////
== Depth average currents

Depth average currents should be included as a geophysical value in the N_MEASUREMENTS dimension along with an associated QC channel at the time when the calculation was completed by the glider. The methodology for derivation of the depth average current can be included as a methodology attribute to the variable. Results of the OceanGliders working group on depth average currents are pending at the time of writing (25th April 2023). Once, the working group provides result the approach for inclusion of depth average currents should be reviewed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using N_MEASUREMENTS would impose a lot of redundancy, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@castelao Not sure what you mean by this. Do you mean as opposed to defining them as scalars? Can you explain? We discussed the pros and cons at length on the call today and came to agreement that defining these parameters using the unlimited record dimension provides the most flexibility for the data providers. Yes, there will be lots of missing/_FillValues but compression will make the file size increase negligible. There are some cons (ERDDAP aggregation, distribution, etc.) depending on whether the data providers are submitting one file/underwater segment as opposed to one file that contains all of the underwater segments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kerfoot , let's first check if we are understanding the same from the proposed text. My understanding is that as temperature is defined as TEMP(N_MEASUREMENTS), so would be depth_averaged_velocity(N_MEASUREMENS). Thus, even though such an average is only possible between two surfaces, the same value would be repeated as many times as temperature measurements during a dive, which could be on the order of thousands. Did I understand it correctly?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The calculated depth-average current value would be reported at the appropriate time(N_MEASUREMENTS) row only. The time of the reported depth-averaged current would be left up to the data provider, according to their own determination. While the underlying calculation of the depth-averaged current is likely similar for all glider types, the timing of the reported current will be left up to the data provider.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused, but I think we are getting there. So the depth_averaged_velocity is not a scalar, but has the same dimension of time, which is N_MEASUREMENTS. Only one value of depth_averaged_velocity would be reported, which would be aligned with the respective time that the average is representative for. Is it the proposed idea?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@castelao I'll offer my take as well. I think your last comment matches what @kerfoot proposed on the call. As I understood it, assuming the file contains multiple profiles, the suggestion was to include a sparse depth averaged current variable where there are mostly fill values but one depth averaged current value per profile. @kerfoot proposed file size would not be an issue if nc compression is used. An alternative is to follow the NGDAC approach where each .nc file is one indiv. underwater segment. In this case the depth averaged velocity can be stored as a scalar. I don't personally have experience aggregating scalars like this but @kerfoot reports this works well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI all, just to remind everyone, the main intended use of OG1.0 is for the fundamental data object of "mission" a.k.a. "deployment." So I believe the proposal we have to consider is whether to include DAC as a variable with same dimensions as the fundamental geophysical dimension (N_MEASUREMENTS), with FillValue on most points. I am fine leaving this to the next version, and then we could add surface velocity as well at the same time. My reason for pushing this is because all the glider types have been computing this for years and the data are there for the taking already (with many caveats).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for the delay. I think that there are different points here. My opinion on each one of those:

  • I think it is worth re-phrasing this part of the text. I don't think the text is clear for any of the two discussed paths. As it is, it can create confusion.
  • I agree with @glidermann , and independent if it is possible to be used with arbitrary chunks of a mission, it should be expected to work well with a full mission/deployment. No problem in using this format with just a 'segment', but it should be a priority to accommodate well a full mission.
  • There is no problem for the average velocity to be a scalar in a segment sub-dataset, and there is no problem in aggregating several segments, but that would require a dimension to align such an array of depth averages. Better we name it, or it would be free for each one decides. Alternatively, if the idea in the case of aggregation of multiple segments is to jump straight to a long array with FillValues, it would be required a new variable with the representative time for that average.
  • I understand that a long array with N_MEASUREMENTS with mostly FillValues is possible. It is true that compression minimizes the problem (although there is a small overhead for the compression since nothing is free). If storage is not a problem, I'm not so sure that on the time to process/analyze it that overhead would be larger. The most important issue with that option in my opinion is that the depth average is a quantity representative for a period of time, not a specific time. Different than the temperature where each measurement could be assumed instantaneous (in fact an average of a short time window), the depth average is for the period underwater. If that is not implicit within the data structure, it would be a good idea to designate what is the 'segment', otherwise we will let every user figure that out by themselves. I strongly think that the priority should be on the final user and the format should be the easiest way that science can be done with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm the only one seeing an issue here, feel free to 'resolve conversation'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi all, I think we should leave this issue until the next version. Looking at the manual as a whole, a dedicated, but cryptic, section devoted to DAC seems out of place. After all, we have the generic and reference to vocabularies in the Geophysical Variables section and do not prescribe that any particular quantity should be included, only how it should be done. Since current is in the vocab, users can do it now as they wish, or wait until the next version for more guidance. this discussion above will be a good starting point, and is here for posterity.

Copy link
Member

@castelao castelao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the team decides to move forward with the long array with FillValues, please move ahead. I won't stop it.

OG_Format.adoc Outdated
////
== Depth average currents

Depth average currents should be included as a geophysical value in the N_MEASUREMENTS dimension along with an associated QC channel at the time when the calculation was completed by the glider. The methodology for derivation of the depth average current can be included as a methodology attribute to the variable. Results of the OceanGliders working group on depth average currents are pending at the time of writing (25th April 2023). Once, the working group provides result the approach for inclusion of depth average currents should be reviewed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for the delay. I think that there are different points here. My opinion on each one of those:

  • I think it is worth re-phrasing this part of the text. I don't think the text is clear for any of the two discussed paths. As it is, it can create confusion.
  • I agree with @glidermann , and independent if it is possible to be used with arbitrary chunks of a mission, it should be expected to work well with a full mission/deployment. No problem in using this format with just a 'segment', but it should be a priority to accommodate well a full mission.
  • There is no problem for the average velocity to be a scalar in a segment sub-dataset, and there is no problem in aggregating several segments, but that would require a dimension to align such an array of depth averages. Better we name it, or it would be free for each one decides. Alternatively, if the idea in the case of aggregation of multiple segments is to jump straight to a long array with FillValues, it would be required a new variable with the representative time for that average.
  • I understand that a long array with N_MEASUREMENTS with mostly FillValues is possible. It is true that compression minimizes the problem (although there is a small overhead for the compression since nothing is free). If storage is not a problem, I'm not so sure that on the time to process/analyze it that overhead would be larger. The most important issue with that option in my opinion is that the depth average is a quantity representative for a period of time, not a specific time. Different than the temperature where each measurement could be assumed instantaneous (in fact an average of a short time window), the depth average is for the period underwater. If that is not implicit within the data structure, it would be a good idea to designate what is the 'segment', otherwise we will let every user figure that out by themselves. I strongly think that the priority should be on the final user and the format should be the easiest way that science can be done with it.

@castelao
Copy link
Member

castelao commented May 2, 2023

@rtodd-WHOI , maybe this discussion is of your interest.

@callumrollo
Copy link
Member

Is this PR ready to merge? I believe it's one of the few remaining decisions to finalize for the first release

@castelao
Copy link
Member

castelao commented Jul 6, 2023

@callumrollo , thanks for raising this point again. I still think that this approach will create confusion, and even if moving forward with this, the text is not clear. But if everyone else is happy as it is, feel free to 'resolve conversation' and merge it.

I do not think we should get into details of a potential geophysical variable we think should be included in this document. We can discuss details of currents, or other derived/processed quantities in a best practice document as they become available and mature. Then reference those.
@glidermann glidermann self-requested a review July 9, 2023 08:28
Copy link
Member

@vturpin vturpin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinion here.
I would tag this issue in order to discover it back easily.

@vturpin vturpin added the 1.X For stuff to postpone and resolve only after the 1.0 release label Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.X For stuff to postpone and resolve only after the 1.0 release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants