Feature/new json #92

staleyLANL · 2023-06-29T21:41:25Z

Might as well put this in.

An earlier branch introduced the ability to write HDF5 files in different ways, according to two boolean flags: *reduced* and *typed*. This branch enhances our JSON capabilities in the same manner, reflecting, with JSON, an analog of the options we have for HDF5. The above was done in respect to GNDStk's general philosophy of making its support of different file types as consistent with one another as reasonably possible. Similarly to HDF5, the JSON class now has two static booleans: "reduced" and "typed". The "reduced" flag tells the JSON writer whether or not to reduce -- basically, to fold into a shorter and simpler representation -- certain special constructs that GNDStk's Node uses internally in order to handle what are called cdata, pcdata, and comment nodes in XML. Basically, if reduced == true, then we make this simplification, which shortens the JSON output. If reduced == false, then the JSON will closely reflect what Node uses internally. The "typed" flag tells the JSON writer whether or not it should use our type-guessing code to guess whether "strings" in certain contexts (in particular, metadata values and pcdata) actually contain what look like numbers. Examples: "12" looks like an int, "321 476" looks like a vector of ints, "3.14" looks like a double, and "3.14 2.72 1.41" looks like a vector of doubles. If typed == false, we use the original strings, with no type guessing. If typed == true, we apply the type guesser, and, where appropriate, get JSON numbers in place of strings in the JSON output. This is a work in progress, with a couple of things that still need doing... Still to do: modify the JSON *reading* code so that it recognizes the various different ways that the JSON *writing* code writes things, and can reliably reverse the writing process and recover a GNDStk Node. Also still to do: at present, the nlohmann JSON library, which we use under the hood, doesn't provide a way to write JSON numbers (unquoted, as opposed to quoted JSON strings) beginning from an existing string representation that we might provide for the number. Consider this discussion that I started: nlohmann/json#3460 This is relevant to us because GNDStk provides very fine control over exactly how numbers, in particular floating-point numbers, are formatted, in terms of the number of significant digits, fixed vs. scientific form, etc. At the time of this writing, if we use typed == true, so that certain strings that look like numbers are written as numbers, the numbers will, unfortunately, be formatted by the JSON library itself. We'd like to have the capability of writing an original string (one that we've already determined looks like a number!) - but writing it as a JSON number (so, without quotes) rather than a JSON string. In this commit, we also Modified an old Tree/ test that depended on the previous default (but still available, via appropriate flags, even though no longer the default) JSON writing behavior. And, we changed a variable name in json2class.cpp. We refactored detail-node2hdf5.hpp (for HDF5) considerably, and then reflected this in the modified detail-node2json.hpp (for JSON) that we're primarily working on. And, we've enhanced the JSON test code so that it tries out the new flags. This is a work in progress as well. In order to finish it, we need to finish the ability to read JSON files that were written not just in our original manner, but with any variations of the above-described flags.

Also some small tweaks and comment changes here and there.

Small improvements.

…into feature/better-json

Some tweaks to HDF5-related code. Extended and improved some comments, to clarify things.

Additional tests for those, too.

There are still some instances of the strings in question, typically appearing amongst other text inside of string literals. At some point I'll look into fixing up those cases too, but that's not a priority right now.

Updated the HDF5 test suite to ensure that we test the modifications.

We'll update the code generator later, and ensure that it does the right thing. More feature-packed classes Field and FieldPart.

I'm not sure why I made them be converting constructors to begin with. Whatever the reason, it seems (according to the test suite!) that they don't need to be converting constructors. And, I don't think that they should be. Indeed, some new code modifications uncovered this issue, as converting constructors for Node triggered certain ambiguities.

Code generator itself still needs updating.

…in name isn't known. This was important for the new generated-class Field system to work well, without carrying along a std::string (expensive!) in each class field. On the other hand, such a thing wouldn't necessarily be a memory hog in a well-designed library that mostly contained vectors and such - not a huge number of individual scalar fields.

Improved some SFINAE. Added more comments, and did a bit of rearrangement.

It now generates classes that use the new system.

into feature/component-field

… feature/component-field

Note that it doesn't yet add shortcuts to generated classes; that'll come soon. Changed the way a map was done. Faster, and leads to better printing of shortcut information.

Some refactoring related to the above. Miscellaneous cosmetic and clarity-related changes.

Updated various test codes.

Some small printing-related changes and fixes as well.

… offers. There's lots more to be done in the above respect; this is just a start. Also, provided some preliminary support for another approach to customization. The original (and still in place) "customization" system involves inserting code from custom.hpp files directly into generated classes. Inserting additional bits and pieces of code into already-existing classes is actually a somewhat goofy concept. (Imagine someone customizing, say, std::complex by adding a new piece of information. They really should derive from it, or contain it in another class. Not try to mess with the existing class.) I'm fleshing out *one* new way, at least (with possibly others to come) for doing customizations. There's more to be done, but I'll work with a particular user right now, to see how this works. JSON specs now allow metadata to have converters. It turns out that this will help with the new customization business. Briefly: we want an ability to insert our own fields into classes (say, to contain data that we'll compute as functions of other fields), but without those new fields playing a role in I/O from/to files. A new (and simple) "Noop" (no-op[eration]) class fills the necessary role of representing the "no conversion is taking place" for I/O on the new fields. Some new if-constexprs, elsewhere in the code, allow convert() functions to return a bool (but they can still return void, as before). If a convert() returns bool, and is used in the context of Node add() metadatum or child node, then a return value of false means, "don't actually add the new metadatum or children." This is all in support of our new customization approach, because we don't necessarily want auxiliary computed fields to be written to files. (But still want them in the multi-query, so they work with print() and such.) Tweaked some SFINAE. This detail:: stuff really ought to be made more internally consistent, even though it's in detail::. Note: the _v and _t business is modeled after what's in std::. (Examples: enable_if_t, is_convertible_v.) Adjusted some comments to reflect recent changes.

…terface. Some code cleanup and minor rearrangements. Lots more to come in the Python interface, as time allows. The next thing will probably be access to our extensive capabilities for controlling formatting, in particular as it relates to floating-point precision.

The code generator builds C++ code, a C-language interface, and Python bindings. Prior to this change, the code that generated the C interface handled only the old "BlockData" manner of handling data nodes. That was a construct that handled the "valueType"-based "run-time defined data type" business that exists in some GNDS nodes. But our format for format specs allows for a fixed data type as well, and it's handled in a different way. For a user who has been using GNDStk's code generator to generate an entirely different format, I needed to update the generation of the C interface in order to handle this properly.

These are largely in preparation for some upcoming, more-substantive changes; ones that should somewhat improve and clarify the interface.

Some work on Component::has() and Component::operator(). Small change: class to struct in a few places.

Additional work regarding Lookup objects, Field, Component, etc.

Cleaned up some SFINAE, and removed some that was no longer needed. A bit of formatting work in some CMakeLists.txt files, just to make them arguably more readable.

As with: https://stackoverflow.com/questions/57770972

- Clarified "auto" in some places. - Ditto for constness.

See: https://stackoverflow.com/questions/42797279

Minor changes in some SFINAE to more-closely resemble certain std:: constructs. Sfome SFINAE sfimplifications. Removed unnecessary "inline"s. The functions in question were in fact function *templates*, so that "inline" was unnecessary in order to still achieve header-only-ness. Used "struct" instead of "class" in a few places. Used "size_t" instead of "int" in a few places. The difference wasn't really important in these places, but I decided that size_t was correct in principle. C Interface: fixed some previously unnoticed discrepancies that arose after an earlier change. Some function names were changed a while back: color to colors, note to notes, and warning to warnings. However, the *declarations* (for C) hadn't been updated accordingly. Function name change: detail::getter() to detail::compGetter(). (comp, for class Component.) Too many things were called "getter." This change should make code auditing/analysis a bit easier, going forward. Miscellaneous other changes. There are still some relevant files to be uploaded, but I need to review them more carefully first.

Split up some longer files.

The new overload and SFINAE were added after I switched from clang 9 to clang 17, and the latter reported overload ambiguities in some old code that hadn't had problems with earlier compilers. It's not clear to me that ambiguities really exist, but in any event, the new overload and the SFINAE disappears the new errors. Casted to (void) in certain places in some test files, after noticing that a new g++ emitted warnings about the results of some function calls being unused. (We did those calls to test that they're valid and to check out-parameters, not because we intended to use the return values of the functions.)

Under 1600 lines, in comparison with over 24000. Gives faster test-suite compilation times. Also, our JSON has more-flexible capabilities regarding the printing of floating-point numbers, in terms of format and number of significant digits. We're not taking advantage of those capabilities quite yet, but we will, if we choose to make JSON a first-class format - as important as XML.

staleyLANL added 30 commits April 29, 2022 14:43

Uploading some unfinished JSON-related work-in-progress.

50c42c6

Also some small tweaks and comment changes here and there.

Added some vetter JSON files, to be used in the test suite.

c36ceff

Small improvements.

Merge branch 'feature/better-json' of https://github.com/njoy/GNDStk …

b6f942d

…into feature/better-json

More work on some JSON-related code.

faf2e67

Some tweaks to HDF5-related code. Extended and improved some comments, to clarify things.

Some extra material in resources for the JSON tests.

86d5527

Saving some work.

0f346b7

More updates to the work on JSON improvements.

3ee65d7

Additional tests for those, too.

More JSON code updates.

91da1bf

Finishing up some JSON work.

7669c83

Put most instances of certain reappearing strings into constants.

5caca0f

There are still some instances of the strings in question, typically appearing amongst other text inside of string literals. At some point I'll look into fixing up those cases too, but that's not a priority right now.

Small but worthwhile improvements in the HDF5 files GNDStk creates.

7f84bbd

Updated the HDF5 test suite to ensure that we test the modifications.

Detail detail.

5e932dc

Additional work on simplifying autogenerated classes.

93acc2d

Direct modifications of certain generated files.

c11ea62

We'll update the code generator later, and ensure that it does the right thing. More feature-packed classes Field and FieldPart.

More generated-code updates.

621995a

Code generator itself still needs updating.

Some new test material, for better generated-code getter coverage.

ed83af4

Corrected some verbiage. The word "data" is plural, not singular. :-D

673c516

Shorten constructors in generated classes.

9111a5e

Additional autogenerated C++ and Python code updates.

fa7234a

Shortened some things.

6cd2ba7

Improved some SFINAE. Added more comments, and did a bit of rearrangement.

Updating generated code.

9c783f0

Major update to code generator.

f96a289

It now generates classes that use the new system.

Merge branch 'feature/component-field' of https://github.com/njoy/GNDStk

a09ecfa

into feature/component-field

Temporary fix for broken HighFive library master branch. :-/

057485b

Shortened some things.

9c9bcfb

Some work mostly in Component's field.hpp. Work-in-progress.

d93a560

Merge branch 'feature/component-field' of github.com:njoy/GNDStk into…

3243a9f

… feature/component-field

staleyLANL added 27 commits February 22, 2023 00:37

Made the shortcut generator more flexible.

da70eb2

Note that it doesn't yet add shortcuts to generated classes; that'll come soon. Changed the way a map was done. Faster, and leads to better printing of shortcut information.

Updating generated code.

ddab317

Shortcuts now written to C++ and to Python-binding code.

4bee8e0

Some refactoring related to the above. Miscellaneous cosmetic and clarity-related changes.

Updates to Component printing, used for Python __repr__.

ce81699

Updated various test codes.

print() details.

f5fc4d4

Python mode for printing.

82a928a

Some small printing-related changes and fixes as well.

Improve python-bindings build speed by a factor > 3.

ef2e240

Reviewing and improving class Field.

5557527

These are largely in preparation for some upcoming, more-substantive changes; ones that should somewhat improve and clarify the interface.

Work on the Field and Lookup classes.

8f925ec

Some work on Component::has() and Component::operator(). Small change: class to struct in a few places.

Regenerate some generated files.

c23d29a

Some cleanup, in particular removing old files that won't be used.

95b6546

Additional work regarding Lookup objects, Field, Component, etc.

Work regarding better handling re: index vs. label vs. class Lookup.

bab98b9

Cleaned up some SFINAE, and removed some that was no longer needed. A bit of formatting work in some CMakeLists.txt files, just to make them arguably more readable.

Small cleanup: ensure newline at EOF of each text file.

dda53ef

As with: https://stackoverflow.com/questions/57770972

A couple of small but worthwhile code improvements:

3343ae2

- Clarified "auto" in some places. - Ditto for constness.

Cleanup regarding size_t.

db4afb5

See: https://stackoverflow.com/questions/42797279

Updated autogenerated file.

f94f790

Some refactoring and rearrangement of Component-related material.

8ccb129

Updated files, reflecting some recent cleanup/simplification.

3ebd840

Workaround for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85282

d09c337

Some improved exposition.

491f75d

Split up some longer files.

Some cleanup associated with the new vs. old JSON.

2520a18

staleyLANL requested review from whaeck and BobbiRiedel June 29, 2023 21:41

staleyLANL self-assigned this Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/new json #92

Feature/new json #92

staleyLANL commented Jun 29, 2023

Feature/new json #92

Are you sure you want to change the base?

Feature/new json #92

Conversation

staleyLANL commented Jun 29, 2023