Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Deserialisation Feature to read null and missing fields as empty collections. #1258

Closed
wants to merge 3 commits into from

Conversation

SiliconMeeple
Copy link

@SiliconMeeple SiliconMeeple commented Jun 3, 2016

Here's a basic approach - feels a bit blunt-force, but it seems to get the job done.

I'm afraid I don't understand contextual deserializers at all. There seems to be a problem where removing the check for the feature in BeanDeserializer#290 breaks a number of the contextual tests. The check gets them passing again, but it would be good to validate it with someone who better understood what was going on with that.

Should I add some user documentation? If so, where?

As discussed here: FasterXML/jackson-module-scala#257

@@ -354,7 +354,11 @@ protected JsonToken _initForReading(JsonParser p) throws IOException
t = p.nextToken();
if (t == null) {
// Throw mapping exception, since it's failure to map, not an actual parsing problem
throw JsonMappingException.from(p, "No content to map due to end-of-input");
if (this._config.hasDeserializationFeatures(DeserializationFeature.READ_NULL_AS_EMPTY_COLLECTION.getMask())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this one: note that this does not mean JSON null token, but actual physical end-of-input.
So I don't think it should fall under this feature; another feature could of course allow consider end-of-input as null equivalent, and thereby return either null or empty collection or whatever other coercions might be applicable.

@cowtowncoder
Copy link
Member

Ok first of all, thank you for the contribution. I know many users have requested this feature.

Due to size of changes I will need to spend more time looking through, but here are some initial thoughts (in addition to couple of notes I already added):

  1. Defining exact scope and naming is important, regarding what should be affected. Maps are not Collections, nor are (technically) Java arrays. However I can see why arrays would make sense to include; Maps, however, have quite distinct handling so a separate DeserializationFeature would make sense
  2. Part in BeanDeserializer that iterates over bean properties seems like performance killer so I would not want to add that at all
  3. Ideally all DeserializationFeatures should work on per-call basis, that is, dynamically and not as part of static definition of deserializer. This is sometimes tricky to do efficiently... in worst case it's just necessary to indicate that feature differs in its behavior so that changing setting for ObjectReader will have no effect after initial value is used to construct deserializers

One thing that I wonder, now, is that perhaps what would make most sense would be to instead add a new hook that would allow external definition of "empty" and "null" values. If so, handling could be externalized; and even if a simple/default implementation would be provided for convenience, users and/or library (scala module) authors could develop more advanced and use-case specific solutions.
Trying to cover all bases within databind leads to complexity without necessarily covering all or most use cases.

Would such an approach make sense? I am not 100% sure how handler should be defined, but should probably be called from getEmptyValue() / getNullValue(), passing type deserializer handles (possibly just raw type if full type not available; should usually suffice).

@SiliconMeeple
Copy link
Author

re scope and name:
There can only be 32 DeserializationFeatures, and this is the 27th. Is adding a 28th to distinguish maps/collections really worthwhile? Why might someone need one but specifically not want the other? We definitely need both.
Possibly we could call the feature READ_NULL_OR_MISSING_AS_EMPTY and change the documentation?
The alternative could really get ugly, because what we have here could be four features:
READ_NULL_COLLECTIONS_AS_EMPTY
READ_MISSING_COLLECTIONS_AS_EMPTY
READ_NULL_MAPS_AS_EMPTY
READ_MISSING_MAPS_AS_EMPTY

re BeanDeserializer:
I welcome other suggestions? In order to fill in the properties that can be empty, they're going to have to be iterated over at some point. We could try and iterate over the bean definition once, and cache whether or not that property might be empty, but it's not clear that the extra memory bloat / lookups would make this faster in the aggregate.
Are there perf tests that could help make this decision? If not, would some JMH in a sub-module be useful?

re Dynamic DeserializationFeatures
I'm afraid I don't follow. :( Should the member variable in the deserializers be dropped?

re A customizable hook

Intriguing! That would certainly allow the most flexibility. Happy to jump on IRC sometime if you want someone to rubber duck as you design. :)

@cowtowncoder
Copy link
Member

@HolyHaddock valid point about possibility of too many features. I would be fine with renaming of proposed feature to something where "Collection" would be replaced by something like "Container"; I think that is the term used within Jackson codebase to refer to structured types (although whether it covers referential types -- that is, Option(al) -- is bit ambiguous).

As to iteration: what I would do differently could be quite simple (although I haven't checked feasibility yet); instead of letting caller (BeanDeserializer) worry about null to empty conversion, let container deserializer handle that: you are already passing the boolean flag so it knows whether such coercion is warranted. If so, it should be able to handle it transparently. This would also work better with delegation by other types.

On dynamic features: it is perfectly fine to change state of any or all DeserializationFeatures when using ObjectReader:

Object ob = mapper.readerFor(MyType.class)
    .with(DeserializationFeature.READ_ENUMS_USING_TO_STRING)
    .readValue(jsonSource);

and this means that state of a particular feature at point of deserializer construction may not be state at actual use. As such, general contract is that state should be determined at point of use.
Conversely, all MapperFeatures are considered unchangeable after ObjectMapper (and any ObjectReader constructed from it) is used for the first time: they need to be fully enabled/disabled before calling readValue().

But patch as is defines things so that state of new feature is checked when constructing MapDeserializer (and others).
I am fine with "static" behavior here and do not see strict need or even benefit of being able to change it: I would think most users would define baseline setting and stick to it. But from this perspective feature behaves more like MapperFeature, even if it only applies to deserialization.
At the end of the way this is a minor thing so I don't think it's worth worrying too much about. First things first.

@SiliconMeeple
Copy link
Author

SiliconMeeple commented Jun 10, 2016

The feature name and docs have been improved as discussed.

re: iteration
Afraid I still don't follow. When there is a bean with three fields that could be set (foo, bar and quux, say) and a json object that only has a binding for foo, somewhere, you're going to have to iterate over some of the beans properties in order to know which deserializers are needed for bar and quux to call getEmptyValue on.
How else could it be done?

It would be nice not to build up the Set of property names if the feature was disabled - I'll have a think about how to do that later.

@cowtowncoder
Copy link
Member

Ok yes, if missing fields also need to be substituted, yes (although efficiently using bitfield/bitmask like with required property handling) but I do not anticipate that feature to be added at this point. Only mapping from explicit null. Use of default values could be added later on as part of work to support regular required properties, not just creator properties.
I'd be open to providing missing->empty in case of creator properties however.

I guess I wasn't clear on my distaste of value defaulting: I am not fully sure it even belongs to databinding layer at all. POJOs may set default values, and should do so where necessary. Translation of nulls to something else is coercion that is already supported for other things so that is fine.

But. If and when required check for general properties is supported, coercion of missing values may be added as part of that since it would/should use much of same functionality (keeping track of presence of property.

@@ -355,7 +355,11 @@ protected JsonToken _initForReading(DeserializationContext ctxt, JsonParser p)
t = p.nextToken();
if (t == null) {
// Throw mapping exception, since it's failure to map, not an actual parsing problem
ctxt.reportMissingContent(null); // default msg is fine
if (this._config.hasDeserializationFeatures(DeserializationFeature.READ_NULL_OR_MISSING_CONTAINER_AS_EMPTY.getMask())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is semantically different, as it is actually missing the whole JSON content (which makes it invalid; empty content is not a valid JSON value), so I don't think it is right to map it to anything. If such handling of invalid content is desired it should be fully separate feature, especially as it would apply to all content. I am not necessarily against such a feature but it belongs to a different issue/patch.

@cowtowncoder
Copy link
Member

At this point I think that implementation of #1402 will implement "null to empty Collection/array/Map" part; and remaining problem of missing values will need to be handled separately. Since there are conflicts and some of code is likely not needed or is incompatible, I will close this PR. I am open to possibility of further work wrt missing values of course, so this is not due to patch being refused or such. It actually helped with implementation of #1402.

@SiliconMeeple
Copy link
Author

Super. We ended up resolving this problem through the use of a different library, so I have no more itch to scratch here.

Thank you for the prompt and fulsome feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants