You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a prototype of how a library might look like for (de)serialising XML into Python dataclasses. XML dataclasses build on normal dataclasses from the standard library and [`lxml`](https://pypi.org/project/lxml/) elements. Loading and saving these elements is left to the consumer for flexibility of the desired output.
5
+
This library enables (de)serialising XML into Python dataclasses. XML dataclasses build on normal dataclasses from the standard library and [`lxml`](https://pypi.org/project/lxml/) elements. Loading and saving these elements is left to the consumer for flexibility of the desired output.
6
6
7
-
Itisn't ready for production if you aren't willing to do your own evaluation/quality assurance. I don't recommend using this library with untrusted content. It inherits all of `lxml`'s flaws with regards to XML attacks, and recursively resolves data structures. Because deserialisation is driven from the dataclass definitions, it shouldn't be possible to execute arbitrary Python code. But denial of service attacks would very likely be feasible.
7
+
It's currently in alpha. It isn't ready for production if you aren't willing to do your own evaluation/quality assurance. I don't recommend using this library with untrusted content. It inherits all of `lxml`'s flaws with regards to XML attacks, and recursively resolves data structures. Because deserialisation is driven from the dataclass definitions, it shouldn't be possible to execute arbitrary Python code (not a guarantee, see license). Denial of service attacks would very likely be feasible. One workaround may be to [use `lxml` to validate](https://lxml.de/validation.html) untrusted content with a strict schema.
8
8
9
9
Requires Python 3.7 or higher.
10
10
11
-
## Example
12
-
13
-
(This is a simplified real world example - the container can also include optional `links` child elements.)
I would like to add support for validation in future, which might also make it easier to support other types. For now, you can work around this limitation with properties that do the conversion.
39
+
For now, you can work around this limitation with properties that do the conversion, and perform post-load validation.
95
40
96
41
### Defining text
97
42
@@ -122,10 +67,10 @@ Children must ultimately be other XML dataclasses. However, they can also be `Op
122
67
* Next, `List` should be defined (if multiple child elements are allowed). Valid: `List[Union[XmlDataclass1, XmlDataclass2]]`. Invalid: `Union[List[XmlDataclass1], XmlDataclass2]`
123
68
* Finally, if `Optional` or `List` were used, a union type should be the inner-most (again, if needed)
124
69
125
-
Children can be renamed via the `rename` function, however attempting to set a namespace is invalid, since the namespace is provided by the child type's XML dataclass. Also, unions of XML dataclasses must have the same namespace (you can use different fields if they have different namespaces).
126
-
127
70
If a class has children, it cannot have text content.
128
71
72
+
Children can be renamed via the `rename` function. However, attempting to set a namespace is invalid, since the namespace is provided by the child type's XML dataclass. Also, unions of XML dataclasses must have the same namespace (you can use different fields with renaming if they have different namespaces, since the XML names will be resolved as a combination of namespace and name).
73
+
129
74
### Defining post-load validation
130
75
131
76
Simply implement an instance method called `xml_validate` with no parameters, and no return value (if you're using type hints):
If defined, the `load` function will call it after all values have been loaded and assigned to the XML dataclass. You can validate the fields you want inside this method. Return values are ignored; instead raise and catch exceptions.
139
84
85
+
## Example (fully type hinted)
86
+
87
+
(This is a simplified real world example - the container can also include optional `links` child elements.)
This can be a real pain to get right. Unfortunately, if you need this, you may have to resort to:
151
+
152
+
```python
153
+
@xml_dataclass
154
+
@dataclass
155
+
classChild:
156
+
__ns__ =None
157
+
pass
158
+
159
+
@xml_dataclass
160
+
@dataclass
161
+
classParent(XmlDataclass):
162
+
__ns__ =None
163
+
children: Child
164
+
```
165
+
166
+
It's important that `@dataclass` be the *last* decorator, i.e. the closest to the class definition (and so the first to be applied). Luckily, only the root class you intend to pass to `load`/`dump` has to inherit from `XmlDataclass`, but all classes should have the `@dataclass` decorator applied.
167
+
142
168
### Whitespace
143
169
144
170
If you are able to, it is strongly recommended you strip whitespace from the input via `lxml`:
@@ -151,7 +177,7 @@ By default, `lxml` preserves whitespace. This can cause a problem when checking
151
177
152
178
### Optional vs required
153
179
154
-
On dataclasses, optional fields also usually have a default value to be useful. But this isn't required; `Optional` is just a type hint to say `None` is allowed.
180
+
On dataclasses, optional fields also usually have a default value to be useful. But this isn't required; `Optional` is just a type hint to say `None` is allowed. This would occur e.g. if an element has no children.
155
181
156
182
For XML dataclasses, on loading/deserialisation, whether or not a field is required is determined by if it has a `default`/`default_factory` defined. If so, and it's missing, that default is used. Otherwise, an error is raised.
157
183
@@ -163,8 +189,8 @@ This makes sense in many cases, but possibly not every case.
163
189
164
190
Most of these limitations/assumptions are enforced. They may make this project unsuitable for your use-case.
165
191
166
-
*It isn't possible to pass any parameters to the wrapped `@dataclass` decorator
167
-
* Setting the `init` parameter of a dataclass' `field` will lead to bad things happening, this isn't supported
192
+
*If you need to pass any parameters to the wrapped `@dataclass` decorator, apply it before the `@xml_dataclass` decorator
193
+
* Setting the `init` parameter of a dataclass' `field` will lead to bad things happening, this isn't supported.
168
194
* Deserialisation is strict; missing required attributes and child elements will cause an error. I want this to be the default behaviour, but it should be straightforward to add a parameter to `load` for lenient operation
169
195
* Dataclasses must be written by hand, no tools are provided to generate these from, DTDs, XML schema definitions, or RELAX NG schemas
0 commit comments