|
| 1 | +# Patterns |
| 2 | +Recognition and resolution of numbers, units, and date/time expressed in multiple languages (e.g. English, French, Spanish, Chinese) is built around several data structures like string constants, regular expressions, lists and dictionaries. |
| 3 | +The Microsoft.Recognizers.Text reusable component, currently has implementations in C# and TypeScript/JavaScript. |
| 4 | + |
| 5 | +As contributors incorporate fixes and support for new languages (e.g. Japanese, Korean, German, and Dutch) and new implementations (e.g. Python) a centralized store for data structures becomes crucial. |
| 6 | + |
| 7 | +### YAML |
| 8 | +YAML is a human-readable data serialization language. Custom data types are allowed, but YAML natively encodes scalars (such as strings, integers, and floats), lists, and associative arrays (also known as hashes or dictionaries). |
| 9 | + |
| 10 | +> For more information on YAML format read the [The Official YAML Website](http://yaml.org/) |
| 11 | +
|
| 12 | +Different implementations will use different methods to generate the required data structures (e.g. the C# implementation leverages T4 templates to read the YAML files and generate a .CS file containing a single class including all data structures.) |
| 13 | + |
| 14 | +### Strings and char constants |
| 15 | +Use simple scalar entities to generate a string constant. The '!char' tag can be used to force a char constant definition instead of a string one. |
| 16 | + |
| 17 | + |
| 18 | + |
| 19 | + |
| 20 | +### Complex structures |
| 21 | +For data structures more complex than string or char constants specific YAML tags will be used for different purposes described below: |
| 22 | + |
| 23 | +#### simpleRegex |
| 24 | +Used for the regex patterns that don't contain other regexes or params within; they're simply escaped string constants. |
| 25 | + |
| 26 | + |
| 27 | + |
| 28 | + |
| 29 | +#### nestedRegex |
| 30 | +Used for the regex patterns that are composed with other regexes definitions. It is a common implementation to make regex dependent on other regex patterns. Please, notice the 'references' property and the C# Interpolated String-like notation. |
| 31 | +> As a side effect, depending on the language implementation the order of the YAML entities has a direct impact on the final value of the data structures. |
| 32 | +
|
| 33 | + |
| 34 | + |
| 35 | + |
| 36 | +#### paramsRegex |
| 37 | +Used for the regex that is parametrized. Similarly to the nestedRegex notation a parameterized regex pattern is supported and a Function-like implementation. |
| 38 | + |
| 39 | + |
| 40 | + |
| 41 | + |
| 42 | +#### dictionary |
| 43 | +Used to define a dictionary using basic key and value data types. |
| 44 | + |
| 45 | + |
| 46 | + |
| 47 | + |
| 48 | +#### list |
| 49 | +Used to define lists of values of any basic data type. |
| 50 | + |
| 51 | + |
| 52 | + |
| 53 | + |
| 54 | + |
| 55 | +### Update definitions files |
| 56 | +Once the data structures are store in the corresponding YAML files, a series of constants and read-only values must be generated for each implementation. |
| 57 | + |
| 58 | +#### .NET |
| 59 | +The C# implementation leverages T4 templates to read the YAML files and generate a .CS file containing a single class including all data structures for each YAML file. |
| 60 | + |
| 61 | +The Microsoft.Recognizers.Definitions project contains all the required T4 templates and the generated .CS files. Re-processing the T4 templates to update the definition classes can be done upon saving each T4 file individually, using the Visual Studio Build - Transform All T4 Templates menu option or using any VS Extension to automatically trigger all T4 templates in your solution upon build. |
| 62 | + |
| 63 | + |
| 64 | + |
| 65 | + |
| 66 | +#### TypeScript |
| 67 | +The TypeScript implementation uses a Node.js program to read the YAML files and generate a .TS files containing a single namespace including all data structures for each YAML file. |
| 68 | + |
| 69 | +The *tools\resource-generator* folder contains the Node.js program which generates the data structures namespaces in the *src\resources folder*. To update the definition namespaces the `build-resources` configured script can be used as follow: `npm run build-resources`. |
| 70 | + |
| 71 | + |
0 commit comments