Skip to content

Commit 5669f6f

Browse files
JuanArtellarin
authored andcommitted
Patterns README doc (#99)
1 parent 7922ace commit 5669f6f

9 files changed

+71
-0
lines changed

Patterns/README.md

+71
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Patterns
2+
Recognition and resolution of numbers, units, and date/time expressed in multiple languages (e.g. English, French, Spanish, Chinese) is built around several data structures like string constants, regular expressions, lists and dictionaries.
3+
The Microsoft.Recognizers.Text reusable component, currently has implementations in C# and TypeScript/JavaScript.
4+
5+
As contributors incorporate fixes and support for new languages (e.g. Japanese, Korean, German, and Dutch) and new implementations (e.g. Python) a centralized store for data structures becomes crucial.
6+
7+
### YAML
8+
YAML is a human-readable data serialization language. Custom data types are allowed, but YAML natively encodes scalars (such as strings, integers, and floats), lists, and associative arrays (also known as hashes or dictionaries).
9+
10+
> For more information on YAML format read the [The Official YAML Website](http://yaml.org/)
11+
12+
Different implementations will use different methods to generate the required data structures (e.g. the C# implementation leverages T4 templates to read the YAML files and generate a .CS file containing a single class including all data structures.)
13+
14+
### Strings and char constants
15+
Use simple scalar entities to generate a string constant. The '!char' tag can be used to force a char constant definition instead of a string one.
16+
17+
![strings and char constants](images/strings-char-constants.png)
18+
19+
 
20+
### Complex structures
21+
For data structures more complex than string or char constants specific YAML tags will be used for different purposes described below:
22+
 
23+
#### simpleRegex
24+
Used for the regex patterns that don't contain other regexes or params within; they're simply escaped string constants.
25+
26+
![simpleRegex](images/simpleregex.png)
27+
28+
29+
#### nestedRegex
30+
Used for the regex patterns that are composed with other regexes definitions. It is a common implementation to make regex dependent on other regex patterns. Please, notice the 'references' property and the C# Interpolated String-like notation.
31+
> As a side effect, depending on the language implementation the order of the YAML entities has a direct impact on the final value of the data structures.
32+
33+
![nestedRegex](images/nestedregex.png)
34+
35+
 
36+
#### paramsRegex
37+
Used for the regex that is parametrized. Similarly to the nestedRegex notation a parameterized regex pattern is supported and a Function-like implementation.
38+
39+
![paramsRegex](images/paramsregex.png)
40+
41+
 
42+
#### dictionary
43+
Used to define a dictionary using basic key and value data types.
44+
45+
![dictionary](images/dictionary.png)
46+
47+
48+
#### list
49+
Used to define lists of values of any basic data type.
50+
51+
![dictionary](images/list.png)
52+
53+
54+
55+
### Update definitions files
56+
Once the data structures are store in the corresponding YAML files, a series of constants and read-only values must be generated for each implementation.
57+
58+
#### .NET
59+
The C# implementation leverages T4 templates to read the YAML files and generate a .CS file containing a single class including all data structures for each YAML file.
60+
61+
The Microsoft.Recognizers.Definitions project contains all the required T4 templates and the generated .CS files. Re-processing the T4 templates to update the definition classes can be done upon saving each T4 file individually, using the Visual Studio Build - Transform All T4 Templates menu option or using any VS Extension to automatically trigger all T4 templates in your solution upon build.
62+
63+
![.NET definitions](images/net-definitions.png)
64+
65+
66+
#### TypeScript
67+
The TypeScript implementation uses a Node.js program to read the YAML files and generate a .TS files containing a single namespace including all data structures for each YAML file.
68+
69+
The *tools\resource-generator* folder contains the Node.js program which generates the data structures namespaces in the *src\resources folder*. To update the definition namespaces the `build-resources` configured script can be used as follow: `npm run build-resources`.
70+
71+
![TypeScrip resources](images/typescript-resources.png)

Patterns/images/dictionary.png

54.2 KB
Loading

Patterns/images/list.png

38.5 KB
Loading

Patterns/images/nestedregex.png

81.8 KB
Loading

Patterns/images/net-definitions.png

33.6 KB
Loading

Patterns/images/paramsregex.png

51.6 KB
Loading

Patterns/images/simpleregex.png

105 KB
Loading
21.9 KB
Loading
42.2 KB
Loading

0 commit comments

Comments
 (0)