1. We will need regular dictionaries in order to store our custom words and phrases
a. Make a POST to /regulardictionarystore/regulardictionaries
With a json in the body like this:
{ "language" : "en", "words" : [ { "word" : "hello", "exp" : "greeting(hello)", "frequency" : 0 } ], "phrases" : [ { "phrase" : "good afternoon", "exp" : "greeting(good_afternoon),language(english)" } ] } |
b. The API should return with 201 with an URI referencing the newly created dictionary
i. e.g.
eddi://ai.labs.regulardictionary/regulardictionarystore/regulardictionaries/<UNIQUE_ID>?version=<VERSION>
ii. This uri will be used in the parser configuration
2. Next step we will create a parser configuration, including the reference to the dictionary
a. Make a POST to /parserstore/parsers
b. Submit this type of json: (don't forget to replace the UNIQUE_ID and VERSION !)
{ "extensions" : { "dictionaries" : [ { "type" : "eddi://ai.labs.parser.dictionaries.integer" }, { "type" : "eddi://ai.labs.parser.dictionaries.decimal" }, { "type" : "eddi://ai.labs.parser.dictionaries.punctuation" }, { "type" : "eddi://ai.labs.parser.dictionaries.email" }, { "type" : "eddi://ai.labs.parser.dictionaries.time" }, { "type" : " eddi://ai.labs.parser.dictionaries.ordinalNumber" }, { "type" : "eddi://ai.labs.parser.dictionaries.regular", "config" : { "uri" : "eddi://ai.labs.regulardictionary/regulardictionarystore/regulardictionaries/<UNIQUE_ID>?version=<VERSION>" } } ], "corrections" : [ { "type" : "eddi://ai.labs.parser.corrections.stemming", "config" : { "language" : "english", "lookupIfKnown" : "false" } }, { "type" : "eddi://ai.labs.parser.corrections.levenshtein", "config" : { "distance" : "2" } }, { "type" : "eddi://ai.labs.parser.corrections.mergedTerms" } ]}, "config" : null } |
Type | Description | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
eddi://ai.labs.parser.dictionaries.integer | matches all positive integers | ||||||||||||||
eddi://ai.labs.parser.dictionaries.decimal | matches decimal numbers with '.' as well as ',' as a fractional separator | ||||||||||||||
eddi://ai.labs.parser.dictionaries.punctuation | matches common punctuation:
| ||||||||||||||
eddi://ai.labs.parser.dictionaries.email | Matches an email address with regex (\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b) | ||||||||||||||
eddi://ai.labs.parser.dictionaries.time | matches the following time formats: e.g. 01:20 , 01h20 , 22:40 , 13:43:23 | ||||||||||||||
eddi://ai.labs.parser.dictionaries.ordinalNumber | ordinal numbers in English language such as 1st, 2nd, 3rd, 4th, 5th, ... | ||||||||||||||
eddi://ai.labs.parser.dictionaries.regular | uri to a regular dictionary resource: eddi://ai.labs.regulardictionary/regulardictionarystore/regulardictionaries/<UNIQUE_ID>?version=<VERSION> |
3. Last step is to actually use the parser based on the created configurations
a. POST to /api/v1/parser/<PARSER_ID>?version=<VERSION>
b. In the body just plain text what you would like to be parsed
c.The parser will return expressions representing the elements from your plain text
Keep in mind that this parser is made for human dialogs, not parsing (full-text) documents |