Dictionaries Cheat Sheet
Function Summary
The key functions you can incorporate into a Dictionary are summarised below.
-
Wordlist Definition - creates a wordlist
-
Comments – add comments to a dictionary
-
Example – embed test data into a dictionary
-
Special Characters – escape special characters
-
-
Features – add useful information to a tagged text reference (entity)
-
-
Case Sensitivity – title, upper, lower, exact (default: all)
-
Context – check for terms in (or note in) the text block before annotating
-
-
Generalisation
-
Plurals – include plurals
-
Reduce (Spelling Variations) - include spelling variations
-
Example
Below is an example of a simple wordlist in a dictionary, where each word listed will be tagged as an Illicit Drug.
#wordlist tag:Illicit Drugs
fentanyl
heroin
morphine
opium
oxycodone
ecstasy
cocaine
Sample Text:
Fentanyl, morphine, opium and oxycodone were also discovered during the raid of the heroin dealer's home.
Result:
Fentanyl, morphine, opium and oxycodone were also discovered during the raid of the heroin dealer's home.
Function Cheat Sheet
The syntax for each of the Dictionary functions are summarised below. More detailed explanations and examples are provided in later topics.
Wordlist Definition
Creates a wordlist, giving it a name.
#wordlist <label>:<name>
<label>
-
tag (creates a text reference)
-
any text (creates an annotation with the text as the label and there is no visible text reference in the document. Can be referenced in EES.)
-
no label (creates an annotation without a label (listed under the generic label wordlist) and there is no a visible text reference in the document.)
<name>
-
the text that appears after the highlighted text reference when using tag as the label.
Result:
A cat is an animal.
Comments
You can add comments into your dictionary.
Single Line Comments
Can be on its own line or following code
Multi Line Comments
Can be on its own line or over several lines.
/*******************************/
/* Drugs – Illicit Drugs */
/*******************************/
#wordlist tag:Illicit Drugs
// add more illicit drugs
heroin
ecstasy // check for different names
morphine
opium
oxycodone
/* See separate list for prescription drugs */
Example
You can embed sample text inside the EXAMPLE tags to use when testing the dictionary.
/*
#EXAMPLE
Text placed between the example tags will be used as test input when testing the dictionary.
#EXAMPLE END
*/
/*
#EXAMPLE
During the attack, a bomb and several missiles were used to disable the aircraft carrier.
#EXAMPLE END
*/
Special Characters
Special characters included within a wordlist entry need to be entered in specific way (known as escaping special characters).
Special characters include:
- # (hash)
- * (asterisk)
- , (comma)
- // (two back slashes)
- */ (asterisk back slash)
- /* (back slash asterisk)
Escaping Special Characters
Use double quotes to escape special characters.
Escaping “ and \ in escaped text
To use “ and \ contained in double quotes, use \ (forward slash) before the " or \.
Escaping Special Characters
"#ab" // tags #ab"a , b" // tags a , b"a * b" // tags a * b"a // b" // tags a // b
Escaping “ and \ in escaped text
"a \" b" // tags a " b"\\ab" // tags \ab
Features
You can add additional information to text references created by dictionaries, called features.
A text reference can have more than one feature added.
Set Feature:
#feature <label>:<type>:<value>
Clear Feature:
#clear feature <label>
<type>
- string (default, can be omitted)
- boolean
- integer
- long float
- double
Setting a new feature with the same label automatically overrides the previous feature.
A feature is applied to all the following entries until the feature is overridden or cleared.
#wordlist tag:Weapon
#feature:"Weapon Type" "Artillery"
artillery
#clear feature:"Weapon Type"
#feature:"Explosive Type" "Remote Detonation"
bombs
#clear feature:"Explosive Type"
#feature:"Missile Type" "Long Range"
missiles
#cond:context missile
trident
Conditions
Conditions control which entries are annotated depending on if the condition is met. There are two types of conditions you can add: case and context.
Set Condition:
#cond:<type> <+/-><condition> <condition> <condition>
Clear Condition:
#clear cond: <type>
-
#cond on a new line clears the #cond on a previous line of the same type.
-
you can include multiple <conditions> of the same type on the same line.
-
when you want to stop using a condition you can clear the condition.
<type>
-
case - defines the case sensitivity of the entry.
-
context - defines terms that must or must not be in the text block with the entry.
<+/->
-
each condition defaults to the plus symbol
-
since the plus symbol is the default, you only need to add the minus symbol when you want to exclude a condition
-
you can combine position and negative conditions on the same line
#wordlist tag:SynonymsForForces
#cond:context -line
front
#clear cond:context
#cond:case exact
Force
Revolution Force
#clear cond:case
#wordlist tag:Aircraft
#cond:context aircraft plane planes aircrafts flew fly flying
"Boeing 747"
"Lockheed L-1011 TriStar"
"Macchi C.205 Veltro"
"Airbus A380"
"Macchi C.202 Folgore"
"Reggiane Re.2000 Falco I"
"Fiat G.50 Freccia"
Case Sensitivity
Case Sensitivity is a type of Condition, described above.
By default all case variations are annotated. To limit annotations to only certain cases, you can include a conditional case statement.
Set case:
#cond:case <+/-><condition> <condition><condition>
Clear case:
#clear cond: case
-
If not defined, all case variations are annotated.
<condition>
-
title – capitalised version of entries only
-
exact – entries exactly as written
-
upper – fully uppercase version of entries
-
lower – fully lowercase version of entries
Context
Setting a context condition allows you to only annotate an entry if a word or phrase exists (or does not exist) in the same text block as the entry.
Set Context:
#cond:context <+/-><condition> <condition><condition>
Clear Condition:
#clear context: <type>
<+/->
-
If not defined, term must be in the text block (assume plus sign by default).
-
See conditions above.
<term>
-
If term contains more than one word, enclose the term in double quotes (e.g. “New Zealand”)
-
You can include multiple terms, leaving a space between terms.
#wordlist tag:Aircraft
#cond:context aircraft plane planes fly flew flying
"Boeing 747"
"Lockheed L-1011 TriStar"
"Macchi C.205 Veltro"
"Airbus A380"
/*
#EXAMPLE
There were reports of multiple aircraft flying around the front line, with one witness stating they saw a Boeing 747!
While not as large as an Airbus A380, the Boeing 747 is an unexpected sight to see at the front.
#EXAMPLE END
*/
Plurals
By default, if the entry is singular only the singular form is annotated. If the entry is plural, only the plural is annotated.
If you want to include both singular and plural forms of a word, you need to enable plurals.
#generalize:plural <true/false>
Default: Plurals are not included by default.
#wordlist tag:Weapons
#generalize:plural true
fighter jet
bomb
missile
#generalize:plural false
aircraft carrier
Sample Text:
During the attack, a fighter jet, bombs and missiles were used to disable the aircraft carriers.
Result:
During the attack, a fighter jet, bombs and missiles were used to disable the aircraft carriers.
Reduce (Spelling Variations)
Names can often have many spelling variations. To include spelling variations you can enable to the Reduce function.
#generalize:reduce <true/false>
Clear Reduce:
#clear generalize: reduce
Sample Text:
Mohammad
Mohamad
Muhammad
Muhamed
Khalid
Khaled
Michael
Michel
Michal
Result:
Mohammad
Mohamad
Muhammad
Muhamed
Khalid
Khaled
Michael
Michel
Michal
Columns
Using columns, you can apply unique functions to each entry.
For example, for each entry, you can apply unique feature values, specify case and set conditions.
Set Columns:
#cols col-definition, <col-definition>, <col-definition>
Clear Columns:
#clear cols
One of the columns must be text, representing the dictionary entry.
#wordlist tag:Weapon
#cols text, feature:"Threat Level", feature:"Threat Value"
artillery, medium, 4
bombs, medium, 4
#clear cols //clear the previous column structure
#cols text, feature:"Top Speed"
missiles, 200mph
aircraft carrier, 30knots