Dictionaries

What is a Dictionary?

Dictionaries provide a simple way of creating text references, such as entities, when documents are ingested.

Dictionaries are a collection of wordlists. Each wordlist contains a list of words or phrases that you want to identify, such as types of weapons, names of illicit drugs, names of specific organisations and war crime indicators. Each word or phrase is referred to as an entry.

Access

Select Configurations > Dictionaries to view the Dictionaries available in the current Project.

How it is applied

Dictionaries are added to the Document Processing Configuration, which in turn is added to the Ingestion Configuration.

Dictionaries and ESS

Entity Extraction Scripts A Sintelix configuration for marking up and creating connections between document text using a highly configurable scripting syntax. (EESs) allow for more advanced control over entity extraction. EES scripts work faster when Dictionaries are used to create the initial text reference.

Capabilities

Dictionaries offer the following capabilities:

  • single words or multi-word phrases
  • add features to entries
  • case sensitivity
  • context-sensitivity - check if term appears in the same block with (or without) other words
  • include or exclude plurals
  • include or exclude alternative spelling for names
  • escape special characters.
Edit and Test Dictionaries

You can edit and test Dictionaries, EES and Document Processing Scripts using the:

Test the Dictionary: Code Editor

The text editor provides code highlighting. To view the shortcut list of codes, press CTRL+Space.

To test, select the Save & Test button at the bottom of the Code Editor. You can select sample text or a sample document to test the code on and see the resulting output along with a detailed breakdown of the text graph.

For more information on using the Code Editor, see Code Editor.

Test the Dictionary: Text Graph Analyser
Video

Click on the image below to watch a video for quick introduction to defining and testing your Dictionary using the Code Editor and Text Graph Analyser.

For more information on using the Text Graph Analyser, see Text Graph Analyser.