Topics by Key Phrases

Purpose

The Topic Detection by Key Phrase Grouping looks for phrases within the content of each Document. The configuration settings determine the parameters of how key phrases are identified.

A Topic is generated from clustered key phrases based on how key phrases are shared by Documents.

How it is applied

You apply Network Expansions when viewing a Network.

See Expand the Network.

Access

Topic Detection by Key Phrase Grouping is a built-in Network Expansion.

  1. Select Configurations > Network Creation

  2. Select the Network Expansion configuration.

Manage Configurations

See Manage Configurations for more information on managing configurations, including opening, creating, renaming, copying, exporting, and importing configurations.

Process

You can modify the configuration to meet your requirements. Once you have completed your modifications, select Save.

The configuration is divided into two sections:

  • Key-phrase Generation

  • Topic Generation

Key-phrase Generation

Option

Description

Node Type

Is the name of the Node Type created.

If you change the name, you will need to modify the Ontology for the existing Node Type (Key-phrase) or add a new Node Type to the Ontology.

Weight by Lengths

Assigns a weighting based on the number of words in the phrase.

Example:

Entering: 1,5,10

Means 1:

  • 1 word phrase is given a weight of 1

  • 2 word phrase is given a weight of 5

  • 3 word phrase is given a weight of 10

Black & White

Select a Dictionary configuration containing a Whitelist and/or Blacklist to be applied to Topic detection.

You can create a Dictionary configuration to include a:

  • Whitelist of words and phrases that are to be included. It will also find other words and phrases not included in the Whitelist.

  • Blacklist of words and phrases which must be excluded from phrases.

Select Create a Black & Whitelist Dictionary button to open a proforma Dictionary configuration ready for you to add words and phrases.

You can modify the Dictionary configuration as required. For example, if you do not want to include a Whitelist, simply remove the Whitelist from the Dictionary configuration.

Average Separation

You can limit the number of Key phrases generated by setting an average separation.

For example, if you set it to 100, a key phase is generated on average every 100 words.

Limit

Sets the maximum number of key phrases generated across the Network.

You can adjust this number based on the size of the Network.

Starting Header (optional) Only analyse the text after the specified Header text.
Ending Header (optional) Stop analysing the text after the specified Header text.
Colocation Identify the Target Nodes for creating Colocation links between the created Topic node generated and other nodes.
Topic Generation

Option

Description

Node Type

Is the name of the Node Type created.

If you change the name, you will need to modify the Ontology for the existing Node Type (Key-phrase) or add a new Node Type to the Ontology.

Clustering This is the configuration for how key phrases are clustered into Topics.
Min. Key-phrases Defines the minimum number of Key-phrases to form a Topic.
Topic to Document Links Is the name for the link between the Topic and the Documents it was formed from.
Topic to Entity Links Is the name of the link between the Topic and connected Entities.
Proximity Links Choose if you want to have links between Topics when they appear in close proximity of each other in Documents.