Concept: Documents

Documents

Documents are the building block of Sintelix. Once data has been ingested into Sintelix, it is converted into a text-based standardised Document.

Key Concepts

Documents are the fundamental building block of Sintelix. To get a better understanding of the Sintelix core process and key concepts including Documents, view the Sintelix Core Process video.

Ingestion

Source data in many different formats can be loaded to Sintelix. The process of adding files into Sintelix is called Concept: Ingestion.

Standardised Documents

During ingestion, the source files are converted into a standardised text-based Document.

Collections of Documents

Once ingested, the standardised Documents can be stored in a Collection (of documents). See OntologyConcept: Collection.

Sintelix will always create a Document when loading data (both structured and unstructured data). However, you can set the Ingestion configuration not to keep the Documents and to only feed the information extracted from the Documents into a Network.

Mark Up in Documents

During ingestion, Sintelix identifies information of interest (based on Ingestion Configuration settings) and highlights them as Entities and Text References along with Links

These entities and text references are categorised based on Ontology configuration settings. For example, Sintelix looks for words that might be a person's name, highlights it as a "Person" entity. It performs a similar process to pick out other information of interest, such as Organisations and Locations, as illustrated below.

Network Creation: Clustering

Clustering groups similar things together.

Entities identified in documents are grouped (clustered) to become nodes in a Network, and the relationships between these entities become links within the Network.

This example shows the colour codes used in a document to markup the Nodes Nodes are document Entities that have been grouped to form a single identity on a Network. and Links. The number on the entities - implies the number of links that the entity A highlighted text in a Document which is represented as a node in a Network. has. In this case it identifies that Ahmad is Ali's brother.

Viewing Documents

Documents can be accessed from multiple locations, including Collections, Network Tables and Network Graphs (see Accessing documents). Documents are viewed in the Documents Pane.