Concept: Ingestion

The Ingestion configuration defines what happens to source data when it is loaded to Sintelix.

Ingestion

The Ingestion configuration defines:

  • how source data is converted into Documents

    Sintelix will always create a Document when loading data (both structured and unstructured). However, you can set the Ingestion configuration not to keep the Documents and to only feed the information extracted from the Documents into a Network.

  • what information is extracted from structured and unstructured data (Text References, Entities and Links)

  • which Network(s), if any, to populate with the resulting extracted information

  • which Ontology to apply which identifies the icons, colours and default views for entities extracted

  • how to classify and tag ingested Documents.

Default Ingestion Configuration

Every Collection has a default Ingestion configuration assigned.

When you Add Documents to a Collection, the default Ingestion Configuration is selected.

Changing the Default

If you select a different Ingestion configuration when loading data, the selected Ingestion configuration becomes the default.

Warning

If you change the Ingestion Configuration when Documents have already been added into a Collection, you may get inconsistent results, especially when the information is consolidated into a Network.

For example, if the different Ingestion configurations are extracting different information from the Documents and/or applying a different Ontology.

Recommended Practice

If you need to use different Ingestion configurations, create a separate Collection for each configuration. These Collections can still be populating a single Network.

Have a single Ontology for the Collections feeding into a Network, to ensure consistent results in Networks.

For example, you may have a separate Collection for:

  • structured data from a database

  • information harvested from the Internet

  • reports ingested in PDF format.

all feeding extracted information into a single Network.

The Ontology assigned to a Network is the Ontology in the Ingestion configuration used to create a Network. Once a Network has been created, you cannot change the Ontology. To use a different Ontology, you would need to delete the Network and reprocess the Collections feeding the Network.

If you have added structured data into a Collection and have not retained the Documents, you will need to reload the structured data into the Network.

Linked Configurations

The Ingestion configuration also assigns other configurations to ingestion processing, including: