Skip to main content Link Search Menu Expand Document (external link)

Data Sources

A Data Source allows you to define where and how you are pulling data from a communication channel.

  1. Overview
  2. Data Sources List
    1. Generic Data Sources
    2. Email Data Sources
    3. Chat Data Sources
    4. Voice Data Sources
    5. Mobile Data Sources
    6. Collaboration Data Sources
    7. Archive Data Sources
    8. People / HR Data Sources
  3. Data Source Details
    1. Sections of a Data Source
    2. Data Source Auto-Disable
    3. Discovery of Monitored Individuals

Overview

A Data Source stores the configuration necessary to retrieve data from a communication channel, process that data, and ingest it into Relativity Trace.

Data Sources List

This list covers currently available Data Sources.

Generic Data Sources

If your specific data source is not found on this page, Trace has numerious capabilities to support your data using the below methods.

Type Data Source Notes Data Transfer Method
Generic EML Drop For deliverying daily EML exports from various systems Trace Shipper OR R1 SFTP
Generic RSMF Drop For deliverying chat-like data (social, mobile, etc…) Trace Shipper OR R1 SFTP
Generic Mailbox with 3rd party data For deliverying data from mobile (and other) providers such as TeleMessage who deliver their data to customer mailbox Email Data Sources
Generic Zip Drop For already processed structured data Trace Shipper OR R1 SFTP
Generic Generic Audio Data For audio data Trace Shipper OR R1 SFTP
Generic Monitored Individuals from any source using Trace ZipDrop For custom HR / People data Trace Shipper OR R1 SFTP

Email Data Sources

Type Data Source Data Collection Method Data Transfer Method
Email Microsoft O365 Email and Calendar Cloud-to-cloud Relativity Cloud Collect
Email Microsoft O365 Mail Archive Mailbox Cloud-to-cloud Relativity Cloud Collect
Email Google Mail Cloud-to-cloud Relativity Cloud Collect
Email Bloomberg Mail Cloud-to-cloud OR R1 SFTP Drop Relativity Cloud Collect
Email Microsoft Exchange Server On-premises software required - Relativity Collect On-Premises Trace Shipper

Chat Data Sources

Type Data Source Data Collection Method Data Transfer Method
Chat Bloomberg Chat and PChat Cloud-to-cloud OR R1 SFTP Drop Relativity Cloud Collect
Chat ICE Chat Cloud-to-cloud (via Relativity Collect) Relativity Cloud Collect
Chat Refinitiv Eikon Chat Cloud-to-cloud (via Relativity Collect) Relativity Cloud Collect
Chat Symphony On-premises software required - Merge1 Trace Shipper
Chat Skype for Business On-premises software required - Merge1 Trace Shipper
Chat Microsoft O365 Teams Chat Cloud-to-cloud Relativity Cloud Collect
Chat FXConnect On-premises software required - Merge1 Trace Shipper
Chat Cisco WebEx Teams Chat On-premises software required - Merge1 Trace Shipper
Chat ServiceNow On-premises software required - Merge1 Trace Shipper
Chat Google Chat Cloud-to-cloud Relativity Cloud Collect
Chat Salesforce Chatter On-premises software required - Merge1 Trace Shipper
Chat Slack Enterprise Chat Cloud-to-cloud Relativity Cloud Collect
Chat Microsoft Yammer On-premises software required - Merge1 Trace Shipper
Chat Facebook Workplace On-premises software required - Merge1 Trace Shipper
Chat YieldBroker On-premises software required - Merge1 Trace Shipper

Voice Data Sources

Type Data Source Data Collection Method Data Transfer Method
Voice Zoom Audio On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice Symphony Audio On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice WebEx Teams Audio On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice Vodafone On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice Avaya On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice Cloud 9 On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice Verba/Verint On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice Mitel On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice Liquid Voice On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice O2 On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice Microsoft Teams Audio On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP
Voice Skype for Business Audio On-premises software required - Intelligent Voice Trace Shipper OR R1 SFTP

Mobile Data Sources

Trace supports picking up mobile data from customer’s mailbox. In other words, mobile data is delivered to customer’s mailbox and then picked up by Trace from the mailbox.

Type Data Source Capture Methods Data Collection Method Data Transfer Method Notes License
Mobile WhatsApp via Native WhatsApp; via MS Teams; via Slak; via Leap Work (iOS and Android) LeapXpert delivers to customer mailbox Email Data Sources   Additional license with LeapXpert is required
Mobile WeChat via MS Teams; via Leap Work (iOS and Android) LeapXpert delivers to customer mailbox Email Data Sources   Additional license with LeapXpert is required
Mobile WeChat miniapp via Leap Work (iOS and Android) LeapXpert delivers to customer mailbox Email Data Sources   Additional license with LeapXpert is required
Mobile SMS/MMS via native SMS/MMS app on the phone; via MS Teams; via Leap Work (iOS and Android) LeapXpert delivers to customer mailbox Email Data Sources   Additional license with LeapXpert is required
Mobile iMessage via native iMessage app on iphone LeapXpert delivers to customer mailbox Email Data Sources iMessage capture requires user Apple ID and synchronizing the data from iCloud Additional license with LeapXpert is required
Mobile Telegram via native Telegram app LeapXpert delivers to customer mailbox Email Data Sources   Additional license with LeapXpert is required
Mobile Signal via Leap Work (iOS and Android) LeapXpert delivers to customer mailbox Email Data Sources   Additional license with LeapXpert is required
Mobile WeCom via native WeCom app LeapXpert delivers to customer mailbox Email Data Sources   Additional license with LeapXpert is required
Mobile LINE via Leap Work (iOS and Android) LeapXpert delivers to customer mailbox Email Data Sources   Additional license with LeapXpert is required

Collaboration Data Sources

Type Data Source Data Collection Method Data Transfer Method
Collaboration OneDrive for Business On-premises software required - Merge1 Trace Shipper
Collaboration SharePoint On-premises software required - Merge1 Trace Shipper
Collaboration Google Drive Cloud-to-cloud Relativity Cloud Collect
Collaboration Box On-premises software required - Merge1 Trace Shipper
Collaboration AWS S3 On-premises software required - Merge1 Trace Shipper
Collaboration Dropbox On-premises software required - Merge1 Trace Shipper

Archive Data Sources

Type Data Source Notes Data Collection Method Data Transfer Method
Archive Proofpoint   On-premises software required - Relativity Collect On-Premises Trace Shipper
Archive Enterprise Vault   On-premises software required - Relativity Collect On-Premises Trace Shipper
Archive Smarsh Data must be recieved via Scheduled Export configured in Smarsh N / A R1 SFTP

People / HR Data Sources

In Trace People / HR data is refered to as Monitored Individuals. A Monitored Individual is a person within the organization whose communications are being analyzed for misconduct.

Monitored Individuals are used as a unit of billing by Relativity Trace. Generally a Relativity Trace license will specify a number of Monitored Individuals available and the number of data sources they can be used on.

Type Data Source Data Collection Method Data Transfer Method
Monitored Individual Microsoft Azure Active Directory Cloud-to-cloud Relativity Cloud Collect
Monitored Individual Microsoft Active Directory On-premises software required - Relativity Collect On-Premises Trace Shipper OR R1 SFTP

Data Source Details

Sections of a Data Source

  1. General: this tab houses general identifying information and status for the data source. These fields are described in further detail below.

    • Data Source Type: Type of the data source
    • Name: The name of the Data Source
    • Document Type Name: A non-required name that will propagate to the Trace Type field on the documents that come in through this Data Source
      • If this field is left empty, the name of the Data Source will be used instead
    • Provider Type: The type fo communications that are being collected (Audio, Written, etc.)
    • Alternative Monitored Individual Identifier Field: Field which can be used to have different Monitored Individual identifier used when retrieving data from different data sources
    • Ingestion Profile: Ingestion Profile used to load data from this Data Source
    • Start Date: Date from which data will be pulled/pushed into Relativity
    • End Date: Optional date to which data will be pulled/pushed into Relativity.

      • If both dates are present, data between Start Date and End Date will be collected. If Ingestion State is later than Start Date, then only data between Ingestion State and End Date will be collected.
      • If Start Date is present but End Date is empty, data between Start Date and Now will be collected. If Ingestion State is later than Start Date, then only data between Ingestion State and Now will be collected.
      • If Start Date is empty but End Date is present, data between Ingestion State and End Date will be collected.
      • If neither Start Date nor End Date is present, data between Ingestion State and Now will be collected.
      • See Data Retrieval for more details.
    • Last Runtime (UTC): The timestamp when this Data Source was last executed
    • Enabled Time: The timestamp when this Data Source was last enabled
    • Disabled Reason: An explanation for why a data source was automatically disabled by the system
    • Status: The last status message recorded by the Data Source
    • Last Error Date: Timestamp of the last time this Data Source failed, if it happened recently (based on Last Error Retention in Hours setting under Data Source Specific Fields)
    • Last Error: Error message from the last time this Data Source failed, if it happened recently (based on Last Error Retention in Hours setting under Data Source Specific Fields)
  2. Settings: Configures standard settings required for the specific Data Source Type. These settings can be found on specific data source documentation pages.

  3. Trace Monitored Individuals: Configures which monitored individual’s data should be retrieved from the data source. See Monitored Individuals for more information.

  4. Data Transformations: Determines which data transformations to apply to documents prior to ingestion into Relativity by this data source. See Data Transformations for more information.

  5. Data Batches: The data batches which have been generated by this data source. See Data Batches for more information.

  6. Advanced Configuration: Different data source types have different configuration options. This section updates dynamically to allow access to these configuration options. See Advanced Configuration and the documentation of your specific Data Source Type for more information.

  7. Console

    • Enable/Disable Data Source: Enables (or disables) data retrieval for a particular data source.
    • Reset Data Source: Disables and resets data source to retrieve data from the specified Start Date.

      Depending on Import settings, enabling a reset Data Source could duplicate data in the Workspace.

  8. Advanced Configuration

This section contains additional settings which are not associated with specific Relativity Fields. The settings described here are common across all Data Source Types. Type-specific settings are documented under their respected Data Source sections.

  • Aip Application Id: This parameter is deprecated. Leave it empty.
  • Aip Tenant Id: This parameter is deprecated. Leave it empty.
  • Frequency In Minutes: Number of minutes worth of data to pull for each attempted data pull via Data Retrieval Task.
  • Merge Batches During Cold Start: When set to True it will merge initial Data Batches into ont, big Data Batch.
  • Max Number Of Batches To Merge: Input Value to control number of hours collected per Data Batch created, dependent on Frequency in Minutes value. This parametr is used when Merge Batches During Cold Start is set to True.

See Data Retrieval for more details about Frequency In Minutes, Merge Batches During Cold Start and Max Number Of Batches To Merge.

  • Collect Job Timeout In Minutes: 1440 (default) – Time interval after which a Data Batch will be automatically moved from Retrieving to Abandoned state.
  • Collection Period Offset In Minutes: 0 (default) – Modify Collection Period by adding offset in minutes to both Start and End Date. This parameter is used to collect data that are available to be retrieved with some delay e.g. 24 hours.
  • Only Retrieve Natives And Copy To Folder: Relative path to the fileshare folder (e.g. BloombergMailDrop\Drop). This parameter is used to improve performance of ingesting big portion of data e.g. 300,000 files (typical scenario for Bloombertg Chat and Mail). When the parameter is set, the Data Source will only retrieve files and store those files in the given folder. Neither enrichmnet nor ingestion will be performed on those files. To complete enrichment and ingestion, another Data Source (Globanet type of Data Source) needs to be created and configured to point to the folder. Eventually, there will be two Data Sources:
    • Retrieving - For retrieving files via Collect.
    • Processing - for enriching and ingestion those files in smaller chunks (1000 files each).
  • Password Bank Used to specify known passwords to attempt while encountering protected native files. Multiple passwords can be separated by the pipe character, |. Passwords containing the pipe character are supported through escaping the pipe character with a second pipe. Pipes are always escaped left to right.

    Example Password Bank: passw0rd|Trace1234!|aaa|bb|cccc||dd||eee|||ff|||ggg||||hhh||||| Yields the following passwords: - passw0rd - Trace1234! - aaa - bb - cccc|dd|eee| - ff| - ggg||hhh||

  • Extraction Thread Count: The number of documents to extract in parallel.
  • Enrich Documents: Whether or not to extract metadata and children from original documents. Valid values: true or false
  • Embedded File Behavior: Embedded files are defined as attachments without file names. Most commonly these are in-line images. This setting changes the import behavior for embedded files. Valid options are:
    • Import - Import all embedded files (top level and child) as separate documents in Relativity Trace.
    • DoNotImportFromAttachments - Import embedded files from top level documents only. Do not extract embedded files from child documents.
    • DoNotImport - Do not import any embedded files.

      Both the Import and DoNotImportFromAttachments settings will greatly increase document volumes in Relativity Trace.

  • Discover Monitored Individuals: See Discovery of Monitored Individuals
  • Include Monitored Individuals Not Linked to Data Source: See Discovery of Monitored Individuals
  • Discover Monitored Individuals Ignores Case: See Discovery of Monitored Individuals
  • Last Error Retention In Hours: The length of time to persist any message in the Last Error field.
  • Health Check Failure Window Length in Minutes: See [Data Source Auto-Disable] (#data-source-auto-disable)

  • Ingestion State: Timestamp of last Data Source execution.

    This parameter is only visible on **Data Source Layout (dev)** Layout. {: .info}
    
  • Retry Policy: Data Batch Automatic Custom Retry Policy. Defined as intervals in minutes between each Data Batch retry. It overwrites Instance Settings Retry Policy. If empty - Instance Settings Retry Policy will be used. Example values: [720,720,360,180,180] or []

Data Source Auto-Disable

Trace will automatically disable data sources that are identified as unhealthy or have critical configuration errors that will require intervention by the user. Trace will automatically disable a data source for the following reasons:

  • Data source has not had any successful data batches in the number of minutes configured on the Health Check Failure Window Length in Minutes field (if not set, default is 24 hours)
  • Globanet data source is enabled without enabling Globanet (Merge1) at the workspace level

Auto-disabled data sources will have their Disabled Reason field populated to show that it was disabled by the system. The data source will also have error details outlining the failures that caused the system to disable it.

Discovery of Monitored Individuals

Some Data Sources combine data from several places into a single import flow. In that scenario, it may not be clear which Monitored Individual is the source of a given document and no Monitored Individual will be tagged. To address this issue, Trace has introduced the Discover Monitored Individuals option on every Data Source. If enabled, Trace will look inside of the document and tag Monitored Individuals defined on the Data Source if they are found in headers inside the document. Monitored Individuals are recognized by identifier and all secondary identifiers.

There is also the option to discover Monitored Individuals that are not linked to the Data Source with the setting Include Monitored Individuals Not Linked To Data Source. If Discover Monitored Individuals is false, this setting will take no action. If Discover Monitored Individuals is true and Include Monitored Individuals Not Linked To Data Source is false, this setting will take no action and it will only discover Monitored Individuals that are linked to that Data Source. If Discover Monitored Individuals is true and Include Monitored Individuals Not Linked To Data Source is true, it will use all of the Monitored Individuals in the workspace to tag documents.

By default, Monitored Individual discovery ignores case in the domain portion of the email address but not the name portion. For example, John.DOE@URL.COM will match John.DOE@url.com, but not john.doe@url.com.

To ignore case in the entire email address during Monitored Individual discovery, use the Discover Monitored Individuals Ignores Case setting. For example, John.DOE@URL.COM will match always John.DOE@url.com, but only match john.doe@url.com if Discover Monitored Individuals Ignores Case is set to true.

Monitored Individual Discovery On Merge1 Data Sources

Merge1’s EWS Data Source only looks for Monitored Individuals in the X-UserMailbox header of an email. This header is provided by Merge1 and typically contains exactly one Monitored Individual.

Monitored Individual Discovery On Other Data Sources

All other data sources discover Monitored Individuals based on the FROM, TO, CC, and BCC headers. Any Monitored Individual on the Data Source with an identifier (primary or secondary) contained in any of these headers will be associated with the document.

Supported File Formats

Discovery of monitored individuals is based on finding the email addresses of monitored individuals in the headers of an email file. Therefore, it will only work properly on .eml, .msg, and .rsmf (Relativity Short Message Format) files. Any other file format is not currently supported.

Usage of Alternative Monitored Individual Identifier Field

Alternative Monitored Individual Identifier field give possibility to choose Fixed-Length Text field from Monitored Individual which we indicate to be main identifier to retrieve data from source. After retrieve, Monitored Individuals are always link to proper Identifier field.