Software solutions for legal world.

Message Crawler - Data Tools

Message Crawler Manual

 Data Tools

Message Crawler is not just a conversion tool but an application that allows users to analyze and modify data in order to make it conform to required standards. Number of tools are part of Message Crawler that can perform various data manipulations.

About Attachment Identifier

Most tools require field that will identify attachments. In typical data set you will have field Message Type which will contain values such as Message or Attachment. When running tools that require this field, select appropriate field name and value to expect for attachment. This will allow attachments to be processed together with main message.


Deduplicate Data

Data Tools-Dedup.jpg

In case you have duplicate messages in your collection you can use data deduplication tool to remove them. One data type that often has duplicates is MS Teams. In order to use the tool, select all fields that if duplicate would indicate entire message is a duplicate. It is recommended to select multiple fields for example: From, To, Time Stamp, Body of Message. It is also a good idea to let the tool tag duplicates. Once duplicates are tagged, you can review results and use Filter menu to exclude tagged items. Removal is not recommended as it is hard to notice if a mistake was made.


Generate Conversation

Data Tools -Conversation.jpg

Generate conversation tool can help group messages based on criteria you specify. This tool generates a hash value from fields you select. All messages with same hash will belong to same conversation. There are 3 configurable areas where you can select data for hash generation. You do NOT need to specify all 3.

#1 Static Field: This is a field that static such room number or channel name.

#2 Names: Unique list of names will be generated and sorted alphabetically. Be sure to specify correct names delimiter.

#3 Date: Date field can be formatted to required value before hash value is generated. Altering formatting mask will allow you to group messages by any part of date field: date, month or year.

Text Cleanup: You can use regex to clean up text string before hash value is generated.

Return text string instead of hash: Normally a hash value is generated and saved to a field. Some prefer to see a readable string for either quality control or aesthetics purpose. In that case this check box can be selected.


Date Format

Date format tool can be used to format date/time field. You can create sort date by specifying Sortable Date. This will give date field that if sorted as text will put documents in chronological order. You can also round time to nearest increment. This is useful if you have text messages that are duplicates but are 1-2 seconds apart. You can use this tool before running deduplication.

Data Tools-DateFormat.jpg

Time Zone Convert

Time zone conversion tool can take date and time field and convert it from one time zone to another. Care must be taken not to convert date and time twice. If you convert date to required time zone in Message Crawler, you should use UTC as processing time zone in Relativity.

Data Tools-TimeZone.jpg

Name Normalization

Name normalization tool is used to reconcile names in your dataset. You may have same person referred to by different names. This is common if you collected data from multiple phones. Select all fields that contain names, specify names delimiter and click Create Names List button. Now you will have every unique name in your dataset listed in a table. If you need to change any names, simply enter new value in column next to it and click Replace Names button. New columns will be created allowing you to QC results of your work.

Data Tools-NameNormalization.jpg