More Like This / Similar Files

Nx offers a "find more like this" feature, enabling the discovery of files with similar content and record types. This feature facilitates advanced Dark Data Analysis through its integrated Artificial Intelligence (AI) engines. Found under the Data Inventory table within the Data Catalog Dashboard, the More Like This feature identifies files with similar attributes and context to the selected ones, promoting thorough exploration.

By determining verified similarities, you can directly classify these related files from the dashboard, streamlining the process. This tool also serves as a valuable means to perform quality control checks, enhancing overall data management and analysis capabilities.

1. From the Data Catalog Dashboard, apply queries and filters to generate the specific group of files for content ingestion and analysis. Refer to Dashboard Filters and Queries section for details

2. Scroll down to the Data Inventory table and select files for which you would like to identify duplicates or similar files and click More Like This


3. Select the Data Sources where you would like to find similar documents and click Preview. In this example, we selected just one reference file, and used the default settings that produced the results below:


  • The column Filename in the Data Inventory table provides a list of each file that match the reference file. In the example above, observe the upper right corner where it indicates 17 Records Found that are similar to file named “TV Yearend Accrual FYE14 Final.xlsx”,
  • Content provides a preview display of the contents for each file. Observe that the initial text of content present for each file which is another validation indicator of the similarity between the files. With the proper permissions, selecting See more… would render a full view of the file’s contents for deeper inspection and verification.
  • Relevance Score is the auto-calculation from the AI engine in terms to the similarity measure between the reference file and the other associated files with this list.
  • Classification Label is the previously applied contextual classifications for each file. As all of the labels are exactly the same, it also implies that the files were correctly labeled. Is there was a deviation amongst the labels, you could simply select the checkbox just left of the Filename column for each file that needs to be reclassified and then follow the same procedures from Assign Classification Labels

4. User can add more columns on this view by clicking on Edit Columns button

Select the checkboxes for the fields you want to add on the view.


5. You can classify files on this view. Select the files by clicking on the checkbox against the file name. Click on Classify button and choose the classification from the dropdown and click Apply. You can also Reset Classification similarly by selecting the files and clicking on Reset Classification button.

6. Capability of assigning tag to the file is also available on this view. Select the files by clicking on the checkbox against the file name. Click on Tag button and choose the Tag from the list and click Apply. You can also Reset Tags for already tagged files by clicking on Reset Tag button.