Duplicates Finder Plugin

Diskover Data Management Software Dupes Finder Plugin Icon (aka Duplicates Finder plugin)

The Diskover Duplicates Finder Plugin (aka dupes finder) leverages post-processing of index to check for duplicates, across all file systems or subset thereof.

This plugin is included with the following Diskover editions:

Community EditionEssentialProfessionalEnterpriseAJA Diskover Media EditionLife Science

Visit our Solutions page for more details about our different editions.

Category:

Function:

Data integrity

Post-index results are findable, searchable, reportable, and actionable

Overview

The Diskover Duplicates Finder Plugin, aka dupes finder, leverages the post-processing of indices to check for duplicates, across all file systems or subsets thereof. The plugin supports and uses xxhashmd5sha1, and sha256 hash values to compare duplicates, allowing for extremely precise validation.

The plugin is designed for multiple use cases:

  • To check for duplicate files across a single or all file systems (single or multiple indices) and index the file docs in index/indices that are dupes.
  • To calculate the file checksums/hashes for all duplicate files or all files and index hashes to file docs in index.
  • Reports can be customized to easily locate all duplicate files for clean-up efforts and/or to secure sensitive information.
  • The results can be engaged in workflows for automated cleanup.

Calculating file hash checksums is an expensive CPU/disk operation. The dupes finder provides configuration options to control which files in the index get a hash calculated and marked as a dupe. The duplicates plugin harvests hash values that can be stored only for duplicates or for all files.

In addition, the dupes finder provides additional optimization mechanisms as described in the Diskover Configuration and Administration Guide.

Extra Fields Indexed by Diskover

Diskover can currently index four different hash values and adds an extra field for duplicates found. This extra metadata is viewable in the file attributes window and is also searchable, reportable, and actionable.

MD5SHA1SHA256XXHASHDUPES
Field namehash.md5hash.sha1hash.sha256hash.xxhashis_dupe

Duplicates View Via the Search Page

Diskover Data Management Software Dupes Finder Results in Search Page
This example using the Diskover search page shows the same files found with the Dupes Finder Plugin.

Duplicates View Via the File Attributes

Diskover Data Management Software Dupes Finder Results in Attributes
In this example, the file attributes windows are displayed side by side demonstrating that the hash values are the same, therefore flagging duplicity. You can also see at the bottom of the window the value 1 for the number of duplicates found.

No video available at the moment.

Technical Documentation

End User Documentation

Not available – this plugin is meant for technical users

Not available at the moment.

Available here.

Scroll to Top