Duplicate Invoice Detection

When managing Vendor Invoices, it's essential to validate the individuality of each Vendor Invoice that is Gathered into Discovery along with data that may be Extracted. Discovery approaches duplicate Vendor Invoices in different ways depending on which plan features you're currently utilizing. Duplicate Vendor Invoice deterrent measures are incorporated during Discovery's Gather and Extract phases. We'll briefly address each below.


Gather

The primary objective of the Gather feature is to aggregate Vendor Invoice documents into Discovery from one or more sources. While there are several methods of Gathering Vendor Invoices, the method to detect duplicates remains consistent.

No matter the type, each and every digital file is made up of digital information (0's and 1's), of which is ordered and sequenced uniquely to that file. If any other file has the same underlying digital information, it can be seen as a duplicate. Knowing this, Discovery actively scans each uploaded Vendor Invoice file using an MD5 method that detects its exact digital signature. If two files are identified with the same MD5 signature, the second Vendor Invoice is disallowed from being Gathered into Discovery.

If a duplicate Vendor Invoice is presented to Discovery by way of a Batch Import, you will see a prompt like the one below:

You can attempt to retry the upload, but if the file has been unaltered it will again be detected as a duplicate and disallowed from being Gathered into Discovery.

Due to differences that could be imperceptible to humans, it is possible for two Vendor Invoices that are unique, but are effectively the same invoice to be Gathered into Discovery.

For example, let's presume you had an original Vendor generated Invoice PDF and Gathered into Discovery on March 1st. On March 3rd, someone took a photograph of that same PDF and attempted to then Gather the image to Discovery. These two documents would pass the MD5 check, as each would have unique underlying digital information since they were generated at different times using differing technology. They would both be permitted to be Gathered into Discovery even though they're the same Vendor Invoice from a working perspective. Instances of duplicates passing the MD5 check like this are rare, but it is possible.

We request that the duplicate detection employed during the Gather phase not be considered to be impassible, but rather another deterrent beyond human efforts.


Extract

Duplicate detection is also performed during the Extract phase of Vendor Invoice handling. Before an Extracted Vendor Invoice can be processed, unique Vendor Invoices are identified using a combination of the Vendor Account number and invoice number. If a duplicate Vendor Invoice should make it past the duplicate check performed during Gather, these two identifying numbers should flag the second as a duplicate. This occurs after the Vendor Invoice Account and Invoice Number are detected or provided when editing the Extracted Vendor Invoice.

When a possible duplicate Vendor Invoice is detected, you'll be notified with a yellow banner citing the original Vendor Invoice.

After review, you may mark the Vendor Invoice as a duplicate of a Vendor Invoice that has already been Extracted. This will disallow processing of the duplicate moving forward and its data will not be included in data exports. If you do not mark the Vendor Invoice as a duplicate, you will be allowed to process it through to Audit. In addition, any data that is attached to this invoice will appear in data exports.