Help Center

Topic: Recognition

How are duplicate uploads detected?

Help Center RecognitionLast updated: 12 August, 2025

PaperSurvey uses two methods to automatically detect and prevent duplicate survey uploads, ensuring your data remains accurate even when files are accidentally uploaded multiple times.

duplicate surveys

Method 1: Unique page identifiers

This method applies to surveys with unique page identifiers enabled. Learn more about unique identifiers.

How identifier checking works

Each page contains three pieces of information:

  • Unique page ID (e.g., 91)
  • Page number (e.g., 1)
  • Survey ID (e.g., 991)

When you upload a document, our system checks if any page with these exact identifiers has already been processed. If found, the new upload is flagged as a duplicate.

Benefits of identifier checking

  • Prevents accidental re-uploads - Same scanned file uploaded twice
  • Catches re-scanned pages - Even if scanned differently
  • Maintains data integrity - No double-counting of responses

When duplicates are incorrectly flagged

Sometimes legitimate surveys are flagged as duplicates. This happens when:

  • Multiple copies are printed from the same PDF
  • Test prints are uploaded before the actual surveys
  • Pages are reprinted after errors

Your options:

  1. Mark as resolved - Use when the duplicate is truly not needed
  2. Retry processing - Process the flagged pages as new responses
  3. Disable unique identifiers - Ignore the QR codes entirely
  4. Allow duplicates - Keep identifiers but disable duplicate checking

Method 2: File hash comparison

This method works for all surveys, regardless of settings.

How hash checking works

Before processing any document:

  1. System calculates a SHA-1 hash (digital fingerprint) of each page
  2. Compares against all previously uploaded pages
  3. Blocks processing if an exact match is found

Benefits of hash checking

  • Prevents exact file duplicates - Same PDF uploaded multiple times
  • Works automatically - No configuration needed
  • Cross-upload protection - Detects duplicates across different upload sessions

Limitations to consider

  • Different scans = different hashes - Re-scanning creates a new hash
  • Any edit changes the hash - Even rotating the image
  • Not effective for paper variations - Only catches exact digital copies

Managing duplicate detection

Reviewing flagged duplicates

  1. Go to your survey's Uploads page
  2. Look for entries marked as "Duplicate"
  3. Click on each to review details
  4. Choose appropriate action

Disabling duplicate detection

In Survey Settings, you can:

  • Toggle "Allow duplicates" to disable all checking
  • Disable unique page identifiers specifically
  • Process duplicates on a case-by-case basis

Best practices

  • Review flagged duplicates before processing
  • Keep duplicate detection enabled for data quality
  • Use unique identifiers for large-scale surveys
  • Train staff on proper scanning procedures

Get Started with PaperSurvey.io Software

Get Started

Start your 14-day free trial now, no credit card required.