Help Center
Topic: Recognition
How are duplicate uploads detected?
Help Center Recognition • Last updated: 12 August, 2025PaperSurvey uses two methods to automatically detect and prevent duplicate survey uploads, ensuring your data remains accurate even when files are accidentally uploaded multiple times.
Method 1: Unique page identifiers
This method applies to surveys with unique page identifiers enabled. Learn more about unique identifiers.
How identifier checking works
Each page contains three pieces of information:
- Unique page ID (e.g., 91)
- Page number (e.g., 1)
- Survey ID (e.g., 991)
When you upload a document, our system checks if any page with these exact identifiers has already been processed. If found, the new upload is flagged as a duplicate.
Benefits of identifier checking
- Prevents accidental re-uploads - Same scanned file uploaded twice
- Catches re-scanned pages - Even if scanned differently
- Maintains data integrity - No double-counting of responses
When duplicates are incorrectly flagged
Sometimes legitimate surveys are flagged as duplicates. This happens when:
- Multiple copies are printed from the same PDF
- Test prints are uploaded before the actual surveys
- Pages are reprinted after errors
Your options:
- Mark as resolved - Use when the duplicate is truly not needed
- Retry processing - Process the flagged pages as new responses
- Disable unique identifiers - Ignore the QR codes entirely
- Allow duplicates - Keep identifiers but disable duplicate checking
Method 2: File hash comparison
This method works for all surveys, regardless of settings.
How hash checking works
Before processing any document:
- System calculates a SHA-1 hash (digital fingerprint) of each page
- Compares against all previously uploaded pages
- Blocks processing if an exact match is found
Benefits of hash checking
- Prevents exact file duplicates - Same PDF uploaded multiple times
- Works automatically - No configuration needed
- Cross-upload protection - Detects duplicates across different upload sessions
Limitations to consider
- Different scans = different hashes - Re-scanning creates a new hash
- Any edit changes the hash - Even rotating the image
- Not effective for paper variations - Only catches exact digital copies
Managing duplicate detection
Reviewing flagged duplicates
- Go to your survey's Uploads page
- Look for entries marked as "Duplicate"
- Click on each to review details
- Choose appropriate action
Disabling duplicate detection
In Survey Settings, you can:
- Toggle "Allow duplicates" to disable all checking
- Disable unique page identifiers specifically
- Process duplicates on a case-by-case basis
Best practices
- Review flagged duplicates before processing
- Keep duplicate detection enabled for data quality
- Use unique identifiers for large-scale surveys
- Train staff on proper scanning procedures
Topics
Get Started with PaperSurvey.io Software
Start your 14-day free trial now, no credit card required.