Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(documents): prevent duplicate document uploads within an organization #140

Merged
merged 2 commits into from
Feb 28, 2025

Conversation

CorentinTh
Copy link
Member

@CorentinTh CorentinTh commented Feb 28, 2025

closes #114

@CorentinTh CorentinTh requested a review from Copilot February 28, 2025 22:33
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR introduces duplicate document upload prevention by leveraging a SHA256 hash computed from file content to detect duplicates within an organization. Key changes include:

  • Adding a new SHA256 hash utility and tests for computing file hashes.
  • Implementing an early duplicate check in the use case and repository, with corresponding unit tests.
  • Updating the database schema and error handling to enforce a unique document constraint per organization.

Reviewed Changes

File Description
apps/papra-server/src/modules/documents/documents.services.test.ts Added tests for SHA256 hash computation.
apps/papra-server/src/modules/documents/documents.usecases.test.ts Added tests for duplicate document handling and error propagation.
apps/papra-server/src/modules/documents/documents.services.ts Exposed the SHA256 hash function for file processing.
apps/papra-server/src/modules/documents/documents.repository.test.ts Updated tests to include the new hash field and duplicate handling.
apps/papra-client/src/modules/documents/documents.composables.tsx Updated client-side error handling for duplicate documents.
apps/papra-server/src/modules/documents/documents.repository.ts Integrated safe DB insertion and unique constraint error handling.
apps/papra-server/src/modules/documents/documents.usecases.ts Implemented early duplicate checking and file cleanup on insertion failure.
apps/papra-server/src/modules/documents/documents.table.ts Updated the schema with unique and indexed constraints on the document hash.
apps/papra-server/src/modules/documents/documents.errors.ts Added a dedicated error factory for duplicate document errors.
apps/papra-server/src/modules/documents/documents.table.test.ts Updated tests to include the new hash field in schema validation.
apps/papra-server/src/modules/tags/tags.repository.test.ts Adjusted tests to account for the new document hash field.

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Copy link

cloudflare-workers-and-pages bot commented Feb 28, 2025

Deploying papra-demo with  Cloudflare Pages  Cloudflare Pages

Latest commit: d847c69
Status: ✅  Deploy successful!
Preview URL: https://f0c604cb.papra-demo.pages.dev
Branch Preview URL: https://document-hash.papra-demo.pages.dev

View logs

Copy link

cloudflare-workers-and-pages bot commented Feb 28, 2025

Deploying papra-client with  Cloudflare Pages  Cloudflare Pages

Latest commit: d847c69
Status: ✅  Deploy successful!
Preview URL: https://9408708f.papra.pages.dev
Branch Preview URL: https://document-hash.papra.pages.dev

View logs

…t.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link

cloudflare-workers-and-pages bot commented Feb 28, 2025

Deploying papra-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: d847c69
Status: ✅  Deploy successful!
Preview URL: https://0d679d12.papra-2op.pages.dev
Branch Preview URL: https://document-hash.papra-2op.pages.dev

View logs

@CorentinTh CorentinTh self-assigned this Feb 28, 2025
@CorentinTh CorentinTh merged commit f78d42c into main Feb 28, 2025
6 checks passed
@CorentinTh CorentinTh deleted the document-hash branch February 28, 2025 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prevent duplicates documents (unique field hash sha256)
1 participant