Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.
-
Updated
Nov 22, 2022 - Python
Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.
Advanced Duplicate File Finder for Python
Takes an input CSV and produces a CSV of duplicate records. Then the input CSV is cleansed to remove duplicates.
Command Line Interface for deplicate
Searches for duplicates in two separate folders allowing removing duplicated files from one and keeping another intact.
powerful data preprocessing application that simplifies the task of preparing data for machine learning models.
Data Manipulation of Biopic Dataset
A Python package to monitor your downloads folder and remove duplicate files.
My Data Cleaning Library
find out the duplicate items from a list if any.
Add a description, image, and links to the duplicates-removed topic page so that developers can more easily learn about it.
To associate your repository with the duplicates-removed topic, visit your repo's landing page and select "manage topics."