Skip to content

Fast, content based duplicate file detector with cache and more!

License

Notifications You must be signed in to change notification settings

MarcinOrlowski/dhunter

Repository files navigation

dhunter logo

PyPI version CodeFactor Code Rating Codacy Badge codebeat badge Language grade: Python Reviewed by Hound

Table of contents

Introduction

dhunter (pronounced The Hunter) is [d]uplicate [hunter] utility, designed to help scanning and processing large sets of files. Uses content based file duplicates matching and smart caching for faster directory scanning, data changes detection and processing.

Features

  • Content based file matching (sha256)
  • Designed to work with lot of data:
    • caches folder scaning results for quick reuse/rescan
    • directory scanning can be aborted and resumed at any moment
  • Smart content filters
    • Ignores zero length files and symlinks
    • Ignores folders like .git, .cvs, .svn
    • Supports file size based (min and/or max) filtering
    • Per folder exlusion via .dhunterignore file

Credits and license

  • Written and copyrighted ©2018-2019 by Marcin Orlowski
  • dhunter is open-sourced software licensed under the MIT license

About

Fast, content based duplicate file detector with cache and more!

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages