【English | Chinese | Japanese】
This tool is specifically designed for individuals who need to spend a significant amount of time proofreading the content of PDF files. It efficiently compares the differences between different PDF files. The sample comparison results generated by this tool allow for a quick identification of discrepancies in pixels and text between PDF files.
Sample Comparison Results:
This tool generates a comparison result based on pixel differences between two PDF files, including four images. In the top two images, the red overlay indicates areas with pixel differences. To make differences more evident, two additional images are provided below. If the bottom-left image is pure white or the bottom-right image is pure black, it signifies that there are no differences between the two PDFs.
The tool will mark all recognizable text in the PDF with colored masks, where different colors have different meanings.
- Green: The word remains unchanged.
- Orange: Both the font size and color of the word have changed.
- Red: The word is an added or modified word.
Please follow the steps below:
- Clone the GitHub Repository: Clone the repository using the following command:
git clone https://github.com/VintLin/pdf-comparator.git
- Set up Python Environment: Open the "pdf-comparator" project directory and ensure you have Python 3.8 or higher. You can create and activate this environment using the following command, replacing "venv" with your preferred environment name:
cd pdf-comparator
python3 -m venv venv
- Install Dependencies: Install the required dependencies by running the following command:
pip3 install -r requirements.txt
- Run the Code Directly: Compare PDF files by running the following command:
python3 -m pdfcomparator "/compare_file_1.pdf" "/compare_file_2.pdf" "/result_folder/"
- Build an Executable: You can also build an executable using cx-Freeze as needed (the executable can be found in "/build/" after a successful build):
python3 setup.py build
- Run the Executable: Compare PDF files by running the following command with the executable:
./pdfcomparator.exe "/compare_file_1.pdf" "/compare_file_2.pdf" "/result_folder/"
This program accepts the following command line arguments:
-
file1
(required): Path to input file 1. Please provide the path to the first file you want to compare. -
file2
(required): Path to input file 2. Please provide the path to the second file you want to compare. -
output_folder
(required): Path to the output folder. Comparison results will be saved in this folder. -
--cache
or-c
: Optional argument for specifying a cache path. If a cache path is specified, the program will use caching to accelerate the comparison process. Caching is not enabled by default.
Here are some usage examples:
# Perform comparison
python3 -m pdfcomparator file1.pdf file2.pdf output_folder/
# Perform comparison and enable caching
python3 -m pdfcomparator file1.pdf file2.pdf output_folder/ --cache /path/to/cache
Made with contrib.rocks.
- Source Code Licensing: Our project's source code is licensed under the MIT License. This license permits the use, modification, and distribution of the code, subject to certain conditions outlined in the MIT License.
- Project Open-Source Status: The project is indeed open-source; however, this designation is primarily intended for non-commercial purposes. While we encourage collaboration and contributions from the community for research and non-commercial applications, it is important to note that any utilization of the project's components for commercial purposes necessitates separate licensing agreements.
If you have any questions, feedback, or would like to get in touch, please feel free to reach out to us via email at vintonlin@gmail.com