Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make order of files in repaired wheel deterministic #507

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

bemoody
Copy link

@bemoody bemoody commented Aug 6, 2024

Currently, when running auditwheel repair, the contents of the output whl file are unpredictable:

  • The order of entries in the zip archive is unpredictable.
  • The order of lines in the "RECORD" file is unpredictable.
    In both cases, the order is dependent on the order of entries returned by os.walk.

This is a problem for build reproducibility - provided that the build process is sufficiently well defined, different people should be able to run the same process on different machines and get identical outputs.

Note that when setuptools or wheel generates a whl file, it does something similar (see WheelFile.write_files in wheel.wheelfile.) The code here won't do quite the same as what setuptools does, but that shouldn't be a problem.

Benjamin Moody added 2 commits August 5, 2024 20:53
In order to make the output zip file reproducible (independent of the
underlying filesystem's directory traversal order), sort each list of
subdirectories and each list of files before adding them to the zip
file.

(Note that we want to sort the dirs list in place, causing os.walk to
traverse the subdirectories in order.)
In order to make the output zip file reproducible (independent of the
underlying filesystem's directory traversal order), sort each list of
subdirectories and each list of files while we are generating the
RECORD file.

(Note that we want to sort the dirs list in place, causing os.walk to
traverse the subdirectories in order.)
Copy link

codecov bot commented Aug 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.28%. Comparing base (14c4282) to head (a393a17).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #507      +/-   ##
==========================================
+ Coverage   92.25%   92.28%   +0.02%     
==========================================
  Files          20       20              
  Lines        1266     1270       +4     
  Branches      305      305              
==========================================
+ Hits         1168     1172       +4     
  Misses         56       56              
  Partials       42       42              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@mayeut mayeut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR

According to https://peps.python.org/pep-0427/#recommended-archiver-features, it is recommended to place the .dist-info folder at the end of the archive.

If we're ensuring the order for build reproducibility, can this be taken into account please ?

If the wheel metadata files are physically located at the end of the
zip file, this allows other tools to modify the metadata without
rewriting the entire archive.
@bemoody
Copy link
Author

bemoody commented Sep 16, 2024

Sure, that makes sense and is easy to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants