-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor transmute() on top of new "create" module #90
Conversation
Co-authored-by: jaimergp <jaimergp@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can see, the refactor is equivalent to the old code. Adding a default encoding will hopefully fix the occasional encoding errors that were reported.
conda_package_streaming/conda_fmt.py
Outdated
self.info_tar = info_tar | ||
self.is_info = is_info | ||
|
||
def addfile(self, tarinfo, fileobj=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could delegate to two other TarFile instances (info, pkg) and never call self.addfile()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think two TarFile instances are preferrable, even if a slight memory overhead it would reduce the risk of introducing some file access edge case with a custom tarfile subclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File access? OTOH this subclass prevents us from assigning an unused file handle to the theoretical third wrapper TarFile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I don't have strong feelings about it
] | ||
|
||
intersphinx_mapping = {"python": ("https://docs.python.org/3", None)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automatically links our documentation to the documentation of any Python standard library member we reference, either as subclass or in a type annotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions and suggestions
conda_package_streaming/conda_fmt.py
Outdated
|
||
|
||
@contextmanager | ||
def conda_builder( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def conda_builder( | |
def conda_v2_file( |
Technically this creates a v2 version of the conda format, the first having been the tar.bz2 file format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO .conda is the conda format and tar.bz2 is just tar.bz2; why look backward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not sure, the tar.bz2 format isn't a conda file format?
conda_package_streaming/conda_fmt.py
Outdated
self.info_tar = info_tar | ||
self.is_info = is_info | ||
|
||
def addfile(self, tarinfo, fileobj=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think two TarFile instances are preferrable, even if a slight memory overhead it would reduce the risk of introducing some file access edge case with a custom tarfile subclass.
# increase to reduce speed and increase compression (levels above 19 use much | ||
# more memory) | ||
ZSTD_COMPRESS_LEVEL = 19 | ||
# increase to reduce compression and increase speed | ||
ZSTD_COMPRESS_THREADS = 1 | ||
|
||
CONDA_PACKAGE_FORMAT_VERSION = 2 | ||
|
||
# Account for growth from "2 GB of /dev/urandom" to not exceed ZIP64_LIMIT after | ||
# compression | ||
CONDA_ZIP64_LIMIT = zipfile.ZIP64_LIMIT - (1 << 18) - 1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a last thought, could you check if those constants are used in any code in github search? Might be worth keeping them around
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Search turns up empty. We already import the two ZSTD_ constants here, only removing the package version and esoteric zip64 limit.
Description
I wrote this so that I could consider creating
.conda
from scratch using aTarFile
subclass that was backed by files on the filesystem instead of a pre-existingTarFile
. It also decomposestransmute
to use the same streaming data structure as the rest of this library.We also add the encoding= fix here, since it is simple and would not otherwise merge cleanly.