-
Notifications
You must be signed in to change notification settings - Fork 37
RFC for Microsoft.PowerShell.Archive v2.0.0 #117
base: master
Are you sure you want to change the base?
Conversation
updated spelling of a word Co-authored-by: Flavien <fmichaleczek@gmail.com>
Great job on the RFC! 🙂 A few personal thoughts:
|
Hello There! I love it! I have a few suggestions. One of the use cases that our organization uses extensively is to be able to make changes it singular items in a zip file without having to expand and re-compress. For Example:
Sources: https://docs.microsoft.com/en-us/dotnet/api/system.io.compression.ziparchive?view=net-6.0 That's my two-cents. I'll head back to my corner. |
/azp list |
CI/CD Pipelines for this repository: |
/azp run PowerShell.Microsoft.PowerShell.Archive |
Azure Pipelines successfully started running 1 pipeline(s). |
Co-authored-by: Steve Lee <slee@microsoft.com>
Thanks!
When
Using |
You would be able to add files to or replaces files in the archive using the |
Ah, I missed that. Then it makes sense. 👍
As I mentioned I don't have a preference really, but it should be consistent. Default to a archive-named folder would match current default and Windows behavior though, so probably the better way to go. 🙂 |
Why isn't this RFC in the https://github.com/PowerShell/PowerShell-RFC repo for more visibility? |
rfc/RFCNNNN-Archive-Module-v2.md
Outdated
Compress-Archive [-Path] <string[]> [-DestinationPath] <string> [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | ||
|
||
Compress-Archive [-Path] <string[]> [-DestinationPath] <string> -Overwrite [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | ||
|
||
Compress-Archive [-Path] <string[]> [-DestinationPath] <string> -Update [-CompressionLevel {Optimal |NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | ||
|
||
Compress-Archive [-DestinationPath] <string> -LiteralPath <string[]> [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | ||
|
||
Compress-Archive [-DestinationPath] <string> -LiteralPath <string[]> -Overwrite [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | ||
|
||
Compress-Archive [-DestinationPath] <string> -LiteralPath <string[]> -Update [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compress-Archive [-Path] <string[]> [-DestinationPath] <string> [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | |
Compress-Archive [-Path] <string[]> [-DestinationPath] <string> -Overwrite [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | |
Compress-Archive [-Path] <string[]> [-DestinationPath] <string> -Update [-CompressionLevel {Optimal |NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | |
Compress-Archive [-DestinationPath] <string> -LiteralPath <string[]> [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | |
Compress-Archive [-DestinationPath] <string> -LiteralPath <string[]> -Overwrite [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | |
Compress-Archive [-DestinationPath] <string> -LiteralPath <string[]> -Update [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | |
Compress-Archive [-Path] <string[]> [-Destination] <string> [-Collision {NoClobber, Overwrite, Update } [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] | |
Compress-Archive [-LiteralPath] <string[]> [-Destination] <string> [-Collision {NoClobber, Overwrite, Update} [-CompressionLevel {Optimal | NoCompression | Fastest | SmallestSize}] [-PassThru] [-Format {zip | tar | tgz}] [-Filter <string[]>] [-Flatten] [-WhatIf] [-Confirm] [<CommonParameters>] |
-
There are way to many parameter sets to be user friendly and the only difference is
-Update
and-Overwrite
switch parameters. These two switch parameters change the collision behavior when the destination already exists.- Only two parameter sets are needed
-Path
and-LiteralPath
-Overwrite
and-Update
parameters replaced with an enum parameter-Collision { NoClobber, Overwrite, Update }
.NoCobber
would be the default value and have the behavior described in the RFC whenUpdate
orOverwrite
is not specified.Update
andOverwrite
would have the behavior described in the RFC.
- Only two parameter sets are needed
-
Parameter
-DestinationPath
should be renamed to-Destination
to match core cmdlets likeCopy-Item
andMove-Item
. ADestinationPath
alias can be added for backwards compatibility. -
Parameter
-LiteralPath
should havePosition = 0
-
Parameter
-Destination
should havePosition = 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about the number of parameter sets. I think -Collision
can be confusing to users because collision could refer to destination file collision or archive entry collision. I propose -WriteMode { Create, Update, Overwrite}
as an alternative.
Parameter -DestinationPath should be renamed to -Destination to match core cmdlets like Copy-Item and Move-Item. > A DestinationPath alias can be added for backwards compatibility.
How do you feel about keeping -DestinationPath
and using -Destination
as an alias?
Parameter -LiteralPath should have Position = 0
Parameter -Destination should have Position = 1
Since -Path
has position 0, -LiteralPath
cannot have position = 0. I agree with putting -DestinationPath
after -LiteralPath
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing function uses DestinationPath
. So that must be a name or an alias or existing scripts will break .
-Destination
is unambiguous as an abbreviation so there is no need to add it as an alias.
The existing function has Path as position 0 allowing compress-archive *.ps1
Swapping path
and literalpath
would break existing scripts which don't formally write -path
(and there will be many)
Adding complexity with more parameter sets should be avoided but fixing decisions made in the original model will usually do more harm than good
By the same logic existing scripts use -Force and -Update. Breaking these won't be acceptable
That does result in sets for Path or LiteralPath x update or force .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the parameter sets. The position of -LiteralPath
is changed so that it now comes before -DestinationPath
. Also, the number of parameter sets is reduced.
rfc/RFCNNNN-Archive-Module-v2.md
Outdated
Expand-Archive [-Path] <string> [[-DestinationPath] <string>] [-Overwrite] [-PassThru] [-Filter <string[]>] [-WhatIf] [-Confirm] [<CommonParameters>] | ||
|
||
Expand-Archive [[-DestinationPath] <string>] -LiteralPath <string> [-Overwrite] [-PassThru] [-Filter <string[]>] [-WhatIf] [-Confirm] [<CommonParameters>] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expand-Archive [-Path] <string> [[-DestinationPath] <string>] [-Overwrite] [-PassThru] [-Filter <string[]>] [-WhatIf] [-Confirm] [<CommonParameters>] | |
Expand-Archive [[-DestinationPath] <string>] -LiteralPath <string> [-Overwrite] [-PassThru] [-Filter <string[]>] [-WhatIf] [-Confirm] [<CommonParameters>] | |
Expand-Archive [-Path] <string> [[-Destination] <string>] [-Overwrite] [-PassThru] [-Filter <string[]>] [-WhatIf] [-Confirm] [<CommonParameters>] | |
Expand-Archive [-LiteralPath] <string> [[-Destination] <string>] [-Overwrite] [-PassThru] [-Filter <string[]>] [-WhatIf] [-Confirm] [<CommonParameters>] | |
-
Parameter
-DestinationPath
should be renamed to-Destination
to match core cmdlets likeCopy-Item
andMove-Item
. ADestinationPath
alias can be added for backwards compatibility. -
Parameter
-LiteralPath
should havePosition = 0
-
Parameter
-Destination
should havePosition = 1
-
Parameter
-Overwrite
should be named-Force
to match core cmdlets or if that is unacceptable aForce
parameter alias added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameter -DestinationPath should be renamed to -Destination to match core cmdlets like Copy-Item and Move-Item. A DestinationPath alias can be added for backwards compatibility.
Parameter -LiteralPath should have Position = 0
Parameter -Destination should have Position = 1
See my comment above.
Parameter -Overwrite should be named -Force to match core cmdlets or if that is unacceptable a Force parameter alias added.
PowerShell is moving away from -Force
because it is vague. -Overwrite
is more specific and preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ayousuf23 the only time Force
is an issue is when it's not instantly clear what it will do and if it does more than one thing. In this case that is neither the concern or issue. Install-Module
is the perfect example of bad Force
design where it does too many things, reinstalls the module, disables certain checks, and suppresses prompts. PowerShell design works best when the commands are consistent. Force
has been used since the beginning and does have some issues like I described above but if the user has been using PowerShell for any length of time, they will first try Force
. As such if the PowerShell team is unwilling to change the name then be nice to your long time users by adding a Force
alias.
P.S. None of the current built-in commands have Overwrite
as a parameter or alias. There is however, a NoOverwrite
alias for a few NoClobber
parameters. I'd recommend the team add Overwrite
aliases to existing Force
parameters where it makes sense to be more consistent if Overwrite
will be the new standard parameter for this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ayousuf23 Although there may be a desire to reduce the use of the -Force
switch, it is well understood (don't confirm, do overwrite) and is embedded in many existing scripts.
The way to resolve the conflict between compatibility and fixing things which the original design didn't get quite right would be to use new command names.
We don't take an archive and compress it. Really the command should be New-Archive
, and have Add-ArchiveFile
, Remove-ArchiveFile
and Update-ArchiveFile
commands for changing an existing file and Show-ArchiveFile
for listing its contents.
Expand
doesn't allow a subset of files to be extracted, copy-archiveFile
would be ideal for that task but I'm not sure if that is as good a name for the common use of expand everything, but (again) changing the name sidesteps the problem of breaking existing scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way to resolve the conflict between compatibility and fixing things which the original design didn't get quite right would be to use new command names.
Good idea!
We don't take an archive and compress it. Really the command should be New-Archive, and have Add-ArchiveFile, Remove-ArchiveFile and Update-ArchiveFile commands for changing an existing file and Show-ArchiveFile for listing its contents.
This is an interesting idea. I need to think about it some more before I get back to you.
Expand doesn't allow a subset of files to be extracted, copy-archiveFile would be ideal for that task but I'm not sure if that is as good a name for the common use of expand everything, but (again) changing the name sidesteps the problem of breaking existing scripts.
We can add a parameter to Expand-Archive
that allows a subset of files to be extracted. This way it won't break existing scripts and it can still be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather this module keep using -Force
for compatibility's sake.
This RFC is for this module rather than for PowerShell itself |
It does (as long as the archive does not have only one top-level folder and no other top-level items). There is a scenario where nesting the output of |
@SydneyhSmith / @SteveL-MSFT This module is shipped with PowerShell and as such should be treated the same as any other core module. The goal of a RFC is to get as many people to provide feedback as possible. There are over 100 people watching the RFC repo while this repo has 20. |
rfc/RFCNNNN-Archive-Module-v2.md
Outdated
|
||
Second, the module has limited performance compared to other archive software. | ||
Writing the next version of the module in C# instead of PowerShell Script is expected to improve the overall performance of the module. | ||
The module has limited cross-platform support because archive entries are written in an OS-specifc way due to different characters being used as path seperators in different OSs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should pivot what is being said here.
- There is new support in .NET for file types other than ZIP.
- Updating the PowerShell functionality to use the new functionality in .NET there provides as chance to address some existing problems.
- Performance is dictated by .NET - although calling from C# will reduce some of the overhead the time to do the compression or expansion is not going to change. If .NET's implementation is slower than 7Zip's , PowerShell won't fix that
- Supported formats are dictated by .NET, the goal should be to support any format supported by .NET but not to add formats.
- Where .NET offers options (like compression levels) and it makes sense to support them, the module should do so, but it shouldn't be a goal to support everything, nor to go above and beyond the options .NET exposes
- If the new module is bundled with PowerShell it is imperative that the module does not hinder the adoption of newer versions of PowerShell by breaking existing scripts
- Conversely if it fixes issues with wildcards (beyond users not knowing they need to use
-literalpath
) and improves compatibility it helps adoption of newer versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the description based on your feedback.
``` | ||
|
||
The archive preserves the relative structure of the input path. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break existing scripts, which allow an absolute a relative path to be specified and will only create the MyFolder
folder in the Zip file.
At present Compress-Archive -Path MyGrandparentFolder/MyParentFolder/MyFolder
treats MyFolder
as the item to be compressed and creates a top level folder in the ZIP file,
but Compress-Archive -Path MyGrandparentFolder/MyParentFolder/MyFolder/*.*
puts the items in MyFolder at the top level of the zip.
Preserving the intermediate levels is useful, but should not be the default and should be enabled by a switch.
A zip archive is created with the name `destination.zip`. | ||
The archive contains all the files in or descendents of MyFolder. | ||
The archive **does not** retain the directory structure since `-Flatten` is specified. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-Flatten
implies all the files are in the same folder.
Is this actually intended as a DoNotBreakExistingScripts
switch ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is not intended as a switch that maintains backwards compatibility.
#### **Relative Path Structure Preservation** | ||
|
||
When valid path(s) is supplied to the `-Path` or `-LiteralPath` parameter, the relative structure of the path is preserved as long as the path is not absolute, and does not contain ~ or .. (the path can contain other wildcard characters such as *, [, ], ?, and `). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above that preserving the path will be a breaking change for existing scripts
In a PATH parameter File[12]
means File1
or File2
in a literal path [12] is allowed to mean the twelfth file of a a set. It is important that users can still use (e.g) "Log[0-9]*.txt" to include files with "log" a digit and anything else. but exclude "Logarithms.txt" - I think this probably understood, but it does not come through reading the text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -LiteralPath
parameter behaves the same as before in regard to interpreting wildcard characters literally (i.e., not expanding wildcard characters).
``` | ||
|
||
A zip archive is created with the name `destination.zip`. | ||
Notice the cmdlet sees the `.zip` extension and uses the `zip` format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this apply to the other file extensions ? If -format
is only required when NOT using a standard extension that is a bonus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-Format
is an option parameter. If -DestinationPath
has an extension that matches a supported format (e.g., .zip
, or .tar
), then the cmdlet will determine the archive as the format that matches the extension.
As a PowerShell user, | ||
I can expand archives quickly and compress files and folders quickly, | ||
so that I can reduce the cost of operating my server. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would drop this last point, as performance is not under the control of the module. Costs are only reduced by having fewer servers, or fewer people, and it's hard to see that faster compression would reduce head count or machine count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this point.
Our aim is to keep RFCs for modules in their own repositories. The PowerShell-RFC repository is used for RFCs for PowerShell as a whole. |
I updated the RFC with reduced parameter sets. There is a minor change to the condition in which relative path structure preservation is done. |
The existing command uses
Changing
Whatever the current behaviour is, there will be scripts that rely on it.
instead of
So we then do
|
@@ -256,7 +250,7 @@ Note that the `Documents` folder is not retained in the archive. | |||
|
|||
Example: `Compress-Archive -Path Documents -DestinationPath destination.zip -Flatten` creates an archive which only contains the files in or descended from `Documents` and these files are the top-level items. | |||
|
|||
The `-Flatten` parameter can be used with `-Update`. | |||
The `-Flatten` parameter can be used with `-WriteMode Update`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-Update
is used in existing scripts, so is needed to avoid breaking them.
-Writemode
is better , so the code needs to have an Update
switch with the DontShow flag set, and if specified this will set the value Writemode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same goes with -Force
to avoid a breaking change a hidden force parameter that sets WriteMode
to Overwrite
.
As a PowerShell user, | ||
I can expand archives quickly and compress files and folders quickly, | ||
so that I can reduce the cost of operating my server. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a design flaw with Zip which goes back to its origins on MS DOS. It stores the date/time as displayed - i.e. as local time, not UTC time. And when the file is extracted the time is set as local time. This means if compress and expand happen on machines with different time zone settings the time stamp on the files changes. Changing the file format to fix this is absolutely out of scope, however a -SourceTimeZone parameter could allow the unzip operation to create correctly dated files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with -SourceTimeZone
is that it seems like it is only needed for Zip and not other archive formats. How can we make this work for other formats as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If other formats store the as "wall time time with the offset from utc" - equivalent to Get-Date -Format "o"
or always use the same timezone (UTC , Pacific, whatever), then the best course is either to show a warning and ignore any time adjustment parameter, or possibly to apply it only if a -Force
or a similar switch is used. Zip has never become time zone aware, because either it would mean new files couldn't be unzipped by old programs, or files would get different dates depending on which version unzipped them.
Applying a fixed offset to all files doesn't work, because daylight savings time moves the clock in opposite directions in the Northern and Southern Hemisphere and is applied at different dates. .NET timezones are smart enough to know that what offset applies in particular timezone at a given time. (I wrote about it :-) https://jhoneill.github.io/2022/03/07/Timezones.html )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea. I think we can keep this on the backlog and add this feature as time permits.
PR Summary
This PR adds an RFC for the next version of the archive module. The RFC includes support for additional archive formats, relative path preservation, and more. The RFC addresses new features and usability enhancements for the archive module. Please read the RFC for additional details. Please offer feedback by August 6, 2022.
PR Context
The purpose of this PR is to get community feedback on the RFC.