ArchiveFlow is a Fluent API for streamlined and efficient processing of zipped and unzipped file archives. It lets you focus on processing logic instead of file/zip handling code.
- Fluent interface for easy configuration and usage.
- Support for both zipped and unzipped file processing.
- Supports .zip, .7z, and .rar archives
- Customizable file filtering based on extensions and custom predicates.
- Options for reading files as text, binary, or streams.
- Parallel processing capabilities with configurable degrees of parallelism.
- Extensible design for future enhancements.
- Exception handling for robust processing.
- Support for a wide range of frameworks
To use ArchiveFlow in your project, add the following package to your dependencies:
dotnet add package ArchiveFlow
Here's a a simple example to get you started with ArchiveFlow. This will process all files as text file in archive files in the specified folder. The default behaviour is to process all entries in archive files in the folder (non-recursive), and ignore non archive files.
var builder = new FileProcessorBuilder()
.FromFolder("./your/path")
.ProcessAsText((f, t) =>
{
// Your text processing logic here
})
builder.Build().ProcessFiles();
Here's an example that is a bit more advanced. It reads all xml files in the specified folder, recursively, including archives younger than 10 days, and processes the text as xml. It also sets the maximum degree of parallelism to the number of processors on the machine, and handles exceptions for corrupted zip files.
// use a concurrent dictionary beacuse we are using multiple threads
var dict = new ConcurrentDictionary<string, byte>();
var builder = new FileProcessorBuilder()
.FromFolder("/folder/with/xmlfiles_archived_or_not", FolderSelect.RootAndSubFolders)
.SetArchiveSearch(ArchiveSearch.SearchInAndOutsideArchives)
.FromZipWhere((z) => z.LastModified > DateTime.Now.AddDays(-10))
.WhereFile((f) => !f.FileName.Contains("ReturnValue"))
.ProcessAsText((f, t) =>
{
XDocument xdoc = XDocument.Parse(t);
(string? id, string? name) =
(xdoc.Descendants("Id").FirstOrDefault()?.Value,
xdoc.Descendants("Name").FirstOrDefault()?.Value);
dict.TryAdd($"{id}_{name}", 0);
})
.WithMaxDegreeOfParallelism(Environment.ProcessorCount)
.HandleExceptionWith((f, ex) =>
{
if (f.Extension == ".zip" && ex is InvalidOperationException)
{
// ignore these exceptions for zip files (corrupted zip)
return true;
}
return false;
});
builder.Build().ProcessFiles();
check out this fiddle for a working example: https://dotnetfiddle.net/sIwHrW
Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License.
Dominique Biesmans - https://www.linkedin.com/in/dominiquebiesmans/
Project Link: https://github.com/domibies/archive-flow