Skip to content

packing-box/dataset-packed-pe

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dataset of packed PE files

This is a fork of the dataset at https://github.com/chesvectain/PackingData with some samples sanitized (e.g. UPX-packed samples in the ´not-packed´ folder or samples with a same hash from the packer and not-packed folders).

It also includes a folder named outliers containing samples we could identify as potentially disturbing our models, i.e. when they were sorted among the not packed samples while demonstrating characteristics of packed data. This dataset can be used for training machine learning models tailored to PE executable packing.

Folder labels contains a Python script for generating labels based on the packer categories mentioned in the table of packed folder's README.md with the resulting JSON dictionaries.

⭐ Related Projects

You may also like these:

Example of visualization created with Bintropy: