-
Notifications
You must be signed in to change notification settings - Fork 4
hachoir-parser : package of most common file format parsers written for Hachoir framework
License
foreni-packages/hachoir-parser
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
hachoir-parser is a package of most common file format parsers written for Hachoir framework. Not all parsers are complete, some are very good and other are poor: only parser first level of the tree for example. A perfect parser have no "raw" field: with a perfect parser you are able to know *each* bit meaning. Some good (but not perfect ;-)) parsers: * Matroska video * Microsoft RIFF (AVI video, WAV audio, CDA file) * PNG picture * TAR and ZIP archive GnomeKeyring parser requires Python Crypto module: http://www.amk.ca/python/code/crypto.html Website: http://bitbucket.org/haypo/hachoir/wiki/hachoir-parser hachoir-parser 1.3.4 (2010-07-26) ================================= * update matroska parser to support WebM videos hachoir-parser 1.3.3 (2010-04-15) ================================= * fix setup.py: don't use with statement to stay compatible with python 2.4 hachoir-parser 1.3.2 (2010-03-01) ================================= * Include the README file in the tarball * setup.py reads the README file instead of using README.py to break the build dependency on hachoir-core hachoir-parser 1.3.1 (2010-01-28) ================================= * Create MANIFEST.in to include extra files: README.py, README.header, tests/run_testcase.py, etc. * Create an INSTALL file hachoir-parser 1.3 (2010-01-20) =============================== * New parsers: - BLP: Blizzard Image - PRC: Palm resource * HachoirParserList() is no more a singleton: use HachoirParserList.getInstance() to get a singleton * Add tags optional argument to createParser(), it can be used for example to force a parser * Fix ParserList.print_(): first argument is now the title and not 'out'. If out is not specified, use sys.stdout. * MP3: support encapsulated objects (GEOB in ID3) * Create a dictionary: Windows codepage => charset name (CODEPAGE_CHARSET) * ASN.1: support boolean and enum types; fix bit string parser * MKV: use textHandler() * AVI: create index parser, use file size header to detect padding at the end * ISO9660: strip nul bytes in application name * JPEG: add ICC profile chunk name * PNG: fix transparency parser (tRNS) * BPLIST: support empty value for markers 4, 5 and 6 * Microsoft Office summary: support more codepages (CP874, Windows 1250..1257) * tcpdump: support ICMPv6 and IPv6 * Java: add bytecode parser, support JDK 1.6 * Python: parse lnotab content, fill a string table for the references * MPEG Video: parse much more chunks * MOV: Parse file type header, create the right MIME type hachoir-parser 1.2.1 (2008-10-16) ================================= * Improve OLE2 and MS Office parsers: - support small blocks - fix the charset of the summary properties - summary property integers are unsigned - use TimedeltaWin64 for the TotalEditingTime field - create minimum Word document parser * Python parser: support magic numbers of Python 3000 with the keyword only arguments * Create Apple/NeXT Binary Property List (BPLIST) parser * MPEG audio: reject file with no valid frame nor ID3 header * Skip subfiles in JPEG files * Create Apple/NeXT Binary Property List (BPLIST) parser by Robert Xiao hachoir-parser 1.2 (2008-09-03) =============================== * Create FLAC parser, written by Esteban Loiseau * Create Action Script parser used in Flash parser, written by Sebastien Ponce * Create Gnome Keyring parser: able to parse the stored passwords using Python Crypto if the main password is written in the code :-) * GIF: support text extension field; parse image content (LZW compressed data) * Fix charset of IPTC string (guess it, it's not always ISO-8859-1) * TIFF: Sebastien Ponce improved the parser: parse image data, add many tags, etc. * MS Office: guess the charset for summary strings since it could be ISO-8859-1 or UTF-8 hachoir-parser 1.1 (2008-04-01) =============================== Main changes: add "EFI Platform Initialization Firmware Volume" (PIFV) and "Microsoft Windows Help" (HLP) parsers. Details: * MPEG audio: - add createContentSize() to support hachoir-subfile - support file starting with ID3v1 - if file doesn't contain any frame, use ID3v1 or ID3v2 to create the description * EXIF: - use "count" field value - create RationalInt32 and RationalUInt32 - fix for empty value - add GPS tags * JPEG: - support Ducky (APP12) chunk - support Comment chunk - improve validate(): make sure that first 3 chunk types are known * RPM: use bzip2 or gzip handler to decompress content * S3M: fix some parser bugs * OLE2: reject negative block index (or special block index) * ip2name(): catch KeybordInterrupt and don't resolve next addresses * ELF: support big endian * PE: createContentSize() works on PE program, improve resource section detection * AMF: stop mixed array parser on empty key hachoir-parser 1.0 (2007-07-11) =============================== Changes: * OLE2: Support file bigger than 6 MB (support many DIFAT blocks) * OLE2: Add createContentSize() to guess content size * LNK: Improve parser (now able to parse the whole file) * EXE PE: Add more subsystem names * PYC: Support Python 2.5c2 * Fix many spelling mistakes Minor changes: * PYC: Fix long integer parser (negative number), add (disabled) code to disassemble bytecode, use self.code_info to avoid replacing self.info * OLE2: Add ".msi" file extension * OLE2: Fix to support documents generated on Mac * EXIF: set max IFD entry count to 1000 (instead of 200) * EXIF: don't limit BYTE/UNDEFINED IFD entry count * EXIF: add "User comment" tag * GIF: fix image and screen description * bzip2: catch decompressor error to be able to read trailing data * Fix file extensions of AIFF * Windows GUID use new TimestampUUID60 field type * RIFF: convert class constant names to upper case * Fix RIFF: don't replace self.info method * ISO9660: Write parser for terminator content Parser list =========== Archive ------- * 7zip: Compressed archive in 7z format * ace: ACE archive * bzip2: bzip2 archive * cab: Microsoft Cabinet archive * gzip: gzip archive * mar: Microsoft Archive * rar: Roshal archive (RAR) * rpm: RPM package * tar: TAR archive * unix_archive: Unix archive * zip: ZIP archive Audio ----- * aiff: Audio Interchange File Format (AIFF) * fasttracker2: FastTracker2 module * flac: FLAC audio * itunesdb: iPod iTunesDB file * midi: MIDI audio * mod: Uncompressed amiga module * mpeg_audio: MPEG audio version 1, 2, 2.5 * ptm: PolyTracker module (v1.17) * real_audio: Real audio (.ra) * s3m: ScreamTracker3 module * sun_next_snd: Sun/NeXT audio Container --------- * asn1: Abstract Syntax Notation One (ASN.1) * matroska: Matroska multimedia container * ogg: Ogg multimedia container * ogg_stream: Ogg logical stream * real_media: RealMedia (rm) Container File * riff: Microsoft RIFF container * swf: Macromedia Flash data File System ----------- * ext2: EXT2/EXT3 file system * fat12: FAT12 filesystem * fat16: FAT16 filesystem * fat32: FAT32 filesystem * iso9660: ISO 9660 file system * linux_swap: Linux swap file * msdos_harddrive: MS-DOS hard drive with Master Boot Record (MBR) * ntfs: NTFS file system * reiserfs: ReiserFS file system Game ---- * blp1: Blizzard Image Format, version 1 * blp2: Blizzard Image Format, version 2 * lucasarts_font: LucasArts Font * spiderman_video: The Amazing Spider-Man vs. The Kingpin (Sega CD) FMV video * zsnes: ZSNES Save State File (only version 143) Image ----- * bmp: Microsoft bitmap (BMP) picture * gif: GIF picture * ico: Microsoft Windows icon or cursor * jpeg: JPEG picture * pcx: PC Paintbrush (PCX) picture * png: Portable Network Graphics (PNG) picture * psd: Photoshop (PSD) picture * targa: Truevision Targa Graphic (TGA) * tiff: TIFF picture * wmf: Microsoft Windows Metafile (WMF) * xcf: Gimp (XCF) picture Misc ---- * 3do: renderdroid 3d model. * 3ds: 3D Studio Max model * bplist: Apple/NeXT Binary Property List * chm: Microsoft's HTML Help (.chm) * gnomekeyring: Gnome keyring * hlp: Microsoft Windows Help (HLP) * lnk: Windows Shortcut (.lnk) * ole2: Microsoft Office document * pcf: X11 Portable Compiled Font (pcf) * pdf: Portable Document Format (PDF) document * tcpdump: Tcpdump file (network) * torrent: Torrent metainfo file * ttf: TrueType font Program ------- * elf: ELF Unix/BSD program/library * exe: Microsoft Windows Portable Executable * java_class: Compiled Java class * pifv: EFI Platform Initialization Firmware Volume * prc: Palm Resource File * python: Compiled Python script (.pyc/.pyo files) Video ----- * asf: Advanced Streaming Format (ASF), used for WMV (video) and WMA (audio) * flv: Macromedia Flash video * mov: Apple QuickTime movie * mpeg_ts: MPEG-2 Transport Stream * mpeg_video: MPEG video, version 1 or 2 Total: 78 parsers
About
hachoir-parser : package of most common file format parsers written for Hachoir framework
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published