Given a frequency data, produces huffman codes for the unique messages.
- Node.java: DS for storing huffman tree
- PairNode.java: DS used in PairingHeap.java
- BinaryHeap.java: To implement binary heap
- PairingHeap.java: To implement pairing heap
- D_aryHeap.java: To implement D-ary heap. For eg. 4-way heap.
- Gen_huffman_code.java: To comapre the performance of above mentioned priority queues for huffman coding.
Usage:
~:src$ java Gen_huffman_code ../sample_input_large.txt
Reading input file ... Building Freq Table ... Done.
Building Huffman Tree using Binary Heap ...
Run:1.. 2.. 3.. 4.. 5.. 6.. 7.. 8.. 9.. 10.. Done
Average Time: 1663.0
Building Huffman Tree using Pairing Heap ...
Run:1.. 2.. 3.. 4.. 5.. 6.. 7.. 8.. 9.. 10.. Done
Average Time: 3417.0
Building Huffman Tree using 4-ary Heap ...
Run:1.. 2.. 3.. 4.. 5.. 6.. 7.. 8.. 9.. 10.. Done
Average Time: 1291.0
It is found that 4-ary heap is the fastest. So later encoding is done using 4-ary heap.
- encoder.java: Produces code_table.txt containing codes for all the unique messages; and encoded.bin, which is the binary file for the codes corresponding to the input file.
Usage:
~:src$ java encoder ../sample_input_large.txt
Reading input file ...
Building Freq Table ...
Building huffman tree... Done.
Generating code_table.txt .. Done.
Generating encoded.bin .. Done.
As a proof of concept, we will reconstruct our input using code_table.txt and encoded.bin
- decoder.java: Takes in input encoded.bin and code_table.txt and produces decoded.txt which is exactly the same as sample_input_large.txt as used in encoder.
Usage:
~:src$ java decoder encoded.bin code_table.txt
Building huffman tree from code_table.txt ..
Reading code_table.txt .. Done.
Reading encoded.bin .. Done.
Generating decoded.txt .. Done.
Do diff decoded.txt ../sample_input_large.txt
to check correctness.
Clone repository and make:
~:$ git clone https://github.com/vishalkg/huffman_coding.git
~:$ cd huffman_coding/src
~:src$ make