It is interpretation of huffman algorithm
Hot to use: As you see in main.py, we declate object of Huffman class. Then we use method Huffman.encode(filename) to encode, and Huffman.decode() for decoding that encoded text. Encoded thext will be written in encoded_text.txt file, and decoded text in decoded_text.txt file. Also there added hamming encoding for error checking and correction. Hamming implemented as: For encoding: we divide encoded code for 4 bits blocks and last one doesn't matter how many bits. For decoding: we divide hamming encoded text for 7 bits(4 bits encodes to 7 bits, 3 redunnat bits added) For cheking hamming code we add some errors(Huffman.__generateErrorCodes())
How works Huffman encoding:
- Find frequency for every character.
- Make nodes for implementing tree.
- Add these nodes into heap, because at every iterate we need character with minimum frequency.
- Create binary tree and get root node(how it works exactly you can see in code or watch youtube).
- Going further by root node we apply for every leaf node some code, and finally we create map assigning for every character some code.
- Iterate text and rewrite it by map we created.
How works Decoding:
- Iterate encoded text by root node
- If current bit is 0 then go to the left node, otherwise go right
- Check if it is leaf node, if it is, write character this node is stores, and make current node root again.
- Do 2 and 3 until we reach end of file.
- Finally we get our text
How we implement Hamming: For encoding by Hamming we use general code.
- Finding number of parity bits
- Add parity bits in indexes os 2'th power
- Generate matrix
- Find values of parity bits For error checking and correction:
- Do same as in encoding
- Create binary number by values of parity bits
- if it is 0 then no errors are finded
- Else error is at position of values of binary number