The Huffman algorithm
Huffman codes can be used to compress information. Instead of storing each character in a file as an 8-bit ASCII value, we will store the more frequently occurring characters using fewer bits and less frequently occurring characters using more bits. Doing this, on average, should decrease the filesize. It uses Greedy algorithm to build a Huffman tree. Decoding is done by reading in the file bit by bit.
Summary
Huffman codes can be used to compress information. Instead of storing each character in a file as an 8-bit ASCII value, we will store the more frequently occurring characters using fewer bits and less frequently occurring characters using more bits. Doing this, on average, should decrease the filesize. It uses Greedy algorithm to build a Huffman tree. Decoding is done by reading in the file bit by bit.
Things to Remember
- Huffman codes can be used to compress information.
- Instead of storing each character in a file as an 8-bit ASCII value, we will store the more frequently occurring characters using fewer bits and less frequently occurring characters using more bits.
- Doing this, on average, should decrease the filesize.
- It uses Greedy algorithm to build a Huffman tree.
- Decoding is done by reading in the file bit by bit.
MCQs
No MCQs found.
Subjective Questions
No subjective questions found.
Videos
No videos found.

The Huffman algorithm
Huffman Codes
Huffman codes can be used to compress information. The basic idea is that instead of storing each character in a file as an 8-bit ASCII value, we will store the more frequently occurring characters using fewer bits and less frequently occurring characters using more bits. Doing this, on average, should decrease the filesize.
Huffman Coding Algorithm
As an example, let's take the string: "duke blue devils" which is to be encoded using the Huffman Algorithm.
- We first do a frequency count of the characters:
e:3, d:2, u:2, l:2, space:2, k:1, b:1, v:1, i:1, s:1
- Next, we use a Greedy algorithm to build up a Huffman tree.
We start with the nodes for each character.

- We then pick the nodes with the smallest frequency and combine them together to form a new node.
The selection of these nodes is the Greedy part.
- The two selected nodes are removed from the set but replace by the combined node.
- This continues until we have only 1 node left in the set.










- Now we assign codes to the tree by placing a 0 on every left branch and a 1 on every right branch.
- A traversal of the tree from root to leaf give the Huffman code for that particular leaf character
Note that no code is the prefix of another code.

These codes are then used to encode the string.
Thus,“duke blue devils” turns into:
010 011 1110 00 101 11110 100 011 00 101 010 00 11111 1100 100 1101
When grouped into 8-bit bytes:
01001111 10001011 11101000 11001010 10001111 11100100 1101xxxx
Thus, it takes 7 bytes of space compared to 16 characters * 1 byte/char = 16 bytes uncompressed.
Decoding
Decoding or Uncompressing works by reading the file bit by bit
- Start at the root of the tree
- If a 0 is read, head left
- If a 1 is read, head right
- When a leaf is reached decode that character and start over again at the root of the tree
Thus, we need to save Huffman Table information as a header in the compressed file, which doesn't add a significant amount of size to the file for large files.
References
1.http://www.cs.uwec.edu/~stevende/cs252/programs/huffman.ppt
2. G. Brassard and P. Bratley, “Fundamentals of Algorithms”
3. R. L. Kruse, B. P. Leung, C. L. Tondo, “Data Structure and Program designin C”
4. L, Robert. "Data Structures and Algorithms in24Hours"
Lesson
Trees
Subject
Computer Engineering
Grade
Engineering
Recent Notes
No recent notes.
Related Notes
No related notes.