Digital Sound & Music: Concepts, Applications, & Science, Chapter 5, last updated 6/25/2013
7. Use Huffman encoding on the resulting 576 quantized MDCT coefficients.
The result of the MDCT is 576 coefficients representing the amplitudes of 576 frequency
subbands. Huffman encoding is applied at this point to further compress the signal.
Huffman encoding is a compression method that assigns shorter codes to symbols that
appear more often in the signal or file being encoded and longer codes for symbols that appear
infrequently. Here’s a sketch of how it works. Imagine that you are encoding 88 notes from a
piano keyboard. You can use 7 bits for each note. (Six bits would be too few, since with six bits
you can represent only different notes.) In a particular music file, you have 100
instances of notes. The number of instances of each is recorded in the table below. Not all 88
notes are used, but this is not important to the example. (This is just a contrived example, so
don’t try to make sense of it musically.)
Note Instances
of Note
C4 31
B4 20
F3 16
D4 11
G4 8
E4 5
A3 4
B3 3
F4 2
There are 100 notes to be encoded here. If each requires 7 bits, that’s 700 bits for the
encoded file. It’s possible to reduce the size of this encoded file if we use fewer than 7 bits for
the notes that appear most often. We can choose a different encoding by building what’s called a
Huffman tree. We begin by creating a node (just a circle in a graph) for each of the notes in the
file and the number of instances of that note, like this:
(The initial placement of the nodes isn’t important, although a good placement can make the tree
look neater after you’ve finished joining nodes.) Now the idea is to combine the two nodes that
have the smallest value, making a node above, marking it with the sum of the chosen nodes’
values, and joining the new node with lines to the nodes below. (The lines connecting the nodes
are called arcs.) B3/3 and F4/2 have the smallest values, so we join them, like this:
Previous Page Next Page