Digital Sound & Music: Concepts, Applications, & Science, Chapter 5, last updated 6/25/2013

66

7. Use Huffman encoding on the resulting 576 quantized MDCT coefficients.

The result of the MDCT is 576 coefficients representing the amplitudes of 576 frequency

subbands. Huffman encoding is applied at this point to further compress the signal.

Huffman encoding is a compression method that assigns shorter codes to symbols that

appear more often in the signal or file being encoded and longer codes for symbols that appear

infrequently. Here’s a sketch of how it works. Imagine that you are encoding 88 notes from a

piano keyboard. You can use 7 bits for each note. (Six bits would be too few, since with six bits

you can represent only different notes.) In a particular music file, you have 100

instances of notes. The number of instances of each is recorded in the table below. Not all 88

notes are used, but this is not important to the example. (This is just a contrived example, so

don’t try to make sense of it musically.)

Note Instances

of Note

C4 31

B4 20

F3 16

D4 11

G4 8

E4 5

A3 4

B3 3

F4 2

There are 100 notes to be encoded here. If each requires 7 bits, that’s 700 bits for the

encoded file. It’s possible to reduce the size of this encoded file if we use fewer than 7 bits for

the notes that appear most often. We can choose a different encoding by building what’s called a

Huffman tree. We begin by creating a node (just a circle in a graph) for each of the notes in the

file and the number of instances of that note, like this:

(The initial placement of the nodes isn’t important, although a good placement can make the tree

look neater after you’ve finished joining nodes.) Now the idea is to combine the two nodes that

have the smallest value, making a node above, marking it with the sum of the chosen nodes’

values, and joining the new node with lines to the nodes below. (The lines connecting the nodes

are called arcs.) B3/3 and F4/2 have the smallest values, so we join them, like this: