Digital Sound & Music: Concepts, Applications, & Science, Chapter 5, last updated 6/25/2013
Notice that the left arc from each node has been labeled with a 0 and the right with a 1. The
nodes at the bottom represent the notes to be encoded. You can get the code for a note by
reading the 0s and 1s on the arcs that take you from the top node to one of the notes at the
bottom. For example, to get to B4, you follow the arcs labeled 0 0. Thus, the new encoding for
B4 is 0 0, which requires only two bits. To get to B3, you follow the arcs labeled 1 1 1 1 1 0.
The new encoding for B3 is 1 1 1 1 1 0, which requires six bits. All the new encodings are given
in the table below:
Note Instances
of Note
Code in
of Bits
C4 31 10 2
B4 20 00 2
F3 16 110 3
D4 11 010 3
G4 8 011 3
E4 5 1110 4
A3 4 11110 5
B3 3 111110 6
F4 2 111111 6
Note that not all notes are encoded in the same number of bits. This is not a problem to
the decoder because no code is a prefix of any other. (This way, the decoder can figure out
where a code ends without knowing its length.) With the encoding that we just derived, which
uses fewer bits for notes that occur frequently, we need only 31*2 + 20*2 + 16*3 + 11*3 + 8*3 +
5*4 + 4*5 + 3*6 + 2*6 = 277 bits as opposed to the 700 needed previously.
This illustrates the basic concept of Huffman encoding. However, it is realized in a
different manner in the context of MP3 compression. Rather than generate a Huffman table
based on the data in a frame, MP3 uses a number of predefined Huffman tables defined in the
MPEG standard. A variable in the header of each frame indicates which of these tables is to be
Steps 5, 6, and 7 can be done iteratively. After an iteration, the compressor checks that
the noise level is acceptable and that the proper bit rate has been maintained. If there are more
bits that could be used, the quantizer can be reduced. If the bit limit has been exceeded,
quantization must be done more coarsely. The level of distortion in each band is also analyzed.
If the distortion is not below the masking threshold, then the scale factor for the band can be
8. Put the encoded data into a properly formatted frame in the bit stream.
The header of each frame is as shown in Table 5.4. The data part of each frame consists
of the scaled, quantized MDCT values.
AAC compression, the successor to MP3, uses similar encoding techniques but improves
on MP3 by offering more sampling rates (8 to 96 kHz), more channels (up to 48), and arbitrary
bit rates. Filtering is done solely with the MDCT, with improved frequency resolution for
signals without transients and improved time resolution for signals with transients. Frequencies
over 16 kHz are better preserved. The overall result is that many listeners find AAC files to have
better sound quality than MP3 for files compressed at the same bit rate.
Previous Page Next Page