|
Lossless Audio Compression Lossless audio compression allows one to preserve an exact copy of
one's audio files, in contrast to the irreversible changes from lossy
compression techniques such as Vorbis and MP3.
Compression ratios are similar to those for generic lossless data
compression (around 50–60% of original size), and substantially less
than for lossy compression (which typically yield 5–20% of original
size). The primary use of lossless encoding are:
- Archives
- For archival purposes, one naturally wishes to maximize quality.
- Editing
- Editing lossily compressed data leads to digital generation loss, since the decoding and re-encoding introduce artifacts at each generation. Thus audio engineers use lossless compression.
- Audio quality
- Being lossless, these formats completely avoid compression artifacts. Audiophiles thus favor lossless compression.
A specific application is to store lossless copies of audio, and
then produce lossily compressed versions for a digital audio player. As
formats and encoders improve, one can produce updated lossily
compressed files from the lossless master.
As file storage and communications bandwidth have become less
expensive and more available, lossless audio compression has become
more popular.
Formats
Shorten was an early lossless format; newer ones include Free Lossless Audio Codec (FLAC), Apple's Apple Lossless, MPEG-4 ALS, Monkey's Audio, and TTA.
Some audio formats feature a combination of a lossy format and a
lossless correction; this allows stripping the correction to easily
obtain a lossy file. Such formats include MPEG-4 SLS (Scalable to Lossless), WavPack, and OptimFROG DualStream.
Some formats are associated with a technology, such as:
- Direct Stream Transfer, used in Super Audio CD
- Meridian Lossless Packing, used in DVD-Audio, Dolby TrueHD, Blu-ray and HD DVD
Difficulties in lossless compression of audio data
It is difficult to maintain all the data in an audio stream and
achieve substantial compression. First, the vast majority of sound
recordings are highly complex, recorded from the real world. As one of
the key methods of compression is to find patterns and repetition, more
chaotic data such as audio doesn't compress well. In a similar manner, photographs
compress less efficiently with lossless methods than simpler
computer-generated images do. But interestingly, even computer
generated sounds can contain very complicated waveforms
that present a challenge to many compression algorithms. This is due to
the nature of audio waveforms, which are generally difficult to
simplify without a (necessarily lossy) conversion to frequency
information, as performed by the human ear.
The second reason is that values of audio samples
change very quickly, so generic data compression algorithms don't work
well for audio, and strings of consecutive bytes don't generally appear
very often. However, convolution with the filter [-1 1] (that is,
taking the first difference) tends to slightly whiten (decorrelate,
make flat) the spectrum, thereby allowing traditional lossless
compression at the encoder to do its job; integration at the decoder
restores the original signal. Codecs such as FLAC, Shorten and TTA use linear prediction
to estimate the spectrum of the signal. At the encoder, the estimator's
inverse is used to whiten the signal by removing spectral peaks while
the estimator is used to reconstruct the original signal at the decoder.
Evaluation criteria
Lossless audio codecs have no quality issues, so the usability can be estimated by
- Speed of compression and decompression
- Degree of compression
- Software and hardware support
- Robustness and error correction
Lossy Audio Compression
Lossy audio compression is used in an extremely wide range of
applications. In addition to the direct applications (mp3 players or
computers), digitally compressed audio streams are used in most video
DVDs; digital television; streaming media on the internet;
satellite and cable radio; and increasingly in terrestrial radio
broadcasts. Lossy compression typically achieves far greater
compression than lossless compression (data of 5 percent to 20 percent
of the original stream, rather than 50 percent to 60 percent), by
discarding less-critical data.
The innovation of lossy audio compression was to use psychoacoustics
to recognize that not all data in an audio stream can be perceived by
the human auditory system. Most lossy compression reduces perceptual
redundancy by first identifying sounds which are considered
perceptually irrelevant, that is, sounds that are very hard to hear.
Typical examples include high frequencies, or sounds that occur at the
same time as louder sounds. Those sounds are coded with decreased
accuracy or not coded at all.
While removing or reducing these 'unhearable' sounds may account for
a small percentage of bits saved in lossy compression, the real savings
comes from a complementary phenomenon: noise shaping.
Reducing the number of bits used to code a signal increases the amount
of noise in that signal. In psychoacoustics-based lossy compression,
the real key is to 'hide' the noise generated by the bit savings in
areas of the audio stream that cannot be perceived. This is done by,
for instance, using very small numbers of bits to code the high
frequencies of most signals - not because the signal has little high
frequency information (though this is also often true as well), but
rather because the human ear can only perceive very loud signals in
this region, so that softer sounds 'hidden' there simply aren't heard.
If reducing perceptual redundancy does not achieve sufficient
compression for a particular application, it may require further lossy
compression. Depending on the audio source, this still may not produce
perceptible differences. Speech for example can be compressed far more
than music. Most lossy compression schemes allow compression parameters
to be adjusted to achieve a target rate of data, usually expressed as a
bit rate.
Again, the data reduction will be guided by some model of how important
the sound is as perceived by the human ear, with the goal of efficiency
and optimized quality for the target data rate. (There are many
different models used for this perceptual analysis, some better suited
to different types of audio than others.) Hence, depending on the
bandwidth and storage requirements, the use of lossy compression may
result in a perceived reduction of the audio quality that ranges from
none to severe, but generally an obviously audible reduction in quality
is unacceptable to listeners.
Because data is removed during lossy compression and cannot be
recovered by decompression, some people may not prefer lossy
compression for archival storage. Hence, as noted, even those who use
lossy compression (for portable audio applications, for example) may
wish to keep a losslessly compressed archive for other applications. In
addition, the technology of compression continues to advance, and
achieving a state-of-the-art lossy compression would require one to
begin again with the lossless, original audio data and compress with
the new lossy codec. The nature of lossy compression (for both audio
and images) results in increasing degradation of quality if data are
decompressed, then recompressed using lossy compression,
|