Over the past year or so, we’ve talked quite a bit about digital photography and video production technologies in our Tech Tip section of the DMS newsletter. But we haven’t mentioned much about digital audio. We’ll address that this month with a primer on digital audio terminology.
Let’s start by breaking digital audio into two categories: uncompressed and compressed. This categorization is made to distinguish professional master quality audio recordings (uncompressed) from Web-based audio and consumer audio formats used with our iPods (compressed).
This month, we’ll stick to discussing uncompressed audio and leave compressed audio for December.
When a rock band, orchestra or voiceover artist records in a studio, their audio is usually recorded using a computer-based recording system such as Digidesign’s Pro Tools. Basic professional audio recording standards require capturing audio signals via digital-to-analog converters using pulse code modulation at a 44.1 kHz sampling rate and a 16 bit sample depth. Let’s define these standards.
A digital-to-analog converter is a bit of electronic circuitry used to transform an analog audio input signal such as the sound of a musical instrument being fed through a microphone into a digital representation of that signal that can be read and manipulated by a computer. The small audio input jack on your computer is linked to a D/A converter. Recording studios use much higher quality versions.
Pulse Code Modulation (PCM) is the method used by the D/A to store an uncompressed digital audio file as a series of 0’s and 1’s. You might think of it as a type of computer language for audio.
Sampling frequency refers to the number of individual pieces of digital media that make up an audio file. Just as film and video systems record continuous motion using a series of individual still frames (at frequencies from 24 to 60 frames per second), digital audio is recorded using a series of individual samples to represent continuous sound. However, we’re much more sensitive to subtle changes in sound than we are to changes in motion; this requires us to use an audio sampling frequency much higher than that of film or video. It was discovered during the early stages of digital recording and playback that higher sampling rates were required to record higher-pitched sounds. In fact, a sampling rate twice that of the frequency of the sound we wish to record must be employed in order to record lifelike digital audio. Lower-pitched sounds have longer wavelengths and higher-pitched sounds have shorter wavelengths. Human eardrums can respond to sounds with wavelengths as small as about 1.7 cm which translates to a sound wave frequency of around 20,000 Hz (20 kHz). Therefore, in order to record the full spectrum of sounds humans can hear, we must use a sampling frequency of around 40,000 Hz (40 kHz). Because the filters required for playing back digital audio distort the top portion of an audio signal, a sampling frequency of 44.1 kHz was standardized, enabling accurate recording and playback of the full 20 kHz audio spectrum. Modern high-end audio recording systems can utilize even higher sampling frequencies such as 48 kHz, 96 kHz or even 128 kHz. However, very few people can distinguish the difference between these recordings and the standard 44.1 kHz recordings.
The bit depth (or bit rate) of an individual audio sample determines how accurately the subtleties and dynamics of an audio source can be recorded. Reflecting on your binary code basics, a one bit sample - represented by a single 0 or 1 - means the sampled sound is either fully off or fully on. Not much dynamic range or variation there. Two bits provide four options (00, 01, 10, 11), which might translate to off, soft, medium and loud. Extrapolating that out to 16 bits, we get 65,536 different options for representing the loudness level of a sound, which is usually enough subtlety and variation to be considered lifelike. This also gives us a dynamic range of about 96 decibels, almost enough to accurately record the full sound of an orchestra performing The Planets.
To recap, D/A’s use PCM to convert analog sound to uncompressed digital audio using binary code. The sampling rate of a recording determines what frequencies can be captured to a digital audio file and a recording’s bit depth determines how much volume variation and dynamic range can be obtained within each sample of an audio recording.
Next month, we’ll see how compressing audio by limiting these standard recording settings creates much smaller files sizes appropriate for posting on the Internet or loading on our portable music device.