By Adam Ots
Audio dynamics processing is one of the more difficult processing types to understand. It is easy to conceptualize a standard volume control and the effect it has on audio. It is also relatively easy to understand equalization (EQ) controls and hear the effect on an output. Dynamics processing is not as straight-forward to comprehend or to implement, however it is just as important within a professional environment. This article by Adam Ots seeks to unravel some of the mystery, and later sections deal specifically with the dynamics processor built into Ots Labs audio software applications.
To overly simplify, dynamics processing deals with the level (volume) of audio. Unlike a standard volume knob however, which is a static control that is generally set and left, a dynamics processor actively deals with the audio passing through, making adjustments to the level as required. The magnitude and speed of adjustment being made to the audio level is dependent upon the audio itself. A dynamics processor strives to contain audio levels within certain predefined limits. A good processor achieves this transparently, meaning that the perceived change to the audio is neglible and audio quality is not compromised.
Dynamics processors are used within all major forms of broadcast (radio, television) and during the production of commercial audio (records, CDs, etc), as well as live events (concerts, music presentation at clubs, etc). Its use dates back to the first half of last century and it is an absolutely vital form of processing for most environments. This article is more concerned with the dynamics processing that occurs during presentation, rather than that applied during the mastering process (part of the production stage) of music. During presentation of audio material, a dynamics processor will allow you to smoothly and transparently restrict audio output to be within a manageable range. Venue-specific factors determine the range that is desired, the upper limit of audio level that is acceptable, etc. In essence, a dynamics processor is all about audio level control.
How a Dynamics Processor Helps You
How can a dynamics processor help you? Whether you play music to a hall full of people or to a few dinner guests in your dining room, a dynamics processor can greatly enhance the experience. In many cases it is absolutely imperative. All music varies in its use of dynamics. In the recording studio, some songs are mastered to sound punchy over a radio station, others are light and have an "air" about them. Some musical pieces have a fairly consistent audio level throughout their entire duration, others start slowly and quietly and build to a loud and aggressive climax. These attributes all form part of the inherent use of dynamics within a musical recording. This is the way the artist or mastering engineer intended the musical piece to be heard. If you just want to listen to that piece of music in all its glory, you would sit yourself down in your favorite chair in front of your best hi-fi system, crank up the volume, and listen to the recording without any alterations whatsoever. This is the ultimate way to listen to a particular song! Even an entire album (not a compilation or greatest hits album) can be enjoyed this way, since all songs are usually mastered within the one context. For most people however, this is something seldom done, and is generally not practical for most day-to-day uses of music.
Background Music
Since all albums are mastered differently, as soon as you start to play one song, then another from another album, then another, and so on, you are moving between pieces of music which contain different (sometimes vastly different) dynamics attributes. The reference levels will often be different as will the inherent dynamics. The classic example is to load a multi-disc CD player with five different CDs, some recorded from the 80s, some from the 90s, some current generation, etc, press "random play" and listen to the result. You certainly will not obtain a smooth musical experience. Add to this different styles or genres of music, and the mish-mash gets even worse. You set the volume correctly for the first piece that is playing and go and sit down with your guests, then the next song comes on and blasts you all out, or at the very least is uncomfortably loud, intruding on the conversation. You lower the volume to get it "right" for that piece and sit back down. The next song comes on and you are all straining your ears to hear whether there is still music playing, and what the song is! A dynamics processor solves these problems.
When you are playing songs from different artists, or even from the same artist but different albums, you are going to encounter varied dynamics. An album is only ever mastered within the context of itself. Unfortunately, unlike with movie soundtracks, no one standard or set of parameters is adhered to by mastering engineers when they master a CD. Play a track from an album along side songs from other albums and you are crossing a line. Even within the context of a single song, the inherent dynamics may be unacceptable for your particular application. Take a song like Stairway to Heaven which starts out very quietly and ends with a climax. As dinner music, this song will probably be either too quiet at the start, or too offensively loud at the end. You can enjoy all music, even in the context of background music, if you use a dynamics processor. The processor will constantly adjust the audio levels for you, maintaining a smooth output at all times. You simply set the volume once, and forget about it!
DJing to a Large Audience
Ok, so a dynamics processor is great for background music. What about playing to a party hall full of people? The same principles apply. Although the music may not be in the context of merely background or mood music, you still want some form of regulation. If it plays too loudly it may distort or even damage your equipment, or exceed the threshold of pain for human hearing. If it plays too quietly, people will feel disconcerted, especially when it changes from loud to quiet. Often a DJ is trying to maintain a very loud level, but must not exceed a certain level for equipment/safety/comfort reasons. Only a dynamics processor can seemlessly achieve this, without missing a beat, and all while allowing different songs to blend smoothly into each other. You are now in effect the mastering engineer, you are creating a new audio production, one which consists of all of the songs in your playlist, an entity in itself, and which is designed to be heard within the context of your venue. If you beat mix, then while you are overlapping two songs which both contain a heavy bass-beat, you don't need to worry about the bass overloading your equipment or sounding too solid -- the dynamics processor takes care of this and the results are smooth.
Broadcast Use
What about broadcast use? If you run a professional radio station, then you already know about dynamics processors. They are a legal requirement as they prevent overmodulation of the radio spectrum, which causes interference to other stations. They also provide a "signature sound" which allows the station to sound tight, punchy, or whatever, no matter what material they are playing. Big stations spend a lot of money getting their sound just right, as it is proven that this attracts listeners, and more listeners means more advertising dollars. People who listen to the radio don't want to have to constantly ride their volume knob! If you run an internet radio station, you can still benefit from a dynamics processor, and will want to, if you want your sound to be on par with the big boys. Your listeners want to hear good music, not think about it being too loud or too quiet. And it's important to remember that these days most people listen to the radio as an accompaniment to some other activity, not as an activity in itself. Therefore you don't want to jolt them around too much. Since you are playing many different songs from different albums, you will need a dynamics processor to achieve smooth levels. You may also want to try and develop a signature sound.
Automotive Environment
What about automotive use? Music in cars is very interesting. Since cars are an inherently noisy environment, you can never really hear and appreciate the full natural dynamics of a commercial studio recording. Ever notice how great your CD system sounds when you're stopped at the traffic lights, or sitting idle in your driveway with the engine off? That's because the background noise is minimal and so a greater dynamic range can be heard and appreciated. Most of the time you are not sitting in your driveway, so you have to make do with the limited audio dynamic range environment that a running motor vehicle's cabin provides. When listening to CDs, this often means that you'll find yourself turning the quiet parts up, and the louder parts down. Pretty annoying, and certainly not the best way to enjoy music. Interestingly, this is one of the reasons why commercial FM stations process their audio fairly heavily -- they have worked out that their greatest listener base is people driving cars, and therefore the available dynamic range at the "venue" -- in the car -- is limited, so they process aggressively to keep things very smooth and regulated. The quiet parts of songs can still easily be heard above the engine noise. The loud parts do not peak out and cause distortion on your system, or become offensively loud. You set the volume once and forget about it -- except when you get to the lights and everything sounds too loud again -- the environment has changed -- but that's a different matter!
Types of Processors
Dynamics processors come in many different types and flavours. Traditionally, they are a piece of hardware -- a black box with electronic wizardry inside. Prices vary from hundreds of dollars for an inexpensive and low quality unit to about $15,000 for a top-of-the-line broadcast unit that a commercial radio station would use. The term "dynamics processor" loosely means a processor which controls or regulates audio dynamics (changing levels). There are different types of processors which come under this umbrella and a good dynamics processor will generally contain many or all of them. An AGC (Automatic Gain Control) is a processor which smoothly raises or lowers the volume of audio relatively slowly over time, seeking a specific target volume, and operating within a specific range of allowable gain. A Compressor is similar to an AGC, except it generally operates much more quickly, making adjustments to the audio level within milliseconds, rather than seconds. Its adjustments are based on how much the input audio level is exceeding a preset threshold. The more the input audio is exceeding the threshold, the more aggressively the compressor will bring the audio level down to compensate. A Limiter is similar to a compressor, except that it virtually instantaneously brings down the level of any audio exceeding a preset threshold, such that it does not exceed this threshold at all at the output. This basically guarantees that the audio level will never exceed a particular level, no matter how loud the incoming signal is. For completeness it should also be noted that there are other processors within the dynamics processor family such as the Expander, which is basically the inverse of a compressor, and the Gate. These are of little relevance to this discussion however.
Naturally the quality of dynamics processors varies greatly, as does their intended use. Some are designed specifically for musicians, for use on a particular instrument, and may offer only compression and/or limiting. The type of processors we are concerned with in this article are those designed for presentation of pre-recorded audio. This is a more difficult type of processor to design and build because it is important that the underlying dynamics of a musical item are preserved (transparency), while still allowing for the required level of control. These type of processors are usually more expensive, and generally the overall level of audio quality increases with cost, as more sophisticated DSP (digital signal processing) algorithms are employed.
The Ots Labs Dynamics Processor
The Dynamics Processor which ships with Ots Labs audio applications is an entirely software implementation. These days even many hardware dynamics processors -- especially the more expensive ones -- are actually software implementations internally (they contain a DSP chip or chips which are executing program code). In software many things can be achieved which are impossible or more difficult/expensive to achieve in a pure hardware implementation. The Ots Labs processor is a full-suite dynamics processor comprising of an AGC, Compressor and Limiter. The audio quality is excellent, as testified independently by numerous customer testimonials, and it is used around the world in a wide range of environments and applications among thousands of installations. Like any processor, knowing how to correctly set up and use it is key to obtaining the best results.
Fortunately this is made easier by the fact that the processor contains a number of presets which satisfy most environments, and also because the Ots Labs audio pipeline creates a regulated environment which is more conducive to great results. Still, for those that want to go beyond using the presets, an understanding of all controls is invaluable.
The Presets
First let's look at the presets. Lounge, DJ, Party, Office & Radio. Simplistically speaking, from left to right these presets increase in the aggressiveness of processing that they invoke. The Lounge preset actually disables the dynamics processor, except for the limiter, which is always on for safety. The Radio preset heavily compresses all audio and provides quite a punchy sound, albeit at the cost of slightly increased distortion for some source material. The default preset, Party, is the most well-balanced, providing a good amount of level regulation while still maintaining the music's natural dynamics and a high degree of transparency. The DJ preset is suitable for applications where you do not want the AGC to bring up low levels, as the AGC is disabled for this preset. If you do not wish to learn about the specifics of each control, don't worry, you can still achieve excellent results! Use the Party preset, or if you have a more specific application, use the preset which sounds best for that application. However for those that have a thirst for more knowledge, read on!
Understanding the Decibel Scale
To set up a dynamics processor, it is important to have some understanding of the industry standard unit of audio measurement, the dB (decibel). The dB is not technically a unit at all, but rather a scale. This was chosen because the ear does not perceive audio loudness linearly, but rather logarithmically. If you double the output power of an amplifier from 25 watts RMS to 50 watts RMS, and leave all other variables the same, the audience will not perceive the sound to be twice as loud, but rather just a little bit louder. This is because the ear is designed to hear very quiet sounds, such as a pin drop, right through to very loud sounds, such as a jet engine. In order to take in such a vast range, the ear perceives sounds based upon a logarithmic scale. You have to output about 10 times the power in order for something to sound twice as loud. A 250 watt amplifier will sound roughly twice as loud as a 25 watt one. In a sense, the dB scale allows you to deal with sound power levels as if it was a linear scale. Increasing your output by 3 dB will always sound the same amount louder, irrespective of what the original output dB level was. Whereas the difference perceived by adding 3 watts of power to your output will be dependent upon what the original output power was.
There are different dB scales, depending on what you are trying to measure. For example, if you are measuring absolute sound pressure levels in a room, you would use the dB SPL scale. This scale starts at 0 dB (no sound), and increases upwards as the measured sound pressure levels are greater (louder). Every 6 dB of increase sounds about twice as loud to a human. For example, a sound at 24 dB will sound about twice as loud as a sound at 18 dB. Accordingly, a sound at 60 dB will sound about twice as loud as a sound at 54 dB. This scale is great for measuring absolute levels, as they exist within physical space (the air), but is not so good for measuring audio levels on recording mediums such as CDs, tapes, records, etc.
For recording mediums, the accepted standard is to use a negative scale which starts at 0 dB, and works backwards to negative values, such as -12 dB, -60 dB, etc. In this scale, 0 dB normally represents the maximum or peak level of audio that the particular recording medium can represent without distortion. All sounds lower than this peak level are measured with negative dB values. Explaining why this scale is better suited to recording mediums is beyond the scope of the article, but a hint is that it has to do with the fact that different recording mediums have different dynamic ranges (the difference between the loudest and quietest sounds) that they can represent/store, and it's almost always the peak level (0 dB) that you are interested in matching when converting between different formats and mediums.
The Ots Labs dynamics processor also uses this scale. In your mind, think of 0 dB as being the loudest possible sound that your computer's sound card can output (you may have to turn your sound card's output volume up for this to truly be the case). Then think of any negative dB value as simply a lower level, relative to this peak maximum output level. For example, -6 dB will sound half as loud as 0 dB. -12 dB will sound half as loud as -6 dB, and so on. In most parts of the Ots Labs audio pipeline, -96 dB is considered to be absolute zero (no sound). This is in line with the dynamic range offered by the Red Book Audio CD medium. A good comprehension of the above concepts is vital to properly understanding the various dynamics processor controls.
Audio Flow
When audio enters the Ots Labs Dynamics Processor it first has the input gain applied to it, as set by the input gain control. After this it flows through the AGC module. The output of the AGC is fed into the Compressor module, and this output is then fed into the Limiter module. The final output is then scaled down (if the output gain is set to a value lower than 0 dB), and then fed to the sound card. It is important that a dynamics processor is the last component in an audio pipeline, as it is responsible for ensuring that audio levels never exceed 0 dB, which would result in clipping distortion. Other Ots Labs processors, such as the Graphic Equalizer, are always placed earlier in the pipeline.
The AGC, Compressor and Limiter modules can each be indepently disabled (bypassed), by pressing the module's button at the top of the processor. When a module is bypassed, it is as if that module was not present in the processor's audio pipeline. It is usually unwise to bypass the Limiter as clipping distortion can result. This is even more likely if the input gain control is set to a positive value.
Automatic Gain Control
In the AGC processor, there are five controls, threshold, target, scope, attack and release. These are explained as follows:
The threshold control sets the dB level, above which, the AGC will actively operate. While the input audio level is below the set threshold level, the AGC will not actively adjust its internal volume control. If the input level remains below the threshold for more than five seconds, the AGC's internal volume control (as can been seen by the yellow marker on the dynamics processor gauge) will slowly center back to 0 dB (no gain). A sensible default for this control is -48 dB.
The target control sets the target dB level at which the AGC will seek. If the input audio level is below the target, then the AGC will raise its internal volume control to approach the target level. If the input audio level is above the target, then the AGC will lower the volume. Think of the target as the level at which you want the audio to be at, and the AGC will seek out this level. You will normally want this control to be set at -18 dB plus whatever value you have for the input gain control on the very left of the dynamics processor. For example, if the input gain is set to +3 dB, then you will want to set the target to -15 dB. This configuration makes the assumption that most input source material has an average level of -18 dB, which is true of most material flowing through the Ots Labs audio pipeline.
The scope control places constraints on the AGC's internal volume control. A scope value of 3 dB will mean that the AGC is only allowed to boost or cut the volume by 3 dB. If this means that it is unable to reach the target value you have set, then so be it, it will go as far as you have allowed it and no further. This is useful for having some light gain control without it getting out of hand if input levels vary too much and you don't want the AGC to make sweeping changes to the volume.
The attack control specifies the rate in seconds at which the AGC will adjust the volume in a downwards direction when seeking the target. A lower value will mean that the loud parts of input material will more quickly be reduced.
The release control is the same as the attack control, except that it dictates the rate of volume adjustment when moving in the upwards direction (raising the volume). For good results you normally want a slower release value than what you use for the attack value. Using good attack and release values for your material or application is part of the secret to obtaining clean and transparent results, while still obtaining the regulation you need.
Compressor
The Compressor also has five controls, threshold, knee, ratio, attack and release. These are explained as follows:
The threshold control sets the dB level, above which, the Compressor will actively operate. While the input to the Compressor is below the set threshold level, the Compressor will not compress (reduce) the audio flowing through it. A sensible default for this control is -15 dB, but the optimum value really depends on the settings of other controls.
The knee control sets a range (which begins at the threshold point and extends upwards) in which the compression ratio applied is gradually introduced. This type of compression is known as soft-knee compression. If you set the knee to 0 dB, then you are essentially running the Compressor as a hard-knee compressor. Generally you will obtain smoother results by using a knee value (ie. greater than 0 dB). The value you should use depends on other settings and the overall sound you are trying to achieve.
The ratio control is central to a compressor. It sets the amount of compression that occurs when the audio level exceeds the threshold. If the ratio is 2:1, and the audio level exceeds the threshold by 6 dB, then the audio will be reduced by 3 dB. If the ratio was 3:1, then a 6 dB excess would be brought down by 4 dB. If the ratio was 6:1, then a 6 dB excess would be brought down by 5 dB. Therefore the higher the ratio, the greater the amount that excess levels will be reduced by. Bear in mind that if you are using a non-zero knee value, then where you read "exceeds the threshold" in the above description, you should interpret it to mean "exceeds the threshold plus the knee value". What happens throughout the actual knee range is a product of both the specific knee value and the ratio value.
The attack control specifies the rate in milliseconds at which the Compressor will reduce the audio level. A lower value will mean that excesses above the threshold will be more quickly reduced.
The release control is the same as the attack control, except that it dictates the rate of increase of audio levels (when backing off from level reduction, because the audio is now under the threshold). For good results you normally want a slower release value than what you use for the attack value. As with the AGC, using good attack and release values for your material or application is part of the secret to obtaining clean and transparent results, however do not overlook the other Compressor controls, which are equally important.
Limiter
The Limiter has just one control, threshold. When the audio level is below the threshold value, the Limiter does nothing. When the audio level exceeds the threshold, the Limiter almost instantaneously brings the level down such that it does not exceed the threshold. This prevents clipping distortion. Because the Limiter is much more aggressive in the way that it reduces audio levels (unlike the Compressor), it is better to allow the Compressor to do most of the audio conditioning, and allow the Limiter to be invoked only occasionally, such as when an unusually large change in levels occurs. When correctly set up, or when using the Dynamics Processor presets, this is the case. The Limiter's threshold control should almost always be set to 0 dB. There is no tangible advantage (except if trying to achieve special effects) in using a lower threshold value for the Limiter -- all you are essentially doing is wasting the available dynamic range of your output.
Processor Gauges
The Dynamics Processor meters and gauges allow you to visually understand what is happening within the processor. The input and output level meters are fairly obvious, showing the instantaneous audio level in dB at both the entry and exit point of the processor. Note that the input meter is "wired" before the input gain control. The difference between the input and output meters indicates the overall change (boost or cut) of the audio levels of audio passing through the processor.
The AGC gauge -- the yellow block -- shows the current level in dB of the AGC's internal volume control. If this is above 0 dB, then the AGC is currently boosting volume, if it's under 0 dB, then the AGC is attenuating the volume. The blue meter shows compression in dB. The level of compression indicated is the sum of both level reduction within the Compressor module and level reduction within the Limiter module.
Note that if you temporarily disable the Limiter module then the output level meter will change from green to red whenever any clipping occurs. This is handy for testing your AGC/Compressor settings to determine whether you are relying too much upon the Limiter (which results in a little increased distortion). In an ideal configuration, the limiter will only be invoked occasionally or on certain source material, rather than constantly or regularly.
The oscilloscope control at the bottom shows the overall instantaneous audio level alteration being performed by the Dynamics Processor. Imagine all changes made by all three modules being summed together and displayed -- this is what the oscilloscope is displaying. That's why when you bypass the processor altogether, the oscilloscope will flat-line.
Tuning the Dynamics Processor
The specific settings for each of the Dynamics Processor's controls will be determined by what you are trying to achieve and the type of material you are playing. There is no "perfect" setting as such, although the presets generally offer a good compromise for most standard setups.
However, using the above information and having a clear idea of what you are trying to achieve, you can tune the processor to your specific needs. In addition to the information presented above, an important thing to keep in mind is that most source material that arrives via the Ots Labs audio pipeline averages at about -18 dB. This is a key point that will influence your design decisions. (Note that if you use the Graphic Equalizer, and do not compensate for your EQ settings by using the EQ's output gain control, then you may be altering the average audio level to something other than -18 dB. Keep this in mind, or better still, always balance the EQ by using its output gain control to compensate for your individual EQ band settings.)
Assuming therefore that the input level to the Dynamics Processor is generally hovering around -18 dB, you can work forwards from there, setting input gain, thresholds, targets, ratios, etc, "visualizing" what will occur at each stage when a peak (such as a drum beat) flows through, what will happen to low levels, etc. Remember that what you do with the AGC will affect what happens with the Compressor, as you may be causing the level to be more often or less often above the Compressor's threshold. You want to be aware of this, as it will determine what specific Compressor threshold you use to achieve your result. A good processor configuration will be balanced throughout. This means that you wouldn't, for example, set the AGC to target a -21 dB level, but have the input gain set to +6 dB. This is because -18 dB (the average input level) plus 6 dB is -12 dB, which means for the AGC to achieve your target, it would have to constantly be attenuating levels by 9 dB (on average). This is not balanced! It is out of whack by 9 dB, and should be corrected. The same principles apply to the Compressor module.
At the end of the day, it is how you perceive the output that is all-important. If you are happy with a particular group of settings, and they meet your needs, then this is paramount. However, by understanding the functionality offered by each control and the above principles, you can design your Dynamics Processor configurations with more purpose and will obtain first-class results. If much of this is too daunting for you, then you may be better off using the presets, as incorrect or unbalanced settings will usually be worse than the presets, even if you don't immediately detect it through your sound system -- just as with many areas of audio quality, much of audio dynamics is very subtle.
Conclusion
Dynamics Processing is a vital tool within the professional audio arena, used at almost all stages of audio production, reproduction and broadcast. Even music lovers enjoying music in their home can benefit by this powerful tool. Ots Labs audio applications include the powerful Ots Labs Dynamics Processor. Preset configurations can be used to obtain excellent results for those who do not wish to delve further into the world of dynamics processing. If an understanding of the fundamentals is sought after and comprehended, custom configurations can be created to achieve first-class results for a specific purpose or venue.