A New Kind of Control Room.
-----------------------------------------------------
-----------------------------------------------------
When taking a brief for a control room design, acousticians are often told, "I don't need anything fancy, I just want to be able to rely on what I'm hearing." It doesn't matter whether the budget is a couple of hundred or a couple of hundred grand, the requirement stays the same, it's just the degree of precision that changes.
Whether it's a shoestring home setup or a fully featured commercial studioplex, the primary requirement is one of monitoring accuracy, with all the other requirements following on after that. (You don't, for instance need high levels of isolation if you're happy to work entirely on cans).
Of course, it's never quite that simple. You probably also want the room to be comfortable enough to work eight hours at a stretch, big enough to fit the whole band in for the final mix, able to have its equipment changed without altering the monitoring accuracy, sound the same in all parts of the room, and provide mixes which sound the same when you take them to another studio, take them home, or play them in the car.
Over the years, various attempts have been made to design control rooms which tell the absolute truth about the music. Consensus now seems to be that, in fact, there is no such thing as absolute truth, because what you hear is always going to depend on the environment in which you listen to it. So instead, room designers seek to create "neutral" rooms, which impose as little as possible of their own character on a sound, whilst still providing a viable working environment for the engineer.
All sorts of clever (and some not so clever) tricks have been used in the quest for neutrality, and Early Sound Scattering looks set to be the next big thing in achieving it.
-----------------------------------------------------
The Sixties: Dead Rooms
In the beginning was rock and roll. And perfboard and Rockwool.
And the producer said "Let there be the direct sound and nog-all else."
And lo it got the job done.
But it sounded horrible, and engineers could only work in them for twenty minutes at a stretch because they were almost anechoic, and the human animal can't cope with that.
Well OK maybe it wasn't that bad but you get the picture.
-----------------------------------------------------
The Seventies: Rettinger and Eastlake
Stereo happened, and people started getting interested in knowing what was actually going on in the room. The best rooms had rough stone front walls with the monitors set flush into them, and very deep absorption at the back. The front side-walls and ceiling were raked to prevent flutter echoes. The hard front end provided the occupants with a few reflections, giving them some acoustical perspective, so it didn't feel like they had their heads in boxes of cotton wool.
Typical decay times varied from about 380ms for a room of 100 cubic metres, to 430ms for 200 cubic metres.
But no two rooms sounded quite alike, nor did any two places in the same room.
They tried equalizers, and made them look the same on an analyzer, but still they sounded different.
-----------------------------------------------------
The Eighties: Davis' LEDE Room and Reflection Control
Live end dead end was all the rage in the eighties. By making the area at the front of the room almost anechoic, Davis and others opened up a new realm of realism in studio monitoring. The secret lay in the initial time gap, between the direct sound and the first reflections; make this long enough and the brain can separate off the room acoustic, and ignore it. Result: a truly neutral room, where what you hear is exactly the same in any other LEDE room. To preserve operator comfort, the rear wall had to be hard, but not cause a slapback echo. Based on numerical theory by Manfred Schroeder, new diffuse treatments were developed, particularly by Peter D'Antonio at RPG, to break up the echo, but still return the energy to the room as a short decay. For a fuller explanation of Schroeder diffusers, see panel 1.
Reflection control was basically the same concept as live end dead end, and was developed as a solution to the conflicting requirements of having a completely absorbent front end and the usual need for a studio window in the front of the room. By dint of careful geometry, you can arrange for the reflections off the glass to miss the mix position, giving the illusion of complete absorption. With even more careful geometry, you can have many reflections from the front surfaces of the room, all of them missing the mix position, forming a reflection free zone. For the zone to remain reflection free, the rest of the room needs to be anechoic, or at least highly absorbent.
The effect is stunning, provided you sit in exactly the right place and nobody puts a rack of keyboards behind you and you don't want any effects racks or tape machines or anything else in the rear half of your control room.
LEDE and RFZ rooms all seek to achieve essentially the same objective: a room which imposes none of its own character upon the signal. They do this primarily by not allowing any of the early reflections to reach the engineer's ears.
This poses a problem when you want to put any kit in the room, because you unavoidably get reflections off it which the room designer wasn't expecting.
-----------------------------------------------------
One logical alternative to the LEDE/RFZ approach is to build a room in which the characteristic reflections are so uniformly random that they have no character to impose.
The ESS control room is one which features a highly diffusive front end, including the walls into which the monitors are built, which scatters the early sound. The body of the room is absorbent, with most of the LF absorption provided by damped membrane panels.
The room can be made fairly live compared to older control rooms, with a flat frequency response and good stereo imaging, both of which remain stable right to the rear corners of the room.
The concept of surrounding the monitors with diffusers was an invention born of mother necessity. The Amek 9098 is a large piece of kit with an essentially flat surface. Most desks are. The difference with the Amek is that the first one was pre-sold to Lisa Stansfield and Ian Devaney to go in their existing studio in Rochdale.
The studio building had originally been constructed some years previously, when cash was a bit tighter, and it wasn't particularly big. To cut a long story short, the RFZ geometry just wouldn't work. The desk would have had to go too close to the speakers so an alternative was needed. The original purpose of all the diffusers was to produce enough early energy to mask the desk reflection and so reduce comb filter effects.
Although using such an untried trick in a full scale project sounds like a bit of a gamble, the project was undertaken with confidence, since at worst it could have turned out equal to the best rooms of the seventies, but with the advantage of having the erratically random stone replaced by statistically perfectly random diffusers.
While refitting of the room in Rochdale was under way, Ian and Lisa realized that they needed a writing facility at their Dublin home, and the basement room they chose for this purpose was less than ideal in shape.
The same approach to side-step the geometry problem was used, and when the monitors were run up for the first time, Ian thought there was something odd about the room. It was a while before he worked out that what was odd was that it was quieter at the back of the room, and nothing else.
When the room in Rochdale was completed a few months later, Ian and Lisa found they could transfer material from one room to another with complete confidence, despite the two rooms being radically different in shape and size.
Stereo Imaging
A common assumption about diffusion is that, by smearing the signal in time as well as space, the stereo image is bound to be destroyed utterly.
This, however, has turned out not to be the case. Stereo image is a psycho-acoustic illusion: a trick played on the brain and ears. The ears gather whatever information they can, and the brain makes whatever sense it can of that information. When the information is conflicting, the brain fails to make sense of it, and the illusion is lost.
The information of most interest to the brain is the level difference between left and right ears, but timing is also very important. If the timing information conflicts with the level information then the image disappears.
Reflections assist the brain in localizing a sound source, but that is not the aim when trying to form a stereo image. Scrambling the timing information makes it more difficult to localize the loudspeaker itself, leaving the level information, uncontradicted, to provide the image.
The resulting image, while not quite as dramatic as that found in a well set up RFZ room, is reliable regardless of changes of equipment in the rear of the room, and extends the full width of the desk and right to the back wall.
Frequency Response
The most readily grasped measure of a control room's "quality" is its steady state frequency response, as shown on a spectrum analyzer with a pink noise signal source.
Although popular in the late seventies, the use of equalizers to compensate for room acoustics is now generally frowned upon, except in certain circumstances. In particular, if you flush mount speakers which were designed to be free standing, a bass lift will result, because the speaker is radiating the same power into a hemispherical space which it ought to be radiating omnidirectionally. In this instance a bass cut may be applied in the feed to the amplifier; a steady state remedy for a steady state anomaly. Using a graphic equalizer for this task is unwise, as each filter causes all sorts of phase shifts at its turnover points, causing a loss of definition at the bottom end. A simple "bypassed pad" first order bass cut causes the least possible disturbance to the phase at low frequency.
Provided your speakers have been built right, the steady state frequency response of your system depends mainly on the room's decay time response. To achieve a flat frequency response the decay time of the room must be approximately equal in each octave band. Equal decay times may be achieved at mid and high frequencies by specifying suitable absorbent treatments for the walls and ceiling. Typical absorbers in this frequency range include foam tiles, drapes and soft furnishings, and mineral or glass fibre matting up to 200mm thick.
Deep trapping, Helmholtz and membrane absorbers and resonant pipes may be used to control low frequency decay, but because low frequency propagation is primarily by excitation of room resonances, close attention must also be paid to the shape of the room. The room proportions (the ratios of height to width to length) should closely approach one of Bolt's ideal ratios, which distribute the resonances evenly with respect to frequency. In non-rectangular rooms, the averaged dimensions should still be made to fit one of Bolt's ideal ratios, as the non-rectangularity will essentially only damp the resonances, and not eliminate them. For a fuller explanation of resonances and details of Bolt's ratios, see panel 2.
Comb filtering is the effect where a delayed signal cancels with the direct signal at frequencies where the path difference is an odd number of half wavelengths. The depth of the cancellation notch depends on the difference in level between the two signals, with complete cancellation if they are exactly equal. The effect on the sound varies throughout the room because the extra distance the reflected ray travels varies. By spreading the reflection out in time, Schroeder diffusers close to the loudspeakers provide a highly effective method of minimizing the effect.
Where the primary reflection is from a diffuse surface the reflection will be markedly reduced in level, as the energy is being dispersed in many directions, and so much smaller cancellations will be produced.
However, because the spatial diffusion is accompanied by temporal diffusion, the notches are dramatically damped, to the point of non-existence. If the primary reflection is not from a diffuse surface, it will be being fed from the diffuse area at the speaker, with much the same effect.
Examining the impulse response of an ESS room compared with a similarly dimensioned RFZ room reveals that the inevitable desk reflection has changed from a tall spike into a squat hump. This translates into the frequency domain as exchanging deep, narrow notches in the HF region, of up to about 15dB depth, for about 2dB of gentle ripple. The improvement in HF phase coherence that removing the deep notches provides is hard to quantify, but hifi-buff words like clarity, naturalness and transparency spring to mind.Spatial Uniformity
The use of loudspeakers with a highly hemispherical output is central to the ESS design, in order that sufficient energy is delivered onto the diffusers close to them. This, in turn, means that off-axis listeners will receive a very similar direct sound spectrum to those on axis. This, of course, is nothing new, and soft dome speakers have been increasingly popular since the early eighties.
However, anyone who has ever studied any physics knows that two point sources in phase produce fringing effects, which will cause a room to have a different frequency response at every point in space. This is basically the same problem as the comb filter effect described above, except now we're talking about two sources and a spatial anomaly, rather than one source, a reflection, and a frequency domain anomaly. If you haven't ever noticed this, try listening to some 1kHz tone in mono on two speakers, and move your listening position from side to side by a foot or so. The level changes dramatically, as does the apparent direction as you move through the fringes, or hot spots.
The big difference with the ESS room is that this fringing is almost completely absent. The diffusers close to the speakers effectively convert the speakers to large plane sources, which do not suffer from the same constructive and destructive interference effects, removing the biggest obstacle to achieving consistency of frequency response throughout the room.
Also, the imaging benefits from this removal of hot spots because the level differences at the ears are more likely to resemble those at the speakers.
Many control rooms also exhibit a nasty bass lift close to the back wall. In any closed space, close to the boundaries, you get a rise in level at low frequency due to the pressure zone effect. The use of damped membrane absorbers, especially on the rear wall where the effect is most pronounced, minimizes this problem. The mathematics of why this works is beyond the scope of this article, but concerns the phase shift at which the membrane reradiates the energy it fails to absorb.
Decay Time
The decay time in the control room greatly affects the comfort of the engineer, and too short a decay can cause fatigue after quite a short time.
In 1977, Rettinger determined that the perceived liveness of a room depends upon the ratio of the decay time to the room volume, and suggested an ideal relationship for a recording control room.
Since then, probably due to increasing awareness of the need for engineer comfort, rooms have tended to be built to be slightly more live than this, and ESS rooms are normally designed to have a decay about 20% longer than that suggested by Rettinger.
When calculating control room decay times, Sabine's simple formula is inadequate, as the room is both highly absorbent and non-uniformly covered. Accordingly, Eyring's formula resolved along three axes after Fitzroy is recommended. See panel 3.
Repeatability
The biggest factor which makes reflection-control rooms different from each other, given that the designer intended them to sound identical, is the assortment of other kit that ends up in the room.
If the accuracy of the room relies upon freedom from early reflections, one reflection from behind the engineer makes a huge difference to the overall sound, and variations in position or size of the racks, trolleys, and keyboard stands will cause no end of variation in the room acoustic.
If, instead, these extraneous arrivals are just a minute part of a cloud of diffuse arrivals, the effect of changing them, within reasonable limits, is negligible, and therefore two quite different room layouts can sound almost identical.
So there you have it.
Add enough smooth randomness to any imperfect system, and the imperfections virtually disappear. These rooms really work, and give a good representation of what your mix will sound like away from the studio.
They're pleasant to work in, and can be tailored to suit even quite modest construction budgets without greatly compromising performance.
-----------------------------------------------------
Schroeder Diffusers
A Schroeder Diffuser is a structure comprising a number of wells of different, carefully chosen depths.
As a ray of sound strikes the irregular surface, instead of bouncing off it like a mirror, it bounces out of each well at a slightly different time. The result is many small reflections, spread out in both time and space. The frequencies at which it operates as a diffuser depend upon its dimensions, with the lower limit being that frequency where the deepest well is a quarter wavelength, and the upper limit being where the period of the structure is equal to half a wavelength.
The operating range of a single diffuser is limited to about four octaves, because if the deepest well is deeper than about fifteen times its width, it begins to behave as a diaphragmatic absorber.
The way it actually works is a bit complicated, but here goes. Any wavefront travelling in a particular direction may be considered as being made up of an infinite number of side by side omnidirectional "secondary wavelets". The direction of propagation of the wave depends on the spatial arrangement of the notional sources of these wavelets, or on their phase relationship. (Same thing, really). If a wave is reflected by a Schroeder diffuser, each well produces a reflection at a slightly different time, due to its different depth. The phasing of these reflected wavelets is what determines the direction of the reflected wave, and if the diffuser is correctly designed, the reflected wave will depart in many directions. (A theoretical diffuser having infinite wells will reflect the wave in a perfect hemidisc.)
The wells are arranged in a cyclic sequence, and the best sequences consist of a prime number of wells per cycle.
A number of ways of determining the well depths has been tried over the years, but by far the most popular is the quadratic residue sequence. (I tried to get away without using the words 'quadratic' and 'residue', but I just couldn't help it). To quote Dr Peter D'Antonio, these sequences "have the unique property that the Fourier transform of the exponentiated sequence values has constant magnitude in the diffraction directions".
The well depths are given by
where d is the depth, h is the well number, N is the prime number on which the sequence is based, and L is the wavelength of the lowest operating frequency.
-----------------------------------------------------
Room Shape
Low frequencies, from about 200Hz down in typical sized control rooms, behave very differently from higher ones. High frequencies travel like a light ray: in a straight line from the speaker to your ear. With low frequencies the speaker dumps energy into the room, exciting the room's natural resonances, and it is these resonances that then couple into your ears.
If the room shape is such that its resonances, or modes, are all bunched together, then at some frequencies there will be a big lift in what you hear, while at others, where the room does not respond, there will be a big dip.
The modes of a room come in three flavours, axial, tangential, and oblique.
Axial modes occur, as their name suggests, along the axes of the room; front to back, side to side, and floor to ceiling. They are the easy ones to predict the frequency of, because they occur at all multiples of the frequency at which the length, width or height of the room is half a wavelength.
Tangential modes are a bit harder to calculate, as they take in any two pairs of opposite surfaces, and oblique modes are even worse, making the grand tour of all six surfaces.
If you really need to know the frequency of a particular mode,
where f is in Hz, c, the speed of sound, about 344m/s, L is the room dimension and n is the order of the mode.
The point of all this is that, unless your room is a really bad shape, you're not actually all that interested in the frequencies of all these modes, only in how evenly spread out they are. If they're poorly spread out, then where they clump together the room will show a response peak, and low levels at other frequencies.
In smallish rooms, the region which tends to suffer the most in this way is from about 50Hz to 150 Hz, right where you need the most reliable response for mixing.
The maths is too complicated to go into here, but if the ratios of height to width to length (in any order) are 1.14:1.39:1 or 1.28:1.54:1 or 1.60:2.33:1 (Sepmeyer's Golden Ratios, often attributed incorrectly to Bolt), then the modes will be perfectly spaced, and LF response is pretty much guaranteed to be smooth.
A much better way of predicting the response is actually to calculate the frequencies of every mode up to about 300Hz, and the decay time of each of those modes. From the decay time, you can calculate the resonant bandwidth, and so see, when plotted, how well adjacent modes overlap.
-----------------------------------------------------
Sabine, Eyring, and Fitzroy
The simplest way to predict the decay time of a room is by Sabine's formula,

where T is the RT60 decay time, V is the volume of the room in cubic metres, and A is the total absorption in the room, in metric Sabins.
This is fine for predicting decay times in fairly reverberant spaces, where the absorption is evenly distributed and the average absorption coefficient is no more than about 0.2. You just add up all the areas of absorber multiplied by their coefficients to get the value for A, and out pops your answer. In large spaces A should also include an allowance for absorption by the air, which depends on temperature, frequency, and relative humidity.
Eyring improved upon Sabine's formula to make it applicable to less reverberant spaces, by treating the waves as being absorbed only at the surfaces:

Fitzroy, on the other hand, improved upon Sabine by allowing the absorbent material to be distributed
unevenly:

where Sx, Sy and Sz are the areas of absorber projected onto the three axes of the room.
Since control rooms tend to be fairly dead and non-uniform, it seems logical to replace the Sabine expression in each term of Fitzroy's formula with Eyring's, and this has been found to produce results which correspond well with measured values.
The large quantity of sums involved in calculating decay times at octave centres suggests the use of either a spreadsheet or a dedicated computer program to enable a design by trial process.
-----------------------------------------------------

Lisa Stansfield's Gracieland Studio:
Note the three sizes of diffuser to cover different frequency ranges and the raked panels above the monitors.
Martin Price (formerly of 808 State) commissioned the design of this room to produce the 1996 Kaliphz album.
The simplified geometry of the room significantly reduced construction costs.
This O2R/Sadie production room at LBS, Stockport is further simplified, but still benefits from its diffusive
front end.
-----------------------------------------------------
The Author
Andrew Parry has been involved in professional audio since the early eighties. After a few years of running and maintaining PA rigs of various shapes and sizes, he started building custom gadgets and systems for stage and studio use, whilst also offering maintenance services to small studios in the Manchester area. He ran the technical department of a well known studio equipment retailer for five years, during which time he studied acoustics in his spare time, in order to provide a fuller service to his customers. Since 1990 he has been an independent studio consultant, designing, installing and maintaining studios, and still based in Manchester.
He can be reached on 08700 788346.
or you can E-Mail him here at ESS. (that's andy@electroacoustics.co.uk)