"They are here" vs. "You are there"


Sometimes a system sounds like "they are here." That is, it sounds like the performance is taking place IN YOUR LISTENING ROOM.

Sometimes a system sounds like "you are there." That is, it sounds like you have been transported to SOME OTHER ACOUSTICAL SPACE where the performance is taking place.

Two questions for folks:

1. Do you prefer the experience of "they are here" or "you are there"?

2. What characteristics of recordings, equipment, and listening rooms account for the differences in the sound of "they are here" vs. "you are there"?
bryoncunningham

Showing 9 responses by cbw723

Personally, I find the "you are there" sensation a bit disorienting. I know where I am when I'm listening, and it's not in a jazz club, concert hall, or stadium. So when I hear the cues that suggest those places, I find them distracting and they distance me from the music. Maybe that's why I prefer studio-recorded material: it sounds like the music is there with me for my personal enjoyment.

In my case, the acoustic treatment is easy: just make the room a bit on the acoustically dead side. For the "you are there" experience, I'd think you'd want to make your room's acoustics a bit like the venue of interest (without getting carried away). A jazz venue is small and a bit bright, a concert hall is cavernous, a stadium is, well, an acoustic nightmare. So I think you could probably tailor your room in one way or another to maximize a particular kind of venue, but that might have consequences for other types of recordings and venues.

Maybe Hesson11 is on to something. Maybe the best answer is a processor (that reproduces the ambience), and a few surround speakers. Many surround processors have this capability, and have settings for various venue effects. One could fork one's 2-channel line outs into a processor or receiver, and use it for its surround capabilities only. A lot of people are already effectively doing the same thing with their subs (substitute "low pass filter" for "processor"). Hmm, an easy enough experiment to try...
Bryon writes:
1. If an audiophile listens predominantly to one type of music, he should design his listening room (when possible) to approximate the typical characteristics of the recording spaces for that type of music, so as to promote the illusion that "he is there" for the music he usually listens to.

I think this depends not only on the venue of preference, but the recordings. I alluded to this before when I suggested one not get carried away. In situations where the ambience cues are subtle or absent, having room reinforcement would likely be beneficial. But in cases where the cues are already strong, reinforcement could become excessive.

2. If an audiophile listens to a wide range of music, he should design his listening room (when possible) to be neutral, so as to promote the illusion that "he is there" for as many kinds of recording spaces as possible, acknowledging that the more neutral the room, the less likely it is to approximate the recording space of any particular type of music.

Again, I think this works in the case of strong cues, but with weak or absent cues, and hard-to-duplicate room acoustics, electronic enhancement may be the way to go. Surround speakers could produce concert hall acoustics even in a smallish room.

In summary, the electronic approach could provide reinforcement that varied by degree, depending on how much was needed, and could support a variety of venue configurations. You could, for example, put a studio-recorded session in a big concert hall (but, of course, at some point you are going to start creating distortions that can't be ignored).

Finally, I'm not sure how much the playback system's coloration is an issue. Assuming the system is good enough to produce playback with a convincing live or nearly live sound (as judged by the system's owner/primary listener), it seems unlikely that the ambience cues are going to be distorted to a point that they become an impediment to a "you are there" experience.
The idea of creating listening room ambience by electronic means is appealing in theory. In practice, however, the limited experience I have had with professional reverb processors from high end manufacturers was not favorable.

I don't disagree that state of the art processing would not hold up to close scrutiny if it were examined on its own. My thinking is that it may be sufficient when limited to surrounds. As long as the bulk of the sound is coming from the (unprocessed) mains, the processing may not be audible. Given that the cues are themselves the subjects of "analog processing" (i.e., they are things like reflected sounds and room reverb), it may be possible to find a good balance. This approach is, of course, done with movie soundtracks all the time. But, not having tried it with two-channel music, I can't say if the results would be satisfactory. It's just a hypothesis.

However, I believe that colorations in equipment can be a real obstacle to the presentation of ambient cues during playback. I became convinced of this when making component changes in my own system that simultaneously resulted in (1) greater neutrality, judged by independent criteria; and (2) greater audibility of the ambient cues of recordings.

"Coloration" as we've discussed in the past, is a broad category. Since the ambience cues tend to be subtle (except for things like applause), the thing most likely to make them more audible is detail. But detail is a two-edged sword: coloration can obscure it, and coloration can enhance it. So:
1. Some colorations may only have negative effects on the cues. Reducing the noise floor of the system may be an example of an approach that is always positive.
2. Some colorations may be neutral with respect to ambience cues (at least within the usual constraints of high-end systems). THD may be an example. Limited dynamics may be another.
3. Some colorations may enhance the cues. Excessive brightness comes to mind. You get lots of detail in bright systems -- to the point that the ambience cues will practically jump out of the speakers and punch you in the head -- but such systems are not particularly neutral (though they are preferred by some listeners).
Bryon says:
But I wonder whether those colorations would contribute to the illusion that “you are there.” My suspicion is that the answer is often 'no.' That is to say, colorations that enhance ambient cues might nevertheless fail to contribute to the illusion that “you are there” because they might also make the music sound less “real.” I, for one, have a hard time experiencing a bright system as one in which “I am there.”

I considered this when I posted, but I think it is probably very listener-dependent. I have a preference for tonal balance even if it comes at the expense of some detail. But others have a preference for detail. This explains the existence of equipment that makes me want to run screaming from the showroom (and, I suppose on the flip side, equipment that makes the detail-lovers want to fall asleep). For the detail-lover, the increase in detail may add to the realism and the "you are there" experience, despite what you or I might think is an unnaturally colored system. But, of course, I'm talking about two points in what is almost certainly a continuum of listeners, and everyone likely has their own idea where realism starts and ends, and how they weight the various tradeoffs in putting together a system.

Increasing resolution is not the same thing as increasing “perceived detail,” since the latter may be increased, as you pointed out, by changing a system’s frequency response (i.e. making the system brighter). Increasing resolution is a matter of increasing either (1) format resolution, or (2) equipment resolution.

I guess that depends on how precisely you define your terms and how you measure the results. If you define resolution in purely technical terms, then you could increase the resolution of your source, and thereby your system, but that could have no audible result (because, for example, the signal-to-noise ratio of your overall system may be the limiting factor). So "resolution" then says something about your gear, but nothing about your sound, and is therefore disconnected from realism, ambience cues, and the "you are there" experience. But if you appeal to audible results, then "perceived detail" is one potential measure of resolution, and therefore may contribute to realism, etc.
I do not think of resolution this way, and I don’t think most audiophiles do either. The term ‘resolution’ is used by audiophiles to describe both a characteristic of an individual COMPONENT and a characteristic of a whole SYSTEM. Hence the term ‘resolution’ says something about how a system sounds. I am not claiming ownership of the term ‘resolution.’ I am expressing what I believe to be the prevailing use of the term among audiophiles. For the purposes of this discussion, I will stipulate a definition of ‘resolution’: The absolute limit of information about the music that a format, component, or system can present.

You kind of make my point while simultaneously avoid addressing it. If resolution is determined by audible metrics, then "perceived detail" is likely one of them. And ambience cues live in the detail.

If you take an information theoretic approach to resolution -- as you seem to imply with your definition -- then I think you will be unhappy. The overwhelming majority of the information is in the high frequencies. Given the way human hearing works, you would get vastly more information by dumping the low frequencies entirely in favor of enhancing the highs -- you'd maximize the information about the music, but the result wouldn't be music. So I think some other definition is in order.

Which gets us back to my earlier point: the experience (you are there) is subjective. For some people a brighter system might provide it better than a more neutral system. And for those people, the realism obtained might outweigh the realism lost.
Bryon, regarding your recent post on ambience cues, directionality and listening rooms, I think you may be overlooking some aspects of what is going on with respect to the cues in the recording versus the cues from the listening room.

Consider doing the playback in exactly the same space as the recording. You set up the speakers and the equipment to optimally reproduce the soundstage, and put the listener in the position of the microphone that recorded the performance. Thus, your listening space exactly reproduces the recording space. Is this the optimal space for creating the “you are there” experience? I don’t think so, but it illustrates some issues:

1) Consider a single drum hit. From the optimal listening position, the stereo effect tells you that there is a drum set on the stage, left of center. What does the wall directly to the right of the speakers see? It sees two sources (the left and right speakers), separated in time by the distance between the speakers. The reflections along the wall will see a delay between the two sources that varies something like the sine of the takeoff angle. The same for the left wall, other objects in the room, etc. This effect does not exist in the original performance. These echoes come to your ears as something other than what the single source on the recording produced. Let’s call it “source distortion.”

2) Now let’s replace the pair of speakers with a single speaker in the position of the drum set. The drum hit now behaves as a single source: the direct wave travels from the speaker to the listener as it should, and then hits (say) the back wall and comes back to the listener at exactly the same time as the echo in the recording gets to the listener as a direct wave. Thus, you have achieved your goal of reinforcing the primary cue. But the recorded echo itself then travels to the rear wall and comes back to the listener as a secondary echo that did not exist in the original performance. Let’s call this “echo distortion.”

3) Of course, your room is not exactly the configuration of the recording room, so on top of #1 and #2, you hear your primary room echo and the echo on the recording at different times. Let’s call this “temporal distortion.”

In general, to get ambience cues on the recording to be omnidirectional in your listening space, you would have a) primary echoes from your listening room that were stronger than the secondary recorded echoes, and thus dominant, b) recorded ambience cues reflected by your room that arrived at your ears too late (i.e., the reflected ambience cues will be out of sync with the directly radiated (from the speakers) ambience cues), and c) many of the reflections suffering from source distortion.

I see this as a continuum. If you succeed in recreating a recording space perfectly, you get source and echo distortion with it. If your space is some average of the spaces you prefer (say, a generic jazz club), or you listen to recordings recorded in more than one place, you’ll also get temporal distortion. If you manage to suppress echo and temporal distortion (or the recording has weak ambience cues), then the direct echoes from your room will dominate, and you’ll actually get a “they are here” effect, rather than the desired “you are there” effect. If you suppress your room so that the recorded cues dominate, you get “you are there” cues but they’ll be bidirectional (but only if the recording has sufficient cues -- if it doesn’t you may get a somewhat dead or recording studio sound).

So you have a range of recordings (from heavy cues to none), and a range of rooms (from live to dead), but it doesn’t seem possible to have an optimal room for both ends of the spectrum (which I think you’ve said), and it doesn’t seem possible to get time/phase correct omnidirectional ambience cues that aren’t dominated by your room, rather than the recording (short of electronic intervention, which you and Learsfool have said is not desirable).

To sum up, I think to the extent that you succeed in making the ambience cues from the recording omnidirectional, they’ll be mis-timed, out of phase, and probably polarity flipped. And that is on top of all of the very strong room cues that you will necessarily generate to get the recorded cues to be omnidirectional. Or, to put it another way, I don’t think it is possible to get the recorded cues to be omnidirectional without seriously compromising the “you are there” effect.

So, my theory:
1) Strong recorded cues + live room = a mess tending toward “they are here”
2) Strong recorded cues + dead room = “your are there” but bidirectional cues
3) Weak recorded cues + live room = “they are here” but if the room is sufficiently like the recording space, you approximate “you are there” for that space
4) Weak recorded cues + dead room = “they are here” (or in a studio)

All of this comes with the caveat that what I say may be true for certain kinds of cues and not others.
...I composed my response, above, before seeing Bryon's most recent post. But I think everything still stands.
Bryon, I agree that experimentation is really the only way to answer some of these questions and likely the only way to find an ideal listening environment for a person’s particular taste (aside from hiring someone who has the experience to design a room based on your expressed preferences -- though even that might take a few iterations or adjustments since it is unlikely that it will be right on the first pass (unless you’ve already heard exactly what you want and can point to it and say “I want that.”))

My point was mostly about the difficulty of getting the cues on the recording to be omnidirectional. If you achieve it, I think you also get a whole bunch of extra stuff from your room that you probably don’t want and would likely swamp the recorded cues. And even then, to the extent that the cues on the recording are omnidirectional, they’ll be mistimed and out of phase. I’m not sure it’s physically possible (outside of electronic intervention) to get the cues *on the recording* to be both omnidirectional and sound realistic.

The various kinds of room colorations you mention, what you are calling “source distortion,” “echo distortion,” and “temporal distortion,” are definitely things to be addressed. But it seems to me that these are precisely the kinds of things that an acoustically treated room DOES address. “Source distortion” is typically addressed by absorption or diffusion at the first order reflection points on the side walls and the ceiling. “Echo distortion” is typically addressed with diffusion behind the speakers. “Temporal distortion” is typically addressed by balancing the ratio of absorption to diffusion to achieve a specific reverberation time.

Right, I agree. But my point is again about the ambience cues in the recording. The primary signal in the music is generally going to dominate, and the cues are softer, lower SNR, and more diffuse. So, if you succeed in taming the distortions I mentioned for the primary, you also greatly diminish the omnidirectional nature of the cues -- probably completely out of existence. If you don’t succeed in taming the primary reflections, then they’re likely to overwhelm the reflected cues. But this is an argument from theory, and there may be some middle ground where it could work.

My view is that omnidirectional ambient cues are more valuable than strictly accurate ambient cues for creating the illusion that "you are there." Having said that, I guess I’m not as skeptical as you, Cbw, about the possibility of constructing a listening space whose acoustics allow for omnidirectional ambient cues that are REASONABLY ACCURATE to the recording.

If I understand you correctly, I think you are saying that one can, effectively, simulate ambience cues that approximate the cues on the recording, but are not sourced from the cues on the recording. If that’s the case, I agree (with the caveat that if the cues on the recording are strong and not well-matched to the room, you are likely to get a mess). To achieve this, you will be structuring your listening space to create a certain ambience. If that matches well with your music, you may have a very pleasing “live” sound. If it doesn’t, well, you’ll have to learn to live with it (or maybe have some movable absorption panels that can deaden the room effect when it’s not desirable).

I think, though, that purists will not like this approach. To the extent that you are creating ambience cues from the listening room, you are obscuring information on the recording. Learsfool, for example, might not like this approach for his listening, since he’s expressed a strong preference to hear precisely what is on the recording down to the differentiation of concert halls on fifty-year-old records. That probably wouldn't be possible in a room that was not very dead, or with a soundfield that was not very focused.
Bryon, I agree with most everything in your recent post. I would like to point out one detail that I tried (probably unsuccessfully) to make in my most recent post. You say:

-“reactive room” is a listening space with significant ambient cues. Hence a listening space that significantly interacts with the ambient cues of the recording during playback. A.k.a., a “live room.”

My point is that a reactive room reacts to everything in the signal, not just the ambience cues. Thus, with the drum hit I was talking about the direct wave reaches the microphone first* as a primary signal, then come the echoes, reverb, etc. in its wake. The cues come later, smaller in amplitude, and more stretched in time than the primary signal. So a room that reacts to the cues will always react also to the primary signal, and that signal will generally be stronger than the cues.

*While it is technically possible for a signal to reach the microphone before the direct wave, I don't think it is a big factor in most recordings.