Depth is an illusion that is somewhat crude, certainly not as precise as lateral placement. In most systems, center image instruments/soloist tend to sound like they are located at the position of the wall behind the speakers. Instruments and soloists locate far left or right tend to sound more like they are located at the speaker. The difference between back wall and the near the plane of the front of the speaker sort of becomes the perceived depth.
There are certain recording cues that add to the sense of depth. It you hear a soloist singing up close to the microphone, with very little reverberation (natural decay of echoes of that voice), the sense is that that person is closer to you; the farther away that person is, the higher the percentage of reverberant energy (vis-a-vis direct energy) and the greater the sense of distance. It is not often that this type of reverberant decay is well capture (or faked) in recordings. Other cues include changes in the tonal balance of instruments and singers when they are close or further away (e.g., higher frequency sibilance is softened with distance).
It is easiest to here these things with simpler recordings done by companies that care about recording quality (e.g., Chesky Records). They have some test CDs that include samples of their music and stereo demonstration tracks that are pretty good at demonstrating depth (one track captured depth cues by having a speaker ,who also taps a tambourine, simply move progressively away from a microphone).
Whenever imaging or soundstage are mentioned, I like to remind people about these resources: The following provide tests, with which one may determine whether their system actually images, or reproduces a soundstage, as recorded.
ie: On the Chesky sampler/test CD; David explains in detail, his position on the stage and distance from the mics, as he strikes a tambourine(Depth Test).
LEDR test tells what to expect, if your system performs well, before each segment.
Chesky CD contains a number of tests, in addition to the LEDR.
Yes, this Chesky CD is the one I referred to. Most music CDs cannot be so simply recorded and would not produce such specifics cues for depth. Mostly, good recordings don’t convey specific information about depth position of the person or instrument, but they do give one a sense of that person or instrument playing in its own space and not just pasted into the same flat plane. This separation and the sense that some instruments are at or in front of the plane of the speakers and some are at or behind the plane of the back wall is good depth presentation.
Only a laymen’s experience to share. The experience is one of psychoacoustics. Creating enough of an illusion that make it easier for you to imagine depth. Width on the other hand is a function mostly of the recording engineer when balancing between left and right output for a given signal/instrument.
For an amazing image width illusion (no idea how/why it works) listen to Roger Waters album, Amused to Death. Just the first 2 or 3 tracks. You’ll hear a dog barking far to the right of your right speaker. Almost as if it’s coming directly to your right side. You’ll hear spoken dialogue (track 2 as I recall) that comes from beside you to the left. I use that album to set up my speakers. The illusion is almost spooky when dialed in.
Sound "appears" to come from far left and right, well outside the speaker placement width.
It also appears behind the speakers, in front of the speakers, apparent height is also heard, and if you have a nice setup you can actually hear sound that appears to come from behind the listener.
You don't need expensive equipment to hear this. A definitely not audiophile, fully studio manufactured, recording I use for a lot of demos is Led Zeppelin II, side one. If you can't hear the swirling up and behind your head on track 1 your speakers/room are not set up properly.
Amused to death was processed using something called Q-Sound. This is a clever processing that mimics what happens when sound from sources to the extreme left or right or even behind hit the listeners head. If, for example, a source is located directly at your left ear, the sound first hits your left year, the head shades the source from directly hitting your right ear, but the sound diffracts and travels around the outside of your head anyway and reaches the right ear. That sound hitting the right ear is different in timing (phase) and in frequency balance, and your brain knows how to interpret those differences as a source to the extreme left. Q-Sound is reproduces this effect and then, for the extreme left example, inject out of phase information into the right channel to cancel parts of its signal to achieve a simulation of this sort of effect. Very clever. Not many recordings are encoded this way, but, there are enough recordings where there are similar clues in the recording that cause instruments to appear well outside the location of the speakers.
The same phase shifting/frequency response changes from sound hitting the head from different angles, also give height cues. This is something demonstrated well by the Chesky Test CD described above. The create an artificial series of test signals (the LEDR test Rodman99999 described above) that appear to rise up out of one speaker, move to a point almost overhead, then descend back into the other speaker. The effect is not as pronounces if the speaker does not have good phase coherency, or the speaker location has a lot of nearby surfaces reflecting sound and confusing the carefully constructed cancelling signals. This helps the listener do speaker placements that minimize such interference, which should help with imaging of the system.