Electrical/mechanical representation of instruments and space


Help, I'm stuck at the juncture of physics, mechanics, electricity, psycho-acoustics, and the magic of music.

I understand that the distinctive sound of a note played by an instrument consists of a fundamental frequency plus a particular combination of overtones in varying amplitudes and the combination can be graphed as a particular, nuanced  two-dimensional waveform shape.  Then you add a second instrument playing, say, a third above the note of the other instrument, and it's unique waveform shape represents that instrument's sound.  When I'm in the room with both instruments, I hear two instruments because my ear (rather two ears, separated by the width of my head) can discern that there are two sound sources.  But let's think about recording those sounds with a single microphone.  The microphone's diaphragm moves and converts changes in air pressure to an electrical signal.  The microphone is hearing a single set of air pressure changes, consisting of a single, combined wave from both instruments.  And the air pressure changes occur in two domains, frequency and amplitude (sure, it's a very complicated interaction, but still capable of being graphed in two dimensions). Now we record the sound, converting it to electrical energy, stored in some analog or digital format.  Next, we play it back, converting the stored information to electrical and then mechanical energy, manipulating the air pressure in my listening room (let's play it in mono from a single full-range speaker for simplicity).  How can a single waveform, emanating from a single point source, convey the sound of two instruments, maybe even in a convincing 3D space?  The speaker conveys amplitude and frequency only, right?  So, what is it about amplitude or frequency that carries spatial information for two instruments/sound sources?  And of course, that is the simplest example I can design.  How does a single mechanical system, transmitting only variations in amplitude and frequency, convey an entire orchestra and choir as separate sound sources, each with it's unique tonal character?  And then add to that the waveforms of reflected sounds that create a sense of space and position for each of the many sound sources?

77jovian
Post removed 
If one were take someone from the hinterlands who’d never heard anything but natural sounds and played them the finest HiFi in the world, they would have no idea what so ever that the sound was anything but some chromy illuminated beast.

If they then heard the same piece in another location with different hardware but same distortions, they would be astounded that two such beasts had the same song.

As usual, Mr Kait gets it wrong. What Mr. Feynman actually said is “Hell, if I could explain it to the average person, it wouldn’t have been worth the Nobel prize.”

Oh, and Kait's questions are so far off the mark that one has to wonder about everything he ever writes.
That is what I said, Mr. Eels. Try to calm down. 🤪 Ever heard of Valium?

But getting back to the question posed by the OP, what the speakers produce is a function of the electronics, cabling, power cord, fuse, room treatments - everything. So obviously the ability to produce the full orchestra with all the details including the venue acoustic information in a coherent audio waveform without the usual distortion and noise is a huge challenge. I know what some of you are thinking - What noise and distortion? 😳
Post removed 
I know that’s what you’re thinking. You’re wrong. As usual these days, if you don’t mind my saying so too much, Mr. Bluster. 🤡