You are completely right for sure!
But once this is said what you describe is only the starting point system necessary synergy not the end result which cannot be reach without mechanical acoustic room controls and DSP and "tweaks" of mechanical,electrical and acousticals kind ...
It is why musicality/resolution are not basic acoustics concepts but inherited from gear marketing ( the old debate analog/digital and S.S. versus tubes etc )...The basic acoustics concepts are "timbre" perception in recording and playback and spatial sound cues and "listener envelopment/apparent sound width" or ASW/LV ratio or immersiveness ...
All our work begin after the purchase, according to our budget limits, of "resolving and musical" system for a specific controlled or uncontrolled room and for a specific or anonymus ears/brain...And the difference between "specific" and "anonymus" Ears/brain, and controlled or uncontrolled system/room acoustics coupling made all the differences in the world...
Small room acoustics by the way differ completely from Great hall acoustics....
Also if we can guess if a system may be more or less good with a recording youtube video, a real effective judgment imply we are in the owner room.It is so true i dont feel left behind myself with my peanuts cost system when i listen most costlier system/room, save some with astounding acoustics which are very evident and really costlier than most often...
My speakers system is satisfying for me for his price at the very minimally acoustical satisfaction threshold ... My top headphone AKG K340 hybrid could not beat my first and lost acoustic system/room but now beat my second smaller one (nearfield listening in an acoustic basement corner) because i sold my house then my dedicated normal room...
To me, musicality is like art, in the ear of the beholder. While resolution is tangible and directly related to accuracy. For a home audio system to accurately reproduce music it must be highly resolving. From a technical perspective, we do not listen to pure sinusoidal frequency tones, but rather to complex composite sound waves. To reproduce these complex electrical and acoustical music signals the system must posses extended bandwidth at both frequency extremes because these complex signals are made up of both correlated and uncorrelated frequencies, both in and out of the audible frequency spectrum.

