Musicality is a meaningless word inasmuch as "The Way I Like To Hear Music". It could encompass anything depending upon the individual.
The rock solid standard for truth in sound is being physically present in the same space as the performer listening to unamplified performing. Then we might compare the sound of being in the same space listening to the same performance with the assistance of microphones and amplification/transducers to that of the unamplified performance in that same space.
That is one comprison.
Everything changes once the recording has been captured. We now assess the recording as an entirely different entity. Even if the recording is recorded and played back in the same space. In one instance the sound reaches our ears direct from the mouth/instruments with no intervening process. In the other instance after recording we listen to a different thing. Related to the inperson experience by being a capture of that, but no longer the same thing. An animal unto itself.
So we listen to recordings, but not in a standardised way. We listen on whatever device(s) we have to hand. We listen to a facsimile on facsimile decoders. The 'excitement' is on the facsimile. No facsimile decoding system adds excitement. The best a facsimile decoder can do is accurately convey the facsimile.
The value judgement of the facsimile decoder, whatever that is, will be ultimately subjective because at this point in time we don't have the absolute ability to interpret through instruments whether the facsimile reproduction system is absolutely accurate. Plus there is no one single verified accurate facsimile decoding system and room to use as a benchmark.
So here we all are muddling about with diverse equipment and diverse rooms listening to diverse facsimiles. Where is the scientific benchmark to make absolute assessments there?
We shall not get anywhere abandoning science but don't have a scientific answer to what is absolute accuracy in regard to facsimile decoding. We have measures we can make. We have auditions we can make. We haven't arrived at an absolute. So at this stage, bereft of an absolutely accurate playback system and absolutely accurate playbakc space, everything remains a compromise.
I think our ears are good enough to rule out the truly nasty playback options and good enough to indicate exceptional outstanding playback options. The wrench in the paradigm which throws all comparisons of playback options into meaninglessness is that we listen in different rooms to one another. Even if we had a standardised asbolute playback system and absolute standardised playback facsimiles, once they are utilised in diverse listening spaces, all standard listening results bacome specific to the specific space.
We cannot yet think in terms of absolutes with regard to replay of recordings. There's a gamut of standards in the recordings. There's a gamut of standards in replay systems and there's a gamut of rooms in which we place those systems.
We muddle onward. No-one is in possession of the truth. All are somewhere on a spectrum of less accurate to more accurate.