Jules Coleman, a noted professor of jurisprudence (a philosopher, not a lawyer) wrote an article on the subject some years ago which disappeared from the Net but is now back up. Here it is: https://www.inner-magazines.com/audiophilia/musical-value-and-its-reproduction/
Coleman's thesis is similar to yours: why elevate the recording (or more accurately, an LP or other "finished" mixed down product taken from the recording) to a reference when it does not represent the actual performance? This assumes that an actual performance of the instruments in the same room occurred (as opposed to a multitrack conglomeration of overdubs).
The "recording" doesn't necessarily reflect the actual performance (assuming one took place). Yes, it may be the best we have but is "accuracy" to the medium the measure?
Very seldom do we hear recordings of live performances we attended. Where you are positioned in the room makes a difference.
Coleman rejects the notion that an audio reproduction system should be a forensic tool. He doesn't offer a clear alternative other than to state that a "good" system should be emotionally engaging.
Yes, that's a vague measure.
I know as I have evolved my serious listening, I listen differently than I once did. I'm no longer looking for "spectacularity" (think: audiophile warhorse) but instead something that makes it seem like a more convincing illusion of real instruments in the room. I can do this successfully with small scale jazz combos in a fairly large room. Trying to recreate a full symphony or the impact of King Crimson live is much harder and has much to do with the size of the room, as well as the capability of the equipment, to achieve scale.

