Is there measurement that correlates with cohesive/pinpoint imaging?

I am currently using single-driver Omega alnico speakers which have the most coherent imagining that I've ever heard.  However, if I wanted to compare them with other speakers (including multi-way speakers with crossovers) in that regard, are there any specific measurements to account for?  Would measured delay between driver signals in a multi-way speaker be a useful proxy?

The problem with attempting to use head transfer functions at the recording level is it requires near perfect playback setup (specific to the recording unless you are using headphones), and not something readily or even achievable in most people's playback setups. Until we substantially change the listening system away from 2 channel speakers, we are pretty stuck with that is possible though advanced DSP algorithms and controlled reflections has promise in advancing what is possible with 2 channel. Sorry, analog just is not going to cut it for those advancements.
Within a frame work of two channel audio, it is not simply timing information that is important, but relative timing information, i.e. the time difference between arrival at both your ears. To that end, a large baffle will not impact the primary wave front timing differential between a signal from one speaker to both your ears. Maybe you meant something else?

Time and Phase. Study how the highly evolved ear brain function to localize food and threat sounds and you will understand it is small timing differences. Add in low diffraction because a large baffle destroys time information while functioning as a mechanical averaging machine ( frequency response )

Single drivers do not really image coherently. IM distortion from moving at largely different frequencies ruins perceived cohesiveness, not to mention different parts of the speaker behaving different at different frequencies.

Imaging in most recordings is purely conceptual. It is manufactured. Most live recordings have nothing remotely like imaging either with very odd exception and even then. That makes left-right manufactured, and depth to pretty much. Height? It is not there. It just is not in the recording.

Audiophiles will convince themselves of a million reasons why "imaging" is better or worse, right down to fuses. It makes me laugh. Imaging is almost exclusively your room and your speaker and of course the recording. Really awful electronic can impact imaging, but we are talking last bit, and most people are not remotely there.

Want pinpoint imaging .... go into an anechoic chamber or wear heaphones. I know, not the answer you were looking for. Headphones are better, but an anechoic chamber can be a good substitute. Sound a bit like crap though.

So, can you measure imaging? No, but you can measure that impact of the speakers, room and electronics on what reaches the ears and get a relatively good impression of what the likely imaging is like. What I can’t tell you is whether you will like it. It is a trade-off between imaging and ambience in the real world with speakers.
Perfect frequency response from both sides sounds like it would be super important but in practicality for what most people describe as imaging it is not. Symmetrical installation in general is far more important.

What most people describe as imaging is not exact placement, which has no meaning at all really. If I have a left / right volume difference and something shifts 6" to the left or right, that is not going to be critical.

What most describe as imaging is an ability to visualize an exact spot for the sound source, not just "over there" though a lot of people like "over there". A well designed cross-over is going to provide, given suitable speaker/listener distance time coincident wavefront and certainly with digital cross-overs, it is very easy.