What tests would you like all speaker reviewers to do for their reviews?

What qualitative or quantitative tests do you think should be performed regularly on all speakers?  
Maybe like “how fatiguing is it with certain gear and cables?”  

Any other ideas?
Impulse response using square waves. Surprisingly many speakers fail badly at this simple test. The Quad 57 and 63 both do well, as does Walsh and Magneplanar. This is a test of coherency - one pulse in should produce one pulse out. Multi-driver speakers have difficulty with this! 
John Atkinson of Stereophile does the best quantitative testing of speakers! Unlike electronics speakers have not evolved to the point where they can hide their flaws from careful measurements!
impulse, waterfall, then frequency response...

Since 1977....
while designing for an easy amplifier load....
I’d like to see an impedance graph that covers the entire frequency range.  This “Nominal Impedance” number that’s commonly quoted is next to useless IMO, and impedance levels and frequencies are critical to know in choosing the proper amp. 
Exactly @tomic601. Danny Richie performs those tests on all the loudspeakers he gets in (and has designed and offered as kits), and displays the results in his GR Research YouTube videos. Very illuminating!
I’d like them to describe their preferred sound types - we all have a preference - up front. Then I’d like them to say how the speaker under test satisfies their preferences. How can they honestly do anything else?

I hadn't heard of this test.  Does anybody perform it in their reviews?
Have the speakers shipped over to me. I will give them a full workup and listen to them and take them apart. The speakers would be sent back with a full report on what needs to be fixed. 

If ALL speakers did this, I bet 99% speakers would not be permitted on the market place as they would fail my examination.
For my ears:
Dynamic compression:  The difference in output at say 70 dB vs. 90.

Distortion at the same.

Gee, if we accept the premise that every loudspeaker must be custom tuned to each listener's ears, of what value to one person is another listener's impressions of a given loudspeaker?

That question is rendered moot by the fact that the promise is utter nonsense. For each listener, the same ears are hearing both live and reproduced sound. Some will follow my reasoning, at least one won't.
live sound is not the same as reproduced sound. Why is that hard to understand? your argument is therefore not valid.
@bdp24, I get your point. Sound is sound - pressure variations with respect to time. How it's processed by the ear, nerves, brain system is what we hear. Correcting a speaker for the hearing mechanism processing does not make sense to me, because any heard sound already has been processed by the same mechanism within each person's ear-brain system.
This is all wishful thinking.
As for the original question by the OP, I think we all need to agree on the objective of what a speaker should do.

From what I read in textbooks before, the objective may be stated as, 'A speaker should reproduce the electrical signal received at it's inputs exactly, as a pressure variation output.' Obviously, we all know from experience that this does not happen exactly, with less accuracy as it does with respect to electronics (pre-amps, amps). So we need a test to show that the speaker output waveform is the same as the electrical waveform input to the speaker. 

Problem: what about the microphone used to test the speaker?... What is it's transfer function? What room do we test the speaker in? Etc.

We all know that measurement of a speaker is not so easy. I suggest we look at the work of Floyd Toole, et al., as guidance for further discussion and work.
I would like to see them play different varieties of music for a change.
Spinorama 2034, near field driver, early reflections, in room response,  beamwidth,  horizontal and vertical directivity, waterfall , distortion @96dbspl, fundamental and harmonic distortion, maybe some 360° vertical and horizontal polars. 
most of those are the same thing with different names
Jesus dude go away you have no clue they each tell you something different. One shows FR on and off axis, port resonance and woofer resonance, others floor and ceiling bounce, etc.. it’s mostly all from using a Klippel, but each gives you different info.
Test the limit of the power handling, and crossover parts, wiring.
push em to the limit, as MANY OF US DO!

  Maybe after multiple failures, will,they put better parts in the crossover to handle the power of being pushed to the speakers RMS and peak recommendations of the mfr. 
The only meaningful test of a speaker is how it sounds (in my room with my equipment and listening to my music). Nothing else matters.
There’s a reason reviewers don’t do any of this. They review them in their homes so the sound is room dependent to begin. Second each of us has our own subjective sense of what sounds best. 
thorough battery of stereo imaging tests to check stability of stereo image at various angles outside of the classic equilateral triangle. aside from the maggie tympani III, the only other speaker i've heard which offers stable stereo imaging no matter where in the room you sit or stand, is the bose cinemate sr-1. also there should be tests of volume versus odd order harmonic distortion to ferret out speakers whose sound hardens at higher volumes. also bass extension versus power handling. some speakers can reproduce the 32' pipes at low volumes but turn up the dial just a bit and you get doubling or worse. 
Reviewer invites 5 of his audiophile friends over. They are blindfolded and listen to the the speaker being reviewed, mixed in with 4 other speakers rotated and sound matched and they list what they like and dislike about each.
Removing as much bias as possible gives the best result.
Comparing blind tests to other speakers would tell me alot about my likelihood of enjoying that speaker.
symphony orchestras playing dense large scale symphonic works familiar to most listeners. comment on transparency,compression,glare,smearing. speakers that do well here should be able to handle just about anything. a review without this is useless to me.
The acid test: listen.
+3  for John Atkinson.  I've read him since he started more than 40 years ago.
In this time he has tested some 1,000 speakers and has build an unequalled authority.
Well I wish all but a few would go away. But I am sick of reviewers stating that one speaker is suited for a particular type of music and not all types. Measurements are all but useless when bad measurements exhibited by a favored manufacturer are explained as inconsequential. 

@epz - really like this suggestion!  Group reviews with each listener describing their impressions would be very refreshing!
To all: Is there a published guide to understanding John Atkinson's standard measurements for the less experienced of us?
@erik_squires I like your suggestion -- shouldn't be too complicated.

I think all speaker reviewers should attest that their hearing aids are all equally calibrated.
@djones - thanks for the link!  strange that stereophile doesn't publish something similar for their measurements...
I would like a room response graph with a distortion spectrum at a realistic sound pressure level it would really give a lot more information about how they react in a real room at a real distance at a realistic music volume level.
I agree with Bluemoodriver: they have to be tested with music you like. I’ve heard speakers sound bad with certain genres. Not sure why. Amps make a huge difference too. I’ve never heard a YouTube reviewer say a speaker isn’t worth buying. Once I heard the guy say something close. I’ve had some beat gear and it wasn’t cheap. 
  • On axis frequency response (common)
  • Phase response (common)
  • Off axis up and down on many points and out to 90 degrees (helps for room response determination) - preferably without smoothing so that anomalies are more evident
  • Accelerometer on several cabinet points with frequency sweep
  • Distortion versus frequency at peak up to the woofer cross-over frequency, then at -10, -20, and -30 db across the whole frequency spectrum.
  • Run speaker at 20% rated power with pink noise and then re-run quickly, the -10db sweep above.
  • Distortion with a tone at the bottom and top of the frequency range of each driver, and then stepped to 1/4 and 3/4 of the frequency range.
I appreciate the in room response tests with a couple competitors measurements thrown in. It helps give meaning to listening impressions.