@cleeds fair enough, I overstated the conclusion but the study did conclude that perceived differences were potentially a result of psychoacoustic or psychological effects:
“The lack of perceived differences due to burning in is further supported by the fact that changes were reported in identical test sequences (A–A and B–B), while no differences were observed in cross-sequences (A–B and B–A). This suggests that the observed variations may stem from a subconscious desire to detect differences or from the emotional state of the expert group”.

