Why Do So Many Audiophiles Reject Blind Testing Of Audio Components?


Because it was scientifically proven to be useless more than 60 years ago.

A speech scientist by the name of Irwin Pollack have conducted an experiment in the early 1950s. In a blind ABX listening test, he asked people to distinguish minimal pairs of consonants (like “r” and “l”, or “t” and “p”).

He found out that listeners had no problem telling these consonants apart when they were played back immediately one after the other. But as he increased the pause between the playbacks, the listener’s ability to distinguish between them diminished. Once the time separating the sounds exceeded 10-15 milliseconds (approximately 1/100th of a second), people had a really hard time telling obviously different sounds apart. Their answers became statistically no better than a random guess.

If you are interested in the science of these things, here’s a nice summary:

Categorical and noncategorical modes of speech perception along the voicing continuum

Since then, the experiment was repeated many times (last major update in 2000, Reliability of a dichotic consonant-vowel pairs task using an ABX procedure.)

So reliably recognizing the difference between similar sounds in an ABX environment is impossible. 15ms playback gap, and the listener’s guess becomes no better than random. This happens because humans don't have any meaningful waveform memory. We cannot exactly recall the sound itself, and rely on various mental models for comparison. It takes time and effort to develop these models, thus making us really bad at playing "spot the sonic difference right now and here" game.

Also, please note that the experimenters were using the sounds of speech. Human ears have significantly better resolution and discrimination in the speech spectrum. If a comparison method is not working well with speech, it would not work at all with music.

So the “double blind testing” crowd is worshiping an ABX protocol that was scientifically proven more than 60 years ago to be completely unsuitable for telling similar sounds apart. And they insist all the other methods are “unscientific.”

The irony seems to be lost on them.

Why do so many audiophiles reject blind testing of audio components? - Quora
128x128artemus_5
Harman Int'l uses blind testing quite frequently to develop cost effective products that the market will consume.

Audiophiles reject blind testing out of fear. Fear of what? It's pretty obvious. The Oz syndrome.
Did you even read what you posted. Here let me help!

In addition, the discussion emphasizes the usefulness of the ABX approach for testing clinical populations.

The results are interpreted as providing evidence for separate auditory and phonetic levels of discrimination in speech perception.

The obtained one- and two-step functions for both ABX and 4IAX tests are consistently better than the predicted discrimination functions, although the form of the obtained and predicted functions do match each other reasonably well.


The testing had absolutely nothing to do with blind testing by the way. ABX is just one of many test procedures used. Preference testing, pair testing, triads, etc.


Guess what, our brain can only detect timing differences out to 0.5 milliseconds. Does that means that we can’t discriminate audio signals longer than 0.5 milliseconds? If you don’t understand what you are reading then it is best not to comment with authority. I don’t ask my mechanics to interpret my x-rays for a reason!


Here let me illustrate how flawed your logic is.  Audiophiles regularly claim that they can instantly tell the difference from one cable to another because the soundstage got wider, instruments better defined, etc.  Most of that is embedded in first arrival information, stuff on the order of milliseconds. By the logic you attempted above, you should not even have been able to remember a difference!  But you did. Why? Because we don't remember waveforms, we remember the impacts of them, but the accuracy of those memories decay too. So if I play something now, and play it again 30 seconds later, and something in the image shifts 5 degrees, you will notice it.  But if I played one now, and another in a week, you would not be able to accurately identify a shift and the result would be random.


p.s. The test in the literature is a discrimination test, like positional accuracy tests. It tests a very specific processing feature of our auditory system. The funny thing is, tests like this within the domain of audio reproduction don't even need ABX testing. I simply have to test with 1 cable, look at my results, then repeat the test with a different cable and look at my results. If they are the same, the cable made no difference.  Does not matter how long our audio memory is.  Again, don't take your medical x-rays to your mechanic.

edgewound
67 posts
04-29-2021 12:21pmHarman Int'l uses blind testing quite frequently to develop cost effective products that the market will consume.

Audiophiles reject blind testing out of fear. Fear of what? It's pretty obvious. The Oz syndrome.




Fear and ignorance.
Post removed 

steakster
1,141 posts04-29-2021 12:48pm
There aren’t any equations for touch, smell, feel, hear or taste.


That would explain why the food industry places so much emphasis on tests equivalent to ABX testing if not much more rigorous. They have whole societies and technical disciplines in place for the science of testing, and they use blind tests almost exclusively for taste. Pepsi Challenge anyone ...

The notion that blind testing for audio is an absolute test is absurd, and on so many levels. There is abundant literature (although not enough) on the frailty and limitations of blind testing in all matters of research. (That doesn’t mean that blind testing doesn’t have its place in audio, but it’s useless for most audiophiles. It’s tedious. Time consuming. Boring. And still prone to errors.)

One of the best examinations of blind testing is: "Intentional Ignorance: A History of Blind Assessment and Placebo Controls in Medicine" by Ted J. Kaptchuk, published by John Hopkins University Press. In recounting the history he explores some of the nuances of scientific testing in general. This is a scholarly, peer-reviewed article, so there’s not much point debating it here. But he concludes with this:

"The adoption of blind assessment in medicine has had as much to do with shifting political, moral, and rhetorical agendas and technical research design issues as with scientific standards of evidence ... blind assessment has also been a vehicle to confer social authority and moral legitimacy ...


He writes that blind testing has a "concealed history" and that part of its "shadowy past is the intense fervor and absolute authority with which modern biomedicine advocates it ... the justification is ’self-authenticating.’ Concealed history augments the appearance of an obvious transcendent truth. Questions are discouraged. It becomes less something molded by interests, and more an unquestioned resource upon which any interest must draw, if it ever hopes for an accolade of objectivity."

The eternal chorus of those who demand that users here submit to blind testing are merely exercising their religious beliefs. If they were truly interested in science, they’d be discussing blind testing in scientific forums, where content such as I cited here is germane.

@cleeds 

The notion that blind testing for audio is an absolute test is absurd, and on so many levels. There is abundant literature (although not enough) on the frailty and limitations of blind testing in all matters of research. (That doesn’t mean that blind testing doesn’t have its place in audio, but it’s useless for most audiophiles.

This is Paul McGowan's of PS Audio POV. Yes, he uses blind tests in design, but not in listening to music

Blind audio testing – PS Audio

@edgewound
 
Harman Int'l uses blind testing quite frequently...
 Yes they do. So does Paul McGowan. (see above link)  But it is a meant for mfg's purpose, Harmon also trains their testers HOW to listen. And HERE is where most people fail. They either do not have a system which is sensitive enough to make any difference or they don't know what they are listening for. Plus other things which Paul mentions in the video (above)
"...and they use blind tests almost exclusively for taste. Pepsi Challenge anyone ..."

Blind testing is how we got that total failure of "The New Coke".

EDIT: And the Bose 901.
To answer your question on the title of the OP: because "blind testing" is not a thing. It's a catch phrase the snake oil screechers throw in your face every time you say something sounds better than something else. Anything.


how much do you want to know about it?

here is a Stereophile article about the highs and lows of Blind Testing.

https://www.stereophile.com/features/141/index.html

personally i have zero interest in blind testing as tool for system building.

but i have plenty of experience with doing it. and it’s very flawed as a process.

for 20 years i have been a judge in a speaker building contest every other year with our local audio club. there are 3-4 judges and a curtain is set up and the speakers are set up behind that curtain. we have a sheet where we keep score and run through some cuts.

for this event it’s maybe the only process. i can tell you that listening to 8-15 sets of speakers will give you a headache. you are in a forced hearing situation so you are not allowing the music to come to you. how you feel about the music has to be ignored.

i would never choose that for my own decision making. i want to be relaxed and allow my mind to settle without any stress and get to my zen state, then i start to pay attention to what i'm feeling about what i'm hearing. if there is an unknown in the chain that fact takes away my complete concentration and ease.
My only complaint about ABX is that, if the source material does not change, ear fatigue sets in VERY, very quickly.  

I volunteered for an ABX speaker wire test at Klipsch HQ back in '06.  The first five rounds, I was perfect.  5 for 5 identifying the more expensive wire versus the lamp cord.  

My accuracy, as the test continued, began to deteriorate, as my ears desensitized to the source material and it all began to blur together, hearing the same small segment of the same musical passage over and over again.  I finished the test 13/20.  So I barely did better than a coin flip on the last 15.  

Rotating the source material, and also ensuring that the source material is familiar to the listener, can seriously mitigate ear fatigue, making the outcomes more reliable. 

One of the things not discussed in any of these ABX papers is the subjects.  The average person does not care about music nearly as much as, for example, the folks on this forum.  Hell, the average person thinks Bose systems sound great.  If these are your test subjects, of course ABX isn't going to be a useful test on them, when the differences they are looking for are extremely subtle. 
I have no issues with how others choose their components...I would assume they don't care how I choose mine...
My systems have the right loudspeakers, placed perfectly into the room, run by the right amp, with the right source and material. But it takes time to get this right. Change anything and you might be back to square one. Sometimes a few weeks of trying different things is required to get it right. Long term evaluation is the only accepted way to evaluate audio gear. The snapshot of ABX testing is not reliable as most ABX testing results show.  
The only changes in my system that I've been able to detect "instantly" are changes in volume (of at least 1/2 db) and fairly significant changes in tone.

But my system is good enough at this point that it's pretty rare I introduce something new that has this kind of effect. Most changes are more subtle and affect the emotional connection I get with the music as much (or more) than easily identified "audiophile terms".

However, once I've listened to a new component/cable/acoustic treatment/speaker position for a while, I can start to identify aspects of the sound that are different. Once I know what to listen for, it's usually not hard to hear the differences when I switch back. 

But even in cases where the differences are not easily identifiable, if I'm enjoying the music more but don't understand why, that's really all that counts. And the enjoyment part is not always the case - there are times when I'll make a change that I think should be an improvement, but after a while, I find myself wanting to turn the music off even if I can't identify what's wrong. 

These are the reasons I will never make a decision to change something in my system based on an ABX test (unless of course I could switch back and forth over the course of days, but this has never been practical). 
Long term evaluation is the only accepted way to evaluate audio gear. The snapshot of ABX testing is not reliable as most ABX testing results show.
 


Two falsehoods in two sentences. Care to try for 3?
It's really a depressing question. Why do so many people reject/fear science? 
jerkface
I volunteered for an ABX speaker wire test at Klipsch HQ ... My accuracy, as the test continued, began to deteriorate, as my ears desensitized to the source material and it all began to blur together ...
I’ve had similar experiences as an ABX subject. I still think blind testing has value, even though it’s not likely to be of much use to audiophiles.

Here’s another scholarly, objective evaluation that explores the frailty of blind testing in audio (referenced in the Stereophile article):

" The conventional .05 significance level used to analyze typical listening tests can produce a much larger risk of concluding that audible differences are inaudible than concluding that inaudible differences are audible ... resulting in strong systematic bias against those who believe differences are clearly audible between well designed components that are spectrally equated and not overdriven."

cleeds3,773 posts
04-29-2021 1:59pm
The notion that blind testing for audio is an absolute test is absurd, and on so many levels. There is abundant literature (although not enough) on the frailty and limitations of blind testing in all matters of research. (That doesn’t mean that blind testing doesn’t have its place in audio, but it’s useless for most audiophiles. It’s tedious. Time consuming. Boring. And still prone to errors.)



It's amazing that you could read this article, though I don't think you did, I think you are quoting others excerpts, and reach this conclusion!


THE AUTHOR IS NOT ADVOCATING AGAINST BLIND TESTING!  Can I be any more clear? What he is advocating against is poor quality of testing, such that the results are taken as absolute, without any consideration to whether test implication truly met the goals, and the opaqueness that often surrounds these tests!
i’ve challenged blind testing advocates to show me a system that equals or exceeds the performance of my system using only blind testing as a system building method.

all i heard was crickets. zero response. blind testers don’t assemble systems using blind testing. they just have pre-conceived opinions. so why even pay attention to them? i don’t.
Gee @cleeds , nice selective posting there. You know there are AES members and people with access to research literature here ...

This is a convention paper, not a journal paper, which means it does not go through the normal peer review of a formal journal paper.

https://secure.aes.org/forum/pubs/conventions/?elib=11480


The conventional .05 significance level used to analyze typical listening tests can produce a much larger risk of concluding that audible differences are inaudible than concluding that inaudible differences are audible than concluding that inaudible differences are audible, resulting in strong systematic bias against those who believe differences are clearly audible between well designed components that are spectrally equated and not overdriven. This paper discusses ways to equalize error risks, introduces a quantitative measure of a listening test’s fairness, discusses implications for literature reviewers, and presents a statistical table enabling readers to conduct equal-error analyses without calculations.


mikelavigne
1,658 posts
04-29-2021 3:19pm
i've challenged blind testing advocates to show me a system that equals or exceeds the performance of my system using only blind testing as a system building method.



That does not even make sense.
The notion that blind testing for audio is an absolute test is absurd, and on so many levels. There is abundant literature (although not enough) on the frailty and limitations of blind testing in all matters of research. (That doesn’t mean that blind testing doesn’t have its place in audio, but it’s useless for most audiophiles.


No, there is not abundant literature that says blind testing is bad. You will have a hard time finding any.  There is literature that deals with bad testing that is blind, but not the basic concept of blind testing.  Every example given in this thread claims to show blind testing is bad, but not one of the actually does. 

djones51
3,869 posts
04-29-2021 3:12pm
It's really a depressing question. Why do so many people reject/fear science?

To quote Disney, "because when everyone is super, no one is super".    Bonus points if you can identify the reference without Google.


I volunteered for an ABX speaker wire test at Klipsch HQ back in '06. The first five rounds, I was perfect. 5 for 5 identifying the more expensive wire versus the lamp cord.  

My accuracy, as the test continued, began to deteriorate, as my ears desensitized to the source material and it all began to blur together, hearing the same small segment of the same musical passage over and over again. I finished the test 13/20. So I barely did better than a coin flip on the last 15.
 

13/20 across a range of test subjects would be statistically significant, but this point to bad test design, and not any error in blind testing. The result actually had nothing to do with blind testing at all, but an ABX test where listener fatigue set in. Any good analysis of results would also look at grouping to determine if there was a listener fatigue element. This goes back to the opacity of testing, all results and methods should be published.
@cleeds 

I’ve had similar experiences as an ABX subject. I still think blind testing has value, even though it’s not likely to be of much use to audiophiles.

Disagree.  The more discerning ears of the audiophile are far more useful in ABX tests.  I pointed out the criticisms I had for the way audio ABX tests are conducted.  It doesn't defeat the utility value of audio ABX tests, just points toward some changes in approach that would increase their utility value.

I'll grant you, there are some audiophiles out there who will never be convinced by the most perfectly conducted ABX testing.  And most aspects of an audiophile's system cannot be easily ABX'ed, at least not at home.  One can ABX a source, such as a CD player, fairly easily, as most preamps have multiple inputs and can easily be switched between them.  Interconnects are a bit more challenging, and the nearly impossible test is ABX-ing a power cable, because now we're into having multiple amplifiers and some sort of switching device between them in order to verify a difference in sound between two power cables. 

Which is, again, why I do my best not to piss on people who choose to spend their money on these sorts of upgrades.  Without a serious A/B, never mind A/B/X test, there's no way to prove them right or wrong.  I prefer to spend my money on things that will demonstrably improve my system.  Maybe once my room is as close to perfect as possible, I've swapped out the crossovers and the tweeters on my speakers, I've found the right cost/benefit balance on my speaker wire and interconnects, and am satisfied with the signal chain of DAC/preamp/amp I've installed, I'll consider playing around with last-mile stuff like that.  But probably not. 

I'm naive about this debate, but if someone is willing to help me, IF ABX testing is useful at all, in what circumstances is ABX most useful and when is it least useful?

I did a lot of A/B testing of speakers, DACs, etc. to choose equipment. Sometimes I hear a big difference that mattered and sometimes I caught myself inventing a difference so I could have something to tie-break two items. 
dletch2
No, there is not abundant literature that says blind testing is bad.
I don’t think anyone has claimed that, and it’s interesting that you equate "frailty and limitations" with "bad."

There is abundant literature that details the fallibility of blind testing, some of which has been linked in this thread. For the measurementalists here, blind testing is a religion; it is perfect and absolute. The results, oddly, are to be accepted on "blind faith." That was Kaptchuk’s point - which you’d understand if you actually read his paper.
because when everyone is super, no one is super".   Bonus points if you can identify the reference without Google.
I needed Google. 
jerkface
The more discerning ears of the audiophile are far more useful in ABX tests.
Maybe. But this has not been shown in any of the legitimate, scientific blind listening tests with which I'm familiar. However, it has been consistently shown that trained listeners - those who were instructed in advance what to listen for - were more likely to be able to detect differences.
However, it has been consistently shown that trained listeners - those who were instructed in advance what to listen for - were more likely to be able to detect differences.

See, to me, audiophiles are people who have already trained themselves on how to listen to equipment and observe differences. 

That said, if you have studies out there where audiophiles were no more effective at detecting differences in audio equipment in blind listening tests than the average Joe on the street, I'd love to take a look at them.  Always game to be enlightened. 

Blind testing has it’s place. Just not in a "Personal Stereo System".
We as individuals are NOT trying to prove to others what we hear is good to them only US. Why try to please the masses because a VERY few want to prove a point.

The issues is not the fact that the testing is good or bad, it just doesn’t have the final say or in my case ANY say in my system building.. No reason for me to use the method.. I like what I like, no need to prove it to anybody. Recommend maybe, BUT I don’t have to prove something is better to anyone.

If I was trying to prove something WAS actually different, trickery by setting terms for testing really isn’t accurate. We are humans and we can be TRICKED. Our hearing is as much a part of seeing when in concert together. NOW add feeling and smell along with a little taste of popcorn, holy moly you got a full blown skewed test because someone POPED popcorn and YOU smelled it..

That same analogy is why a trained ear for some things is ABSOLUTE. There is no other way to gather data into YOUR brain.. You have to listen with your ears, and feel through your bottom, chest, face and hands. ALL are different collection points.. ALL adding to your hearing, perception.

You really can’t measure it, but you can REMEMBER it.. Remember the phone call from a 50 year old friend, you haven’t heard from in 45 years.

You know exactly who it is as soon as they speak.. We have the ability to hear the difference, the question is does every one have that ability?

7 billion people on the planet, one person calls you from 45 years ago. You know exactly who it is...there’s a blind test for ya. That actually has a purpose.. Go figure.. Hello good buddy long time no hear, BUT I can SEE you in my minds eye.. plane as day... Memory is memory.

Have you ever remembered the tune but forgot the words, so you add your own.. Thirty years later your still mumblin’ the wrong lyrics to the same TUNE.. Wrong memory works too, you have to be able to DISCERN the difference.. Tough for some.. Actually a lot.. You have to be able to remember when your WRONG.. Some it is just impossible.. No names ay!!!

Regards
@hilde45
I'm naive about this debate, but if someone is willing to help me, IF ABX testing is useful at all, in what circumstances is ABX most useful and when is it least useful?

Blind audio testing – PS Audio
Post removed 
Post removed 
Post removed 
Post removed 
How many Artemus?

Do you have a loose figure (I assume that you do;-).

DeKay
By virtue of being on these forums and making recommendations, YOU ARE, and you know OHM (my new short form for you), that many on here are very forward in trying to convince others purely on sighted testing.

<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>

Listen I didn't say some folks ears didn't have an agenda. I'm just saying my ears have to please ME.. And right after me you're first...  

I like to share affordable alternatives but VERY close to the same performance at sometime literally 100 X less. 100.00 vs 10,000.00 usd, same cable .. no kidding.. Might be missing a wooden block or two..

Blind testing doesn't need to be used, for 100 vs 10,000.00 LOL who cares..  Some people that is half a week of Starbucks. take the plunge..
Why we all keep arguing with a sick same dude that keeps coming back here over and over and over and over under multiple usernames that keeps getting banned is beyond my understanding. Guilty as charged
Whatever side of the trenches you have positioned yourself in. Not my business 
I’ve had a few customers...musicians...that are actually blind, unsighted.

The interesting thing about such people...especially musicians...is they actually listen with their ears. I’ve made repairs and adjustments for them based on whet they can hear, not what they can see. It’s rarely the most expensive or esoteric thing either. Simplicity works very well in this regard.
They reject it because it doesn't make any sense. It has as many psychological pitfalls as sighted comparisons.
When I followed Andrew Jones’...an actual physicist by training/education... career from TAD/Pioneer to ELAC, I last heard his masterpiece TAD Reference One before he went to ELAC...where he showcased the diminutive Navis ARB-51... I was expecting diminutive sound from such relatively small drivers. It ended up like Spud Web against Shaq. Simply huge performance from such a small footprint. Was it the same visceral experience as the commanding presence of the $85,000/pair TADs to the $2,200/pair internally triamped ELAC ARB-51? No. Was it a difference worth the $81,000? Oh hell no. That would be a fun blind test demo to listen to, where you no nothing of what’s coming through the blind screen until the reveal, including pricing. Even Andrew was surprised at what he’s accomplished.
Post removed