Reviews with all double blind testing?


In the July, 2005 issue of Stereophile, John Atkinson discusses his debate with Arnold Krueger, who Atkinson suggest fundamentally wants only double blind testing of all products in the name of science. Atkinson goes on to discuss his early advocacy of such methodology and his realization that the conclusion that all amps sound the same, as the result of such testing, proved incorrect in the long run. Atkinson’s double blind test involved listening to three amps, so it apparently was not the typical different or the same comparison advocated by those advocating blind testing.

I have been party to three blind testings and several “shootouts,” which were not blind tests and thus resulted in each component having advocates as everyone knew which was playing. None of these ever resulted in a consensus. Two of the three db tests were same or different comparisons. Neither of these resulted in a conclusion that people could consistently hear a difference. One was a comparison of about six preamps. Here there was a substantial consensus that the Bozak preamp surpassed more expensive preamps with many designers of those preamps involved in the listening. In both cases there were individuals that were at odds with the overall conclusion, and in no case were those involved a random sample. In all cases there were no more than 25 people involved.

I have never heard of an instance where “same versus different” methodology ever concluded that there was a difference, but apparently comparisons of multiple amps and preamps, etc. can result in one being generally preferred. I suspect, however, that those advocating db, mean only “same versus different” methodology. Do the advocates of db really expect that the outcome will always be that people can hear no difference? If so, is it the conclusion that underlies their advocacy rather than the supposedly scientific basis for db? Some advocates claim that were there a db test that found people capable of hearing a difference that they would no longer be critical, but is this sincere?

Atkinson puts it in terms of the double blind test advocates want to be right rather than happy, while their opponents would rather be happy than right.

Tests of statistical significance also get involved here as some people can hear a difference, but if they are insufficient in number to achieve statistical significance, then proponents say we must accept the null hypothesis that there is no audible difference. This is all invalid as the samples are never random samples and seldom, if ever, of a substantial size. Since the tests only apply to random samples and statistical significance is greatly enhanced with large samples, nothing in the typical db test works to yield the result that people can hear a difference. This would suggest that the conclusion and not the methodology or a commitment to “science” is the real purpose.

Without db testing, the advocates suggest those who hear a difference are deluding themselves, the placebo effect. But were we to use db but other than the same/different technique and people consistently choose the same component, would we not conclude that they are not delusional? This would test another hypothesis that some can hear better.

I am probably like most subjectivists, as I really do not care what the outcomes of db testing might be. I buy components that I can afford and that satisfy my ears as realistic. Certainly some products satisfy the ears of more people, and sometimes these are not the positively reviewed or heavily advertised products. Again it strikes me, at least, that this should not happen in the world that the objectivists see. They see the world as full of greedy charlatans who use advertising to sell expensive items which are no better than much cheaper ones.

Since my occupation is as a professor and scientist, some among the advocates of double blind might question my commitment to science. My experience with same/different double blind experiments suggest to me a flawed methodology. A double blind multiple component design, especially with a hypothesis that some people are better able to hear a difference, would be more pleasing to me, but even here, I do not think anyone would buy on the basis of such experiments.

To use Atkinson’s phrase, I am generally happy and don’t care if the objectivists think I am right. I suspect they have to have all of us say they are right before they can be happy. Well tough luck, guys. I cannot imagine anything more boring than consistent findings of no difference among wires and components, when I know that to be untrue. Oh, and I have ordered additional Intelligent Chips. My, I am a delusional fool!
tbg

Showing 39 responses by tbg

Perhaps because the neuroscientist/psychoacousticians don't intend their testing to deal with what is most accurattely replicates music as the experimental context necessitates tight and brief controls.
Mankind, believing the bible, ignore massive bones that kept being discovered. Jefferson charged Lewis and Clark to find if such large creatures lived on the Missouri River. Yes, we are all victims of our underlying theories. Darwin explained evolution and we retheorized where such bones might have come from.

What does this have to do with DBTesting? Nothing.
Rja, I fully suspect you are right. We would have to run experiments to find out.
I am somewhat unhappy that I spoke of J.A. in my post as he brings along a lot of baggage. Many of you who have posted above seem sincerely to believe that better conceived db tests would yield recommendations of some components or cables. My reading of what I have seen posted is that many of those advocating db testing expect a conclusion that says there are no differences and thus buy the cheapest. This seems to have been J.A.'s experience in the 3 amp comparison, but in my limited experience such comparisons with db do yield a recommendation, as in the Bozak instance.

Fundamentally, I have no confidence in same/different comparisons in db with too small a sample and with too much dependence on statistical significance tests. A conclusion that all amps are the same or that all cables are the same is just to at odds with my experience to be acceptable. Perhaps when you randomly assign some to the drug and others to the placebo, double blind testing makes research design sense. But I do not concede that db testing is the fundamental essence of the scientific method. Experimentally, a control group design makes sense but double blind testing seldom is necessary. Often it takes great originality to cope with subjects knowing they are being experimented on. The Hawthorne Electric study is the best example of this.

I also really wonder how A, B, and C comparisons of amps, etc. using double blind would be done and reported. How would the random sample be drawn , and where would they assemble? And would we need to assess the relationship between more qualified listeners and others?

There are some reviewers whose opinions I am responsive to as they have previously said things consistent with what I hear. With double blind testing there would be no reviewers I presume.
Pabelson, you added greatly to my historic understanding of double blind testing. Can you please give citations for the instances where same/different tests yield differences? I think something is fundamentally wrong with the research design unless there are such instances, including just single run throughs of the signal.

I am quite uncomfortable with the idea that finding a single person who can hear differences 15 out of 20 times would be convincing. I do not know how you can set a level here. Why 15 out of 20?

All of the instances were I participated in same/different db testings were too quick and there is too high a probability of the respondent guessing. I also felt that the testing was unrealistic of the listening experience. By contrast the A, B, C etc. comparison using double blind was more analogous to the listening experience. As I said, because of this, I would be interested in such testings. Here I would again suggest the hypothesis that could be tested as to whether there were differences among those with long experience in working with music.

Advancing the field. Yes, that would be nice. I have seen quality components, IMHO, be ignored because of name brand manufactures cache. I have little question that the field has advanced greatly during the 40 years that I have been involved, especially digital. Someone has suggested that manufacturers use double blind testing all the time, but in my experience, they do not. There is also the voicing of components by such notable designers as Kondo, etc. I presently am overwhelmed by the Shindo Labs 301 turntable. All of this is without the aid of double blind testing.

I have no doubt that some proponents of dbt are sincere as I am sure that the overwhelming number of instances where the small sample are unable to hear a difference leads some to embrase db because it fits their preconceived judgments, especially if they cannot afford more expensive gear.

I also still say that reviews would be very curious with dbt. Would you start with 100 amps being compared and then each month add another? Would anyone buy such a magazine or use it for judging what they will buy? Would manufacturers concede that product D is indeed better and withdraw their amps?
Pabelson, I must admit that I had not known of the Stereo Review's db tests. Out of curiosity I will have to look them up. Are there others?

I teach statistics. Apart from making judgments about the population from a random sample, the concept of a confidence interval has no meaning. We never can make the conclusion, "...that the listener really heard a difference, and wasn't just guessing lucky." With a random sample of sufficient size, you can get a confidence level of .05 which might be that your experimental group's mean response was right 15 out of 20 times. This is why I ask about this number in the absence of a random sample. 15 out of 20 may impress you, but it has no basis in statistics.

I also do not understand the notion that db testing is unneeded for, "components where differences are undeniable." Undeniable by whom?

I grow less convinced that db testing has any potential for sheading light in the evaluation of stereo equipment.

Tvad, were good db testing procedures, I would think we would have to assess whether some were better evaluators than others. As I said earlier, I still think review magazines would be boring and that most audiophiles would ignor the results, if any were positive.
As I have said at least 5 times, your statement that, "ABX test, generally regarded in scientific circles as the gold standard for determining audible differences" is not true. But neither of us will ever convince the other, so why don't we just drop it. I can accept your statement that those advocating it would not be numerous enough to justify a magazine so it is a moot point.
Pabelson, frankly I don't care enough about this question to expend the time necessary to do such work. I am more concerned with find a great loud speaker.

I just do not understand the expectation that all individuals are the same in these tests. It is not statistical significance, it is improbability that you are talking about.

How do you know when you wrongfully reject the null hypothesis?
Pabelson, why do you willfully ignore the truth? It is only your CLAIM that DBT gives the reality, when you say "DBT--because it usefully separates reality from illusion" Certainly you don't claim that DBTesting is isomorphic to reality. It is a well structured experiment that differs greatly from what we normally hear and how we hear it. I would say that DBT is an illusion of reality and that reality would be found in the amp that most preferred, especially were personal ownership and manufacturer hidden.

I suspect that this discussion has gone as far as it can. You insist that double blind same/different testing is valid and I say it is not because it is an invalid assessment of people's hearing differences and saying what they like. I am not saying I like what I like and I reject it any more than you are saying I know there are no differences among amps, etc. and therefore anything that shows otherwise is not science as represented by DBT.
I started the thread because I am curious about those who doubt others' abilities to hear the benefits of some components and wires. As many proponents can point to few examples of DBT and nevertheless seem confident of the results, I assumed that they saw DBT as endorsing their personal beliefs. Furthermore, my personal experiences with DBT same/different setups has been that I too could not be confident that my responses were anything other than random. But my experiences with single blind tests with several components which were compared have been more favorable with a substantial consensus on the surprisingly best component.

Speakers have always been a problem for me. Some are better in some regards and others in other areas. I suspect that within the limits of what we can afford, all of us picks our poison.

I did read you reference article and found it very interesting a troublesome as I use a Murata super tweeter with only comes in a 15k Hz and extends to 100k Hz. I am 66 and have only limited hearing above 15k Hz, yet in a demonstration I heard the benefits of the super tweeter, even though there was little sound and no music coming from the super tweeter when the main speakers were turned off. Everyone else in the demonstration heard the difference also. I know that the common response by advocates of DBT is that we were influenced by knowing when they were on.

I must admit that I am confident of what I heard and troubled by my not hearing a difference in a DBT. Were this my area of research rather than my hobby, I would no doubt focus on the task at hand for subjects in DBTs as well as the testing apparatus. My confidence is still in human ears, and I suspect that this is where we differ. I guess it is a question of the validity of the test.

For a sincere DBTer, such as yourself, I am not being truculent. For those embracing DBT as simple self-endorsement, I am dismissive.
Gregm, I do not know how many out there experienced the Murata demonstration at CES 2004, but it was a great deal like what you describe. Initially, the speakers played a passage. Then the super tweeters were used and the passage replayed. The ten people in the audience all expressed a preference for the use of the super tweeters. There was much conversation but ultimately someone asked to hear the super tweeter only. The demonstrator said, we already were hearing it.

When we all refocused on the sound, all that we could hear was an occasional spit, tiz, snap. There was no music at all. The Muratas come in at 15k Hz. I left and dragged several friends back for a second demonstration with exactly the same results.

Would there be any benefit to having this done single or double blind? I don't think so. Do we need to have an understanding for how we hear such high frequency information, without which it might be a placebo or Hawthorne Electric phenomenon? I don't.

But this experience is quite at odds with the article that Pabelson cited. What is going on? I certainly don't know, save to suggest that there is a difference in what is being asked of subjects in the two tests.
Rouvin, bingo! Validity is the missing concern with DBTs. I also entire subscribe to your question about where DBTing fits into the reviews that audiophiles want. As I have said, I cannot imagine a DBT audio magazine.

I am troubled by your comments that some DBTing has given positive results. Can you please cite these examples?
Where is your evidence? Perhaps by your definition which is not widely shared.
Pabelson and wattsboss, I agree with both of you as my first posting would suggest. I am getting on with my search for a better speaker than the twenty or so that I have tried thus far, and I cannot imagine how DBTesting would help me at all in this quest.

In science we are interested in testing hypotheses to move along human understanding. In engineering we are seeking to apply what is known, limited though it may be. Audio is an engineering problem and there is no one right way to come up with the best speaker. When validly applied experiments using blinds are useful for excluding alternative hypotheses. This is not a science, however.

Also, while I read reviews, it is usually of those whose opinions I have learned to value because my replications of their work has reached the same conclusions. I fully realize that their testing is sharply restricted by the limited time and setups they have. If my testing yields results I like, whether or not I am delusional, I buy and am happy. I suspect that others would share my conclusions, but it is not a big deal if they do not.
leme, I am not at all interested in DBTesting as I know from personal experience that there are substantial differences between both cables and amps. This is why I would have to say there is real concept invalidity to BDTesting. Furthermore, I really don't care what the results would be but suspect that a disproportional percentage of the time DBTests accept the null hypothesis.

Pabelson, I did not mean to say that I put much stake in what a reviewer may say even were I to have agreed with him in the past.

Bigjoe, certainly you can dismiss DBT if you find it invalid. Science has to be persuasive or orthodox. And as I keep saying this is not a hypothesis testing circumstance; it is a personal preference situation. Science is supposed to be value free with personal biases not influencing findings, but taste is free of such limitations or the need to defend them.
Qualia, you state, "So, if two amps cannot be distinguished unless you're looking at the faceplates, why buy the more expensive one? Now who finds fault with that reasoning?" My point is that DBTesting "no difference" is not no difference. It is not a valid methodology as it is at odds with what people hear even if they cannot see the faceplates. Furthermore, I can hear a difference and my tastes are all that matters. This is not scientific demonstration.
One thing about being over 60 is that the style of thought in society has changed but not yours. When I was a low paid assistant professor and wanted ARC equipment for my audio system, I just had to tell myself that I could not afford it, not that it was just hype and fancy face plates or bells and whistles and that everyone knows there is no difference among amps, preamps, etc. DBT plays a role here. Since it finds people can hear no differences and has the label of "science," it confirms the no difference hopes of those unable to afford what they want. My generation's attitudes no result in criticizing other peoples buying decisions as "delusional."

I certainly have bought expensive equipment whose sound I hated (Krell) and sold immediately and others (Cello) that I really liked. I have also bought inexpensive equipment that despite the "good buy" conclusion in reviews proved nothing special in my opinion (Radio Shack personal cd player). There is a very low correlation between cost and performance, but there are few inexpensive components that stand out (47 Labs) as good buys. This is not to deny that there are marginal returns for the money you spend, but the logic of being conscious of getting your money's worth really leads only to the cheapest electronics probably from Radio Shack as each additional dollar spent above these costs gives you only limited improvement.

DBTesting, in my opinion, is not the meaning of science, it is a method that can be used in testing hypotheses. In drug testing, since the intrusion entails giving a drug,, the control group would notice that they are getting no intrusion and thus could not be benefited. Thus we have the phony pill, the placebo. The science is the controlled random assignment pretest/posttest control design and the hypothesis, based on earlier research and observations of data, that it is designed to answered with the testing.

If we set aside the question of whether audio testing should be dealt with scientifically, probably most people would say that not knowing who made the equipment you hear would exclude your prior expectations about how quality manufacturers equipment might sound. Simple A/B comparisons of two or even three amps with someone responsible for setting levels is not DBT. Listening sessions need to be long enough and with a broad range of music to allow a well based judgment. In my experience, this does remove the inevitable bias of those who own one of the pieces and want to confirm the wisdom of their purchase, but more importantly does result in one amp being fairly broadly confirmed as "best sounding." I would value participation in such comparisons, but I don't know whether I would value reading about such comparisons.

I cannot imagine a money making enterprise publishing such comparisons or a broad readership for them. I also cannot imagine manufacturers willingly participating in these. The model here is basically that of Consumers Reports, but with a much heavier taste component. Consumers Reports continues to survive and I subscribe, but it hardly is the basis of many buying decisions.

My bottom line is that DBT is not the definition of science; same/different comparisons are not the definition of DBT; any methodology that overwhelmingly results in the "no difference" finding despite most hearing a difference between amps clearly is a flawed methodology that is not going to convince people; and finally, that people do weigh information from tests and reviews into their buying decisions, but they also have their personal biases. No mumble-jumble about DBTesting is ever going to remove this bias.
Sorry, Pabelson, I don't think an appeal to the acceptance of a method used in perceptual psychology demonstrates no differences. When there is controversy over a finding, which demonstrably there is, something other than same/different DBTesting would be needed unless those of you persisting in advocating DBT wish to continue to be ignored. I am afraid your argument that DBT proves humans cannot hear the minor differences runs counter to most people's experiences. As I said before, buying decisions don't hinge on scientific proof, and it is an interesting question why some seem so committed to the belief that audio is all snake oil. Perhaps a psychologist should look into that phenomenon.
Pabelson, no I have no interest in whether a reviewer can hear a difference between components using DBT using the usual same/different format. Since I don't think it is valid, I would rather continue my present procedure of find whose reviews prove on target in my estimation. Frankly I don't think there are enough DBT proponents out there to make a magazine using them viable.
Pabelson, you owe us an explanation of how two amps that replicate music differently can sound identical in the restrictions of DBTesting.
Shadorne, I understand your moderate position. Please take no offense when I simply say that I strongly suspect that DBT is an invalid testing of what people hear. I am not really concerned either that some, myself included, cannot hear differences in the typical same versus different format so commonly used in DBT. People do hear differences when double blind testing is just a which do you prefer of amps A, B, and C. I don't really have much trust in many reviewers and don't need their inability to hear difference in same/different tests to be convinced.

I absolutely concur that we need to be wary of exorbitantly high priced equipment and rave reviews and claims by salesmen and reviewers. But we should equally be ready to hear true quality in some more expensive equipment. Quality parts cost money and research and design work has to be paid for. Too often I have heard some expensive gear that truly is excellent in my opinion and which I remain thrilled with. My Reimyo PAT777 amp and Shindo Labs turntable are but two examples. I also have a relatively inexpensive line stage, phono stage, and universal player that are at least the equals of much more expensive equipment, again in my opinion.

I once heard a $350,000 amp at CES. I listened with no intention of ever buying one. It was the best sounding amp I ever heard. The Stereophile reviewer also loved it, but it measurements looked bad and so they dismissed it. The objectivists ranted that Stereophile should not have even reviewed it. I ranted that they should have heeded their ears rather than their inadequate instruments. I still would not consider buying it largely because I just cannot afford it.
Qualia, you say, "Why anyone, whether or not they think DBT is the *final* word, would ignore DBT as a way of determining where to spend their own money (speakers, room treatment first, then other stuff) is beyond me. " It is totally beyond me why anyone would have such distrust of what they hear to rely on DBT. If you wish say that I just choose to dump cash even when there are no differences. Basically, I find DBT invalid and have to otherwise proceed hoping that I can hear a side by side comparison of what I am interested in. On occasion I have been able to bring the desired components into my own home and do a comparison, some times I can rely on the ears of others I trust, one being a reviewer, one a distributor, and one two manufacturers, but most just audiophiles; and sometimes I just take a flyer, such as with the RealityCheck cdr burner. As I have repeatedly said, this is not a matter of rejecting science, it is a matter of rejecting a methodology as it obviously lacks face or conceptual validity. Also as with automobiles and wine, I do not base my buying decisions on double blind tests.
Pabelson, data is the heart of science. To gather it one has to have operationalizations of the concepts in your hypothesis which involves methodology. Your distinction is not meaningful.

You are always justifying DBTs as often used in perceptual psychology. Such appeals are unscientific appeals to authority. There are many reasons to believe that as applied to audio gear, this methodology does not validly assess the hypothesis that some components sound better.

You, sir, also have no evidence that is intersubjectively transmissible. Furthermore, as I have said repeatedly, I would not care anyway. I buy what I like and need not prove anything to you or others wrapping themselves in the notion that they are the scientists and those who take exception to them are unscientific.
Citations please, Pabelson. I don't follow this literature any longer but your mere saying we know is not convincing.
Qualia, yes there is some minor DBTesting in wine, but as in audio no one pays any attention to it. Like audio tastes rather than DBT rule in the buying decision. Please understand that I see nothing wrong with your making decision based on this methodology, but I do recent those of your school calling other "anti-science" or fools.
What is laughable in the Psych. Dept. is not authoritative. Data has to be presented to justify that DBT validly assesses sound differences among components. DBT lacks face validity as most can hear differences. You, sir, are the one guilty of scientific error no matter how much you protest that others are pseudoscietific.

But more fundementally, we are not engaged in science in picking wine, cars, clothing, houses, wives, or audio equipment, so Charlie is right. Put this to bed. Neither of us is convincing the other nor ever will.
Rouvin, I substantially agree, of course. I agree moreover about the liabilities of publish or perish in academia and its effect on research, even though I am in a field with no commercial interests, other than public polling.

I do study public policy also, including the impact of creationism or intelligent design as it is now called. It is awkward to get good state data on science degrees issued before and after adoption of anti-evolution policies, but the worst states in terms of failing to teach evolution have not experienced a decline in science degrees. They never had many in Kansas, for example. It is much like abortion restrictions, the states that adopt such restrictions are those with few abortions and experience no decline thereafter. Where abortion is common, no politician would risk introducing a restriction or voting for one.

I too have been struck by why those advocating DBT seem to think that anyone need bother paying attention to results when buyers obviously hear a difference which causes them to buy. Anyone who trusts reviewers other than to suggest what you might want to give a listen, are bound to be disappointed.
Pabelson, I do wish it would die, but you continue to misrepresent what science is and who best represents it. There is no evidence anywhere, including with your sacred, DBTesting, that demonstrates given evidence that cables don't sound different. You are not the authority who could declare that science has proven something for decades. No finding is ever proven rather it is tentatively accepted unless further data or studies using different methodologies suggests an alternative hypothesis. Robustness also is not of much use except to suggest that replications have often been done.

It is just the case that I will not concede that scientists or anyone has shown my better sounding cables are indistinguishable from zip-cord. Any fool's testing would indicate that is untrue, even if only in sighted comparisons.
Despite someone who claims to be knowledgeable about research methods, you seem woefully insensitive to the need for your measures to validly assess the theoretical concept for which they are supposed to measure. I stead you make very unscientific appeals to authority which is perhaps the worst scientific infraction.

You have demonstrated that there is insufficient tangible data to dismiss criticisms of DBT as inapplicable to questions of what sounds best that could be shared among customers. Until the obvious disparity between what people hear and what DBT shows is resolved, no one is going to make buying decisions based on DBT. Perhaps you do, but I doubt it.

I am off to CES, so I will not be monitoring further useless appeals to authority.
As I have said before too many times, were DBTs that were not same or different tasks used and to show no differences, many who viewed this as science would be inclined to accept this as a valid measure of sounding different and better. Same or different questions over brief periods do not give results that have face validity.

Again, this discussion should be laid to rest. Your evidence and appeals to "what scientists already knew" authority are not the way to make your conclusions broadly accepted. Again, were this a matter of what would cure cancer, etc., there probably would be the need to resolve what is an appropriate test, but it is not. As such it is not relevant to discussions on Audiogon or AudioAsylum.
Palelson, perhaps we just have a language difference. I would certainly concede that for a coin to be heads 15 out of twenty tosses is improbable. This probability is at the root of statistical inference which, of course, seeks to assess support for a hypothesis in the population from a sample. There is always the possibility that the sample is unrepresentative and that we might wrongly reject the null hypothesis when it is actually true.

I just think the proper hypothesis should be that a sample of people can hear a difference between cables or amps. The null hypothesis is that they cannot.
It would be very difficult with a sample of one to achieve statistical significance, so you are apt to accept the null hypothesis. However, a sample of 25,000 would assure you statistical significance.

I am only concerned that the choice of the sample size may be determined by what the researcher's intended finding might be. I think it is a far more interesting hypothesis to suggest that those with "better ears" would do better. I don't think most audiophile would be convinced or should be convinced that all amps or wires sound the same.
Agaffer, I agree. I have participated in DBTs several times and have found hearing differences in such short term to be difficult, even though after a long term listening to several of the units, I clearly preferred one.

I think the real question is why do short-term comparisons with others yield "no difference" results while other circumstances yield "great difference" results. Advocates of DBT say, of course, that this reveals the placebo effect in the more open circumstances where people know what unit is being played. I think there are other hypotheses, however. Double blind tests over a long term with no one else present in private homes would exclude most alternative hypotheses.

The real issue, however, is whether any or many of us care what these results might be. If we like it, we buy it. If not, we don't. This is the bottom line. DBT assumes that we have to justify our purchases to others as in science; we do not have to do so.
You said,"As you said, what does it matter to you if scientists say your cables are indistinguishable from zipcord?" I would take that to mean that you meant this.

I merely would state that I and many others reject that DBT validly assesses sonic difference among cables, etc. Where is your demonstration of face validity or any demonstration of validity?

My faculty room was also amused that I had any confidence in experiments, which they view as not isomorphic or generalizable to real life. They are always on my case for approving Psych. proposals that use the force contributions from students taking Psych. courses. They are enamoured with econometric modeling usually assuming that humans are rational. I have never found humans maximize much other than perhaps to take the lazy way out, such as voting the political party they adopted from their parents.
Pabelson, you say, "If you can distinguish two amps with flat frequency response and low distortion in a blind test, you will be the first." This means one of two things: there is no differences among amps or DBTesting does not allow humans to judge the differences. To accept the formerr means that quality parts, innovative power supplies, careful construction, and generally good design contributes little or nothing and that humans are hopelessly delusional.

As I have posted, I very much suspect the methodology is invalid. In the research that I do, I cannot imagine peers to accept a methodology that so often accepts the null hypothesis that nothing matters. Since my research so often suggests that states enacting seatbelt laws, .08 alcholal level as intoxicated, spending more per capita on education to compete with other states, or to allow concealed handguns all have no effect on the problem to which they are directed, I know the rath concerning my methodology, which unfortunately cannot include an experiment where have of the states randomly drawn have a law or action, and the other half do not. Here many want to accept that governmental actions matter. In audio many want to accept that amps don't matter. I think other methodologies should be used to assess both prior convictions.
Gregm, I think you are absolutely right that to many of us want others to bless our choices, be it for wine, women, or audio. I stopped going to audio society meetings in New York as too much of the conversations were "mine is bigger than yours" conversations. Having discussion groups on the internet is no different.

My objection to those advocating DBTesting is that they want to use a questionable methodology to say in effect "mine is every bit as good as yours and I paid less." Science does not condone their saying this and I don't really care whether it does or not.

Pabelson, you say, "DBT--because it usefully separates reality from illusion." My only real question is whether the "reality" is a false reality. One that we don't hear when listening. This is why I suggest it is invalid and does not merit acceptance of the findings.
Okay, objectivists, one more try. I have participated in same/different DBTs and found that I could not hear differences. I have also participated in double blind tests that merely selected which preamp sounded best. In this case differences were obvious and most agreed on which preamp we preferred. I valued neither testing but the latter was more fun.

I am engaged in a social science and teach research methods at the graduate and undergraduate level so I am not anti-science. But there is good science and bad. More importantly there is the question of whether the concepts in the hypothesis are tested by the variables in the data. I am merely stating that I am unconvinced that questions such as whether there are differences among amps in their sound are not validly assessed by the short-term same/difference methodology commonly associated with DBTs.

A methodology that fails to hear differences among amps, wire, etc. heard by so many even in double blind circumstances is not convincing. It may sooth those who cannot afford more expensive equipment who can dismiss those who buy more expensive equipment as just impressed with face plates or bells and whistles or sold by hype, but it does not prove their delusional behavior.

I don't mind people keying their behavior on the most common "no difference" findings of DBTs, but objectivists feelings of superiority based on bad science are unjustified and likely to convince very few.

I really have failed in my first posting to suggest why DBTesting has failed to catch hold and why so many of us could care less that it has. No amount of casting aspirsions about subjectivists being unscientific will convince us and obviously no patience in presently my perspective will convince you. So why don't we just drop the issue and get back to enjoying life?
Pabelson, I doubt if any reviewer could "pass" the DBT. This is because of the methodology. Any substitution of other than same/different methods would likely result in rejection of the results by DBT proponents as Gregadd says. Subjectivists would no doubt ignore reviewers "failing" the test. Nothing would be proven to anyone's satisfaction by the entire effort, so what is the point?

Somehow you seem to believe that reviewers are the arbitrators of quality leading customers around like sheep. As is often noted the "best component" issues outsell other issues. I do not know whether this "proves" the influence of reviewers or magazines. Some may just be keeping a count on where their equipment falls.