Amir and Blind Testing


Let me start by saying I like watching Amir from ASR, so please let’s not get harsh or the thread will be deleted. Many times, Amir has noted that when we’re inserting a new component in our system, our brains go into (to paraphrase) “analytical mode” and we start hearing imaginary improvements. He has reiterated this many times, saying that when he switched to an expensive cable he heard improvements, but when he switched back to the cheap one, he also heard improvements because the brain switches from “music enjoyment mode” to “analytical mode.” Following this logic, which I agree with, wouldn’t blind testing, or any A/B testing be compromised because our brains are always in analytical mode and therefore feeding us inaccurate data? Seems to me you need to relax for a few hours at least and listen to a variety of music before your brain can accurately assess whether something is an actual improvement.  Perhaps A/B testing is a strawman argument, because the human brain is not a spectrum analyzer.  We are too affected by our biases to come up with any valid data.  Maybe. 

chayro

Showing 25 responses by noske

@djones51 Mine flanks the TV as I don’t really have anywhere else to put the speakers.

I didn’t say, but same here, 2ch stereo HT. I’m comforted to know that it has no measurable effect. (On a 6 foot long table from a school chemistry classroom. Its tall and the correct depth - I don’t like this kneeling/squatting business to push buttons, do cables etc. Blah)

@chayro I guess I was talking solid state amps. I don’t know if tube amps ever pretended to have minuscule distortion measurements that the Japanese gear you mention were achieving.

I wonder if the very low distortion amps would stand up to scrutiny on Amir’s bench with all the extra parameters we now know are important to sound quality..

Following this logic, which I agree with, wouldn’t blind testing, or any A/B testing be compromised because our brains are always in analytical mode and therefore feeding us inaccurate data?

The data is the signal. The brain is what does the analysing of this signal, not the feeding.

Some form of blindfolding is necessary by definition, and a near instantaneous switching mechanism needs to be used. In a perfect test environment you wouldn’t be aware of when it has been switched back and forth.

Together with dB level matching - that might sometimes be the tricky bit (that’s an understatement), and the nuances that have been noted may well be because this step isn’t followed correctly.

That’s a good start.

 

But if it were so simple, and the bias so strong, then why have I, and countless others, no doubt, been disappointed with components when the opposite would be expected?

If what has been promoted was marketing and snake oil and the earth didn’t move for you? It is common, yes, if people are honest.

Alternatively, you may have preferences, and the reviewer/influencer or whoever it was that lead you to have this preconceived expectation, had other preferences. And again, that is common.

I’d like to think these observations are reasonably self-evident, but perhaps not.

 

Neither of those hypotheticals would explain why a cable might not meet expectations, as, according to many, they shouldn’t make any difference at all!

My two possibilities are clearly not the only reasons, I just threw in a couple suggestions.

What do you suggest could be a reason that a cable may not meet expectations in the context of this thread?  Is this a common experience?

Incidentally, the level matching thing I wrote about earlier may not be such an issue with cables, although someone with specific experience may correct me.

I remember reading somewhere (and now I forget where) that A/B testing relies on our short term memory, which isn't the best method.

Relying on short term memory could prove problematic for certain individuals in some demographics.

Would you mind repeating your question?  I had to think about it.

 

I read somewhere Amir is a famous Egyptian movie star who grows prize winning watermelons in his spare time. It's a confirmed fact.

This anecdote has merit as a credible and verifiable snippet of gossip (for the erudite, technically known as hearsay).

Amir is the high priest 

He is also a French-Israeli singer and songwriter.  A man of many talents.

 

He also knew that a new piece of equipment might sound spectacular at the onset, only to become fatiguing after a few hours or even days, no matter how "good" the measured data were.

This fatigue aspect is an issue with audio.. Not just with audio, but I digress.

This is where describing measurements as good, bad, or anything else is incorrect. It is data.

What can be read into the data matters.

The characteristics of amplification which contribute to fatigue may be measured and therefore predicted.

You mention THD amongst other things - yes, and some aspects are pleasing, and others are grating to the brain. (And some serve to mask certain issues in the recording process, but that’s another topic).

This is perhaps one reason why measurements are preferred over blind testing..

As for blind testing, however messy it is even at the best of times, I would lower the threshold to exclude the enjoyment or pleasing factor. Does it sound different? is a more realistic objective.

Measurements provided by Amir indicate to me which bits of gears I may or may not enjoy owning. Others have different preferences. The data in itself is neither good or bad - it is information.

 

But, wait - how do you evaluate the results of each test? How do the tests correlate to the SQ characteristics most important to you?

There are some tutorials available (both written and video) to help understand what the tests mean and therefore how they may be evaluated.

Understanding that might also answer your second question - there are things to watch out for, and are displayed on colorful pictures. I like pictures.

I also know that my preferences do not align well with many who contribute to ASR.

This is no cause to be disrespectful of the combined knowledge of the contributors, many of whom are electrical engineers, scientists and PhDs, and have contributed in some manner to design and building of audio and associated gears..

As with many things in life, it is valuable to learn the ability to pick and choose that which is useful to you.   

@reven6e And guess what: he could clearly hear a difference. His wife could hear a difference. 

Please supply a link to this video review.  I'd enjoy watching it, not least because I've not witnessed Amir bringing his wife in for a hearing experiment - was it an A/B blind test?

@reven6e Actually it was just a simple request, not an invitation to provide a summary on all the good things about ASR that audiophiles are uncomfortable with.

I'm also not familiar with the whole Matrix/spoon philosophy - do you have links at ASR for that I could have a look at?

So now I’m back thinking that going in paying attention to how you feel listening to the component rather than listening is likely a better way to test gear. 

And so we are back to square one.  This is exactly what audiophiles often do, and what blind testing is designed to eliminate.

people who refuse to use their senses and rely on tech measurements, exclusively

This is focusing on tools, and ignoring another variable - preferences.

An analogy. Two people like red cars - same preferences.

One will investigate the properties (measurements) of the paint and say it is a red that is OK. The other will use their eyes and say that it is a red that is OK.

Now - let’s change things a bit. The first person now likes silver cars, and the second still likes red cars.

Different preferences, *and* different methods of arriving at their preferred solution.

I would say with some certainty that the second person who likes red could also use properties (measurements).

Examples - the recent Carver amp. Nelson Pass’s kit (? camp) amp. Some other things that Amir places in the red corner of his rankings - useful information!

@bruce19 But much of the audiophile pursuit is just pushing boundaries a few percent or less at a time. That is where the really big money gets spent and ironically that is where data is almost never presented.

Yes - step outside the echo chamber that is so entrenched by decades of big money, deflection, dishonesty, manipulation and other unethical norms of behavior.   It won't hurt very much :-)

@redlenses03 His methods have been debunked numerous times, depending on where you sit in obj vs sub viewpoint.

Why should the legitimacy of methods used for the purpose that they are intended depend on where you sit?

To advocate such a view in aspects of western civilization would result in bias, unfairness and a breakdown in tolerance and the rule of law.

As one person said on the first page of that link -

Just... Whatever you do, please don’t let mr A back in. Natural justice? Giving the man a chance to respond? Bugger all that. There is no end to a conversation with A: let’s not have one.

 

@mijostyn The result of all this is an entire industry based on deception. As long as it is not my money why should I care? 

Two ideas there - of course it is none of anyone's business about the spending proclivities of people outside of your personal circle.

However, due to the absence of prudent market regulation, an industry built on deception is far from optimal.  Due reward and incentives are not forthcoming to those with novel and robust products and who by definition do not partake of the deception cup.  Innovation in these pursuits is relatively stifled.

That's why I care.

@daveinpa And i'm sure you can have crappy measurements and it does sound great. But its nice to see good measurements for something you've spent a lot on.

I wouldn't even go as far as saying that its nice to see *good* measurements for something, because that word is in itself a judgement, as is *crappy*.

Its just nice to see measurements.   Only a few years ago all we had was specs, which as we now know could be of limited use at best (at worst, deceptive).

What companies making novel and robust audio equipment are being stifled?

Should they currently

- be a company

- making product

- with a known brand name

they are probably already in the public domain, displayed at a store nearby alongside other established mainstream brand names, and have managed to overcome the inherent obstacles associated with the audio industry.

I'm thinking rather of new and emerging technologies and ideas, and the word "relatively" was used deliberately. .  Its a fluid concept, one which is getting dangerously off-topic.

@redlenses03 Actually, it was a mistake to even respond to this topic,

No, I would welcome an impartial review and assessment of ASR. One which is inherently hostile to the approach only serves to alienate interested people seeking truths.

The latest link you provided was to a very brief discussion of transients about which I know nothing, and should that be a material shortcoming in the approach then a discussion may be valuable. These things have probably been thought of and perhaps discarded. Or something? Who knows.

The limitations of SINAD  beyond a certain level are known and hardly need elaborating upon.

@rtorchia But, if one has spent a small fortune thousands on cables high-end audio gear one may well be hostile to Amir and measurements, and insist that one hears auditory phantasms. 

Quite so (along with your other observations), and I took the liberty of changing your quote slightly so that it has a wider audience amongst audiophiles.

@chayro We had tons of equipment back in the 70s with the lowest measurable distortion possible. One of the founding principles of high end audio was to prioritize sound over specs.

Perhaps (I thought it was more like the early 80s, and as always I am happy to be corrected), but I suspect that technology and knowledge has advanced since then to accommodate such characteristics in general without the pain of fingernails on blackboards.

edit - folk who have amps from the 70s may recall, for example, amps such as Marantz, Sansui and many others being capacitor coupled. I’d like to see those (refurbished, of course) put over the measuring bench.

@chayro OK, granted, I stand corrected. I didn’t know that. I wonder what kind of gear they used at discos?

Incidentally, I do notice from some pics here that some folk still do have their setups flanking the TV, perhaps for home theatre reasons.

@adambennette not much discussion here about statistics... well, it’s not an exciting subject really.

Until an appreciation of stats demonstrates how often they may be used incorrectly to draw conclusions that have no basis. That’s just a general observation, so moving right along....

If in fact there is a difference and 40% say there is none, this is saying that those 40% of people have less than optimal hearing. Isn’t that conceding that it is a poor test audience, not to be relied upon?

If we are to trust out ears (whatever that means, despite it being some kind of mantra), ought not 100% agree that there is a difference?

And if prior to the test an unknown portion of the audience cannot trust their ears, on what basis can it then be said after the test that A and B are in fact different? These hard of hearing people may be saying there is a difference when none exists.

The good thing about the one person test is that the variability in peoples hearing is removed - if you have "poor" hearing, any AB test will be subject to a query and be discounted.

If "good" hearing, then multiple tests would need to be done in accordance with good practice (not so easy, as it happens) - to find some measure of confidence would involve discussing probability theory but there does exist common sense rules of thumb which I don’t really like much.

Clear as mud?

edit - by the way, there is no requirement that the individual/s tested be into music/audio gear, whatever.  The best test subjects would be teenagers or even slightly younger.  Just sayin'

The only thing your test would reveal is that under those test conditions, 40 percent of the subjects heard no difference. There’s no data to support your conclusion.

Under strict test conditions. This is a given. I do say that this isn’t easy.

Common sense suggests that something may be learnt from a test when there is a difference and 40% says there is none.  It is not a "dead" number - to an analyst it speaks information.

What would be more interesting (to some, anyway) is where there is no difference and 40% (or 60% or even just 5%) said there is a difference. Could any conclusions be drawn from this? Perhaps the test wasn’t blind or conducted properly (this includes a person who is partial to the outcome conducting the test)? Hmmm. Correctly, this aspect is conceded in the comment.

Given that ears are apparently to be trusted. As is often advocated by many good folk here.

In any event, individual tests are preferred for reasons.