How can a 40 watt amp outshine a 140 watt amp

My query is; I see $6,000 integrated amplifiers with 40 watts per channel, how is this better than my Pionner Elite SC-35 @ 140 watts per channel, what am I going to hear different, with a, let's say Manley Labs - STINGRAY II? I obviously don't understand the basics involved and if someone could explain or point me in the right direction, I would greatly appreciate it.

I would like to set up a nice two channel analog system. I really can't afford the aforementioned Stingray, what is "out there" in the 2.5 grand range?
It all boils down to this. If the 1st watt doesn't sound good, why would you want 139 more of them!
to answer your questions you need to go listen to some equipment other than what you have now....if the sound is better and worth the money to you then go for it. Griffithsds answer was classic.
Mystertee...there are so many variables that explain differences in sound quality but the basic differences between high end audio and midfi equipment that you can buy at a large retailer like Best Buy is in the amplifier topology, circuit design, the quality of the parts used, the quality of the enclosure which the amplifier sits in aimed at shielding the circuitry from RFI and EMI and to reduce vibrations in the enclosure, etc...In short, the designers of high-end audio equipment are optimizing sound quality and the designers of midfi equipment are designing to a price point that focuses on number of features they can stuff in a piece of equipment (because that is how those units are marketed) and to simple measures like a power rating (which has no bearing on sound quality but the public has been conditioned to think more is better).

As to your questions what will you get with that Stingray is to experience it for yourself. Visit a few local high end audio stores (which you hopefully have in your area), and ask to audition a few systems at various price points and you can experience what high end audio is all about. Take some music that you are very familiar with and play them when auditioning and pay attention to how resolving the system is (do you hear details you did not hear before on your system), is it timbrally much more real/accurate (do instruments and vocals sound more natural/real vs, an electronic reproduction of them), do the systems present a soundstage with dense images floating from the speakers within that soundstage, etc...The only way to understand the difference is to experience it.

As to your final question, what's out there in the $2.5K range for an amplifier, integrated amplifier, or receiver (depending on what you want/need), the answer is a lot and to get appropriate recommendations, we will need to know a lot more about what you are looking for, what speakers you will be using with the amp/receiver, how big is your room, what music you like listening to, what source(s) you will be using (CD, computer-based audio, turntable, tuner, etc...).

Hope this helps.

The Richard Clark Amp Challenge is a listening test intended to show that as long as a modern audio amplifier is operated within its linear range (below clipping), the differences between amps are inaudible to the human ear. Because thousands of people have taken the test, the test is significant to the audiophile debate over audibility of amplifier differences. This document was written to summarize what the test is, and answer common questions about the test. Richard Clark was not involved in writing this document.
The challenge

Richard Clark is an audio professional. Like many audiophiles, he originally believed the magazines and marketing materials that different amplifier topologies and components colored the sound in unique, clearly audible ways. He later did experiments to quantify and qualify these effects, and was surprised to find them inaudible when volume and other factors were matched.

His challenge is an offer of $10,000 of his own money to anyone who could identify which of two amplifiers was which, by listening only, under a set of rules that he conceived to make sure they both measure “good enough” and are set up the same. Reports are that thousands of people have taken the test, and none has passed the test. Nobody has been able to show an audible difference between two amps under the test rules.

This article will attempt to summarize the important rules and ramifications of the test, but for clarity and brevity some uncontroversial, obvious, or inconsequential rules are left out of this article. The full rules, from which much of this article was derived, are available here and a collection of Richard's comments are available here.
Testing procedure

The testing uses an ABX test device where the listener can switch between hearing amplifier A, amplifier B, and a randomly generated amplifier X which is either A or B. The listener's job is to decide whether source X sounds like A or B. The listener inputs their guess into a computerized scoring system, and they go on to the next identification. The listener can control the volume, within the linear (non-clipped) range of the amps. The listener has full control over the CD player as well. The listener can take as long as they want to switch back and forth between A, B, and X at will.

Passing the test requires two sets of 12 correct identifications, for a total of 24 correct identifications. To speed things up, a preliminary round of 8 identifications, sometimes done without levels or other parameters perfectly matched, is a prerequisite.

Richard Clark normally has CD source, amplifiers, high quality home audio speakers, and listening environment set up in advance. But if the listener requests, they can substitute whatever source, source material, amplifiers, speakers (even headphones), and listening environment they prefer, within stipulated practical limits. The source material must be commercially available music, not test signals. Richard Clark stipulates that the amplifiers must be brand name, standard production, linear voltage amplifiers, and they must not fail (e.g. thermal shutdown) during the test.
Amplifier requirements

The amplifiers in the test must be operated within their linear power capacity. Power capacity is defined as clipping or 2% THD 20Hz to 10kHz, whichever is less. This means that if one amplifier has more power (Watts) than the other, the amplifiers will be judged within the power range of the least powerful amplifier .

The levels of both left and right channels will be adjusted to match to within .05 dB. Polarity of connections must be maintained so that the signal is not inverted. Left and Right cannot be reversed. Neither amplifier can exhibit excessive noise. Channel separation of the amps must be at least 30 dB from 20Hz to 20kHz.

All signal processing circuitry (e.g. bass boost, filters) must be turned off, and if the amplifier still exhibits nonlinear frequency response, an equalizer will be set by Richard Clark and inserted inline with one of the amps so that they both exhibit identical frequency response. The listener can choose which amplifier gets the equalizer .
How many people have taken the challenge?

Richard Clark says over a couple thousand people have taken the test, and nobody has passed. He used to do the test for large groups of people at various audio seminars, and didn't charge individuals to do the test, which accounted for the vast majority of the people who did the test. Around 1996 was the last of the big tests, and since then he has done the test for small numbers of people on request, for a charge ($200 for unaffiliated individuals, $500 for people representing companies).
When did the challenge start?

Sometime around the year 1990. Richard Clark says in a post on 7/2004 that the test with the $10,000 prize started about 15 years ago.
What were the results of the test?

Nobody has ever successfully passed the test. Richard Clark says that generally the number of correct responses was about the same as the number of incorrect responses, which would be consistent with random guessing. He says in large groups he never observed variation more than 51/49%, but for smaller groups it might vary as much as 60/40%. He doesn't keep detailed logs of the responses because he said they always show random responses.
Is two sets of 12 correct responses a stringent requirement?

Yes. Richard Clark intentionally made the requirements strict because with thousands of people taking the test, even random guessing would eventually cause someone to pass the test if the bar was set low. Since he is offering his own $10,000 to anyone who will pass the test, he wants to protect against the possibility of losing it to random guessing.

However, if the listener is willing to put up their own money for the test as a bet, he will lower the requirements from 12 correct down to as low as 6 correct.

Richard Clark has said “22 out of 24 would be statistically significant. In fact it would prove that the results were audible. Any AVERAGE score more than 65% would do so. But no one has even done that”.”
Do most commercially available amplifiers qualify for this test, even tube amplifiers and class D amplifiers?

Yes. Nearly all currently available amplifiers have specs better than what are required for the test. Tube amplifiers generally qualify, as do full range class D amplifiers. It is not clear whether Richard Clark would allow sub amplifiers with a limited frequency response.
Besides taking Richard Clark's word, how can the results of the test be verified?

Many car audio professionals have taken the test and/or witnessed the test being taken in audio seminars, so there isn't much doubt that the test actually existed and was taken by many people. One respected professional who has taken and witnessed the test is Mark Eldridge. Because the test has been discussed widely on audio internet forums, if there were people who passed the test it seems likely that we would have heard about it. Sometimes there are reports of people who believe they passed the test, but upon further examination it turns out that they only passed the preliminary round of 8 tests, where levels were not matched as closely as for the final test.
How can audio consumers use the results of this test?

When purchasing an amplifier, they can ignore the subjective sound quality claims of marketers. Many amplifier marketers will claim or imply that their amplifiers have some special topology, materials, or magic that makes the sound clearly superior to other amps at all volume levels. Many consumers pay several times more than they otherwise would for that intangible sound quality they think they are getting. This test indicates that the main determinant of sound quality is the amount of power the amplifier can deliver. When played at 150W, an expensive 100W measured amplifier will clip and sound worse than a cheap 200W measured amp.
Does this mean all amps sound the same in a normal install?

No. Richard Clark is very careful to say that amps usually do not sound the same in the real world. The gain setting of an amplifier can make huge differences in how an amplifier sounds, as can details like how crossovers or other filters are set. When played very loud (into clipping), the amplifier with more power will generally sound better than a lower powered amp.

Most people perceive slight differences in amplitude as quality differences rather than loudness. The louder component sounds “faster, more detailed, more full”, not just louder. This perceptual phenomenon is responsible for many people thinking they liked the sound of a component when really they just liked the way it was set up.
I changed amps in my system to another one with the same measured power and I hear a sound quality difference. Does this show that the test results are invalid?

No. Installing a new amplifier involves setting the gains and crossovers, and any slight change you make to those settings is going to affect how things sound.
Is adding an equalizer just a way of “dumbing down” the better amplifier ?

Richard Clark allows the equalizer to be added to whichever amplifier the listener wants. It can be added to the amplifier that the listener perceives as the weaker amplifier . The EQ is most likely to be used when comparing a tube amplifier (which exhibits slight high frequency rolloff) to a solid state amplifier . In that case Richard Clark says he can usually fashion an equalizer out of just a resistor and/or capacitor which for just a few dollars makes the solid state amplifier exhibit the same rolloff as the tube amplifier, and therefore sound the same. If the tube amplifier really sounded better, then modifying the solid state amplifier to sound indistinguishable from it for a few bucks should be a great improvement.
How might allowing clipping in the test affect the results?

It's impossible to know for sure because that would be a different test that has not been done. But Richard Clark seems to think that in clipping, conventional amplifiers would sound about the same, and tube amplifiers would sound different from solid state amplifiers.

Richard Clark reported that he did some preliminary experiments to determine how clipping sounds on different amplifiers . He recorded the amplifier output using special equipment at clipping, 12db over clipping, 18db over clipping, and 24db over clipping. Then he normalized the levels and listened. His perception was that with the same amount of overdrive, the conventional amplifiers sounded the same. With the same amount of overdrive the tube amplifiers sounded worse than the conventional amplifiers . On the basis of that experiment, he said “I believe I am willing to modify my amplifier challenge to allow any amount of clipping as long as the amplifiers have power ratings (actual not advertised) within 10% of each other. This would have to exclude tube amplifiers as they seem to sound much worse and it is obvious”
If a manufacturer reports false power ratings, will that interfere with the test?

No. The test is based on measured power, not rated power .
Does this mean that there is no audible difference between sources, or between speakers?

No. There are listening tests that show small but significant differences among some sources (for instance early CD players versus modern CD players). And speakers typically have 25% or more harmonic distortion. Most everyone agrees that differences among speakers are audible.
Does the phrase "a watt is a watt" convey what this test is about?

Not quite but close. Richard Clark has stated that some amplifiers (such as tubes) have nonlinear frequency response, so a watt from them would not be the same as a watt from an amplifier with flat frequency response.
Do the results indicate I should buy the cheapest amp?

No. You should buy the best amplifier for your purpose. Some of the factors to consider are: reliability, build quality, cooling performance, flexibility, quality of mechanical connections, reputation of manufacturer, special features, size, weight, aesthetics, and cost. Buying the cheapest amplifier will likely get you an unreliable amplifier that is difficult to use and might not have the needed features. The only factor that this test indicates you can ignore is sound quality below clipping.

If you have a choice between a well built reliable low cost amp, and an expensive amplifier that isn't reliable but has a better reputation for sound quality, it can be inferred from this test that you would get more sound for your money by choosing the former.
Do home audio amps qualify for the test?

Yes. In the 2005 version of the test rules, Richard explicitly allows 120V amplifiers in a note at the end.
How can people take the test?

They should contact Richard Clark for the details. As of 2006 Richard Clark is reported to not have a public email account, and David Navone handles technical inquiries for him. Most likely they will need to pay a testing fee and get themselves to his east coast facility.
Is this test still ongoing?

As of early 2006 , there have not been any recent reports of people taking the test, but it appears to still be open to people who take the initiative to get tested.
Do the results prove inaudibility of amplifier differences below clipping?

It's impossible to scientifically prove the lack of something. You cannot prove that there is no Bigfoot monster, because no matter how hard you look, it is always possible that Bigfoot is in the place you didn't look. Similarly, there could always be a amplifier combination or listener for which the test would show an audible difference. So from a scientific point of view, the word “prove” should not be used in reference to the results of this test.

What the test does do is give a degree of certainty that such an audible difference does not exist.
What do people who disagree with the test say?

Some objections that have been raised about the test:

* Richard Clark has a strong opinion on this issue and therefore might bias his reports.
* In the real world people use amps in the clipping zone, and the test does not cover that situation.
* Some audible artifacts are undetectable individually, but when combined with other artifacts they may become audible as a whole. For instance cutting a single graphic EQ level by one db may not be audible, but cutting lots of different EQ levels by the same amount may be audible. Maybe the amps have defects that are only audible when combined with the defects from a particular source, speaker, or system.
* Some listeners feel that they can't relax enough to notice subtle differences when they have to make a large number of choices such as in this test.
* There is a lack of organized results. Richard Clark only reports his general impressions of the results, but did not keep track of all the scores. He does not know exactly how many people have taken the test, or how many of the people scored “better than average”.
* If someone scored significantly better than average, which might mean that they heard audible differences, it is not clear whether Richard Clark followed up and repeated the test enough times with them to verify that the score was not statistically significant.

Is there one sentence that can describe what the test is designed to show?

When compared evenly, the sonic differences between amplifiers operated below clipping are below the audible threshold of human hearing.
At one point in the history of audio there really was a struggle to get enough watts out of an amp to provide enough power to the speakers of the day to get sufficent volume. I am thinking of a period from around the dawn of home stereo "the golden age that extended well into the period of relatively inefficient speakers mid 70s maybe? In any event you are right that this makes very little sense these days but the last 35 years or more of experience has done little to convince anyone.
Indeed I have heard tales of Audio people trying their modestly powered tube amps in shops with high power SS much to the amazement of these shops own staffs that the first bad watt is jst as bad as the last one is tryue. Great saying BTW.
I had wondered the same thing too and found that current means more than watts (rarely do you see current in specs). I had a Bryston B100 rated at 180w @4ohms. I did buy an Octave V70SE rated 70w @4ohms. The Octave is tubes which is different than solid state. I also found the Naim line of integrated's seem to have a lot more current than the rated watts (50 to 80w) if interested in solid state and used should be right in your price range.

The biggest difference I hear with the Octave is much more control in the music. The loudness is the same. Since the Octave is tubes I can change them out and get a different sound like more or less bass - mids and highs depending on the tubes (type and manufacturer). For example with either the KT88's or the 6550's the bass just kills my Bryston (and Bryston is known for decent bass). With the EL34's not as much bass as the Bryston but killer mid range and highs. So I can't say what to expect in your system with your room - speakers and source. That's what I hear in my system but I think Cmalak summed it up the best in general.

Good luck

Unfortunately the Octave is a little beyond your budget too at almost $7K but a killer unit for the money.
Schipo I only have 1 problem with Mr Clarks conclusions. That being he may have been comparing 2 very closely types of amps. I would like to see the results of comparing something like my Bryston B100 and my Octave V70SE. To me there is no comparison and the differences are (punch you in the nose) obvious. Granted the msrp of my Bryston was 5k vs 7k for my Octave.

So my stance stays - There is a great deal of differences with just the amp BUT speakers - source and ROOM make a huge difference. I think I have heard that referred to as system synergy. I'm just trying to answer Mystertee's question with what I have experienced not what someone else thinks. I don't mean to knock your reply because I agree with most of your previous posts.
Xti16: thank you for your reply to The Richard Clark Amp Challenge. I find it interesting reading and I myself believe that in all probability that in real world. Most and that's including myself. At normal listening,can not tell the difference between the much more expensive from the budget.Its a great possibility at clipping I would? I thank you all and only posted to keep the conversation going.
Interesting reading about Richard Clark's test, BUT does that mean the amps sounded the BEST they could be. In other words, does makeing two amps sound the same result in equally bad or good sound?
I believe in an additional objection to the Richard Clark test method.

The Speaker.

All speakers certainly do not react the same to all loads. I'm not even talking the difference between speakers which have a preference to current or voltage source amps.
Many speakers have wacky characteristics and huge phase swings, even if impedance would appear to be within reason. Worst case? Combine a phase shift of 50 or 60 degrees with an impedance dip to say.....3 ohms. Low power amps or amps which don't like this kind of behavior need not apply. Even if a tube amp would meet Clark's criteria, I'm sure if you compared a heavily capacitive speaker, you'd notice a difference.

Perhaps, when running within limits amps do sound alike, but for some amps and speaker combinations, those limits are either at low levels or maybe even frequency limited. I'd love to hear my panels with 6 or 8 SET watts.

And finally, while I hate to drag Bob Carver into this, I think he demonstrated differences between amps based on Transfer Function. He would somehow connect 2 amps to the same speaker. Any sound the speaker made was the DIFFERENCE between the amplifiers transfer function. After adjustment, the speaker would 'null' and at least for THAT speaker, the amps could be considered identical.

And Xti, the difference between 5k and 7k is negligible. Any manufacturer should be pretty much able to do their best work for that kind of money. Toss something like 700$s worth of Onkyo A-9555 into the mix and see what'cha get. You can probably get the Onkyo for what would amount to a TRUE nutty hi-end guy's pocket change.
If all amps sound the same, I sure have wasted a lot of money upgrading and changing amps all these years. I wish someone would have told me about this sooner. :)
Once upon a time, I’ll bet that arguments asserting that the world was flat were nothing less than academical blockbusters. Why are we still debating this? if you can’t hear a difference between amplifiers, then there isn’t any....and you’re deaf!
A 40 watt amp can outshine a 140 watt amp with proper speaker amp matching;this has been my experience.
Phaelon: if you can’t hear a difference between amplifiers, then there isn’t any....and you’re deaf! that statement is now getting very tired. Please tell us why no one has passed the test? I am not in anyway saying that when pushing amps towards clipping that the listener will not hear a difference..but at normal levels there just might not be either much or any...I am not running the test. So please contact Richard Clark and try to collect 10k.
Post removed 
Wow, you "guys" rock! I've had a few "duh" moments after reading your responses, of course it makes sense to figure the size of the listening room and speaker type into the equation. Obviously, there are a lot of factors to think about, and I can take everyone's suggestions (which I will do to some extent) but I need to get my butt out to a true audiophile type store to make comparisons based on my own ears. I figured I could surf the net and find out everything I needed to know, I can to some extent, but I have to now do the leg (ear) work. I live near Cleveland OH, I used to go to Audio Craft but they've gone down hill a little (IMHO) if they're still in existence. Bottom line, I've learned that watts is quality not quantity (I think I've mentioned something like that to my wife once or twice, just not about watts) ;) are smiley faces allowed here? I love Griffithds' quote "If the 1st watt doesn't sound good, why would you want 139 more of them!" (first duh moment).

There is so much to learn (crossover considerations, etc) I can see already that this site will help me immensely. Thanks to all who responded, I look forward to additional comments.
It is useless debating the quality of amps alone, they are only heard when they are driving speakers. On speakers that are easy to drive there is less difference, on hard to drive ones more. Don't spend money on things you can't hear; I hear differences IN MY SYSTEM so I have a reasonably expensive amp; my friend drives his Quad 57s with a rebuilt tube Heathkit; they are not sensitive to changes in amps. You pays your money and makes your choice; on some very difficult loads the old Classe 25 watt amp would outdrive amps 10x as powerful, IT ALL DEPENDS!
the output transformers!
Sorry, but I can't get past the first response.

02-20-11: Griffithds
It all boils down to this. If the 1st watt doesn't sound good, why would you want 139 more of them!


Am I really reading that all amps sound alike here on Agon and some actually think this is true. Come on - really?

This test and article must be a joke of some kind right?

My goodness I hear differences as true and real as day is from night with my eyes. Someone said if you can't hear the difference, then that is true for you. I agree and that is the bottom line. If you can't hear it, then no reason to invest in better amps! Just buy a 100 watt 1980's Sears MCS series amp for $35 and be done with it.

Strange a person with such ears would even bother with this site or any high end gear?
It is all in the match of the amp to your speaker, and the speaker sensitivity. If your speakers are 90dB sensitive, a modest sized room, 40wpc amps should play loud enough provided you don't want insane levels. Remember that to go from 40-80wpc only gives you 3dB, and from 80-160wpc, another 3dB, a mere click or two on the volume control.
SS integrated amps to consider among others...Odyssey Khatargo, NAD, Bryston, Tubes-Prima Luna, Manley, Vincent, CJ, Quicksilver separates and ARC.
Surely we must have some members on here that live near Cleveland and could help Mystertee hear a variety of equipment (probably set up with more care than many high end stores). Any volunteers? We've got a chance to win a convert here! I bet you make a new friend too.
Schipo, I don’t know why people don’t pass Clark’s test. But I do know that tests can be constructed to conclude in misleading results. Heck, forget about amplifiers sounding different. What about tube rolling. I hear remarkable differences from same value tubes.
Phaelon: I don't know myself but they all seem to fail...I don't think the testing was constructed to mislead but to teach. I remember Arthur Salvatore writing somewhere when he opened a very expensive mono amp, asking why so much for so little in parts.Maybe the ride is more expensive when it comes to high-end? And only deep pockets need apply.
I have an idea about what is going on w the Clark test. I am not a statistician but I do not think that you need 2 sets of 12 out of 12 correct answers to reject the null hypothesis that the 2 amplifiers sound the same. I would that that the chance of getting even one set of 12 100% correct is pretty low, esp. when you have audio memory/human perception issues. Its been a long time since I thought about/learned experimental design but I think what we are talking about here is the difference between a false positive (hearing a difference when there is none) and a false negative (not hearing a difference when there is one). One of them is called type 1 error and the other is type 2 error. I'll bet that if someone here knows a statistician or experimental design specialist, they could work out the numbers pretty easily. I'd be surprised if the ability to get 12/12 correct, twice in a row is pretty darn remote.
02-20-11: Schipo
Richard Clark normally has CD source, amplifiers, high quality home audio speakers, and listening environment set up in advance. But if the listener requests, they can substitute whatever source, source material, amplifiers, speakers (even headphones), and listening environment they prefer, within stipulated practical limits.

Does this mean that Richard Clark will set up the test in my own listening room, with my own equipment? I presume the answer is no.

And that is what is wrong with the test.

I believe that the ABX nature of the test is an illusion. To call it an ABX test is to say, among other things, that only one variable changes. That is of course true in one respect, namely that the amp is the only component that is switched during the test. Hence the test appears to involve a single variable change.

But there is another respect in which hundreds of variables have changed the moment you sit down to take the test, namely the TESTING SYSTEM ITSELF is different from your own. That is the reason, I suspect, why no one can pass the test.

By conducting the test with a system that a participant is not extensively familiar with, the participant’s auditory frame of reference is eliminated. Without it, detecting the manipulation of a single variable change is hopeless.

An analogy: If you put a dish I’ve never eaten in front of me and ask me to tell you if it has ingredient X, I may not be able to do so with any reliability greater than chance. That may be true even if I know what ingredient X tastes like. But if the dish is one my wife has cooked once a week for three years, I can instantly tell you if it has a different ingredient. The reason: Familiarity with the dish.

That’s what missing from Richard Clark’s test: familiarity with the testing system (including the listening room).

Bringing one or two familiar components to the test isn’t enough to make the testing system truly familiar, since literally hundreds of variables are still new to the participant, so Richard Clark's accommodations to participants gives the appearance of scientific rigor without actually providing it.

Swampwalker: your making sense when you state 12/12 correct, twice in a row is pretty darn remote.I would gather if Clark lowered the numbers 12/12 correct,there would be a better chance of passing the test. The more you play and listen the more confused audio memory/human perception becomes?
A 40 watt amp can out perform a 140 watt amp in two ways. First, a watt is current into a resistance, and the resistance is the speaker. But the speaker resistance varies with frequency. As the resistance offered by the speaker drops, the amp has to put out more current. If the 40 watt rating is based on 8-ohms, the amp would have to put out 80 watts into 4-ohms and 160 watts into 2-ohms. Only if the 40 watt amp has a capable power supply can that happen. That requires an expensive power supply (bigger transformer, higher quality and capacity of filtering caps). The Pioneer 140 watt may not have that kind of a power supply. It is rated at 140 watts at 8-ohms, but if the speaker load drops to 4-ohms, the amp may not put out 280 watts to drive the load, instead the power supply only puts out, say 200 watts, and at 2-ohms it says "I give up" and just clips.

The second factor is the quality of the output devices. The more expensive transistors are more linear and stable and distort the signal to a far lesser degree. High quality output devices in a smaller watt amp will definitely out perform a mass market higher powered amp.

When comparing amps, it is critical to note the rating at 4-ohms and, if able, 2-ohms in addition to the standard 8-ohms. The ability of an amp to put out current to a varying load maintains its sound quality throughout the audio band. But that is what drives up the price of an amp -- the power supply and output devices.
Don't forget, Gs5556 (catchy name, BTW!) that the amps under test are NOT run to clipping and are carefully level matched. In real world use, your objection is doubtless correct, but for purposes of this test? I'm less clear.

I think all the evidence needed is in front of us.

First, Carver pretty much proved amps sound different. He had to null his amp to the reference amp to make them indistinguishable. As far as I'm concerned that is 'game set and match' for the 'all amps sound alike' school.

Second, That null was valid ONLY for a particular speaker, though probably close on quite a few others.
The reason? Not only impedance but phase.
Try the same pair of 'nulled' amps on 3 speakers.... 1. A full range, single driver 2. Some Maggies 3. Some B&W from the '8' series.
I'll bet the null doesn't survive all speakers.

Now, keep in mind that the Clark test under discussion specifies NOT clipping the amp(s) under test. A 40 watt amp driving 83 db speakers with an impedance dip to 3 ohms at some wacky phase angle will almost surely clip. I'd be surprised if it didn't, if the level were above 'low'.

I would conclude that using a benign speaker load that it may very well BE impossible to distinguish 2 amps. Stereophile thought they could make such distinctions when challenging Carver. I'd be curious to know if Carver ALSO knew the speaker or had one of the test speakers with which to do his adjustments?

There are many subtle cues to telling gear under A/B tests apart. Even the best poker player can have a 'tell'. I'd suggest that the full test rules have a clue. Perhaps the exact level matching?
Easy: connect a 40 watt rated light bulb to each amp. Run a 50/60 hz sine wave at 2 to 3 volts rms into each amp. Eventually the 140 watt amp will blow the light bulb. There you have it. The 40 watt wins.
Thanks Gs5556 and Dpac996 for some very useful insight. I have a lot to learn, but it's going to be fun and I think AudiogoN will be THE place for my education.
"If you can't hear it, then no reason to invest in better amps!" by Grannyring.

Now there is a quote and a half! Let that guide you thru the Sea of Snake Oil.

I'll add that while navigating thru that sea, there may be differences that you won't hear the first tune but will, upon further listening, become apparent. If those differences are worth it to you then go for it. Vote with your wallet.
I just don't understand why it is that we all love this hobby if so much of it is truly unbeievable. As in "Snake Oil" etc. fixed delusions. I think we all believe that there are real differences in some aspects of audio or else we would join the silent majority of listeners content with whatever they use for sound reproduction.
Mechans: every industry has snake oil are we to believe that ours doesn't?
Mechans I don't believe that the hobby is unbelievable. Snake oil is eventually exposed as such, but for the majority of us, there are real differences. IMO, Each individual must trust their own conclusions drawn upon their listening/hearing experience. What else matters? Can anyone honestly tell you that you aren't hearing what you say you are hearing?
I guess I worded my response poorly. I intended to convey that despite our knowing that some aspects are Snake Oil, we know that there are real differences between products. It is that belief that makes this hobby fun. By saying belief I don't want to convey that believing is unfounded. It is just remarkable that we know some of this hobby is absurd but we still love it.
# watts alone says little about an amp other perhaps than how loud it can go with a particular pair of speakers depending on how efficient the speakers are before its published total harmonic distortion specs are exceeded (assuming those are accurate).

All the other things that go into designing and building the amp, including the details beyond THD that determine how the amp really distorts and sounds is how.
Bravo, Griffithds! Churchill would have been proud of that perfect answer:)
How can a 40 watt amp outshine a 140 watt amp
So, I've resisted responding here for quite a while, but temptation has overcome me. My first thought is that the challenge is pretty silly, in that it relies on crippling the better amp, or jacking around with the lesser amp to make the sounds least able to be discriminated. What's left is exactly what most listeners don't care about, which is whether the sound fits some arbitrary standard, rather than whether it sounds lifelike.

Several comments above have made similar points, so I'll add another observation a little more technical. The test requires 24 judgements to be correct. If we are willing to assume that each judgement is statistically independent (arguable, but not terribly germane), then the probability of passing the test if you can detect exactly no difference between the amps is roughly .00000006, a pretty stringent test.

That is, if the probability of choosing the better amp is exactly .5 (we are just flipping a coin to make our choice), then the probability of passing the test (by chance) is less than .0000001. Let's call the probability of detecting a difference on any given trial "p", and the probability of passing the test "P". In our example, p=.5 and P<.0000001, if there is exactly no difference between the amps. Now, suppose there is a small, but hard to detect difference between the amps. Since we have to introduce a variable source signal (music), we cannot just compare one sine wave signal to another, and we are unable to compare the amps to each other with 100% accuracy. The music thus introduces uncertainty into the comparison. If this uncertainty is large, or the difference between the amps is small, p will be near .5 (might as well flip a coin). If the uncertainty is small, and the difference between the two amps is large, then p will approach 1.0. Note that the challenge forces P to equal 1.0. In other words, the challenge is based on the assumption that ANY difference in amps should make it possible to detect a difference in EVERY case. Looking at it from this point of view, the fact that the challenge has never been overcome is just a statistical artifact of the design of the challenge. For those who have had a stat course, the design has almost no statistical power when the signals from the amps are pretty close, or the uncertainty introduced into the signal by the music is large. The design is strongly (!) biased in favor of the null hypothesis of no difference.

Suppose we allow there to be some difference between the amps, but not enough to be detected every time, say, a p value of .6, meaning that we only can detect the difference about 60% of the time. Now, P (the probability of winning the challenge) is less than .00001, still very unlikely. But notice that there is a real difference between the amps. It's obscured by our jacking around with the signals, our confusion induced by the variability of the music, and the fact that we require perfect performance on each trial, but the difference between the amps is still very real.

What situation would lead us to be able to pass the test more often than not? We would have to be able to detect the difference on every trial more than 97% (p greater than .97) of the time--an extraordinary level of performance for an ambiguous stimulus.

The bottom line is that the challenge is primarily a statistical artifact based on the fallacy of accepting the null hypothesis. We cannot conclude that there is exactly no difference between the amps, because we can never prove that p is exactly .5. All we have proven is that we can set up an experiment with enough ambiguity, and so little statistical power, that the result is a foregone conclusion. The prize money is safe for quite some time.
Nicely done, Mcphersn.
Thanks Bryon. An intellectually honest way of setting up the experiment would have been to test whether a listener ever gets it right more often than chance. A totally different analysis, and a money losing proposition for him, I suspect.
My objection is simple and to my thinking is the 'trick' to being unable to distinguish betweem 2 amps... the equaliser.... as soon as that is inserted in line? It is MODIFYING the sound of the amplifier it's inline with to be identical, as far as electronic testing is concerned... He cheats by removing that difference.