Paralleled Transistors

Is there any truth to the argument that many paralleled output transistors, despite strong attempts to match closely, will smear music signals as they are not identical. How about those designers. using only N channel mosfet pairs rather than complimentary P Channel devices? Just curious whether using larger more powerful Mosfets, and thus fewer pairs, is better in any way than let’s say 12 smaller pairs (24) per channel? Thanks for helping me to understand.
Using multiple bipolar transistors allows not only to distribute temperature between them, but also to lower their collector current. It improves linearity since current gain of transistor drops at larger collector currents (beta droop). Putting transistors in parallel also lowers output impedance, without affecting stability (same emitter resistors). Mosfets don’t have "beta droop" and suffer less stress at high current, while putting them in parallel increases chance for parasitic oscillations, so perhaps it is only good idea for bipolar transistors. Of course you have to use more than one if available transistors don’t deliver required power. Perhaps we have a member who designs amps, to chime in?
Post removed 
If this is true, we've never seen it born out in distortion, square wave or any other measurements. 

To the contrary, ultra simple circuits often have the most "color" added by distortion or high output impedance.
The people over here (including Nelson Pass) would probably have some relevant experience and opinions:
Pass is a good example, because while it has many fans, their uber simple design does not win everyone over for sound quality.

I think if the idea of transistor smearing was accurate or a convincing win no one would buy any amp that used more than one pair. Instead these types of designs, and tube equivalents, remain a niche sound.

I encourage the OP to find a FirstWatt kit and make his own, see what conclusions he comes to. :)