Inside Audio: David Moulton
Blind Testing in the Audio Realm
Alert readers will recall that last month I began
a discussion of how we try to determine what it is that we humans
can really hear. I noted that people often report hearing "awesome"
differences between small changes (like 16 to 20 bits, for instance),
but that when they are tested in a "blind" way, they
can't seem to make out the difference at all. How can this be?
This month, we'll take a quick look at blind testing.
To begin with, we humans are quirky creatures.
Our conscious and subconscious minds behave in unruly and pesky
ways. Our perceptions of the world around us are multifaceted,
complex, cross-integrated, extraordinarily rich in information,
and also heavily "edited." In fact, our raw perceptual
information is overwhelmingly complex and noisy, and some sort
of perceptual "editing" is necessary to keep us from
drowning in sensory noise.
With the editing that occurs at various neurological
and precognitive stages of our perception, however, we also experience
a loss of á well, objectivity. Our edited perceptions tell us
quickly what we think we need to know. Such perceptions take into
account all of what we think we need to know about the
stimulus in question. Sometimes these edited perceptions get us
into trouble.
THIS IS A TEST
To give you an example, let's go through the following
exercise: Say, out loud, the following sentence: "Compared
to Microphone B, Microphone A sounds really dull and lifeless."
Not very hard to say, is it? Anybody could say
that ÷ and mean it, too!
Now, lets's try another sentence. [Disclaimer:
Any mention of particular brand names is strictly for the purpose
of hypothetical comparision.] Say this one loud and clear
(I dare you!):
"Compared to a Shure microphone, a Neumann
microphone sounds really dull and lifeless."
A little harder to say, isn't it? Especially if
you have a knowledgeable colleague listening.
Why is this?
It is because we "know," as a matter
of professional competence, that Neumann microphones are better-sounding
than Shure microphones. (Ah, the power of brand identities ...
.)
How do we know? From our own experience with microphones
(whether or not we've ever directly compared a Shure to a Neumann),
gossip, group-think, perceived value, relative cost, mythology,
etc. This sum total of "knowledge" helps us quickly
make the acceptable professional decisions that allow us to survive.
This sum total of "knowledge" is also
the basis for bias and prejudice. For better or worse, it is lacking
in what we like to think of as scientific rigor and objectivity.
Welcome to blind testing.
ALL BEING EQUAL
When we wish to determine the answer to the question:
"All other things being equal, which microphone sounds better
to human listeners, a Shure BX58 or a Neumann Y49?" the "all
other things being equal" part of the question (which is
seldom stated, but always implied) requires that we take steps
to make sure all other things are equal when we perform
an experiment that compares the two microphones.
Because we already "know" that Neumanns
are better than Shures, that knowledge colors our perceptions.
How strong is this effect? Overwhelming, apparently.
In1994, Floyd Toole published a report on this effect, which found,
in a study of loudspeakers, that brand identity was the strongest
force affecting perceived audio quality ÷ far stronger than audio
performance.
So, we conceal the identity of the devices under
test, and rename them with neutral names such as Microphone A
and Microphone B. This is what we mean by "blind testing."
The test listener should not know the identity
of the items under test. Then, when he or she reports that, "compared
to Microphone A, Microphone B sounds really dull and lifeless,"
we can reasonably assume that he or she is not drawing on other,
prejudicial "experience" in making that judgment.
NOT ENOUGH
Sadly, this isn't enough. It turns out that we
humans are pernicious and sneaky enough to screw things up anyway.
If the person administering the listening test knows the identity
of the two microphones, he or she may, by either inadvertent or
advertent conduct, cause the blind listener to prefer one microphone
over the other. So, if we are going to be thorough about this,
we also need to conceal the identity of the microphones from the
person administering the test as well. This is what we mean by
"double-blind testing."
So, if we are going to rule out prejudices and
biases, however well-founded they might be, we need to use double-blind
testing, or so it seems. This requires a lot of complexity and
expense, but at least we've found out a truth that stands up "when
all other things are equal," which is a cornerstone of scientific
method.
Unfortunately, all of this gets a little tougher
when we start trying to measure very small differences, or trying
to find out if there is an audible difference at all. When we
compare two microphones, the differences are generally pretty
large and at least reasonably obvious ÷ both in terms of objective
measurement and perceived sound quality.
REAL OR IMAGINED
When we go digging for the real or imagined differences
between, for instance, 20 and 24 bits, or two audio cables, the
acoustical or electronic measured objective differences are really
very small. And with such small differences, the complexity of
the blind test begins to actually affect the results.
It boils down to this obvious but inescapable fact:
It is harder to correctly answer questions whose answers we don't
know than questions whose answers we do know. Setting aside the
obvious issues of prejudice, bias and cheating for a moment, we
will get "correct" answers more often when we "know"
the answers than when we don't.
I've seen this effect a lot when doing my Golden
Ears seminars (I publish a set of audio ear training CDs called
"Golden Ears," and often present ear-training seminars
using them). Listeners asked to identify the difference between
two versions of the same recorded excerpt will have real trouble,
at first, hearing that one version is 3 dB louder than the other.
Once they are told and shown that such a difference exists, they
find it "obvious."
So, when we try to measure really small differences,
we can reasonably expect to find (and do in fact find) that blind
(and double-blind) tests yield more negative results than nonblind
tests, due both to bias effects and also to the confidence effect
of "knowing' the answer. The insight is that blind tests
are "harder" than "sighted" tests.
Interestingly, this doesn't mean that blind tests
are necessarily more rigorous than sighted tests. Because they
represent a testing context that is comparatively more difficult,
and because they represent a listening situation that is different
from the end-use situation where we wish to apply our findings,
the results from blind tests may not prove to be perfectly reproducible
or relevant.
BIAS EFFECTS
What we can be sure of is that blind tests
are not contaminated by bias effects. Usually, that benefit more
than outweighs the small error caused by the "lack of confidence"
effect.
So, for these very small differences, I personally
take the pragmatic (and lazy) approach. I use blind testing, and
figure any errors due to "loss of confidence" are so
small that they aren't worth worrying about. It generally works
pretty well and relieves me from the problem of having to account
for the small problem it causes.
Such pragmatism is widely accepted in the testing
community, because the alternative ÷ sorting out the bias effect
present in sighted tests ÷ is prohibitively expensive and time-consuming,
if it is even possible.
So, I recommend that you depend on blind (or better,
double-blind) testing to find out answers to questions about the
audibility of effects like 96 kHz sampling rates or 24-bit words.
In the next issue, we'll look at our curious choice
of words like "amazing" to describe small differences
that we can barely hear. Thanks for listening.
Dave Moulton is an audio guy in Groton, Mass.,
who likes to get blind on weekends. And, just so you know, he
thinks Shure makes GREAT microphones. You can complain to him
about anything at dmoulton@ma.ultranet.com.
| Sponsored links: |
|
Nucomm delivers industry-leading microwave solutions for high-data-rate HD and IP File transport applications from portable ENG/OB to rack-mounted fixed link systems. Click here!
Harris Corporation's Broadcast Communications Division designs products that streamline workflow of content production, processing, transmission, management, storage, test and measurement and broadcast graphics. Click here!
RF Central - Total RF solutions manufacturer (TV broadcast): Full-Service 2GHz Relocation, COFDM, HDTV ENG components, complete links.
MultiDyne provides a wide array of video and fiber optic transport solutions, each with the highest image quality in the industry. Click here!
Transradio: DRM, AM, VHF/FM - We make the transmitters. Visit us now at www.transradio.de for more information.
QuStream's signal conversion and processing products set the signal standard using patented technology to convert, encode, decode, synchronize and process video signals. Click here!
|
|