|
Observations are
made of the workings of the universe. This sound big and
grand, but an observation can be something as simple as, "My tummy
hurts." Observations represent the taking in of information
from the world. Observations may be direct, taken in by
a person's own senses; you probably trust observations you
personally have made with your own senses. However, there are
a couple of ways that observations can be indirect. One
type of indirect observation involves the use of technology.
Machines allow you to see things you can't really see (like with
microscopes or telescopes), or hear things outside of a human
hearing range, or even detect things are senses aren't built for
(like magnetic fields). We trust the machines to give us
accurate information. The other type of indirect observation
involves using other people's observations, when they tell or write
about them. We can develop ideas based on what someone else
has seen, or heard, or detecting with a machine. |
Sometimes
even our senses can be fooled: optical illusions.
Illusion types, with links.
Artistic images taken with microscopes.
The unreliability of eyewitnesses. |
|
From those observations,
explanations are formed. Such an explanation is a
hypothesis. If you see something happen an think you
know why it happened, you have formed a hypothesis. For
science, though, a good hypothesis needs to have two critical
features:
A good hypothesis should lead
to good predictions. "If this hypothesis is true,
then this should happen..." It isn't enough to ask,
"What'll happen when we do this...?" You need to
produce predictions for the second critical feature -
A good hypothesis should be testable.
This is part of the basic concept of science: it's all about
the testing of ideas. It may look like science, it might sound
like science, but if it isn't testable, it isn't really science.
Often an idea is too big or complex to test, and must be split into
testable bits. |
The myths on Mythbusters are hypotheses; the tests are
sort of scientific. |
|
Tests of hypotheses
follow particular forms. The "meat" of science is
designing the tests for hypotheses. A good test takes a
lot of skill and imagination, and carrying them out often involves
making adjustments as things veer away from the plan. Tests
may take the form of controlled experiments, which usually
take place in laboratories, and field tests, which happen out
in the world and are trickier to design. It is common to use
models as substitutes for subjects that can't really be
tested in a lab: mice may be used to see what a new drug's
toxicity levels are, or computer simulations are used for weather
and climate systems.
Tests should be focused. A test should address a
particular aspect of a question, and aspects within the question
must be clearly defined. "Is being friendly to a stranger
likely to get them to help you?" is an interesting question, but to
test it, you need some particular behavior that will be your
"friendly" term, and a clear idea of what form of "help" you'll be
looking for. Even the "stranger" part of the test is open to
definition: how different from your strangers do you want your
tester to be? Sometimes two tests that seem to have
conflicting results actually had very different definitions for what
looked like the same factor.
Tests require a comparison test. The
test above needs a comparison, maybe two - will you get help from a
stranger if you are not friendly, or are even unfriendly? If
you can, then the help doesn't seem related to your friendliness at
all. You could only know that with comparison tests. The
classic comparison in experiments is called a control,
and the classic control test duplicates the experimental test, with
the object being tested removed. The object being tested is
the experimental variable (or one of them, but
the only one we'll be discussing here). Many experiments can't
follow this classic pattern, since removing the tested object by
itself may not be possible, and many control tests just vary the
variable, or check the impact of confounding factors (defined
below). In field tests, running real controls or even good
comparisons can be impractical or impossible; this makes the
conclusions from such tests less reliable.
Tests should
address recognizable confounding factors. There are almost
always aspects of a test that might affect your results but
aren't what you're testing - those are confounding factors.
Many confounding factors are part of the experimental procedure, and
their effects on results are called artifacts. For
instance, testing a new drug requires two test groups - both get
"treated," but the controls don't get the drug in the pill or shot.
They must get the treatment, though, to control for the placebo
effect: just the act of treating people will improve the
conditions of some members of the test group, enough to show up in
the results. If both groups get treated, it's assumed that the
placebo effect is equal in the two groups. As drug tests have
developed over the last century, part of the control design involved
a single blind, where the case patients and control patients
were not told which group they were in. This made sense, since
knowing whether your treatment was "real" or not would affect
placebo effect. Then a researcher found that if the
administering doctors know who is in which group, they can subtly
give it away to the patients, and tests became double blind,
where they don't know who is in which group (the treatments are
randomly split up before the doctors get them). All sorts of
things can be confounding factors, and sometimes they aren't
recognized until the tests are under way. A common confounding
factor is investigator bias: researchers see what they
expect to see. A philosophical concept called postmodernism
addresses how a person's own internal influences, from personality,
upbringing, and culture, can strongly affect the way they see the
world; this can also affect how researchers form their
hypotheses, design their experiments, and see their own results.
There are often ethical limitations on what may be done,
which is a type of postmodern bias.
Tests should be reproducible by others.
If other people can't repeat your experiment and get similar
results, then something odd is going on - you could be "steering"
your results without being aware of it, or your particular test has
an unrecognized confounding factor that changes for other testers. |
A page on experimental design (uses some slightly different terms).
Mythbusters talking-to-plants design, with control
but limited scope.
Plus, the test develops
a big confounding factor.
Experimental design in psychology experiments.
Example of confounding factor effects.
How confounding factors can affect cancer research.
Control group definition, with example.
A study comparing placebos: fake pills against fake acupuncture.
Why is the placebo effect stronger now than it used to be?
Using single and double-blind in a different context. |
|
Getting reliable results
is somewhat affected by chance - if you test a drug on one person
and it really helps them, or the one person dies, how much do you
know about the effects of your drug. Could you even say that
the drug caused the death? Good tests require repetition.
A reliable drug test should use as many subjects as possible to
reduce the impact of the "oddball" results. Sometimes it's not
a lot of subjects, but just doing the test over and over to see how
often certain results occur.
Results are
usually statistically analyzed. Data is gathered -
something is counted or measured through the course or at the end of
the test, but how do you know what the numbers mean? There are
many ways to process the numbers, some of which are particularly
used for certain types of tests. It makes checking the
conclusions difficult, especially if it isn't entirely clear just
how the numbers have been crunched. This is another way that
two apparently similar tests can come to very different conclusions.
Statistics can also be used to distort results until they look like
they support the hypothesis, and some times the researchers don't
even know they've done it - they have just changed statistical
methods until they have gotten a "good fit" with their data.
Because so much math is used in science, scientists
prefer quantitative data, data in number form, to
qualitative data, in a more subjective form. If you were
in a study of painkillers, it is likely that you would be asked to
rate your pain on some sort of defined number scale - your pain is
qualitative, but it would be converted into quantitative data. |
Trying to figure out if results are reliable, reproducible.
Introduction to experimental statistics.
|