Animal
Ecology (Zoology 441
at UT Martin)
Lecture: Population Size Estimation, Sampling, Error, and Statistics
in Ecology
Goals: To learn how the scientific method is applied in
ecology, learn about the kinds of errors that can exist, how statistics
are used to help deal with random error, and the importance of replication.
The Lecture
Errors in Scientific Studies: two major categories:
1. Random error occurs when we take a sample of a variable population,
and by chance the sample does not perfectly represent the real population.
This is always present to some degree, because populations are naturally
variable. Random errors:
-
unpredictably make estimates either too large or too small
-
have larger effects if our sample size is small, so to decrease the
impact of random error, increase sample size.
2. Systematic error occurs when something about the way we sample
results in estimates that are consistently incorrect in some direction
(either consistently too large or consistently too small). Systematic
error:
-
creates bias: directional tendency towards some incorrect conclusion
-
is caused by the sampling method, so increasing sample size does not
help get rid of it
-
is best avoided by taking random samples of whatever we are studying so
we do not introduce any bias into our sampling method
-
can be dealt with, to some extent, if we are aware of it: if we know our
results may have bias in some direction, we can evaluate them as possibly
too small or possibly too large (depending on the direction of the bias.)
Another kind of mistake that occurs in ecological studies is a blunder.
We might measure something wrong, write it down wrong, etc. These
probably always occur, but being careful helps.
Statistics in Ecology:
-
help us deal with natural variability, and the random error it causes
-
can be used to describe, summarize, variable information
-
main use in ecology: determine the probability that apparent patterns in
our data are real (non-random; not just caused by random error and natural
variability)
-
convention (based on what has worked in the past): use statistics to determine
the probability that apparent trends are random. If this probability
is less than 5%, then we'll consider the pattern to be real (not random).
Otherwise, we'll consider it to be random.
Definitions of ecologically important concepts in statistics:
-
p: the letter p means the probability that an observe trend or pattern
results purely from random chance (and is NOT ecologically interesting.)
In ecological papers, you will see, after the results of a statistical
test are presented, a statement in parentheses that says (p<0.05) or
(p>0.05). (p<0.05) means that there is less than 5% chance that
the trend observed is just random.
-
Statistically significance (sometimes just called "significance"
or referred to as "significant") means that there is less than 5% chance
that an observed pattern is random (so it means, in words, what (p<0.05)
says in symbols).
-
statistical power is the ability to detect a trend if there really
is one. To understand why statistical power is important, consider
the following. Sometimes a real, non-random pattern can exist in
nature but our test does NOT find it to be statistically significant. Random
error has a big impact if we have a small sample size or if there is a
high level of natural variability; in these situations random error
obscures real trends so that we do not detect them as statistically
significant trends even if they are real. So we need to consider statistical
power to find out the chance that we WILL detect any trends that are really
there. We increase statistical power by decreasing random error --
remember that we do that by increasing sample size.
Sample size considerations: how large is a large enough sample and
what does it mean if we can't get a sample that large?
-
"Rule of thumb" in ecology: if we have a sample of at least 30 of whatever
we're sampling for a statistical test, we are very likely to have adequate
power to detect any trends that are present. Most of the time, a
sample of 20 will give adequate power.
-
if we have a sample of fewer than 20 of whatever we're sampling for our
statistical test, we may not have enough power to detect trends that are
really present. Interpretation of our results in this case depends
on the outcome: if we DO find a statistically significant trend, then we
conclude that there is a real trend, just as we would with a larger sample,
because we have found this trend even though low power made it hard to
find. If we do NOT find a statistically significant trend, then we
do not know if there is a trend or not -- we can not make real conclusions.
Methods for determining the level of power, "power analyses," exist, and
may help evaluate our power in situations wtih low sample size, but the
best thing to do is try to get a sample size large enough so low power
isn't a problem.
Replication of categories being studied: Often, in ecological
studies, we want to determine whether something we observe about the animals
we are studying is associated with some factor in the environment that
may change from individual to individual, area to area, year to year, etc.
We often categorize different areas as different types (for example, we
could categorize lakes as "shallow" or "deep" We could categorize
forests as "deciduous" or "coniferous". We could categorize years
as "wet" or "dry." We could categorize individuals as "juvenile"
or "adult.") We then try to determine whether something about the
distribution or abundance of the animal we are studying is related to the
differences among categories (for example, we could ask whether reproductive
rate of some fish species was greater in deep or shallow lakes, or whether
abundance of a species of mammal was greater in deciduous or coniferous
forests, or whether insect survival was higher in wet or dry years.)
When we do this, it is important to sample several independent situations
of each category. That is, we would have to sample several different
shallow lakes and several different deep lakes, or several deciduous forests
and several coniferous forests, or several wet and several dry years.
This is true because there is more than one factor that differs between
areas or years. One deciduous forest and one coniferous forest might
also differ in aspects of climate, or human disturbance, or any of many
other factors. A difference between them could depend on ANY of these
factors, not just the one (coniferous vs. deciduous) that we're trying
to study. To conclude that a difference, or trend, is likely to depend
on the factor that we are trying to study we need to sample several areas
of each category; if we do that, and still observe a trend, it becomes
less likely that the trend is related to some other factor that happens
to be true of one particular area but is not generally associated with
that category of area. Note that this is a different problem from
getting a large sample to get statistical power. We could take 50
samples from one deep lake and 50 samples from one shallow lake and have
excellent power to detect any differences between these lakes, but we would
NOT know if those differences are present because one lake is shallow and
the other is deep -- some other difference between the lakes could cause
any trends we observe.
How many different independent situations (years, areas, etc.) do we
have to sample? Here the "rule of thumb" is: if we have five situations
of each type we're in great shape -- trends are very likely to reflect
the factor we're interested in. We're probably OK with at least three replicates.
If we have less than three replicates of each situation we start to worry
that factors other than the ones we are trying to observe may be causing
the trends we see.
When you're considering an ecological study, such as the ones we discuss
in class, the one you critique, and the ones you study for your term paper,
you need to consider these aspects of sampling and consider the problems
that might be present if the researchers were not able to meet some of
the sampling criteria that are recommended. In addition, if field
experiments are performed, you should evaluate how well controlled they
are, and if lab experiments are performed, consider whether important factors
were left out. For any ecological study, it is impossible to study
ALL the factors that might be affecting some situation -- that's why ecology
can be considered "science under the worst possible conditions" -- so you
should always think of other factors that were not tested or considered
that might result in the same trends that are described in a study.
Click here to return
to the lecture syllabus.