Data Management

Exercise 1
Data Management
Ecology Laboratory
Submitted by
Gerardo, Mary Antonette

Maguslog, Justine
Salumbre, Renz
Surquia, Joseph
Introduction
Data management is a must in the study of ecology. Data management is done

using various statistical methods and is valuable in decision-making especially in
scientific studies that uses controlled experiments.
To gain information, data is first collected. A datum is a value that a variable
assumes. Then the data collected is subjected to a particular test that fits the
experimentation taken with the hypothesis involved.
Most statistical tests are designed so that error is minimized if not completely
prevented. The value of these tests rests on the fact that a study of whole populations is
quite impossible, and that these tests account for the whole population. Meaning to say, a
statistical test provides a conclusive “guesswork” or generalization of a population. It
provides a specific view and reveals the salient feature of a population.
There are now many ways of getting around statistical tests. With the advent of
advanced spreadsheet softwares, data management is now routine. Programs such as
Microsoft Excel, Apple Numbers, Minitab and SPSS, and some advance Linux-based
open-source spreadsheet, there is now this certain ease at attending to a certain statistical
protocol. But it does not mean that these softwares have no degree of complexity. An
understanding of commands and special program functions is required, and since two
programs do not necessarily have the same build, there is a wide range of commands.
Learning them, though, will certainly prove advantageous.
The objectives in the Data Management experiment are: 1) to be able to learn

some of the principles and techniques of data management; and 2) to be able to
familiarize one self with the use of the computer and spreadsheet.
Materials and Methodology
The materials that were used in the experiment are bond papers, graphing papers,
pencils, erasers, scientific calculator, personal computer or laptop with spreadsheet.
Before statistical tests were performed, a general method in hypothesis testing

was followed. First, the null and alternative hypothesis were stated; second, the level of
significance was selected; third, the critical value and the rejection region were
determined; fourth, the decision rule was stated; fifth, the test statistic was computed; and
finally, a decision whether to reject or not to reject the null hypothesis was made.
In practice exercise I, a Student’s t-test was performed to see whether there is a

significant difference in the growth of oat coleoptiles treated with indole acetic acid
(IAA) in comparison with untreated controls. The group then used Microsoft Excel, and
in the program, the first two columns were labeled as Control and IAA.
Formula 1. t-test formula
In practice exercise II, Microsoft Excel was used to plot and print an XY graph of
larval growth in Noctua pronuba using the given data.
As for practice exercise III, Microsoft Excel was also used to plot and print a
graph of “Growth in Pices halluciginea” based on the equation (Formula 2) w =aLb
where w is weight in grams and is the dependent variable, a is the coefficient of
proportionality (0.001), L is length in mm and is the independent variable, and b is 3.
For practice exercise IV, the length-weight equation in the previous problem was
plotted using a log-log plot.
For practice exercise V, a species effort curve was made. It involves taking
samples, identifying, and counting species in a sample. The cumulative number of species
was plotted against the number of samples.
For practice exercise VI, a histogram was made with the data on Age Distribution
of Male Perch in Lake Windermere, England in the year 1966.
For practice exercise VII, a graph of population growth of Selenastrum
capricornutum was plotted using the exponential growth equation (Formula 3) Nt = N0en.
In this graph, the population size was taken as the dependent variable and time in days as
the independent variable.
In practice exercises VIII and IX, ANOVA and Kruskall-Wallis Test were
performed.
Results and Discussion
1.
Control IAA
10.1 11.8
9.8 12.7
10.3 11.2
10.2 13.0
9.9 12.9
10.5 13.2
10.7 13.5
10.0 12.6
10.7 13.9
9.8 13.9
Table 1.1 Data for problem 1
N Control Control2 IAA IAA2

1 10.1 102.01 11.8 139.24
2 9.8 96.04 12.7 161.29
3 10.3 106.09 11.2 125.44
4 10.2 104.04 13.0 169
5 9.9 98.01 12.9 166.41
6 10.5 110.25 13.2 174.24
7 10.7 114.49 13.5 182.25
8 10.0 100 12.6 158.76
9 10.7 114.49 13.9 193.21
10 9.8 96.04 13.9 193.21
Total 102 1041.46 128.7 1663.05
Mean 10.24 12.87
Table 1.2 Square of Control and IAA with their respective Means
Standard Deviation
Control IAA
0.343187671 0.86158768
Table 1.3 Standard Deviation of Control and IAA
Control
10.8
10.6
10.4
10.2
10
9.8
Coleoptile length
9.6
9.4
9.2
1 2 3 4 5 6 7 8 9 10
Sample
Using manual
Graph 1.1 Sample vs. Coleoptile length in Control
computation of the t-test, the group came up with the result of -8.97 while using
Microsoft Excel, the generated answer was 0.37589951 and the conclusion was to reject
the null hypothesis.
Treated with IAA
16
14
12
10
6
Coleoptile length
4
0
1 2 3 4 5 6 7 8 9 10
Sample
Graph 1.2 Sample vs. Coleoptile Length in IAA
The t-test assesses whether the means of two groups are statistically different
from each other. This analysis is appropriate whenever you want to compare the means of
two groups.
The t-test gives the probability that the difference between the two means is
caused by chance. It is customary to say that if this probability is less than 0.05, that the
difference is 'significant', the difference is not caused by chance.
The result in the t-test shows that there is a significant difference in the growth of
coleoptiles treated with IAA and untreated controls.
2.
Instar Mean Body length
(mm)
1 2.52
2 4.3
3 6.62
4 10.35
5 15.14
6 23.36
7 35.90
Table 2.1
Data for
40
35
30
25
20
15
10
5
0
1 2 3 4 5 6 7
problem 2
Graph 2.1 Instar vs. Mean Body Length
This graph was used to test the null hypothesis which is the growth rate is linear
and does not change as the caterpillar grows. This graph shows the relationship of instar
and body length is directly proportional. The graph exhibit that the growth rate is not
linear and it changes as the caterpillar grows. The graph also tells us that the relationship
between growth rate and body length is exponential which means that even when it seems
slow on the short run, it becomes impressively fast on the long run.
Length in mm Weight in grams

3. 50 125
65 274.625
80 512
95 857.375
110 1331
125 1953.125
140 2744
155 3723.875
170 4913
185 6331.625
200 8000
Length-Weight Relationship in the

growth of Pisces halllucigenia
9000
8000
7000
6000
Log
5000 Length-log Weight relationshp
in the growth of Pisces hallucigenia
4000
3000
weight
2000 in grams
4.5
1000
4
0
3.5 50 65 80 95 110 125 140 155 170 185 200
length in mm
3
Graph 3.1 Length-Weight Relationship
2.5
2
4.
1.5
1
Log weight in grams
0.5
0
1.7 1.81 1.9 1.98 2.04 2.1 2.15 2.19 2.23 2.27 2.3
Log length in mm
Graph 4.1 Log Length and Log Weight Relationship
Log length in Log weight in

mm grams
1.698970004 2.096910013
1.812913357 2.43874007
1.903089987 2.709269961
1.977723605 2.933170816
2.041392685 3.124178055
2.096910013 3.290730039
2.146128036 3.438384107
2.190331698 3.570995095
2.230448921 3.691346764
2.267171728 3.801515185
2.301029996 3.903089987
Table 4.1 Data for problem 4 in log

5.
Number of Cumulative Number of Cumulative Number of Cumulative

samples # of speciesSpecies
speciesEffort Curve
# of species samples # of species
1 6 10 22 40 28
2 35 8 15 24 50 28
3 30 14 20 25 80 29
5 19 30 27 100 29
25
20 Table 5.1 Data

for problem 5
15
10
5
cumulative # of species
0
1 2 3 5 10 15 20 30 40 50 80 100
# of samples
Graph 5.1 Species effort curve
In the species effort curve, the following have been concluded by the group: 1) the
most common species will be found first; 2) the most dominant species will control the
whole population; 3) an intensive sampling is necessary in order to satisfy the real
number of species; 4) the curve depends primarily on two factors, the first one is the
community or area of sampling and the second is the method of trapping.
6.
Age (years) % of male Age (years) % of male Age (years) % of male
perch pop. perch pop. perch pop.
2 0 6 9 10 3
3 2 7 60 11 6
4 0 8 6 12 0
5 2 9 12

Age Distribution of Male Perch in Lake Windermere,
England
70
60
50
40
30
20
10
Percentage of Male Perch Population
0
1 2 3 4 5 6 7 8 9 10 11 12
Age in Years
Graph 6.1 Histogram of Age Distribution of Male Perch
Male perch are aged using scale, otolith, spine, and opercle. The histogram shows
us that 7 years old male perch was the most abundant in Lake Windermere , England in
the year 1966. This tells us that, 7 years ago many male perch survived and this also tells
us that, 2 years, 4 years and 12 years ago there must be a rampant fish kill that many male
perch did not survived.
The study uses male perch and not female perch because male perch are more
stable than female perch. Also, the age of the male perch are more easy to identify
because male perch exhibit standard length, weight and markings at a specific age.
Histogram was used in this study instead of a pie chart because 0 value was
presented in the histogram unlike in a pie chart 0 value was not presented. 0 value were
significant in this study because this value can tell us something like presented above.
7.
N0 = (5382)(2.7)(1.5)(0)
Time (days) Population size (cells/mL)
= 5382 cell/ml
0 5382
N1 = (5382)(2.7)(1.5)(1) 1 23878
= 23878 cells/ml 2 105934
3 469981
N2 = (5382)(2.7)(1.5)(2)
= 105934 cell/ml Table 7.1 Data for problem 7
N3 = (5382)(2.7)(1.5)(3)
= 469981 cells/ml
Population Growth of
Selenastrum capricornutum
500000
400000
300000
Series1
200000
100000
Population Size
0
1 2 3 4
Time in Days
Graph 7.1 Population Growth of Selenastrum capricornutum

8. ANALYZE THE GIVEN DATA USING ANOVA
A B C D
78 78 79 77
88 78 73 69
87 83 79 75
88 81 75 70
83 78 77 74
82 81 78 83
81 81 80 80
80 82 78 75
80 76 83 76
89 76 84 75
Table 8.1. Data for problem set 8
One-way ANOVA: C1, C2, C3, C4
Source DF SS MS F P
Factor 3 341.9 114.0 9.01 0.000
Error 36 455.6 12.7
Total 39 797.5
S = 3.557 R-Sq = 42.87% R-Sq(adj) = 38.11%
Individual 95% CIs For Mean Based on

Pooled StDev
Level N Mean StDev -+---------+---------+---------+--------
C1 10 83.600 4.033 (------*-----)
C2 10 79.400 2.503 (------*-----)
C3 10 78.600 3.307 (------*-----)
C4 10 75.400 4.142 (-----*------)
-+---------+---------+---------+--------
73.5 77.0 80.5 84.0
Pooled StDev = 3.557

Fig. 8.1. Minitab-generated analysis of variance
H0 µ1 = µ2 = µ3 = µ4
HA µ1 ≠ µ2 ≠ µ3 ≠ µ4
Level of Significance α0.5
Critical Value
Conclusion
Table 8.2. Summary of results
The Analysis of Variance (ANOVA) is a statistical technique that makes use of the
F- test, and tests for a hypothesis concerning the means of more than two populations. In
an experiment, certain situations, although concerning the same elements, may exhibit a
degree of variability. In such cases, ANOVA is used as an estimating tool. In ANOVA, the
total variations are accounted for and subsequently subdivided to various factors of
interest to the observer or experimenter.
There are assumptions made in using ANOVA. These assumptions are similar to
the t-test and the F statistic. The basic assumption that must be first satisfied is that the
data must be normally distributed with a common variance; otherwise another test is
performed such as the Kruskall-Wallis nonparametric test.
9. ANALYZE THE GIVEN DATA USING KRUSKAL-WALLIS TEST
A B C D
78 78 79 77
88 78 73 69
87 83 79 75
88 81 75 70
83 78 77 74
82 81 78 83
81 81 80 80
80 82 78 75
80 76 83 76
89 76 84 75
Table 9.1. Data for problem set 9
A B C D
78 16.5 78 1 79 2 77 1
6.5 0.5 2.5
88 38.5 78 1 73 3 69 1
6.5
87 37 83 3 79 2 75 6.5
3.5 0.5
88 38.5 81 2 75 6.5 70 2
7.5
83 33.5 78 1 77 1 74 4
6.5 2.5
82 30.5 81 2 78 1 83 3
7.5 6.5 3.5
81 27.5 81 2 80 2 80 2
7.5 3.5 3.5
80 23.5 82 3 78 1 75 6.5
0.5 6.5
80 23.5 76 10 83 3 76 10
3.5
89 40 76 10 84 36 75 6.5
Table 9.2 Data and ranking
Ti = 18909.40
H = 15.36
H0 µ1 = µ2 = µ3 = µ4
HA µ1 ≠ µ2 ≠ µ3 ≠ µ4
Level of Significance α0.5
Critical Value 7.81
Conclusion Reject null hypothesis
Table 9.2. Summary of results.
The Kruskall-Wallis Test is a nonparametric test serving as an alternative to

ANOVA. This test is used to detect differences in locations among more than two
population distributions based on independent random sampling. In this respect, the
Kruskall-Wallis test is similar in aspect to ANOVA, except that the only situations where
a Kruskall-Wallis test is appropriate is when the data cannot be assumed to have a normal
distribution and/or a problem with heteroscedasticity arises.
This test is commonly performed when there is one attribute variable and one
measurement variable, and the measurement variable does not meet the normality
assumption of ANOVA. If the original data set actually consists of one attribute variable
and one ranked variable, an ANOVA cannot be performed.
Conclusion
Data management is necessary for the interpretation of data. There are many
methods of analyzing data such as the Student’s t-test, Analysis of Variance and the
Kruskall-Wallis. There are many criteria to which we can fit our data so that a specific
test can be made. There are also various ways of transforming the data without breaking
the integrity of the collected data. There are also many ways by which we can represent
our results such as a histogram, a scatter plot or a simple line graph.
There are many software specialized for statistical data analysis. One of the most
common is Microsoft Excel. Other programs such as SPSS and Minitab are much more
sophisticated programs in that they are committed to statistics only.
References
A. Books
Alferez, M.S. & M.C.A. Duro. 2006. Statistics and probability. MSA : Quezon City
Magurran, A.E. 2004. Measuring biological diversity. Blackwell Publishing :
Australia
Mendenhall, W., R.J. Beaver & B.M. Beaver. Introduction to probability and
Statistics. Thomson Brooks/Cole : Singapore
Odum, E.P. & G.W. Barrett. Fundamentals of ecology. Thomson Brooks/Cole:Canada
B. Websites
http://answers.yahoo.com/question/index?qid=20070826050030AACAguZ
http://en.wikipedia.org/wiki/Kruskal-Wallis_one-way_analysis_of_variance
http://weblogs.elearning.ubc.ca/biol300/archives/2006/04/anova_vs_kruska.php
http://www.socialresearchmethods.net/kb/stat_t.php
http://udel.edu/~mcdonald/statkruskalwallis.html
http://yhspatriot.yorktown.arlington.k12.va.us/~dwaldron/stat_examp.html#krusk

Data Management

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Data Management

Încărcat de

Drepturi de autor:

Formate disponibile

Exercise 1

Gerardo, Mary Antonette

Data management is a must in the study of ecology. Data management is done

The objectives in the Data Management experiment are: 1) to be able to learn

Materials and Methodology

Before statistical tests were performed, a general method in hypothesis testing

In practice exercise I, a Student’s t-test was performed to see whether there is a

Formula 1. t-test formula

Results and Discussion

Table 1.1 Data for problem 1

N Control Control2 IAA IAA2

Treated with IAA

Graph 1.2 Sample vs. Coleoptile Length in IAA

Length in mm Weight in grams

Table 3.1 Data for problem 3

Length-Weight Relationship in the

Log length in Log weight in

Table 4.1 Data for problem 4 in log

Number of Cumulative Number of Cumulative Number of Cumulative

20 Table 5.1 Data

Table 6.1 Data for problem 6

Graph 6.1 Histogram of Age Distribution of Male Perch

Graph 7.1 Population Growth of Selenastrum capricornutum

One-way ANOVA: C1, C2, C3, C4

S = 3.557 R-Sq = 42.87% R-Sq(adj) = 38.11%

Individual 95% CIs For Mean Based on

Pooled StDev = 3.557

9. ANALYZE THE GIVEN DATA USING KRUSKAL-WALLIS TEST

The Kruskall-Wallis Test is a nonparametric test serving as an alternative to

S-ar putea să vă placă și