Sunteți pe pagina 1din 6

Jennifer Gonzales

Math 2040-001
November 24, 2015
Statistical Report to determine if there is a correlation between the age of
a person and the number of animals they currently own

Purpose of the Study


As a person gets older, they are pressed with a larger workload or they may want to
take on additional responsibilities in an effort to expand their impact on the world. Is
this also the case when it comes to caring for an animal? Pets and animals do
require discipline and responsibility after all. The purpose of this study was to
determine if it is possible for a correlation to exist between the age of a person and
how many animals they currently own. I chose this study because I recently have
lost a pet and was curious see if people obtain more animals as they age.

Study Design
Our question was tested by taking a random sample of 50 individuals from various
locations and backgrounds. These individuals were asked their age and how many
animals they currently owned. To keep the sample random, a poll survey was
created online. To provide comfort to the individuals participating, the results were
left anonymous. This survey was posted to Facebook, shared by a few friends, and
reached around the country. My two partners also collected responses in-person by
asking co-workers. We received a total of 138 responses. To further introduce
randomness, we used an excel spreadsheet, shuffled the responses by numbering
them and then used a =RANDBETWEEN(1,138) function to select the required 50
responses for this study.

Data, Calculations and Graphs


Data Collected; ntotal=50
n Age of
# of
person
pets
(x)
(y)
1
15
1
2

17

17

18

19

19

20

20

1
8
1
9
2
0
2
1
2
2
2
3
2
4
2

Age of
person
(x)
26

# of
pets
(y)
3

27

27

28

28

28

29

29

3
5
3
6
3
7
3
8
3
9
4
0
4
1
4

Age of
person
(x)
33

# of pets
(y)
1

34

35

35

35

37

37

45

38

Jennifer Gonzales
Math 2040-001
November 24, 2015

20

1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7

20

21

23

24

24

25

26

26

5
2
6
2
7
2
8
2
9
3
0
3
1
3
2
3
3
3
4

30

30

30

30

31

31

32

32

33

First Variable, x = age of the person sampled


Statistic
Value
Mean:
30.54

xbar=

x 1+ x 2 ++ x n
n

Sample Standard Deviation:

s=

10.379

( xxbar )
n1

Minimum, Q0:
Quarter 1, Q1:
Median, Q2:
Quarter 3, Q3:
Maximum, Q4:
Range, Q4-Q0 :
Mode:
Inter Quartile Range:

15
24
29.5
35
67
52
20
11

IQR=Q3Q1
Lower Fence:

7.5

LF=Q 11.5(IQR)
Upper Fence:

51.5

2
4
3
4
4
4
5
4
6
4
7
4
8
4
9
5
0

39

12

40

44

45

46

50

57

67

Jennifer Gonzales
Math 2040-001
November 24, 2015

UF=Q3 +1.5( IQR)


Outliers include age 57 and age 67.

Second Variable, y = number of pets the person currently owns


Statistic
Value
Mean:
2.76

ybar=

y 1 + y 2++ y n
n

Sample Standard Deviation:

s=

6.495

( y ybar )
n1

Minimum, Q0:
Quarter 1, Q1:
Median, Q2:
Quarter 3, Q3:
Maximum, Q4:
Range, Q4-Q0 :
Mode:
Inter Quartile Range:

0
1
1
2
45
45
1
1

IQR=Q3Q1
Lower Fence:

LF=Q11.5(IQR)

-0.5

Jennifer Gonzales
Math 2040-001
November 24, 2015
Upper Fence:

3.5

UF=Q3 +1.5( IQR)


Outliers include 4 pets, 8 pets, 12 pets, and 45 pets.

The linear correlation coefficient is a measure of the strength and direction of the
linear relation between two quantitative variables. A sample correlation coefficient
is represents by r. This is calculated using the following formula:

Where xbar is the sample mean of the first variable, s x is the sample standard
deviation of the first variable, ybar is the sample mean of the second variable, s y is
the sample standard deviation of the second variable, and n is the number of
individuals in the sample. The linear correlation coefficient is always between -1
and 1, inclusive. The closer r is +1, the stronger is the evidence of positive
associated between the two variables. The closer r is to -1, the stronger is the
evidence of negative associate between the two variables. Using the data from the
charts above, this formula can be solved for our particular sample.

r=

432.48
( 10.379 ) ( 6.495 )( 501 )

r=

432.48
3303.169

Jennifer Gonzales
Math 2040-001
November 24, 2015

r=0.1309
Recall that the slope of a linear function is represented by y=mx+b. This is
applicable to correlations as well using the r value. With r solved, we can determine
the line of regression. The equation for which is given by:

y ^ b1 x +b 0
Where b1 = r(sy/sx) and represents the slope of the line and b 0 is the y-intercept of
this line.

Linear Correlation
45
40
35
30
25

# of pets/animals 20
15
10
5
0
10

f(x) = 0.08x + 0.26


20

30

40

50

60

70

Person's age

Our line of regression is represented by:

y ^ 0.0819 x +0. 2577

Error Analysis
There are many variables which could have affected our results in a negative
manner. Since we are asking for the current amount of animals or pets a person
has, the values provided by the person could be skewed if they were a breeder. If,
for example, their dog just had puppies but they are not planning on keeping them
long term. It is possible that since the poll was shared on social media, that not all
age groups were well represented. I was the original poster so a good amount of
results came from friends which were around the same age. Because of the
groupings selected for the number of pets, we can see that our lower fence is a
negative number. In reality, this would be an impossible value since we cannot own
a negative amount of pets. I also feel that it is important to consider the size of a
persons family. It may be possible that a larger family would support a larger
amount of animals.

Jennifer Gonzales
Math 2040-001
November 24, 2015
Data Analysis
When reviewing the distribution charts of each variable, we can see that both the
age group and the number of pets owned histograms are right-tailed or skewed left.
Based on the error analysis, media may have not been the ideal method of data
collection. Seeing the age group histogram skewed is not surprising. It shows that
the median age was 29.5 years old with a range of 20 years for a majority of the
sample. Upon review of the distribution chart of the second variable, number or
pets, it was difficult to make the distribution appear normal without exceeding a pet
grouping of 4. Since a majority of the responses were in the lower range, the first
column alone represents 92% of the sample. The linear correlation graph shows a
slight slope. We are able to clearly see the outliers.
The r value is what is important in our study. This is what determines if there is a
relationship between the two variables or if there is no relationship. The r value
turned out to be 0.1309 as shown in calculations. Because the value is positive, we
can tell that there is a positive relationship between the two variables. However, the
relationship is very weak, if existent at all. Consider this. Is it possible that I received
a weird sampling which made a relationship appear to exist when it really does not?
To determine this, we need to compare the r value from our sample and a critical r
value. The critical r value will help us determine if the sample collected is just a
weird occurrence or if it is a true representation of the relationship between the two
variables. Our textbook does not include samples as high as 50. With a bit of online
searching, I could only find two values. One of which was for a sample size of 40
and one for a sample size of 60. The critical values are 0.312 and 0.254
respectively. When your tested r value falls between these numbers, there is no
linear correlation. Regardless of which sample size I select, my r value falls well
between the critical values. Therefore, we can conclude that there is no linear
correlation that is statistically significant.
Conclusions
The purpose of this study was to determine if it is possible for a correlation to exist
between the age of a person and how many animals they currently own. Based on
the very low value I received for r, we can see that there is no linear correlation
statistically significant. A linear correlation coefficient close to 0 does not imply that
there is no relation at all, just no linear relation. It is possible there may be another
type of relationship associated with these two variables. If I were to re-collect data, I
would try an alternative method and try to represent both farmers and non-farmers
alike. The sample which I collected literally translates to say that for every 1 year a
person ages, they obtain an additional 0.0819 th of a pet. For every 9 years, the
person gains one pet or animal. I could see this being realistic because something
may happen to ones pet/animal due to old age (or by eating it if it is a farm animal)
and the owner ends up replacing it with another.

S-ar putea să vă placă și