Sunteți pe pagina 1din 46

Applied Multivariate

Statistical Analysis

Chang xinfeng
Department of statistics
What Is Multivariate Analysis?

Statistical methodology to analyze data with


measurements on many variables.

controllable factors

input output
Process

uncontrollable factors
Why to Learn Multivariate Analysis?

Explanation of a social or physical phenomenon


must be tested by gathering and analyzing data.
Complexities of most phenomena require an
investigator to collect observations on many
different variables.
This course is concerned with statistical methods
designed to elicit information from these kinds of
data sets.
Course Outline
Introduction
Matrix Algebra and Random Vectors
Sample Geometry and Random Samples
Multivariate Normal Distribution
Inference about a Mean Vector
Comparison of Several Multivariate Means
Multivariate Linear Regression Models
Course Outline

Principal Components

Factor Analysis and Inference for Structured


Covariance Matrices
Canonical Correlation Analysis

Discrimination and Classification

Clustering, Distance Methods, and Ordination


Text Book

R. A. Johnson and D. W. Wichern,


Applied Multivariate Statistical Analysis,
6th ed., Prentice Hall, 2006.
References
J. F. Hair, Jr., B. Black, B. Babin, R. E. Anderson,
and R. L. Tatham, Multivariate Data Analysis, 6th
ed., Prentice Hall, 2006.
D. C. Montgomery, Design and Analysis of Experi
ments, 6th ed., John Wiley, 2005.
T. W. Anderson An Introduction to Multivariat
e Statistical Analysis, 3th ed., John Wiley, 2003.

8
Major Uses of Multivariate Analysis

Data reduction or structural simplification


Sorting and grouping
Investigation of the dependence among variable
s
Prediction
Hypothesis construction and testing

9
What is a Multivariate Data Set?

Univariate statistics is concerned with random


scalar variable Y .
In multivariate analysis, we are concerned with the
joint analysis of multiple dependent variables.
These variables can be represented using matrices
and vectors

10
Example

x1

X x2
x
3
11
Example

Suppose that the ith person in the sample has height


= 175 cm, forearm length = 25.5 cm and foot length
= 27 cm. In vector notation these observed data
could be written as:

xi1
175

X i xi 2 25.5
x 27.0
i 3
12
Definitions of Matrix and Vector
A matrix is two-dimensional array of numbers of
formulas.
A vector is a matrix with either only one column
or only one row.
A column vector has only one column.
A row vector has only one row.

13
Definitions of Matrix and Vector

The dimension of a matrix is expressed as number


of rows number of columns.
For instance, a matrix with 10 rows and 3 columns
is said to be a 10 3 matrix.
The vectors written in Example 1 above are 3 1
matrices.

14
Definitions of Matrix and Vector

A square matrix is one for which the numbers of


rows and columns are the same.
For instance, a 4 4 matrix is a square matrix.

15
Example
Example 2 A selection of four receipts from a university bookstore
was obtained in order to investigate the nature of book sales. Each re
ceipt provided, among other things, the number of books sold and the
total amount of each sale. Let the first variable be total dollar sales an
d the second variable be number of books sold. Then we can regard t
he corresponding numbers on the receipts as four measurements on t
wo variables. Suppose the data, in tabular form, are

Variable 1 (dollar sales): 42 52 48 58


Variable 2 (number of books): 4 5 4 3

16
A data matrix

42 4
52 5
X
48 4

58 3

four rows and two columns

17
The Data Matrix in Multivariate Problems

Usually the observed data are represented by a


matrix in which the rows are observations and the
columns are variables.
The usual notation is n = the number of observed
units (people, animals, companies, etc.) and p =
number of variables measured on each unit.

18
Example

Example 3: Suppose that we have scores for n = 6


college students who have taken the verbal and the
science subtests of the College Qualification test
(CQT). We have p =2 variables: (1) the verbal
score and (2) the science score for each student.
The data matrix is the following 6 2 matrix:

19
Example
41 26

39 26

53 21
X
67 33
61 27


67 29

Each row gives data for a student in the sample.


To repeat the rows are observations, the columns
are variables.
20
Notation notes

xi1
xi
xi 2
21
Note
Considering data in the form of matrix facilitates the
exposition of the subject matter and allows numerical
calculations to be performed in an orderly and efficient
manner.

describing numerical calculations as operations on


matrix.
the implementation of the calculations on computers,
which now use many languages and statistical
packages to perform array operations.
22
Transpose of a Matrix

23
Transpose of a Matrix

24
Symmetric Matrices

25
Symmetric Matrices

26
Adding Two Matrices

Two matrices may be added if and only if they have


the same dimensions (same number of rows and also
same number of columns as each other. To add two
matrices, add corresponding elements (in terms of
location).

27
Multiplying a Matrix by a Scalar

Definition: The word scalar is a synonym for a


numerical constant. (In matrix terms, a scalar is a
matrix with one row and one column.)
To multiply a matrix by a scalar, multiply each
element in the matrix by the scalar.

28
Multiplication of Matrices

29
Multiplication of Matrices

30
Multiplication of Matrices

31
Multiplication of Matrices

32
The Identity Matrix

Definition; An identity matrix is a square matrix


that has the value one in each main diagonal
position (from upper left to bottom right) and has
the value 0 in all other locations.

the 33 identity matrix is

1 0 0

I 0 1 0
0 0 1

33
Matrix Inverse

The calculation of an inverse for large matrices is a


laborious process that well leave to the computer. For
2 x 2 matrices, however, the formula is relatively
simple.

34
Matrix Inverse
a11 a12
For A
a21 a22

the inverse is

1 1 a22 a12
A
a11a22 a12 a21 a21 a11

35
Matrix Inverse
10 6
For A
8 5

the inverse is

1 1 5 6 1 5 6 2.5 3
A
10 5 6 8 8 10 2 8 10
4 5

36
Matrix Inverse

37
Orthogonal matrix

38
Determinant

39
Eigenvalues and eigenvectors

Definition: If we have a p x p matrix A we are going to


have p eigenvalues, 1, 2 ... p. They are obtained by
solving the equation given in the expression below:

A I 0
Definition:
The corresponding eigenvectors e1, e2, ...,ep are obtained
by solving the expression below:

( A j I )e j 0

40
Eigenvalues and eigenvectors

Note: This does not have a unique solution. So, to


obtain a unique solution we will often require
that ej transposed ej is equal to 1. Or, if you like, the
sum of the square elements of ej is equal to 1.

e j e j 1

41
Eigenvalues and eigenvectors

Example: Consider the 2 x 2 matrix.


To illustrate these calculations consider the matrix A as
shown below:

1 5
A
5 1

42
Eigenvalues and eigenvectors

Then, using the definition of the eigenvalues, we must


calculate the determinant of A - times the Identity
matrix.

1 5 1 0
A I 0
5 1 0 1

1 5
(1 )(1 ) (5) (5) 0
5 1
43
Eigenvalues and eigenvectors

Here we will take the following solutions:

1 6 2 4

Next, to obtain the corresponding eigenvectors, we must


solve a system of equations below:

( A j I )e j 0

44
Eigenvalues and eigenvectors
1 j 5 e j1 0

5 1 j e j 2 0
e e 1
2
j1
2
j2

we get
1 1
2 2
e1 e2
1 1

2
2

45
Eigenvalues and eigenvectors

46
Homework
P103:
2.2 2.3 2.5 2.8 2.9 (a) 2.24

47

S-ar putea să vă placă și