Applied Multivariate Statistical Analysis: Chang Xinfeng Department of Statistics

Applied Multivariate
Statistical Analysis
Chang xinfeng
Department of statistics
What Is Multivariate Analysis?
Statistical methodology to analyze data with

measurements on many variables.
controllable factors
input output
Process
uncontrollable factors
Why to Learn Multivariate Analysis?
Explanation of a social or physical phenomenon

must be tested by gathering and analyzing data.
Complexities of most phenomena require an
investigator to collect observations on many
different variables.
This course is concerned with statistical methods
designed to elicit information from these kinds of
data sets.
Course Outline
Introduction
Matrix Algebra and Random Vectors
Sample Geometry and Random Samples
Multivariate Normal Distribution
Inference about a Mean Vector
Comparison of Several Multivariate Means
Multivariate Linear Regression Models
Course Outline
Principal Components
Factor Analysis and Inference for Structured

Covariance Matrices
Canonical Correlation Analysis
Discrimination and Classification
Clustering, Distance Methods, and Ordination

Text Book
R. A. Johnson and D. W. Wichern,

Applied Multivariate Statistical Analysis,
6th ed., Prentice Hall, 2006.
References
J. F. Hair, Jr., B. Black, B. Babin, R. E. Anderson,
and R. L. Tatham, Multivariate Data Analysis, 6th
ed., Prentice Hall, 2006.
D. C. Montgomery, Design and Analysis of Experi
ments, 6th ed., John Wiley, 2005.
T. W. Anderson An Introduction to Multivariat
e Statistical Analysis, 3th ed., John Wiley, 2003.
8
Major Uses of Multivariate Analysis
Data reduction or structural simplification

Sorting and grouping
Investigation of the dependence among variable
s
Prediction
Hypothesis construction and testing
9
What is a Multivariate Data Set?
Univariate statistics is concerned with random

scalar variable Y .
In multivariate analysis, we are concerned with the
joint analysis of multiple dependent variables.
These variables can be represented using matrices
and vectors
10
Example
x1

X x2
x
3
11
Example
Suppose that the ith person in the sample has height

= 175 cm, forearm length = 25.5 cm and foot length
= 27 cm. In vector notation these observed data
could be written as:
xi1
175

X i xi 2 25.5
x 27.0
i 3
12
Definitions of Matrix and Vector
A matrix is two-dimensional array of numbers of
formulas.
A vector is a matrix with either only one column
or only one row.
A column vector has only one column.
A row vector has only one row.
13
The dimension of a matrix is expressed as number

of rows number of columns.
For instance, a matrix with 10 rows and 3 columns
is said to be a 10 3 matrix.
The vectors written in Example 1 above are 3 1
matrices.
14
A square matrix is one for which the numbers of

rows and columns are the same.
For instance, a 4 4 matrix is a square matrix.
15
Example
Example 2 A selection of four receipts from a university bookstore
was obtained in order to investigate the nature of book sales. Each re
ceipt provided, among other things, the number of books sold and the
total amount of each sale. Let the first variable be total dollar sales an
d the second variable be number of books sold. Then we can regard t
he corresponding numbers on the receipts as four measurements on t
wo variables. Suppose the data, in tabular form, are
Variable 1 (dollar sales): 42 52 48 58

Variable 2 (number of books): 4 5 4 3
16
A data matrix
42 4
52 5
X
48 4

58 3
four rows and two columns
17
The Data Matrix in Multivariate Problems
Usually the observed data are represented by a

matrix in which the rows are observations and the
columns are variables.
The usual notation is n = the number of observed
units (people, animals, companies, etc.) and p =
number of variables measured on each unit.
18
Example
Example 3: Suppose that we have scores for n = 6

college students who have taken the verbal and the
science subtests of the College Qualification test
(CQT). We have p =2 variables: (1) the verbal
score and (2) the science score for each student.
The data matrix is the following 6 2 matrix:
19
Example
41 26

39 26

53 21
X
67 33
61 27

67 29

Each row gives data for a student in the sample.

To repeat the rows are observations, the columns
are variables.
20
Notation notes
xi1
xi
xi 2
21
Note
Considering data in the form of matrix facilitates the
exposition of the subject matter and allows numerical
calculations to be performed in an orderly and efficient
manner.
describing numerical calculations as operations on

matrix.
the implementation of the calculations on computers,
which now use many languages and statistical
packages to perform array operations.
22
Transpose of a Matrix
23
Transpose of a Matrix
24
Symmetric Matrices
25
Symmetric Matrices
26
Adding Two Matrices
Two matrices may be added if and only if they have

the same dimensions (same number of rows and also
same number of columns as each other. To add two
matrices, add corresponding elements (in terms of
location).
27
Multiplying a Matrix by a Scalar
Definition: The word scalar is a synonym for a

numerical constant. (In matrix terms, a scalar is a
matrix with one row and one column.)
To multiply a matrix by a scalar, multiply each
element in the matrix by the scalar.
28
Multiplication of Matrices
29
30
31
32
The Identity Matrix
Definition; An identity matrix is a square matrix

that has the value one in each main diagonal
position (from upper left to bottom right) and has
the value 0 in all other locations.
the 33 identity matrix is
1 0 0

I 0 1 0
0 0 1

33
Matrix Inverse
The calculation of an inverse for large matrices is a

laborious process that well leave to the computer. For
2 x 2 matrices, however, the formula is relatively
simple.
34
Matrix Inverse
a11 a12
For A
a21 a22
the inverse is
1 1 a22 a12
A
a11a22 a12 a21 a21 a11
35
Matrix Inverse
10 6
For A
8 5
the inverse is
1 1 5 6 1 5 6 2.5 3
A
10 5 6 8 8 10 2 8 10
4 5
36
Matrix Inverse
37
Orthogonal matrix
38
Determinant
39
Eigenvalues and eigenvectors
Definition: If we have a p x p matrix A we are going to

have p eigenvalues, 1, 2 ... p. They are obtained by
solving the equation given in the expression below:
A I 0
Definition:
The corresponding eigenvectors e1, e2, ...,ep are obtained
by solving the expression below:
( A j I )e j 0
40
Note: This does not have a unique solution. So, to

obtain a unique solution we will often require
that ej transposed ej is equal to 1. Or, if you like, the
sum of the square elements of ej is equal to 1.
e j e j 1
41
Example: Consider the 2 x 2 matrix.

To illustrate these calculations consider the matrix A as
shown below:
1 5
A
5 1
42
Then, using the definition of the eigenvalues, we must

calculate the determinant of A - times the Identity
matrix.
1 5 1 0
A I 0
5 1 0 1
1 5
(1 )(1 ) (5) (5) 0
5 1
43
Here we will take the following solutions:
1 6 2 4
Next, to obtain the corresponding eigenvectors, we must

solve a system of equations below:
( A j I )e j 0
44
1 j 5 e j1 0

5 1 j e j 2 0
e e 1
2
j1
2
j2
we get
1 1
2 2
e1 e2
1 1

2
2
45
46
Homework
P103:
2.2 2.3 2.5 2.8 2.9 (a) 2.24
47

Applied Multivariate Statistical Analysis: Chang Xinfeng Department of Statistics

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Applied Multivariate Statistical Analysis: Chang Xinfeng Department of Statistics

Încărcat de

Drepturi de autor:

Formate disponibile

Applied Multivariate

Statistical methodology to analyze data with

Explanation of a social or physical phenomenon

Factor Analysis and Inference for Structured

Discrimination and Classification

Clustering, Distance Methods, and Ordination

R. A. Johnson and D. W. Wichern,

Data reduction or structural simplification

Univariate statistics is concerned with random

Suppose that the ith person in the sample has height

The dimension of a matrix is expressed as number

A square matrix is one for which the numbers of

Variable 1 (dollar sales): 42 52 48 58

four rows and two columns

Usually the observed data are represented by a

Example 3: Suppose that we have scores for n = 6

Each row gives data for a student in the sample.

describing numerical calculations as operations on

Two matrices may be added if and only if they have

Definition: The word scalar is a synonym for a

Definition; An identity matrix is a square matrix

the 33 identity matrix is

The calculation of an inverse for large matrices is a

Definition: If we have a p x p matrix A we are going to

Note: This does not have a unique solution. So, to

Example: Consider the 2 x 2 matrix.

Then, using the definition of the eigenvalues, we must

Here we will take the following solutions:

Next, to obtain the corresponding eigenvectors, we must

S-ar putea să vă placă și