0 evaluări0% au considerat acest document util (0 voturi)

1 vizualizări37 paginicis515-15-spectral-clust-chap3-3

Aug 26, 2018

cis515-15-spectral-clust-chap3-3

© © All Rights Reserved

PDF, TXT sau citiți online pe Scribd

cis515-15-spectral-clust-chap3-3

© All Rights Reserved

0 evaluări0% au considerat acest document util (0 voturi)

1 vizualizări37 paginicis515-15-spectral-clust-chap3-3

cis515-15-spectral-clust-chap3-3

© All Rights Reserved

Sunteți pe pagina 1din 37

our knowledge, these points are not clearly articulated in

the literature).

1. The choice of a matrix representation for partitions

on the set of vertices.

invariant.

conditions for such matrices to represent a partition.

80 CHAPTER 3. GRAPH CLUSTERING

V by an N ⇥ K matrix X = [X 1 · · · X K ] whose columns

X 1, . . . , X K are indicator vectors of the partition (A1, . . .,

AK ).

the vector X j is of the form

X j = (xj1, . . . , xjN ),

and where aj , bj are any two distinct real numbers.

that, for i = 1, . . . , N ,

(

j aj if vi 2 Aj

xi =

bj if vi 2

/ Aj .

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 81

is called a partition matrix by Yu and Shi.

have a scale-invariant representation to make the denom-

inators of the Rayleigh ratios go away.

pairwise disjoint blocks whose union is V , some conditions

on X are required to reflect these properties, but we will

worry about this later.

in order to express the normalized cut Ncut(A1, . . . , AK )

as a sum of Rayleigh ratios.

convenient form, by chasing the denominators in the

Rayleigh ratios, and by expressing the objective function

in terms of the trace of a certain matrix.

82 CHAPTER 3. GRAPH CLUSTERING

the relaxed problem are right-invariant under multiplica-

tion by a K ⇥ K orthogonal matrix.

d.

(X j )>DX j = ↵j a2j + (d ↵j )b2j .

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 83

cut(Aj , Aj ) 6= cut(Ak , Ak ) if j 6= k, and since

K

X cut(Aj , Aj )

Ncut(A1, . . . , AK ) = ,

j=1

vol(Aj )

cut(Aj , Aj ) (X j )>LX j

= j = 1, . . . , K,

vol(Aj ) (X j )>DX j

K

X cut(Aj , Aj )

µ(X) = Ncut(A1, . . . , AK ) =

j=1

vol(Aj )

XK

(X j )>LX j

= j > j

.

j=1

(X ) DX

84 CHAPTER 3. GRAPH CLUSTERING

If bj 6= 0, then

2↵j d

aj = bj .

2↵j

0, so we will opt for the choice bj = 0, j = 1, . . . , K.

(X j )>DX j = ↵j a2j .

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 85

1 1

aj = p = p , j = 1, . . . , K,

↵j vol(Aj )

choice also corresponds to the scaled partition matrix

used in Yu [16] and Yu and Shi [17].

representing the partition of V = {v1, v2, . . . , v10} into

the four blocks

86 CHAPTER 3. GRAPH CLUSTERING

{{v2, v4, v6}, {v1, v5}, {v3, v8, v10}, {v7, v9}},

is shown below:

0 1

0 a2 0 0

Ba 1 0 0 0C

B C

B0 0 a3 0C

B C

Ba 1 0 0 0C

B C

B0 a2 0 0C

X=B

Ba 1

C.

B 0 0 0CC

B0 0 0 a4C

B C

B0 0 a3 0C

B C

@0 0 0 a4A

0 0 a3 0

sufficient conditions for a matrix X to represent a parti-

tion of V .

tured by the orthogonality of the X i:

(X i)>X j = 0, 1 i, j K, i 6= j. (⇤)

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 87

Rayleigh ratios, conditions on the quantities (X i)>DX i

show up, and it is more convenient to express the or-

thogonality conditions using the quantities (X i)>DX j

instead of the (X i)>X j , because these various conditions

can be combined into a single condition involving the ma-

trix X >DX.

and because the nonzero entries in each column of X have

the same sign, for any i 6= j, the condition

(X i)>X j = 0

is equivalent to

(X i)>DX j = 0. (⇤⇤)

88 CHAPTER 3. GRAPH CLUSTERING

are equivalent to the fact that every row of X has at most

one nonzero entry.

union of the Aj is V is captured by the fact that each

row of X must have some nonzero entry (every vertex

appears in some block).

this condition in matrix form.

has a single nonzero entry aj , we have

of the partition.

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 89

is

det(X >X) 6= 0.

terminant and is scale-invariant stems from the observa-

tion that not only

but

0 1

n1a1

X >1N = @ .. A ,

nK aK

90 CHAPTER 3. GRAPH CLUSTERING

0 1

1

a1

B C

(X >X) 1X >1N = @ .. A ,

1

aK

and thus

X(X >X) 1X >1N = 1N . (†)

dent, (X >X) 1X > is the pseudo-inverse X + of X.

be written as

XX +1N = 1N .

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 91

projection of RK onto the range of X (see Gallier [6],

Section 14.1), so the condition XX +1N = 1N is equiva-

lent to the fact that 1N belongs to the range of X.

columns of a solution X satisfy the equation

a1 1X 1 + · · · + aK1X K = 1N .

are invariant under multiplication by a nonzero scalar, be-

cause the Rayleigh ratio is scale-invariant, and it is crucial

to take advantage of this fact to make the denominators

go away.

92 CHAPTER 3. GRAPH CLUSTERING

If we let

n

X = [X 1 . . . X K ] | X j = aj (xj1, . . . , xjN ), xji 2 {1, 0},

o

j

aj 2 R, X 6= 0

then the set of matrices representing partitions of V into

K blocks is

n

K = X = [X 1 · · · X K ] | X 2 X,

(X i)>DX j = 0,

1 i, j K, i 6= j,

o

X(X >X) 1X >1 = 1 .

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 93

really K-tuples of points in RPN 1, so our solution set is

really

n o

P(K) = (P(X 1), . . . , P(X K )) | [X 1 · · · X K ] 2 K .

way clustering of a graph using normalized cuts, called

problem PNC1 (the notation PNCX is used in Yu [16],

Section 2.1):

94 CHAPTER 3. GRAPH CLUSTERING

Cut, Version 1:

Problem PNC1

XK

(X j )>LX j

minimize j )> DX j

j=1

(X

subject to (X i)>DX j = 0, 1 i, j K, i 6= j,

X(X >X) 1X >1 = 1,

X 2 X.

are K-tuples (P(X 1), . . . , P(X K )) of points in RPN 1 de-

termined by their homogeneous coordinates X 1, . . . , X K .

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 95

Remark:

Because

(X j )>LX j = (X j )>DX j (X j )>W X j

= vol(Aj ) (X j )>W X j ,

Instead of minimizing

XK j > j

(X ) LX

µ(X 1, . . . , X K ) = j )> DX j

,

j=1

(X

we can maximize

K

X (X j )>W X j

✏(X 1, . . . , X K ) = ,

j=1

(X j )>DX j

since

✏(X 1, . . . , X K ) = K µ(X 1, . . . , X K ).

96 CHAPTER 3. GRAPH CLUSTERING

maximizing ✏(X 1, . . . , X K ), but from a practical point of

view, it is preferable to maximize ✏(X 1, . . . , X K ).

from (unit) eigenvectors corresponding to the K small-

est eigenvalues of Lsym = D 1/2LD 1/2 (by multiplying

these eigenvectors by D1/2).

and eigenvectors of a symmetric matrix do much better

at computing largest eigenvalues.

listed in increasing order correspond to the eigenvalues of

I Lsym = D 1/2W D 1/2 listed in decreasing order.

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 97

est eigenvalue ⌫i i↵ v is an eigenvector of I Lsym for the

(N + 1 i)th largest eigenvalue ⌫i.

of I Lsym = D 1/2W D 1/2 and their eigenvectors.

[0, 2], the eigenvalues of 2I Lsym = I + D 1/2W D 1/2

are also in the range [0, 2] (that is, I + D 1/2W D 1/2 is

positive semidefinite).

be converted to a more convenient form, by chasing the

denominators in the Rayleigh ratios, and by expressing

the objective function in terms of the trace of a certain

matrix.

98 CHAPTER 3. GRAPH CLUSTERING

R, any symmetric N ⇥ N matrix A, and any N ⇥ K

matrix X = [X 1 · · · X K ], the following properties hold:

(1) µ(X) = tr(⇤ 1X >LX), where

⇤ = diag((X 1)>DX 1, . . . , (X K )>DX K ).

1 >

µ(X) = µ(XR) = 2

tr(X LX).

↵

(3) The condition X >AX = ↵2I is preserved if X is

replaced by XR.

(4) The condition X(X >X) 1X >1 = 1 is preserved if

X is replaced by XR.

ditions in PNC1 are scale-invariant, we are led to the

following formulation of our problem:

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 99

subject to (X i)>DX j = 0, 1 i, j K, i 6= j,

(X j )>DX j = 1, 1 j K,

X(X >X) 1X >1 = 1,

X 2 X.

tion

X >DX = I,

tion problem:

100 CHAPTER 3. GRAPH CLUSTERING

Cut, Version 2:

Problem PNC2

subject to X >DX = I,

X(X >X) 1X >1 = 1, X 2 X.

I to be satisfied, it does not have the same set of solutions

as problem PNC1.

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 101

PNC1, in the sense that for every minimal solution

(X 1, . . . , X K ) of PNC1,

(((X 1)>DX 1) 1/2X 1, . . . , ((X K )>DX K ) 1/2X K ) is a

minimal solution of PNC2 (with the same minimum for

the objective functions), and that for every minimal solu-

tion (Z 1, . . . , Z k ) of PNC2, ( 1Z 1, . . . , K Z K ) is a min-

imal solution of PNC1, for all i 6= 0, i = 1, . . . , K (with

the same minimum for the objective functions).

set of minimal solutions as K-tuples of points

(P(X 1), . . . , P(X K )) in RPN 1 determined by their ho-

mogeneous coordinates X 1, . . . , X K .

102 CHAPTER 3. GRAPH CLUSTERING

malized cut has a geometric interpretation in terms of the

graph drawings discussed in Section 2.1.

Find a minimal energy graph drawing X in RK of the

weighted graph G = (V, W ) such that:

1. The matrix X is orthogonal with respect to the inner

product h , iD in RN induced by D, with

hx, yiD = x>Dy, x, y 2 RN .

vi 2 V is assigned to the origin of RK (the zero vector

0K ).

3. Every vertex vi is assigned a point of the form (0, . . . , 0,

aj , 0, . . . , 0) on some axis (in RK ).

4. Every axis in RK is assigned at least some vertex.

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 103

for graph drawings (R>R = I) by making the change

of variable Y = D1/2X or equivalently X = D 1/2Y .

Indeed,

LD 1/2

Y ),

1/2 1/2

so we use the normalized Laplacian Lsym = D LD

instead of L,

X >DX = Y >Y = I,

and conditions (2), (3), (4) are preserved under the change

of variable Y = D1/2X, since D1/2 is invertible.

especially condition (3).

104 CHAPTER 3. GRAPH CLUSTERING

orthogonal with respect to both the Euclidean inner prod-

uct and the inner product h , iD , so condition (1) is re-

dundant, except for the fact that it prescribes the norm of

the columns, but this is not essential due to the projective

nature of the solutions.

PNC2 is that it is very difficult to enforce the condition

X 2 X.

rotations, but only by very special rotations which leave

X invariant (they exchange the axes).

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 105

the condition that X 2 X , and we obtain the

Problem (⇤2)

subject to X >DX = I,

X(X >X) 1X >1 = 1.

are ultimately seeking are solutions of problem PNC1,

the preferred relaxation is the one obtained from problem

PNC1 by dropping the condition X 2 X , and simply

requiring that X j 6= 0, for j = 1, . . . , K:

106 CHAPTER 3. GRAPH CLUSTERING

Problem (⇤1)

XK

(X j )>LX j

minimize j )> DX j

j=1

(X

subject to (X i)>DX j = 0, X j 6= 0 1 i, j K, i 6= j,

X(X >X) 1X >1 = 1.

clear that X >X is invertible in (⇤1) and (⇤2).

orthogonal, they must be linearly independent, so X has

rank K and and X >X is invertible.

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 107

of problem (⇤1) yields a solution of problem (⇤2) by nor-

malizing each Z j by ((Z j )>DZ j )1/2, and conversely for

every solution Z = [Z 1, . . . , Z K ] of problem (⇤2), the

K-tuple [ 1Z 1, . . . , K Z K ] is a solution of problem (⇤1),

where j 6= 0 for j = 1, . . . , K.

matrix R 2 O(K) and for every solution X of (⇤2), the

matrix XR is also a solution of (⇤2).

the same value in order to have µ(X) = µ(XR), in gen-

eral, if X is a solution of (⇤1), the matrix XR is not

necessarily a solution of (⇤1).

108 CHAPTER 3. GRAPH CLUSTERING

for every R 2 O(K), XR is a solution of both (⇤2) and

(⇤1), and since (⇤1) is scale-invariant, for every diagonal

invertible matrix ⇤, the matrix XR⇤ is a solution of (⇤1).

family of solutions of problem (⇤1); namely, all matrices

of the form ZR⇤, where R 2 O(K) and ⇤ is a diagonal

invertible matrix.

solution X “close” to a solution Z of the relaxed problem

(⇤2).

and ⇤ a diagonal invertible matrix i↵ its columns are

nonzero and pairwise orthogonal.

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 109

dition X(X >X) 1X >1 = 1 is equivalent to XX +1 = 1,

which is also equivalent to the fact that 1 is in the range

of X.

lently X = D 1/2Y , the condition that 1 is in the range

of X becomes the condition that D1/21 is in the range of

Y , which is equivalent to

Y Y +D1/21 = D1/21.

Y + = Y >,

110 CHAPTER 3. GRAPH CLUSTERING

Problem (⇤⇤2)

subject to Y >Y = I,

Y Y >D1/21 = D1/21.

Z of problem (⇤2) by Z = D 1/2Y .

over all N ⇥ K matrices Y satisfying Y >Y = I is equal

to the sum ⌫1 + · · · + ⌫K of the first K eigenvalues of

Lsym = D 1/2LD 1/2.

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 111

tion A.3) guarantees that the sum of the K smallest eigen-

values of Lsym is a lower bound for tr(Y >LsymY ).

straint, the minimum of problem (⇤⇤2) is achieved by

any K unit eigenvectors (u1, . . . , uK ) associated with the

smallest eigenvalues

0 = ⌫1 ⌫2 . . . ⌫K

of Lsym.

graph is connected (otherwise, we work with each con-

nected component), in which case Y 1 = D1/21/ D1/21 2,

because 1 is in the nullspace of L.

112 CHAPTER 3. GRAPH CLUSTERING

range of Y , so the condition

Y Y >D1/21 = D1/21

yields a minimum of our relaxed problem (⇤2) (the second

constraint is satisfied because 1 is in the range of Z).

associated with the eigenvalues 0 = ⌫1 ⌫2 . . . ⌫K .

and Z 1 = 1/ D1/21 2.

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 113

(Z i)>DZ j = 0, whenever i 6= j.

and thus, that each Z j has both some positive and some

negative coordinate, for j = 2, . . . , K.

that Z i and Z j are orthogonal (w.r.t. the Euclidean inner

product), but we can obtain a solution of Problems (⇤2)

and (⇤1) achieving the same minimum for which distinct

columns Z i and Z j are simultaneously orthogonal and D-

orthogonal, by multiplying Z by some K ⇥ K orthogonal

matrix R on the right.

114 CHAPTER 3. GRAPH CLUSTERING

K ⇥ K symmetric matrix Z >Z can be diagonalized by

some orthogonal K ⇥ K matrix R as

Z >Z = R⌃R>,

(⇤2) and (⇤1), and tr((ZR)>L(ZR)) = tr(Z >LZ).

3.3. K-WAY CLUSTERING USING NORMALIZED CUTS 115

fact, orthogonal) and since Z = D 1/2Y , the matrix Z

also has linearly independent columns, so Z >Z is positive

definite and the entries in ⌃ are all positive.

izing it, the matrix R can be found by computing an SVD

of Z.

## Mult mai mult decât documente.

Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.

Anulați oricând.