Sunteți pe pagina 1din 50

Wayfinding in

Social Networks

David Liben-Nowell
Carleton College
dlibenno@carleton.edu

Microsoft Research–Cambridge | 7 December 2007


Workshop on Online Social Networks
1
Analyzing Social Networks, pre-1995

Social Network Analysis: Old S˜ool


social networks have been around for 100K+ years!
before the web, hard to acquire (surveys, interviews, ...).
but many interesting, relevant, generalizable observations!

2
Zachary’s Karate Club

12

8 13 11 [Zachary 1977]
22 Recorded interactions in a
18 17

5 karate club for 2 years.


inst
2 6
During observation,
25 adminstrator/instructor
4
32
3
7
conflict developed
28 ⇒ broke into two clubs.
26
14
29
10
9 20 Who joins which club?
24
admin

33 Split along
31

21 administrator/instructor
30

19
minimum cut (!)
15
16
27 23

3
Zachary’s Karate Club

12

8 13 11 [Zachary 1977]
22 Recorded interactions in a
18 17

5 karate club for 2 years.


inst
2 6
During observation,
25 adminstrator/instructor
4
32
3
7
conflict developed
28 ⇒ broke into two clubs.
26
14
29
10
9 20 Who joins which club?
24
admin

33 Split along
31

21 administrator/instructor
30

19
minimum cut (!)
15
16
27 23

4
Zachary’s Karate Club

12

8 13 11 [Zachary 1977]
22 Recorded interactions in a
18 17

5 karate club for 2 years.


inst
2 6
During observation,
25 adminstrator/instructor
4
32
3
7
conflict developed
28 ⇒ broke into two clubs.
26
14
29
10
9 20 Who joins which club?
24
admin

33 Split along
31

21 administrator/instructor
30

19
minimum cut (!)
15
16
27 23

5
Zachary’s Karate Club

12

8 13 11 [Zachary 1977]
22 Recorded interactions in a
18 17

5 karate club for 2 years.


inst
2 6
During observation,
25 adminstrator/instructor
4
32
3
7
conflict developed
28 ⇒ broke into two clubs.
26
14
29
10
9 20 Who joins which club?
24
admin

33 Split along
31

21 administrator/instructor
30

19
minimum cut (!)
15
16
27 23

6
Zachary’s Karate Club

12

8 13 11 [Zachary 1977]
22 Recorded interactions in a
18 17

5 karate club for 2 years.


inst
2 6
During observation,
25 adminstrator/instructor
4
32
3
7
conflict developed
28 ⇒ broke into two clubs.
26
14
29
10
9 20 Who joins which club?
24
admin

33 Split along
31

21 administrator/instructor
30

19
minimum cut (!)
15
16
27 23

7
X Why to think algorithmically about social networks
The small-world phenomenon

X Models of the navigable small world


X Some conclusions and musings
Milgram: Six Degrees of Separation
Social Networks as Networks: [Milgram 1967]

People given letter, asked to forward to one friend.


Source: random residents of Omaha, NE;
Target: stockbroker near Boston, MA.
Of completed chains, averaged six hops to reach target.
8
Milgram: The Explanation?

“the small-world problem”

Why is a random Omahaian close to a Boston stockbroker?

Standard (pseudosociological, pseudomathematical) explanation:


(Erdős/Rényi) random graphs have small diameter.

Bogus! In fact, many bogosities:


degree distribution
clustering coefficients
...

9
High School Friendships

White
Black
Other

Self-reported high school friendships. [Moody 2001]


10
High School Friendships

[Moody 2001]
11
Homophily

homophily: a person x’s friends tend to be ‘similar’ to x.

One explanation for high clustering: (semi)transitivity of similarity.

x, y both friends of u ≈⇒ x and u similar; y and u similar


≈⇒ x and y similar
≈⇒ x and y friends
12
Navigability of Social Networks

[Kleinberg 2000]
Milgram experiment shows more than small diameter:
People can construct short paths!

Milgram’s result is algorithmic, not existential.

Theorem [Kleinberg 2000]: No local-information algorithm


can find short paths in Watts/Strogatz networks.

13
Homophily and Greedy Applications

homophily: a person x’s friends tend to be similar to x.


Key idea: getting closer in “similarity space”
⇒ getting closer in “graph-distance space”

[Killworth Bernard 1978] (“reverse small-world experiment”)


[Dodds Muhamed Watts 2003]
In searching a social network for a target,
most people chose the next step because of
“geographical proximity” or “similarity of occupation”
(more geography early in chains; more occupation late.)

Suggests the greedy algorithm in social-network routing:


if aiming for target t, pick your friend who’s ‘most like’ t.

14
Greedy Routing

Greedy algorithm:
if aiming for target t, pick your friend who’s ‘most like’ t.

Geography: greedily route based on distance to t.


Occupation: ≈ greedily route based on distance
in the (implicit) hierarchy of occupations.

Want Pr [u, v friends] to decay smoothly as d(u, v) increases.


(Need social ‘cues’ to help narrow in on t.
Not just homophily! Can’t just have many disjoint cliques.)

15
The LiveJournal Community

www.livejournal.com “Baaaaah,” says Frank.

Online blogging community.


Currently 14.3 million users; ∼1.3 million in February 2004.

LiveJournal users provide:


disturbingly detailed accounts of their personal lives.
profiles (birthday, hometown, explicit list of friends)

Yields a social network, with users’ geographic locations.


(∼500K people in the continental US.)
[DLN Novak Kumar Raghavan Tomkins 2005]
16
LiveJournal
0.1% of LJ friendships

[DLN Novak Kumar Raghavan Tomkins 2005]


17
Distance versus LJ link probability

1e-03

1e-04
link probability

1e-05

1e-06

1e-07
10 km 100 km 1000 km
distance

[DLN Novak Kumar Raghavan Tomkins 2005]


18
Analyzing Social Networks, pre-1995

Social Network Analysis: Old S˜ool


social networks have been around for 100K+ years!
before the web, hard to acquire (surveys, interviews, ...).
but many interesting, relevant, generalizable observations!

Social Network Analysis: New Sch00l


automatically extract networks without having to ask.
phone calls, emails, online communities, ...
big! but are these really social networks?
19
The Hewlett-Packard Email Community

[Adamic Adar 2005]

Corporate research community.


Captured email headers over ∼3 months.
Define friendship as ≥ 6 emails u → v and ≥ 6 emails v → u.

Yields a social network (n = 430),


with positions in the corporate hierarchy.
20
Emails and the HP Corporate Hierarchy

[Adamic Adar 2005]

black: HP corporate hierarchy gray: exchanged emails.

21
Emails and the HP Corporate Hierarchy

[Adamic Adar 2005]

So what?

22
X Why to think algorithmically about social networks
X The small-world phenomenon
Models of the navigable small world

X Some conclusions and musings


Requisites for Navigability

1e-03

1e-04

link probability
1e-05

1e-06

1e-07
10 km 100 km 1000 km
distance

[Kleinberg 2000]:
for a social network to be navigable without global knowledge,
need ‘well-scattered’ friends (to reach faraway targets)
need ‘well-localized’ friends (to home in on nearby targets)

23
Kleinberg: Navigable Social Networks

[Kleinberg 2000]

put n people on a k-dimensional grid


connect each to its immediate neighbors
1
add one long-range link per person; Pr[u → v] ∝ d(u,v) α.

24
Navigability of Social Networks

put n people on a k-dimensional grid


connect each to its immediate neighbors
1
add one long-range link per person; Pr[u → v] ∝ d(u,v) α.

Theorem [Kleinberg 2000]: (short = polylog(n))

If α 6= k
then no local-information algorithm can find short paths.

If α = k
then people can find short—O(log2 n)—paths using
the greedy algorithm.

25
Geography’s Role in LiveJournal

[DLN Novak Kumar Raghavan Tomkins 2005]

By simulating the Milgram experiment,


we find that LJ is navigable via geographically greedy routing.

By Kleinberg’s theorem,
navigable 2D geographic mesh ⇒ Pr[u → v] ∝ d(u, v)−2.

Original goal of this research:


verify that Pr[u → v] ∝ d(u, v)−2 in LiveJournal.

26
Distance versus link probability
1e-03

ε + 1/d
1e-04
link probability

1e-05

1e-06

1/d
1e-07
10 km 100 km 1000 km
distance

shows Pru,v [u is friends with v | d(u, v) = d]


Kleinberg’s 1/d2 highly unsupported!
Not really linear! Link probability levels out to ∼ 5 × 10−6.

27
The LiveJournal Odyssey

Dot shown for every


inhabited location
in LiveJournal network.

Circles are centered


on Ithaca, NY.

Each successive circle’s


population increases by 50,000.

Uniform population ⇒ radii would decrease quadratically.


(actually mostly increase!)

People don’t live on a uniform grid!


28
Why does distance fail?

cb cb cb cb cb
b cbcb c
b
bb
c cb
b
ccbcb cbcb
cb cb cbcb cb cbcb cb cb cbcb
b
c c
b cb ccb
b c
bcb
b cb
b cb
cbcb
c
b cb
cb
c
bcb cb c
cb
b
cb
cb cb
c
bc
b
cb cb cbcb cb
cb cbc
cbcb ccb
b cbcbc
b c
bc
b cb
b c
b c
b c c c cb
b c
bc
b c
b c
bc
b
cb cb c
b c
b
c
b cb c
cc b
b
cb
b
ccb cb
b ccbcb ccb
b cb cb cb cb
cb
b
cb c
cb
cb
b
cb cb
cb
bcbcb
c
cb cb
cbcb c cb
c
cb
b cb
cb
b
cb
b ccb
cb
cb
b
c
b ccb
b
ccb
b
c
b
cb
cb cb
b cbcb
cb
cc b
b
b
c
c
b
cb
b c
b c
b c c
bc c
b c
b
ccb
b c
b c
bc c
b c
b c
c
b c
b cb
b c c
b c
b
cb
b cb
ccb cb
b cb
b cbcbcb cbcbcb cb cb
bcb cb
b cb cb cbcb cb cbcbcb
c
b cb
b c
cb
b cbcb c c
bcb
cb cb
c
b cb c
b c
b
cb
bcb
cb ccbc cb
b cb
c
cb
b
c
b
cb
c
b cbcb ccb
b
c
b c c
bcb
c
b c c
b cb c
c
b c
b c
b c
b
c
b c
bc
b c
b c
bc
b
c
b c
b
c
b c
b c
b c
b c
b c c
b c
b c
b c
b c
bc
bc
b c
b c
b c
b c
b c c
bc
b c
b c
b c
b c
b c
b cb
b c
cb
b cb
b cb
cb c b
c cb
cb
b
ccb cb
ccb cb cb cb
cb cb cb
b cb
cb
ccbcb
cbcbcb cb
b cb
cbcb cb cb
cb cb c
b
cb
b cb
cb
cb
ccb c
bc
b c
bcb
bc b
c
b cb c
b c
b
cb
b c
b c
b c
bcb
b
c
bcb
c c
b cb
cb
c
b
c
b c
b cb c
b c
b c
bc
b c
b c
b cbcb cccb
cb
b
c
b
cb
c
b cb
b cbc cb
c c
b
cb
b ccb cb
cb
cb cb cc b
b
cb ccb
cb
b cb cb
cb cb
ccbcb cb
cbcb cb
cbcb
bcb
ccb
b cbcb cb cb
cb cb
ccb cb
b cbcb cb cb
b cb
cbcb cb c
b
cb
b
cb
b
cb
ccb
b cb
cb ccb
b cb
c
b cb
b c b cb
c c
b c
b c
b b
c
b
cb
b c c
bcb cbc
b
cb
c c
b c
b c
b b cb
cc
b
c
b c
bc
b c
b cb
cb cb c c
b c
500 m cbcb cb cbcb cb cbcb cb
cb cb cb
cb
cb
ccb cb
b cb cb bcb
cb cb
500 m cb
cbcb
cb cb cb cb
cb
ccb
cb
b cb cb
cbcb
cb
cb cb
cbcb cb cbc
b
cb
b cbccb
cb cb
cb c
b c
b c
b
c
b b c c
b
cbc c
bc
b cb
b
c
b c
b
cc
b cb
b
c
b
cbccb
b cb cb cc
b c
bc
b
c
b c
b c
b c
b
b
c
bc
b c
cb
b cb
cb c
c
b c
bc
b c
b c
b c
b cb
b cc
c c
b c
b c
b c
b c
b
c
b c
b
cb
b cb cb
cb
cb cb
b cb
b cb cb cb
cb cbcbcb
b cbcb
ccb
b cb cb
b cb
cb
cb cb cb
bcb cb cb cb
b cb cbcb
bcbcbcb cc b
cb cb
b
cb
b c c
bc
b c
b cb
c
bcc
b cb
c
b c
b
c c
b
c
b c
b c
b
cb
b c c
b c
bc c
b c c
b c
b c
bc
b
cb c c
b c
b cb
cc
b c
b c
c
b
cb
b cb
cc
b c
b
b
c
bcb
b cb
bcb
cbcbccb
cb c
c
cb
b c
b
cb c
b c
b c
b c
b c
b c
bc
b c
b c
b c
b
c
b c
b c
bc
b c
b
c
b c
b c
b c
bc
b c
b c
b c
b c
b c
b c
b c
b c
b c
bc
b c
bc c
b
c
b c
b cb
b c
b c
b cb
c
cb
bcb cbccb
b
cb
cb
cb
c
b c
c
b cb
bcbcb
cb c cb
b cb
cbcbcb
c c
b
cb
bcb
c
b c
b c
b
cb
c
b
cbcb ccb
b cb
c
cb
cb
b
c
c
b ccb
bcb cb
cb cbcbcb
cb
c
b
cb
c
b cbcc
b c
b cb
bcb cbcb cb
cb
b cbcb c cb
b
c
b
c c
b cb
b c
b cb
b c
b c
b c
b c
b c
b c
cb
b cb cb c b
ccb
b cbcb
cb cb
cb cb
cb
c b
ccb
cc b
b
cb cc b
cb ccb
b
cb
b
cc b
cb
b
ccb cb cb
cb
cb
cb
b ccb
cb cb
cb
cb
cbcb cb
cc b
b
c
b
c

Population density varies widely across the US!

and : best friends in Minnesota, strangers in Manhattan.


29
Rank-Based Friendship

How do we handle non-uniformly distributed populations?

Instead of distance, use rank as fundamental quantity.

rankA(B) := |{C : d(A, C) < d(A, B)}|

How many people live closer to A than B does?

Rank-Based Friendship : Pr[A is a friend of B] ∝ 1/rankA(B).


Probability of friendship ∝ 1/(number of closer candidates)
c
b c
b cb
bcb cb
c b cbcb c b cb
c b
c b
c b c b
c cb
cb c b
c b c bcbc bc bc bcbcbcbc bc bc bc bcbc bc
bc bcbcbcbcbcbcbcbcbcbcbcbc bcbcbcbcbcbc bcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbc bcbcbcbc bc bcbcbcbcbcbcbc bcbc bc bcbcbc
cc bc b cb c b cb cb
cbc
cb
bb
c b cbcb
cb cb
b c
b cb
bccb
b cbc bcbcbc bc
b
bc bcbc bcbc
cbbcbc bcbc bcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbc bcbcbcbcbcbcbcbcbc
bc
bb
cc c b
b c cb
b c
b c
bc
b c
c
b c
b c
b c
bc
b c
b c
b bcbc bcbc bcbcbcbc bc bc bcbcbcbcbcbc bcbc bc
c
b c b
b c b
b c c
b
c
b c
c
b
cb
b cb
cb
c
b cb
b c c
b bcbcbcbc bcbcbcbcbcbcbcbcbcbcbc bcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbc bcbcbcbcbc bcbc bcbcbcbcbcbcbc bcbcbcbcbcbc
c
b bb
cc c
b cb
b cc
b cb
bcb c b cb
cb cb
b cb
cb
b c bc bc bcbc bcbcbcbcbcbc bcbc bcbcbc bcbcbc bcbcbcbcbc bcbcbcbc bc
c
b c
b cbc
bc b
cb cb cb cb
cb
b
cb cb
cb c b cbcb
cb cc b ccbcb c b c bc bc bcbc bc bc
bcbcbcbc bcbc bcbcbcbcbcbcbcbcbcbcbcbcbc bc bcbcbcbcbc bcbcbcbcbcbcbcbcbcbcbcbc bcbcbcbcbcbcbc bcbc bcbcbc
bc bc
b
c c c
b c
b c bcbcbcbcbcbc bc bc
cb
b cb
cb c c
b cb
b c
b
cb c
b
cb
cc
b cb
bc c
b b bcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbc bcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbc
c c
b b c b cc b cb cb
cb
c b c b c cb
bc
c bc bcbc bc bc
cb
c cbc b c b cb b bcbcbcbcbcbcbcbcbcbcbcbc bcbcbcbc bcbc bcbcbcbc bcbcbcbcbcbcbcbcbcbcbcbc bc bcbcbcbcbcbcbcbcbcbcbcbcbcbc bcbc
b c
b
b
c
b
cb
b cb c b cb c b
b c bc bcbc bcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbc bcbc bcbcbcbcbcbcbc bc bcbcbcbc
cb
b c
b cb
b c
b cb
bc b c
b c b
b c b c
cc b
b c c c
b bc bcbcbc bcbc bc bc bc bcbcbcbc bc bcbcbc bc bc bc
c
b c c
b cc cc b cb
b c cb
cb c b
b cb
b c bc bcbcbcbcbcbc bcbc bc
c
b c
b cb
b b
c
b cb c
b c
b c
b c
b c
b c
bc
bc
b c
b c
b
bc
bcbcbcbc bc bcbcbcbcbcbcbcbc bcbcbcbcbcbcbcbcbcbcbcbcbcbcbcbc bcbcbcbc bcbcbcbcbcbcbc bcbcbcbcbcbcbcbcbcbcbc
cb
b c b
b c b
cb
cb cc b cb
b bc bcbc bcbcbc bc bcbcbcbcbcbcbcbcbcbcbcbcbc bc bcbc bcbcbcbcbcbcbcbcbcbcbcbc
c b cb cbcbc c b
b cb cc b
b c cb
b cb
bcb cb
cb cbcb cb
bcbcb
cb c bcbcbc bc
cb c
b c
b c
b c
b c
b c
b c
b c
b c
b c
b bcbcbc
c b
b c cc b
cb cb
c b cb
ccb cb
cb ccbccb c
b
bcbcbcbc bc bcbcbc bc bc bcbc bc bc bc c
b c
b c
b
c
b
c
b cb
b cb c
b c
b cb
b c cb cb cb
c b cb ccb
b c
b c
b
bc bcbcbcbcbcbcbcbcbcbcbcbcbcbc bcbcbcbcbcbcbcbc bc bcbcbcbcbc bcbcbcbcbcbcbc bcbcbcbc bcbc bcbcbc
c
b c
b cbcb cb
c
bc b
cb cb
c c
b
cb
b cc c
cbcbc bc bc bcbc bc bc bcbc bc bc bcbc bcbc bcbcbcbcbcbc bcbcbcbcbc
c
b c
b c
b b c
b cb
b c b
c b
b ccbc b cb b c
b bc bcbc bcbc bc bcbc bcbc bc bc bcbc bc bc

30
Relating Rank and Distance
Rank-Based Friendship: Pr[A is a friend of B] ∝ 1/rankA(B).

Kleinberg (k-dim grid): Pr[A is a friend of B] ∝ 1/d(A, B)k .

Uniform k-dimensional grid:

radius-d ball volume ≈ dk


1/rank ≈ 1/dk

For a uniform grid, rank-based friendship


has (essentially) same link probabilities as Kleinberg.

31
Population Networks

A rank-based population network consists of:


a k-dimensional grid L of locations.

a population P of people, living at points in L (n := |P |).

a set E ⊆ P × P of friendships:
one edge from each person in each ‘direction’
one edge from each person, chosen by rank-based friendship

e.g.,
locations rounded
to the nearest
integral point in
longitude/latitude.

32
The Real Theorem

Theorem: For any n-person rank-based population network


in a k-dimensional grid, k = Θ(1), for any source s ∈ P
and for a randomly chosen target t ∈ P ,
the expected length (over t) of Greedy (s, loc(t)) is O(log3 n).

Intuition: difficulty of halving distance to isolated target t


is canceled by low probability of choosing t.

Real theorem: not just for grids.


(use doubling dimension of metric space instead of k).

33
Short Paths and Rank-Based Friendships

Theorem: For any n-person population network in a k-dim grid,


for any source s ∈ P and a randomly chosen target t ∈ P ,
the expected length (over t) of Greedy (s, t) is O(log3 n).

Theorem [Kleinberg 2000]: For any n-person uniform-density


population network, any source s, and any target t,
the length of Greedy (s, t) is O(log2 n) with high probability.

Lose: expectation (not whp).


Lose: another log factor.
Gain: arbitrary population densities.
Gain? holds in real networks?

34
Ranks and Friendships in LiveJournal
1e-01

1e-02
link probability

1e-03

1e-04

1e-05

1e-06

1e-07
1 10 100 1000 10000 100000
rank

shows Pru[u is friends with the v : ranku(v) = r]

35
Ranks and Friendships in LiveJournal
1e-01

1e-02 r −1.2 + ε
link probability

1e-03

1e-04

1e-05

1e-06 r −1.15

1e-07
1 10 100 1000 10000 100000
rank

shows Pru[u is friends with the v : ranku(v) = r]


very close to 1/r, as required for rank-based friendship!
again, must correct for nongeographic friends.

35
Ranks and Friendships in LiveJournal
1e-01
1e-01 1e-02

link probability
1e-02 1e-03

1e-04

1e-03 1e-05
link probability minus ε

1e-06

1e-04 1e-07
1 10 100 1000 10000 100000
rank

1e-05
1e-06 1e-01
1e-02

1e-07 1e-03

link probability minus ε


1e-04
1e-05

1e-08 1e-06
1e-07
1e-08

1e-09 r −1.25
1e-09
1e-10
1e-11

1e-10 1 10 100 1000 10000 100000


rank

1e-11
1 10 100 1000 10000 100000
rank
shows probability of rank-r friendship, less ε = 5.0 × 10−6.

36
Ranks and Friendships in LiveJournal
1e-01
1e-01 1e-02

link probability
1e-02 1e-03

1e-04

1e-03 1e-05
link probability minus ε

1e-06

1e-04 1e-07
1 10 100 1000 10000 100000
rank

1e-05
1e-06 1e-01
1e-02

1e-07 1e-03

link probability minus ε


1e-04
1e-05

1e-08 1e-06
1e-07
1e-08

1e-09 r −1.25
1e-09
1e-10
1e-11

1e-10 1 10 100 1000 10000 100000


rank

1e-11
1 10 100 1000 10000 100000
rank
shows probability of rank-r friendship, less ε = 5.0 × 10−6.
LJ “location resolution” is city-only.
average u’s ranks {r, . . . , r + 1300} are in the same city
⇒ we’ll average probabilities over ranks {r, . . . , r + 1300}

36
Ranks and Friendships in LiveJournal
1e-01
1e-01 1e-02

link probability
1e-03
link probability minus ε, averaged

1e-02 1e-04

1e-05

1e-03 1e-06

1e-07
1 10 100 1000 10000 100000

1e-04 rank

1e-05 1e-01
1e-02
1e-03
1e-06

link probability minus ε


1e-04
1e-05
1e-06

1e-07 1e-07
1e-08

r −1.30
1e-09
1e-10

1e-08 1e-11
1 10 100 1000 10000 100000
rank

1e-09
1 10 100 1000 10000 100000 1e-01

link probability minus ε, averaged


1e-02
rank 1e-03
1e-04
1e-05
1e-06
1e-07
1e-08
1e-09
1 10 100 1000 10000 100000
rank
still very close to 1/r
(though not absolutely perfect)

37
X Why to think algorithmically about social networks
X The small-world phenomenon
X Models of the navigable small world
Some conclusions and musings
Geographic/Nongeographic Friendships
1e-03 1e-03

link probability minus ε


1e-04 1e-04

link probability
1e-05 1e-05

1e-06 1e-06

1e-07 1e-07
10 km 100 km 1000 km 10 km 100 km 1000 km
distance distance

good estimate of friendship probability:


or Pr[u → v] ≈ ε + f (d(u, v)) for ε ≈ 5.0 × 10−6.
‘ε friends’ (nongeographic) ‘f (d) friends’ (geographic).

LJ: E[number of u’s “ε” friends] = ε · 500,000 ≈ 2.5.


LJ: average degree ≈ 8.

∼5.5/8 ≈ 66% of LJ friendships are “geographic,” 33% are not.

38
A Puzzle to Ponder

This talk:
66% of LJ friendships form through geographic processes.
Geography is very important, even in a virtual community.
The ∼5.5 ∼rank-based geographic friends per person
suffice to account for the small world.

Not in this talk:


Why should networking people care?

Also not in this talk:


And what about the other 2.5 nongeographic friends?
But why does geography matter so much?
‘virtualized’ real-world friendships?
39
Routing Choices

In real life, many ways to choose a next step when searching!

Geography: greedily route based on distance to t.


Occupation: ≈ greedily route based on distance
in the (implicit) hierarchy of occupations.
Age, hobbies, alma mater, ...
Popularity: choose people with high outdegree.
[Adamic Lukose Puniyani Huberman 2001]
[Kim Yoon Han Jeong 2002] ...

What does ’closest’ mean in real life?


How do you weight various ‘proximities’ ?

minimum over all proximities? [Dodds Watts Newman 2002]


a more complicated combination?
40
A Puzzle to Ponder

This talk:
66% of LJ friendships form through geographic processes.
Geography is very important, even in a virtual community.
The ∼5.5 ∼rank-based geographic friends per person
suffice to account for the small world.

Not in this talk:


Why should networking people care?

Also not in this talk:


And what about the other 2.5 nongeographic friends?
But why does geography matter so much?
‘virtualized’ real-world friendships?
41
Open Directions

A half-sociological, half-computational question ...

Why should rank-based friendship hold, even approximately?


Are there natural processes that generate it?

E.g., a generative process based on “geographic interests”?

42
Thank you!

David Liben-Nowell
dlibenno@carleton.edu
http://cs.carleton.edu/faculty/dlibenno
Some references
Milgram 1967
“The Small World Problem.” Psychology Today.

Kleinberg 2000
“The Small-World Phenomenon: An Algorithmic Perspective.” STOC’00.

Dodds Muhamad Watts 2003


“An Experimental Study of Search in Global Social Networks.” Science.

Liben-Nowell Novak Kumar Raghavan Tomkins 2005


“Geographic Routing in Social Networks.” Proc. Natl. Acad. Sci.

Kumar Liben-Nowell Tomkins 2006


“Navigating Low-Dimensional and Hierarchical Population Networks.” ESA’06.

Adamic Adar 2005


“How to search a social network.” Social Networks.

43

S-ar putea să vă placă și