Bine ați venit la Scribd!

DRX Final Presentation Slides

Încărcat de

0% au considerat acest document util (0 voturi)

43 vizualizări34 pagini

Facebook problem statement: find a predictive function f:(V,E,X,vs,C)-Y. Challenge: real networks are large > 1 billion users on Facebook (oct. 2012) > 500 million users on Twitter (july 2012) > 175 million users on LinkedIn (june 2012)

Descriere originală:

Drepturi de autor

Formate disponibile

PDF, TXT sau citiți online pe Scribd

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Raportați acest document

Drepturi de autor:

Attribution Non-Commercial (BY-NC)

Formate disponibile

Descărcați ca PDF, TXT sau citiți online pe Scribd

Indicator pentru conținut neadecvat

0% au considerat acest document util (0 voturi)

43 vizualizări34 pagini

DRX Final Presentation Slides

Încărcat de

Casey Robinson

Drepturi de autor:

Attribution Non-Commercial (BY-NC)

Formate disponibile

Descărcați ca PDF, TXT sau citiți online pe Scribd

Indicator pentru conținut neadecvat

Salt la pagina

Sunteți pe pagina 1din 34

Căutați în document

Performance Analysis of Hadoop Link Prediction

Yuxiao Dong ydong1@nd.edu

Casey Robinson crobins9@nd.edu

Jian Xu jxu5@nd.edu

Introduction

Facebook

? ?

Twitter
? ? X

Problem Statement
In a network G=(V,E,X), for a particular user vs and a set of candidates C to which vs may create a link, nd a predictive function f:(V,E,X,vs,C)Y where Y={y1,y2,...,y|C|} is a set of inferred results for whether user vs would create links with users in C.

Challenges
Real networks are large > 1 billion users on Facebook (Oct. 2012) > 500 million users on Twitter (Jul. 2012) > 175 million users on LinkedIn (Jun. 2012) Big data makes prediction even slower

Our Solution
Divide Adjacency list Distributed computing Hadoop

Smaller Problems Map Reduce Data Intensive Science Cluster

sort

split 0

map

merge

reduce

part 0

sort

split 1

map
merge

reduce
sort

part 0

split 2

map

Link Prediction Framework

Prepare Vertex Num Split Data Probe Edge Num Degree Statis

AUC

Non-Exist Score

Probe Score

LP Score

AdjList

Algorithm Design
1 2 5 3 4 7 6
1 5 1 1 2 2 3 2 4 6 2 6 3 4 3 4 4 6 5 7 5 5 3 1 1 1 6 7 4 4 2 3 4 5 2 2 2 3 4 6 1 2 3 4 5 6 2,3,4 3,4,6 4,,,, 5,,,, 6,7,, 7,,,, 2,3,1 2,4,1 3,4,1 3,4,2 3,6,2 4,6,2 6,7,5 2,3,1 2,4,1 3,4,1,2 3,6,2 4,6,2 6,7,5

Mapper

Reducer

Mapper

Reducer

Data Sets
Name
HepPh ND Web Live Journal

Nodes
12,008 325,729 4,847,571

Edges
237,010 1,497,134 68,993,773

Relative Size
1x 7.14x 357.78x

Approach

Black Box

Number of Reducers Data Size

Time Breakdown

Which step(s)?

80 Time (% of total) 60 40 20 0

HEP Ph

ND Web

Live Journal

Resource Monitoring

Bottlenecks

Machine Specications

26 Nodes 32 GB RAM 12x2 TB SATA disks (4 dedicated to Hadoop storage) 2x8-core Intel Xeon E5620 CPUs @ 2.40 GHz Gigabit Ethernet

Monitoring Tools
Resource
CPU Disk Network

Command
iostat -c 1 iostat -d 1 netstat -c -I

Monitoring Implementation
1 for q in $(seq -w 1 26); do 2 ./ssh.exp disc$q.crc.nd.edu crobins9 $p 3 date >> /tmp/cpu.out 4 (iostat -c 1 >> /tmp/cpu.out) & 5 done 6 7 # submit and wait for link prediction 8 9 for q in $(seq -w 1 26); do 10 " ./ssh.exp disc$q.crc.nd.edu crobins9 $p 11 ps aux | grep iostat | awk {print $2} | xargs kill -9 12 done 13 " 14 for q in $(seq -w 1 26); do 15 ./scp.exp disc$q.crc.nd.edu crobins9 $p 16 done

CPU

100

CPU Usage (%)

1000

2000

3000 4000 Time (s)

5000

6000

7000

Disk

80 Blocks Read (1k blocks)

LP Score

AUC

1000

2000

3000 4000 Time (s)

5000

6000

7000

800 Blocks Written (1k blocks)

LP Score

AUC

400

1000

2000

3000 4000 Time (s)

5000

6000

7000

Network

1000 Data Received (Mb/s)

LP Score

AUC

500

1000

2000

3000 4000 Time (s)

5000

6000

7000

LP Score

AUC

1000 Data Sent (Mb/s)

500

1000

2000

3000 4000 Time (s)

5000

6000

7000

Conclusions and Future Improvements

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

n = 13000000 double left[] = new double[n]; double right[] = new double[n]; int n1=0, n2=0; int m = 3*n; for(int i = 0; i < m; i++){ " index1 = rand.nextInt(n); " index2 = rand1.nextInt(n); " " leftScore = left[index1]; rightScore = right[index2]; if(leftScore > rightScore){ n1++; } else if( Math.abs(leftScore - rightScore) < 1E-6 ){ n2++; } } AUC = ( n1 + 0.5 * n2 ) / m;

" "

Some Conclusions

Data 1GB Hadoop Useful 6 Reducers Multiple jobs with less reducers

S-ar putea să vă placă și

Fear: Trump in the White House
De la Everand
Fear: Trump in the White House
Bob Woodward
Evaluare: 3.5 din 5 stele
3.5/5 (738)
A Man Called Ove: A Novel
De la Everand
A Man Called Ove: A Novel
Fredrik Backman
Evaluare: 4.5 din 5 stele
4.5/5 (4609)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
De la Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Evaluare: 4.5 din 5 stele
4.5/5 (121)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
De la Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Evaluare: 3.5 din 5 stele
3.5/5 (231)
Grit: The Power of Passion and Perseverance
De la Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Evaluare: 4 din 5 stele
4/5 (588)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
De la Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Evaluare: 4.5 din 5 stele
4.5/5 (266)
Principles: Life and Work
De la Everand
Principles: Life and Work
Ray Dalio
Evaluare: 4 din 5 stele
4/5 (599)
Never Split the Difference: Negotiating As If Your Life Depended On It
De la Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Evaluare: 4.5 din 5 stele
4.5/5 (838)
The Emperor of All Maladies: A Biography of Cancer
De la Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Evaluare: 4.5 din 5 stele
4.5/5 (271)
The Little Book of Hygge: Danish Secrets to Happy Living
De la Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Evaluare: 3.5 din 5 stele
3.5/5 (400)
Yes Please
De la Everand
Yes Please
Amy Poehler
Evaluare: 4 din 5 stele
4/5 (1891)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
De la Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Evaluare: 4 din 5 stele
4/5 (5794)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
De la Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Evaluare: 3.5 din 5 stele
3.5/5 (2259)
The Glass Castle: A Memoir
De la Everand
The Glass Castle: A Memoir
Jeannette Walls
Evaluare: 4.5 din 5 stele
4.5/5 (1712)
Shoe Dog: A Memoir by the Creator of Nike
De la Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Evaluare: 4.5 din 5 stele
4.5/5 (537)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
De la Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Evaluare: 4 din 5 stele
4/5 (1090)
John Adams
De la Everand
John Adams
David McCullough
Evaluare: 4.5 din 5 stele
4.5/5 (2409)
A Tree Grows in Brooklyn
De la Everand
A Tree Grows in Brooklyn
Betty Smith
Evaluare: 4.5 din 5 stele
4.5/5 (1929)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
De la Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Evaluare: 4.5 din 5 stele
4.5/5 (344)
Team of Rivals: The Political Genius of Abraham Lincoln
De la Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Evaluare: 4.5 din 5 stele
4.5/5 (234)
Her Body and Other Parties: Stories
De la Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Evaluare: 4 din 5 stele
4/5 (821)
The Art of Racing in the Rain: A Novel
De la Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Evaluare: 4 din 5 stele
4/5 (4200)
Wolf Hall: A Novel
De la Everand
Wolf Hall: A Novel
Hilary Mantel
Evaluare: 4 din 5 stele
4/5 (3811)
The Light Between Oceans: A Novel
De la Everand
The Light Between Oceans: A Novel
M.L. Stedman
Evaluare: 4.5 din 5 stele
4.5/5 (789)
The Perks of Being a Wallflower
De la Everand
The Perks of Being a Wallflower
Stephen Chbosky
Evaluare: 4.5 din 5 stele
4.5/5 (2103)
Rise of ISIS: A Threat We Can't Ignore
De la Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Evaluare: 3.5 din 5 stele
3.5/5 (137)
Angela's Ashes: A Memoir
De la Everand
Angela's Ashes: A Memoir
Frank McCourt
Evaluare: 4.5 din 5 stele
4.5/5 (440)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
De la Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Evaluare: 4 din 5 stele
4/5 (895)
The Woman in Cabin 10
De la Everand
The Woman in Cabin 10
Ruth Ware
Evaluare: 3.5 din 5 stele
3.5/5 (2322)
The Outsider: A Novel
De la Everand
The Outsider: A Novel
Stephen King
Evaluare: 4 din 5 stele
4/5 (1839)
The Unwinding: An Inner History of the New America
De la Everand
The Unwinding: An Inner History of the New America
George Packer
Evaluare: 4 din 5 stele
4/5 (45)
Little Women
De la Everand
Little Women
Louisa May Alcott
Evaluare: 4 din 5 stele
4/5 (104)
Sing, Unburied, Sing: A Novel
De la Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Evaluare: 4 din 5 stele
4/5 (1103)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
De la Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Evaluare: 4.5 din 5 stele
4.5/5 (474)
On Fire: The (Burning) Case for a Green New Deal
De la Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Evaluare: 4 din 5 stele
4/5 (74)
Brooklyn: A Novel
De la Everand
Brooklyn: A Novel
Colm Toibin
Evaluare: 3.5 din 5 stele
3.5/5 (1937)
The Guildsman #7: Fall 2000
Document60 pagini
The Guildsman #7: Fall 2000
Jim Vassilakos
100% (3)
The Yellow House: A Memoir (2019 National Book Award Winner)
De la Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Evaluare: 4 din 5 stele
4/5 (98)
The Constant Gardener: A Novel
De la Everand
The Constant Gardener: A Novel
John le Carré
Evaluare: 3.5 din 5 stele
3.5/5 (104)
Programming For Problem Solving-4
Document66 pagini
Programming For Problem Solving-4
ENC 35 RAJAT KUMAR
Încă nu există evaluări
Manhattan Beach: A Novel
De la Everand
Manhattan Beach: A Novel
Jennifer Egan
Evaluare: 3.5 din 5 stele
3.5/5 (792)
Manual Wpa Kali Linux Hack: Read/Download
Document2 pagini
Manual Wpa Kali Linux Hack: Read/Download
luiseg
0% (1)
Rajiv Chopra - C Programming - A Self-Teaching Introduction-Mercury Learning & Information (2018) PDF
Document295 pagini
Rajiv Chopra - C Programming - A Self-Teaching Introduction-Mercury Learning & Information (2018) PDF
Syd
Încă nu există evaluări
Steve Jobs
De la Everand
Steve Jobs
Walter Isaacson
Evaluare: 4.5 din 5 stele
4.5/5 (806)
Case Study Auto Insurance Claims: 2.1 Notification
Document11 pagini
Case Study Auto Insurance Claims: 2.1 Notification
Giridharan_Kum_164
Încă nu există evaluări
How To Use This Competency Based Learning Material
Document44 pagini
How To Use This Competency Based Learning Material
Micaella Joanna
100% (1)
Bad Feminist: Essays
De la Everand
Bad Feminist: Essays
Roxane Gay
Evaluare: 4 din 5 stele
4/5 (1015)
Introduction To Ansys Meshing 16.0 PDF
Document17 pagini
Introduction To Ansys Meshing 16.0 PDF
Amer Mehmood
Încă nu există evaluări
QuantEconlectures Python3
Document1.362 pagini
QuantEconlectures Python3
Cristian F. Sanabria
Încă nu există evaluări
Abdulhay Alafrangy
Document6 pagini
Abdulhay Alafrangy
Feras Khaled
Încă nu există evaluări
Simulation and Animation: Computer Graphics & Visualization
Document15 pagini
Simulation and Animation: Computer Graphics & Visualization
hisuin
Încă nu există evaluări
Lesson 5: Online Platform As Tools For Ict Content & Development
Document3 pagini
Lesson 5: Online Platform As Tools For Ict Content & Development
Raymond Canela
Încă nu există evaluări
Software Testing and Quality Assurance
Document8 pagini
Software Testing and Quality Assurance
Krishnakanth Gudur
Încă nu există evaluări
Killa-Gorilla FXMaster v1.52 UserManual
Document59 pagini
Killa-Gorilla FXMaster v1.52 UserManual
Giri Jammi
Încă nu există evaluări
Sappress Sap Governance Risk and Compliance PDF
Document33 pagini
Sappress Sap Governance Risk and Compliance PDF
Prince McGershon
Încă nu există evaluări
Protect The Esxi Virtual Machines With Openbsd
Document6 pagini
Protect The Esxi Virtual Machines With Openbsd
Lara Smith Laursen
Încă nu există evaluări
Pankaj Kumar: - (Formerly Celestica Hyd.) As in
Document3 pagini
Pankaj Kumar: - (Formerly Celestica Hyd.) As in
pankaj
Încă nu există evaluări
MAT 105 Syllabus
Document2 pagini
MAT 105 Syllabus
Ankit Singh
Încă nu există evaluări
Cashflow SC/SCL Series Customer Interface Manual
Document59 pagini
Cashflow SC/SCL Series Customer Interface Manual
Enrique Torres
Încă nu există evaluări
Java Swing
Document79 pagini
Java Swing
prafullmathur28
100% (1)
The Role of Google in Education
Document2 pagini
The Role of Google in Education
Grace Martinez
Încă nu există evaluări
59 Tech Tips
Document54 pagini
59 Tech Tips
Keith Hartung
Încă nu există evaluări
CCNA Exploration 2 - Module 8 Exam Answers Version 4.0
Document4 pagini
CCNA Exploration 2 - Module 8 Exam Answers Version 4.0
fun kolla
Încă nu există evaluări
Nema Vs IP
Document1 pagină
Nema Vs IP
Tamer Ahmed Salem
Încă nu există evaluări
The Sims
Document5 pagini
The Sims
Alexis
Încă nu există evaluări
DMA301m MKT1814 Group1
Document10 pagini
DMA301m MKT1814 Group1
xuantoan501
Încă nu există evaluări
Programing To T
Document1.584 pagini
Programing To T
Sokar Dax
100% (1)
Transformer2016 SC en V02 DIGI PDF
Document24 pagini
Transformer2016 SC en V02 DIGI PDF
Andrés Felipe
Încă nu există evaluări
Lab4 ConditionalProbAndBayes
Document1 pagină
Lab4 ConditionalProbAndBayes
Samrat
Încă nu există evaluări
Common Questions and Answers About ABAP
Document141 pagini
Common Questions and Answers About ABAP
mhku1
Încă nu există evaluări
Phoenix - S4HANA - AP353 FD Report-RMC003-Report Daily Production v1.01-x
Document19 pagini
Phoenix - S4HANA - AP353 FD Report-RMC003-Report Daily Production v1.01-x
sueb sueb
Încă nu există evaluări