Sunteți pe pagina 1din 12

Agenda

Introduction to HIHO
Hadoop In
Hadoop Out
Next Steps
Questions

HUG Noida July 2010 © Nube Technologies, 2010


What is Hadoop?
Framework for distributed processing
Supports Map Reduce programming
Add more machines, get more computing
power

HUG Noida July 2010 © Nube Technologies, 2010


HIHO

A framework to connect Hadoop to data


sources and sinks
Open source, Apache Licensed
Hosted at http://code.google.com/p/hiho

HUG Noida July 2010 © Nube Technologies, 2010


HIHO Data Flow

Data Source

Data In Data Out

HIHO

Hadoop

HUG Noida July 2010 © Nube Technologies, 2010


Hadoop In

Specify connection parameters


Specify a query to fetch data from the
database
Tell the framework how fetching is to be split
across Hadoop mappers
Choose the format for data to reside in
Hadoop
Run
HUG Noida July 2010 © Nube Technologies, 2010
Hadoop In - Concept
Input Query
select employee.id, employee.name, employee.salary,
designations.designation
from employee, designations
where employee.designationId = designations.id and employee.
isMarried = ?
AND $CONDITIONS

HUG Noida July 2010 © Nube Technologies, 2010


Hadoop In - Concept
Split Column: This is the column against
which the splitting will take place. Example
employee.id

Bounding Query:
select min(id), max(id) from employee
Actual number of mappers to use

HUG Noida July 2010 © Nube Technologies, 2010


Hadoop In - Format

Delimited, choice of delimiter


Avro
Can plug in your own format too

HUG Noida July 2010 © Nube Technologies, 2010


Hadoop Out - Sending data from
Hadoop to a database
Text files
Uses MySQL extensions
Entire file loaded in one go

HUG Noida July 2010 © Nube Technologies, 2010


Hadoop Out - Configuration

Database connection
Name of table
File format – fields separated by, lines
terminated by

HUG Noida July 2010 © Nube Technologies, 2010


Next Steps

Building a community
Escaping delimited text
Write to more databases
Make the web as a data source
Integration with HBase
Integration with Hive

HUG Noida July 2010 © Nube Technologies, 2010


Questions

HUG Noida July 2010 © Nube Technologies, 2010

S-ar putea să vă placă și