Documente Academic
Documente Profesional
Documente Cultură
Pre-requiste
DataNode Pre-requisites
Configuration
Configuration(2)
In core-site.xml, for the namenode address,
instead of using localhost, change it to ipaddress or hostname of the machine.
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hdfstmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://masternode-hostname:54310</value>
</property>
</configuration>
Configuration(3)
Also in mapred-site.xml, for the jobtracker
address, instead of using localhost, change it to
ip-address or hostname of the machine.
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>jobtracker-hostname:54311</value>
</property>
</configuration>
Configuration(4)
In the slaves file, change the localhost to the
machine where we want to run the slave. Also,
provide the hostname of each slave in each line.
Master needs to know all the slaves, so he can
start the required daemons remotely in them.
hostname-ofslave-1
hostname-ofslave-2
hostname-ofslave-3
Share Configurations
To each of the slave we transfer the same set of configuration
files (only XML configuration is enough) though secure-copy
(scp). We give this to every slave, since each slave needs to
know the master, so it can send the block reports and
heartbeats.
$ cd /data/hadoop/conf
$ scp *.xml slave-ip-address:/data/hadoop/conf
Run the above command in the master machine, for each
slave. (Note: Ensure that you are in that same path of those
config xmls).
You can login to each of the slave and verify if the files are
successfully transfered.
Start Cluster
In the master node, we need to start the cluster.
$ cd /data/hadoop
$ bin/start-all.sh
Start Cluster(2)
In each of the slave node, we can see that the
Datanode and the Tasktracker daemons are
running.
$ jps
Datanode
Tasktracker
Enough, huh?
Now, we need to see,
how to scale out the cluster.
SCLAING OUT
EXISTING CLUSTER
MASTER
S1
S2
S3
slaves
S4
hostname-ofslave-1
hostname-ofslave-2
hostname-ofslave-3
hostname-ofnew-slave
$ cd /data/hadoop/conf
$ scp *.xml new-slaves-ip-address:/data/hadoop/conf
Start Daemons
Scaled out