Single Node Hadoop 2.x Cluster Setup On CentOS V0.3

Single Node Hadoop 2.
x Setup
on CentOS
Step by Step Guide Presented By
Page 1
Big Data Developer
Table of Contents
1. Installing Oracle Virtual Box..............................................................................3
2. Downloading CentOS.........................................................................................3
3. Steps to Install CentOS:......................................................................................3
4. Adding More Users in the CentOS:....................................................................7
Add more users in the CentOS by using the command..........................................7
5. Starting NameNode, DataNode, ResourceManager, NodeManager and
Jobhistoryserver.......................................................................................................20
1.1 Starting NameNode:....................................................................................20
2.1 Starting DataNode.......................................................................................21
3.1 Starting ResourceManger............................................................................21
4.1 Starting NodeManager................................................................................21
5.1 Starting Job historyserver...........................................................................21
6. Check Health of Daemons................................................................................22
Page 2
Big Data Developer
1. Installing Oracle Virtual Box

Download and install the Oracle Virtual Box from the below link.
Link: https://drive.google.com/file/d/0Bxr27gVaXO5sRXdxQVpEUmhCZ3c/view?
usp=sharing
2. Downloading CentOS
Download and install CentOS from the below link:
Link: https://drive.google.com/file/d/0Bxr27gVaXO5sRU0yVFVQM0FvLU0/view?
usp=sharing
Note: The above step downloads CentOS as a compressed file. You need to unzip
it using any unzipping software by right clicking on the file and selecting the
option Extract here.
3. Steps to Install CentOS:

1. Click on New Option and then enter the below details as shown in the
screenshot.
Name: Type in any name, to name your VM.
Type: Select the option Linux from the drop down list.
Version: Select Other Linux (64 bit) from the drop down list.
Page 3
Big Data Developer
2. Click Next.
On clicking Next, a prompt appears to set up RAM size for the VM.
3. Increase the RAM up to 2048 MB if the system has 8 GB RAM and increase up
to 4 GB if the system has 4 GB RAM.
Page 4
Big Data Developer
4. Click on Next to get the option of selecting the Hard Disk Option; choose the
third option i.e. using the existing Virtual hard drive file.
5. Click on the folder icon to browse to the location where the unzipped file of
CentOS is kept.
Page 5
Big Data Developer
6. Select the imported VM and click on the Start button to start the VM.
Page 6
Big Data Developer
On starting the VM, a prompt appears to put the credentials.

7. Type username and password as follows:
User name: tom
Password: tomtom
8. Open the terminal and login to the root user to have administrator permissions.
1. Type the password as follows:
Password: tomtom
Page 7
Big Data Developer
4. Adding More Users in the CentOS:

Add more users in the CentOS by using the command adduser acadgild.
2. Set the password of the added user by using the command passwd acadgild.
Refer the below screenshot.
Note: Enter any password for the created user.

3. Disable the firewall in the CentOS using the below command:
service iptables stop
4. Add the user acadgild into sudoers file to give the administrator rights to the
created user.
Page 8
Big Data Developer
5. Type the command visudo from root and add the users as shown in the
screenshot.
Note: To type any command in the above file come into insert mode by pressing I
in the keyboard and then add the users in the sudoers file and then press Esc button
to come out of insert mode and then type :wq to save and exit.
Reboot the machine and then login to the created user.
1. Use the below link to download jdk in the VM using the browser present in the
centos.
Link: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-
downloads-2133151.html
On clicking the above link, a screen prompts for selecting the required version.
2. Select the option shown with the red colored arrow symbol.
On clicking the above option, download will start and get saved in Downloads
folder.
Page 9
Big Data Developer
3. Copy the above file into /home/acadgild directory using the mv command.
4. Untar the jdk and extract the java file by using the command as shown in the
screenshot.
5. Enter the command ls to see the extracted jdk in the same folder
/home/acadgild.
6. Download the hadoop file using the below link and then copy the file from
Downloads folder to /home/acadgild directory.
Link: http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.6.0/
On clicking the above link the below screen will prompt to select a file.
7. Select the file with tar.gz extension.
Page 10
Big Data Developer
8. Untar the downloaded hadoop file using the below command, refer the below
screenshot.
9. Update the .bashrc file with required environment variables including Java and
hadoop path.
Note: Update the path present in your system.
Page 11
Big Data Developer
10. Create two directories to store NameNode metadata and DataNode blocks as
shown below:
mkdir -p $HOME/hadoop/namenode
mkdir -p $HOME/hadoop/datanode
Note: Change the permissions of the directory to 755.

chmod 755 $HOME/hadoop/namenode
chmod 755 $HOME/hadoop/datanode
11. Change the directory to the location where hadoop is installed.
Page 12
Big Data Developer
12. Open hadoop-env.sh and add the java home(path) and hadoop home(path)
in it.
Page 13
Big Data Developer
Note: update the JAVA VERSION present in your system, in our case the vesrion
is 1.8
13. Open Core-site.xml using the below command from the path shown in the
screenshot.
14. Add the below properties in between configuration tag of core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Page 14
Big Data Developer
15. Open the hdfs-site.xml and add the following lines in between configuration
tags.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/acadgild/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/acadgild/hadoop/datanode</value>
</property>
</configuration>
Page 15
Big Data Developer
16. Open the Yarn-site.xml and add the following lines in between
configuration tags.
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
Page 16
Big Data Developer
17. Copy the mapred-site.xml template into mapred-site.xml and then add the
following properties as shown in the screenshot.
cp mapred-site.xml.template mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Page 17
Big Data Developer
18. Login to the root user and then install openssh server in the CentOS.
Refer the below screenshot to install the openssh server.
19. Generate ssh key for hadoop user.

Command: ssh-keygen -t rsa
Refer the below screenshot.
Note: Ensure to hit enter key after typing the command ssh-keygen -t rsa and hit
enter once again when it asks for file in which to save the key and for
passphrase.
Page 18
Big Data Developer
20. Copy the public key from .ssh directory to authorized_keys folder.
Change the directory to .ssh and then type the below command to copy the files
into the authorized _keys folder.
Page 19
Big Data Developer
21. Type the command ls to check whether authorized_keys folder has been
created or not.
22. To ensure whether the keys have been copied, type the command
cat authorized_keys
23. Change the permission of the .ssh directory.

chmod 600 .ssh/authorized_keys
Page 20
Big Data Developer
24. Restart the ssh service by typing the below command.

Command: sudo service sshd start
25. To start all the daemons follow the below steps:

1 Format the NameNode:
Command: hadoop namenode -format
5. Starting NameNode, DataNode, ResourceManager, NodeManager and

Jobhistoryserver
NOTE: Change the directory to sbin of hadoop before starting the daemon.
1.1 Starting NameNode:
Change the directory to the location of hadoop:
Command: cd hadoop-2.6.0/sbin
And now type the below command to start Namenode:
Command: ./hadoop-daemon.sh start namenode
Page 21
Big Data Developer
2.1 Starting DataNode

Command: ./hadoop-daemon.sh start datanode
3.1 Starting ResourceManager

Command: ./yarn-daemon.sh start resourcemanager
4.1 Starting NodeManager

Command: ./yarn-daemon.sh start nodemanager
5.1 Starting Job historyserver

Command: ./mr-jobhistory-daemon.sh start historyserver
Page 22
Big Data Developer
6. Check Health of Daemons

Check if all the daemons have started or not by typing the command jps
Your Hadoop Cluster is

now ready!
Page 23

Single Node Hadoop 2.x Cluster Setup On CentOS V0.3

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Single Node Hadoop 2.x Cluster Setup On CentOS V0.3

Încărcat de

Drepturi de autor:

Formate disponibile

Single Node Hadoop 2.

1. Installing Oracle Virtual Box

3. Steps to Install CentOS:

On starting the VM, a prompt appears to put the credentials.

4. Adding More Users in the CentOS:

Note: Enter any password for the created user.

Note: Change the permissions of the directory to 755.

14. Add the below properties in between configuration tag of core-site.xml

19. Generate ssh key for hadoop user.

23. Change the permission of the .ssh directory.

24. Restart the ssh service by typing the below command.

25. To start all the daemons follow the below steps:

5. Starting NameNode, DataNode, ResourceManager, NodeManager and

2.1 Starting DataNode

3.1 Starting ResourceManager

4.1 Starting NodeManager

5.1 Starting Job historyserver

6. Check Health of Daemons

Your Hadoop Cluster is

S-ar putea să vă placă și