Documente Academic
Documente Profesional
Documente Cultură
x Setup
on CentOS
Step by Step Guide Presented By
Page 1
Big Data Developer
Table of Contents
1. Installing Oracle Virtual Box..............................................................................3
2. Downloading CentOS.........................................................................................3
3. Steps to Install CentOS:......................................................................................3
4. Adding More Users in the CentOS:....................................................................7
Add more users in the CentOS by using the command..........................................7
5. Starting NameNode, DataNode, ResourceManager, NodeManager and
Jobhistoryserver.......................................................................................................20
1.1 Starting NameNode:....................................................................................20
2.1 Starting DataNode.......................................................................................21
3.1 Starting ResourceManger............................................................................21
4.1 Starting NodeManager................................................................................21
5.1 Starting Job historyserver...........................................................................21
6. Check Health of Daemons................................................................................22
Page 2
Big Data Developer
2. Downloading CentOS
Download and install CentOS from the below link:
Link: https://drive.google.com/file/d/0Bxr27gVaXO5sRU0yVFVQM0FvLU0/view?
usp=sharing
Note: The above step downloads CentOS as a compressed file. You need to unzip
it using any unzipping software by right clicking on the file and selecting the
option Extract here.
Page 3
Big Data Developer
2. Click Next.
On clicking Next, a prompt appears to set up RAM size for the VM.
3. Increase the RAM up to 2048 MB if the system has 8 GB RAM and increase up
to 4 GB if the system has 4 GB RAM.
Page 4
Big Data Developer
4. Click on Next to get the option of selecting the Hard Disk Option; choose the
third option i.e. using the existing Virtual hard drive file.
5. Click on the folder icon to browse to the location where the unzipped file of
CentOS is kept.
Page 5
Big Data Developer
6. Select the imported VM and click on the Start button to start the VM.
Page 6
Big Data Developer
8. Open the terminal and login to the root user to have administrator permissions.
1. Type the password as follows:
Password: tomtom
Page 7
Big Data Developer
2. Set the password of the added user by using the command passwd acadgild.
Refer the below screenshot.
4. Add the user acadgild into sudoers file to give the administrator rights to the
created user.
Page 8
Big Data Developer
5. Type the command visudo from root and add the users as shown in the
screenshot.
Note: To type any command in the above file come into insert mode by pressing I
in the keyboard and then add the users in the sudoers file and then press Esc button
to come out of insert mode and then type :wq to save and exit.
Reboot the machine and then login to the created user.
1. Use the below link to download jdk in the VM using the browser present in the
centos.
Link: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-
downloads-2133151.html
On clicking the above link, a screen prompts for selecting the required version.
2. Select the option shown with the red colored arrow symbol.
On clicking the above option, download will start and get saved in Downloads
folder.
Page 9
Big Data Developer
3. Copy the above file into /home/acadgild directory using the mv command.
4. Untar the jdk and extract the java file by using the command as shown in the
screenshot.
5. Enter the command ls to see the extracted jdk in the same folder
/home/acadgild.
6. Download the hadoop file using the below link and then copy the file from
Downloads folder to /home/acadgild directory.
Link: http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.6.0/
On clicking the above link the below screen will prompt to select a file.
7. Select the file with tar.gz extension.
Page 10
Big Data Developer
8. Untar the downloaded hadoop file using the below command, refer the below
screenshot.
9. Update the .bashrc file with required environment variables including Java and
hadoop path.
Note: Update the path present in your system.
Page 11
Big Data Developer
10. Create two directories to store NameNode metadata and DataNode blocks as
shown below:
mkdir -p $HOME/hadoop/namenode
mkdir -p $HOME/hadoop/datanode
Page 12
Big Data Developer
12. Open hadoop-env.sh and add the java home(path) and hadoop home(path)
in it.
Page 13
Big Data Developer
Note: update the JAVA VERSION present in your system, in our case the vesrion
is 1.8
13. Open Core-site.xml using the below command from the path shown in the
screenshot.
Page 14
Big Data Developer
15. Open the hdfs-site.xml and add the following lines in between configuration
tags.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/acadgild/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/acadgild/hadoop/datanode</value>
</property>
</configuration>
Page 15
Big Data Developer
16. Open the Yarn-site.xml and add the following lines in between
configuration tags.
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
Page 16
Big Data Developer
17. Copy the mapred-site.xml template into mapred-site.xml and then add the
following properties as shown in the screenshot.
cp mapred-site.xml.template mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Page 17
Big Data Developer
18. Login to the root user and then install openssh server in the CentOS.
Refer the below screenshot to install the openssh server.
Page 18
Big Data Developer
20. Copy the public key from .ssh directory to authorized_keys folder.
Change the directory to .ssh and then type the below command to copy the files
into the authorized _keys folder.
Page 19
Big Data Developer
21. Type the command ls to check whether authorized_keys folder has been
created or not.
22. To ensure whether the keys have been copied, type the command
cat authorized_keys
Page 20
Big Data Developer
Page 21
Big Data Developer
Page 22
Big Data Developer
Page 23