Documente Academic
Documente Profesional
Documente Cultură
Hadoop Configuration
Nithin Mohan
AM.EN.U4CSE15143
Environment
Ubuntu 18.10
JDK 15 or JDK 16
Java 8
Hadoop 2.9.2 (Any Stable Release)
java -version
After installing Java on Linux system, You must have to set JAVA_HOME and JRE_HOME
environment variables. Which is used by many Java applications to find Java libraries during
runtime.
OR
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export JRE_HOME=/usr/lib/jvm/java-8-oracle/jre
Step 2- Create Hadoop User
A)Creating a normal (nor root) account for Hadoop
adduser hduser
passwd hduser
su - hduser
ssh localhost
exit
wget http://www-eu.apache.org/dist/hadoop/common/hadoop-2.9.2/hadoop-
2.9.2.tar.gz
tar xzf hadoop-2.9.2.tar.gz
mv hadoop-2.9.2 hadoop
First, we need to set environment variable uses by Hadoop. Edit ~/.bashrc file and append
following values at end of file.
nano ./.bashrc
OR
source ~/.bashrc
hadoop version
cd $HADOOP_HOME/etc/hadoop/
nano hadoop-env.sh
//In the file add the line
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
nano core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
nano hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hduser/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hduser/hadoop/hdfs/datanode</value>
</property>
</configuration>
nano mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
nano yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
C)Format Namenode
Now format the namenode using the following command, make sure that Storage directory is
hdfs namenode -format
Step 5 - Start Hadoop Cluster
Let’s start your Hadoop cluster using the scripts provides by Hadoop.
start-dfs.sh
start-yarn.sh
jps
cd $HADOOP_HOME
hdfs dfs -mkdir -p input
hdfs dfs -put input.txt input
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar wordcount
input output
hdfs dfs -ls output
hdfs dfs -cat output/part-r-00000