Clustering of Hadoop Environment
Pre-Preparation: Install one Hadoop System on VMware Player/Workstation and clone it for second Node.
We will be creating two node clusters with one Namenode and two Datanode
Open the master VM and go to the hadoop/conf directory
$ cd hadoop/conf
List the contents of the directory to view various configuration files.
$ ls
capacity-scheduler.xml
configuration.xls
core-site.xml
hadoop-env.sh
hadoop-metrics.properties
hadoop-policy.xml
hdfs-site.xml
log4j.properties
mapred-site.xml
masters
slaves
slaves.multi
ssl-client.xml.example
ssl-server.xml.example
Open core-site.xml for editing in vi:
Enter the file system's Namenode IP address with 9000 port.
Pre-Preparation: Install one Hadoop System on VMware Player/Workstation and clone it for second Node.
We will be creating two node clusters with one Namenode and two Datanode
Open the master VM and go to the hadoop/conf directory
$ cd hadoop/conf
List the contents of the directory to view various configuration files.
$ ls
capacity-scheduler.xml
configuration.xls
core-site.xml
hadoop-env.sh
hadoop-metrics.properties
hadoop-policy.xml
hdfs-site.xml
log4j.properties
mapred-site.xml
masters
slaves
slaves.multi
ssl-client.xml.example
ssl-server.xml.example
Open core-site.xml for editing in vi:
Enter the file system's Namenode IP address with 9000 port.
$ vi core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.197.128:9000</value>
</property>
</configuration>
Save and Exit
Open hdfs-site.xml for editing in vi:
Enter dfs.replication value as 2.
$ vi hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
Save and Exit
:wq!
:wq!
Open hdfs-site.xml for editing in vi:
Enter dfs.replication value as 2.
$ vi hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
Save and Exit
:wq!
Open mapred-site.xml for editing in vi:
Enter mapred.job.tracker IP address and port 9001.
$ vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file -->
<configuration>
<property>
</configuration>
Save and Exit
:wq!
Enter mapred.job.tracker IP address and port 9001.
$ vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://192.168.197.128:9001</value>
</property></configuration>
Save and Exit
:wq!
Open masters file for editing in vi:
Enter IP address of master.
$ vi masters
192.168.197.128
Save and Exit
:wq!
Enter IP address of master.
$ vi masters
192.168.197.128
Save and Exit
:wq!
Open slaves file for editing in vi:
Enter IP address of master.
$ vi slaves
192.168.197.128
Save and Exit
:wq!
Enter IP address of master.
$ vi slaves
192.168.197.128
192.168.197.134
:wq!
Switch to the slave terminal now:
Check the IP address of the slave
$ ifconfig
...
inet addr:192.168.197.134 Bcast:192.168.197.255 Mask:255.255.255.0
...
Note down and clear the screen
$ clear
Check the IP address of the slave
$ ifconfig
...
inet addr:192.168.197.134 Bcast:192.168.197.255 Mask:255.255.255.0
...
Note down and clear the screen
$ clear
Go to the configuration directory: hadoop/conf
$ cd hadoop/conf
$ cd hadoop/conf
List the contents to see various configuration files.
$ ls
capacity-scheduler.xml
configuration.xls
core-site.xml
hadoop-env.sh
hadoop-metrics.properties
hadoop-policy.xml
hdfs-site.xml
log4j.properties
mapred-site.xml
masters
slaves
slaves.multi
ssl-client.xml.example
ssl-server.xml.example
$ ls
capacity-scheduler.xml
configuration.xls
core-site.xml
hadoop-env.sh
hadoop-metrics.properties
hadoop-policy.xml
hdfs-site.xml
log4j.properties
mapred-site.xml
masters
slaves
slaves.multi
ssl-client.xml.example
ssl-server.xml.example
Clear the screen
$ clear
$ clear
Open core-site.xml for editing in vi:
Enter the file system's Namenode IP address with 9000 port.
Enter the file system's Namenode IP address with 9000 port.
$ vi core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.197.128:9000</value>
</property>
</configuration>
Save and Exit
Open hdfs-site.xml for editing in vi:
Enter dfs.replication value as 2.
$ vi hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
Save and Exit
:wq!
:wq!
Open hdfs-site.xml for editing in vi:
Enter dfs.replication value as 2.
$ vi hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
Save and Exit
:wq!
Open mapred-site.xml for editing in vi:
Enter mapred.job.tracker IP address and port 9001.
$ vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file -->
<configuration>
<property>
</configuration>
Save and Exit
:wq!
Enter mapred.job.tracker IP address and port 9001.
$ vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://192.168.197.128:9001</value>
</property></configuration>
Save and Exit
:wq!
Open masters file for editing in vi:
Enter IP address of master.
$ vi masters
192.168.197.128
Save and Exit
:wq!
Enter IP address of master.
$ vi masters
192.168.197.128
Save and Exit
:wq!
Open slaves file for editing in vi:
Enter IP address of master.
$ vi slaves
192.168.197.134
Save and Exit
:wq!
Enter IP address of master.
$ vi slaves
192.168.197.134
:wq!
Switch to masters terminal
Generate ssh keys for password less communication between nodes.
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key(/home/hadoop-user/.ssh/id_rsa):
/home/hadoop-user/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Enter paraphrase (empty for no paraphrase):
Enter same paraphrase again:
Your identification has been saved in /home/hadoop-user/.ssh/id_rsa.
Your public key has been saved in /home/hadoop-user/.ssh/id_rsa.pub
The key finger print is:
Generate ssh keys for password less communication between nodes.
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key(/home/hadoop-user/.ssh/id_rsa):
/home/hadoop-user/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Enter paraphrase (empty for no paraphrase):
Enter same paraphrase again:
Your identification has been saved in /home/hadoop-user/.ssh/id_rsa.
Your public key has been saved in /home/hadoop-user/.ssh/id_rsa.pub
The key finger print is:
00:ba:38:a2:a9:da:33:71:37:8d:56:e3:31:7c:ea:6f hadoop-user@hadoop-desk
[ master@domain ]: cat ~/.ssh/id_rsa >> ~/.ssh/authorized_keys
Copy ssh keys from the master to the slave node.
[ master@domain ]: ssh-copy-id -i $HOME/.ssh/id_rsa
hadoop-user@192.168.197.128
Now trying logging into machine, with "ssh 'hadoop-user @192.168.197.128'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
[ slave@domain ]: ssh hadoop-user@192.168.197.128
...
[ master@domain ]:
Exit after login.
On Master Node start all the daemons of hadoop
[ master@domain ]: cat ~/.ssh/id_rsa >> ~/.ssh/authorized_keys
Copy ssh keys from the master to the slave node.
[ master@domain ]: ssh-copy-id -i $HOME/.ssh/id_rsa
hadoop-user@192.168.197.128
Now trying logging into machine, with "ssh 'hadoop-user @192.168.197.128'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
[ slave@domain ]: ssh hadoop-user@192.168.197.128
...
[ master@domain ]:
Exit after login.
On Master Node start all the daemons of hadoop
[ master@domain ]: start-all.sh
......
Hadoop get successfully started on master
On the slave node start all hadoop daemons
[ slave@domain ]: start-all.sh
......
......
Hadoop get successfully started on master
On the slave node start all hadoop daemons
[ slave@domain ]: start-all.sh
......
Hadoop get successfully started on slave
On Master list the contents of hdfs
[ master@domain ]: hadoop fs -ls /
......
On Slave list the contents of hdfs
[ slave@domain ]: hadoop fs -ls /
......
......
On Slave list the contents of hdfs
[ slave@domain ]: hadoop fs -ls /
......
You will observe the same content on both master and slave.
Create a sample file.
vi sample.txt
...
:wq!
Create a sample file.
vi sample.txt
...
:wq!
Copy file to HDFS
[ master@domain ]: hadoop fs -copyFromLocal sample.txt /
Turn safemode to off if prompted.
[ master@domain ]: hadoop dfsadmin -safemode leave
Copy file to HDFS
[ master@domain ]: hadoop fs -copyFromLocal sample.txt /
List the hdfs contents that are to be verified
[ master@domain ]: hadoop fs -ls /
......
......
You will observe the new file. Cat the contents of the file
[ master@domain ]: hadoop fs -cat sample.txt
......
......
Switch to Slave Node
List the hdfs contents that are to be verified
List the hdfs contents that are to be verified
[ slave@domain ]: hadoop fs -ls /
......
......
You will observe the new file. Cat the contents of the file
[ slave@domain ]: hadoop fs -cat sample.txt
......
......
You will observe the distributed file system is working properly on the two nodes!