Install and Run Hadoop YARN in 10 Easy Steps

Preamble
If you’re interested in playing with Apache Hadoop’s MRv2 (a.k.a. YARN), you’ve probably looked for ways to set it up on a single-node.
On the Apache Hadoop Yarn Home Page, you will find instructions for setting up a Single Node cluster. Unfortunately, there are some pre-setup assumptions (e.g. that you have installed hadoop-common/hadoop-hdfs and exported various environment variables) that may not apply to you — i.e. unless you have a Hadoop development environment set up already.
If you are interested in a simple installation from tarball, read on.
My Environment
- Single node cluster (flavor of Linux)
- Java 1.6
Step 1 : Download a tarball and define $YARN_HOME
- Download and unpack a Hadoop-0.23.x (or later) tarball from here
- Note : The Hadoop-2.0.x (or later) branch also contains YARN but is currently in alpha, so I downloaded 0.23.1
- Unpack the tarball and define $YARN_HOME
> export YARN_HOME=/Users/sianand/Apps/hadoop-0.23.1
Step 2 : Alter core-site.xml
> cd $YARN_HOME
> vi ./etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>
Step 3 : Create local directories for the namenode and datanode
> mkdir /Users/sianand/yarn_data> mkdir /Users/sianand/yarn_data/hdfs> mkdir /Users/sianand/yarn_data/hdfs/namenode> mkdir /Users/sianand/yarn_data/hdfs/datanode
Step 4 : Format the namenode
> cd $YARN_HOME> bin/hadoop namenode -format
> cd $YARN_HOME
> vi ./etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/Users/sianand/yarn_data/hdfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/Users/sianand/yarn_data/hdfs/datanode</value> </property> </configuration>
> cd $YARN_HOME>./bin/hdfs namenode>./bin/hdfs secondarynamenode>./bin/hdfs datanode
> cd $YARN_HOME> vi ./etc/hadoop/yarn-env.sh
export HADOOP_CONF_DIR="${HADOOP_CONF_DIR:-$YARN_HOME/etc/hadoop}"
export HADOOP_COMMON_HOME="${HADOOP_COMMON_HOME:-$YARN_HOME}"
export HADOOP_HDFS_HOME="${HADOOP_HDFS_HOME:-$YARN_HOME}"
> cd $YARN_HOME> vi ./etc/hadoop/yarn-site.xml
<?xml version="1.0"?> <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce.shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
> cd $YARN_HOME> vi ./etc/hadoop/mapred-site.xml<?xml version=”1.0”?>
<?xml version="1.0"?> <?xml-stylesheet href="configuration.xsl"?> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
> cd $YARN_HOME>./bin/yarn resourcemanager>./bin/yarn nodemanager
sianand-mn:hadoop-0.23.1 sianand$ jps95579 ResourceManager94607 NameNode6815 Jps94801 DataNode95723 NodeManager94950 SecondaryNameNode
> cd $YARN_HOME
> bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.23.1.jar pi -Dmapreduce.clientfactory.class.name=org.apache.hadoop.mapred.YarnClientFactory -libjars share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-0.23.1.jar 16 10000
- For Hadoop 0.23.1., the scripts under $YARN_HOME/sbin worked on Linux RHEL5, but not on Mac OS X
- My examples use the executables under $YARN_HOME/bin instead
-
buy-steroids-uk-co likes this
-
overheardbycheeseguy likes this
-
maheshcr likes this
-
hiqus reblogged this from rooksfury
-
rooksfury posted this