Technical: Disk – IO – Performance – Disk Benchmark – Tools – DiskMark

Technical: Disk – IO – Performance – Disk Benchmark – Tools – DiskMark

Introduction

Have a couple of internal disks and network SAN Storage and wanted to know how they compare.

So looked on the Net for free Disk IO Benchmark and profiling tools.

Tools

  • Iometer
  • Atto
  • CrystalDiskMark

Tried a couple of the tools, but the one I settled on is DiskMark from NetworkDLS.

Download

Download the tool from http://www.networkdls.com/Software/View/DiskMark/

Install

Install the tool

Usage

Once the tool is installed, please initiate it.

There are 4 major areas you want to touch on.

Disk Drive

All the disks on the system are availed in the “Disk Drive” drop-down box.

Set Size

This is the payload unit size.  It is represented in bytes.

You are able to use simple math when entering your value. 

 Size  Values
 1 K  1024
 64 KB  64 * 1024
 1 MB (1000 KB)  1000 * 1024
 1 GB (1000 * 1000 KB)  1000 * 1000 * 1024

Rounds

This is the number of times to process each payload.

Runs

This value relates to repetition cycle.  For each cycle values are gathered.  And, at the completion of all the runs, averages are calculated.

Sample Result

Trial Run:

Here is the configuration of our Trial Run:

TrialRun

Disk C:

DiskMark - Drive C

Disk D:

DiskMark - Drive D (v2)

Disk X:

DiskMark - Drive X

Disk Y:

DiskMark - Drive Y

Disk Z:

DiskMark - Drive Z

Analyze Result

Analyze result.

Disk  Disk Time  Write Performance  Read Performance
C 274.79  9.34 MB/s 128.32 MB/s
D 272.47  9.42 MB/s 125.60 MB/s
X 22.74 115.99 MB/s 287.97 MB/s
Y 25.56 101.97 MB/s 254.61 MB/s
Z 25.24 106.47 MB/s 277.02 MB/s

Conclusion

From using the Open Source Tool, DiskMark, we are quickly able to determine that our internal drives (C: & D:) will only deliver 10 MB/sec when writing data.

Whereas the SAN Storage drives will deliver an average of about 105 MB.

The numbers are not break-point numbers as the tool might probably not be writing data in parallel, using read-ahead cache, etc.

The number are simply guidance numbers while comparing Apple to Apple, so to speak.

References

References – DiskMark

http://lifehacker.com/5824265/diskmark-is-a-free-and-easy-hard-drive-benchmark-tool

References – winsat

http://blog.dv411.com/2011/01/disk-test-quickie-windows.html

Posted in Metrics, Performance, Storage, Technical | Tagged , | Leave a comment

Staying Centered and Alive

Staying Centered and Alive

It has being a bit of a melachonic couple of weeks for me.

Like they say, you ‘re rarely the only one:

Frank X. Shaw, Corporate Vice President of Corporate Communications at Microsoft:

http://blogs.technet.com/b/microsoft_blog/archive/2013/05/10/staying-centered.aspx

There are many advantages to living in a world that is mostly connected. Feedback is immediate. Weak signals are easily amplified. Voices can be heard.

Of course, every benefit has a drawback.

In this world where everyone is a publisher, there is a trend to the extreme – where those who want to stand out opt for sensationalism and hyperbole over nuanced analysis.

In this world where page views are currency, heat is often more valued than light.

Stark black-and-white caricatures are sometimes more valued than shades-of-gray reality.

It’s been a week like that, from a couple of unlikely sources. :)

So let’s pause for a moment and consider the center.

Though dated, here is Drake’s take.  It is the opening track on his sophomoric Album:

Drake – Over my Dead Body

http://www.dailymotion.com/video/xm88az_drake-over-my-dead-body-official-audio-take-care_music#.UZUSUis6WsY

But, you know who really stayed alive and well, are those 3 girls and their familia in Cleveland.

Leaning on Drake:

Those girls wear crowns…

We all like to be heard

       … as it is the spaces between Living and the lived.

References:

Posted in Culture, Music | Tagged , , , , , | Leave a comment

Technical: Hadoop/Cloudera (v4.2.1) – Installation on CentOS (32 bit [x32])

Technical: Hadoop/Cloudera (v4.2.1) – Installation on CentOS (32 bit [x32])

Introduction

Here is quick preparation, processing, and validation steps for installing Cloudera – Hadoop (v4.2.1) on 32-bit CentOS.

Blueprint

I am using Cloudera’s fine documentation “http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Quick-Start/cdh4qs_topic_3_2.html” as a basis.

It is a very good documentation, but I stumble a lot for lack of education and glossing over important details.  And, so I chose to write things down.

Environment Constraints

Here is the constraint that I have to work with:

  • My lab PC is an old Dell
  • It is a 32-bit processor
  • And, so I can only install a 32-bit Linux & Cloudera Distro

Concepts

Here are a couple of concepts that we will utilize:

  • File System – Linux – Stickiness

Concepts : File System – Linux – Stickiness

Background

Sticky Bit

http://en.wikipedia.org/wiki/Sticky_bit

The most common use of the sticky bit today is on directories. When the sticky bit is set, only the item’s owner, the directory’s owner, or the superuser can rename or delete files. Without the sticky bit set, any user with write and execute permissions for the directory can rename or delete contained files, regardless of owner. Typically this is set on the /tmp directory to prevent ordinary users from deleting or moving other users’ files.

In Unix symbolic file system permission notation, the sticky bit is represented by the letter t in the final character-place.

Set Stickiness

http://en.wikipedia.org/wiki/Sticky_bit

The sticky bit can be set using the chmod command and can be set using its octal mode 1000 or by its symbol t (s is already used by the setuid bit). For example, to add the bit on the directory/tmp, one would type chmod +t /tmp. Or, to make sure that directory has standard tmp permissions, one could also type chmod 1777 /tmp.

To clear it, use chmod -t /tmp or chmod 0777 /tmp (using numeric mode will also change directory tmp to standard permissions).

Is Stickiness set?

http://en.wikipedia.org/wiki/Sticky_bit

In Unix symbolic file system permissions notation, the sticky bit is represented by the letter t in the final character-place. For instance, in our Linux Environment , the /tmp directory, which by default has the sticky-bit set, shows up as:

  $ ls -ld /tmp
  drwxrwxrwt   4 root     sys          485 Nov 10 06:01 /tmp

Prerequisites – Operating System

Introduction

Listed below are Cloudera’s stated minimal requirements (in the areas of Operating System, Database, and JDK).

For the bare minimum install we are targeting, we do not need a database.  And, only kept it in for completeness.  And, even when needed, the database itself can be on another server outside of the Cloudera node or Cluster.

Operating System

http://www.cloudera.com/content/support/en/documentation/cdh4-documentation/cdh4-documentation-v4-latest.html

  • Redhat – Redhat Enterprise Linux (v5.7 –> 64-bit, v6.2 –> 32 and 64 bit)
  • Redhat – CentOS (v5.7 –> 64-bit, v6.2 –> 32 and 64 bit)
  • Oracle Linux  (v5.6 –> 64-bit)
  • SUSE Linux Enterprise Server (SLES) (v11 with SP1 –> 64 bit)
  • Ubuntu / Debian (Ubuntu – Lucid 10.04 [LTS] –> 64 bit)
  • Ubuntu / Debian (Ubuntu - Precise 12.04 [LTS] –> 64 bit)
  • Ubuntu / Debian (Debian – Squeeze 6.03 –> 64 bit)

What does all this mean:

  • The only 32-bit OS supported is Redhat’s.  If RedHat Enterprise Linux or RedHat CentOS, then the minimum OS version# is v5.7 and v6.2

Databases

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Requirements-and-Supported-Versions/cdhrsv_topic_2.html

  • Oozie (MySQL v5.5, PostgreSQL v8.4, Oracle 11gR2)
  • Hue (MySQL v5.5, PostgreSQL v8.4)
  • Hive (MySQL v5.5, PostgreSQL v8.3)

Java /JDK

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Requirements-and-Supported-Versions/cdhrsv_topic_3.html

  • Jdk 1.6 –> 1.6.0_31
  • jdk 1.7 –> 1.7.0_15

Prerequisites – Networking – Name ID

Introduction

Ensure that your Network Names are unique and they are what you want them to be.

Validate Hostname

Use hostname.



    Syntax:

         hostname

    Sample:

         hostname

Output:

Network - Hostname - Get

Set Hostname

If you the hostname is not what you thought it will be, please set it using resources available on the Net:

Prerequisites – Networking – Domain Name & FQDN

Introduction

Get Domain Name and FQDN (Fully qualified domain name)

Get Domain Name (using hostname)



    Syntax:

         hostname --domain

    Sample:

         hostname  --domain

 

Get Domain Name (using resolv.conf)



    Syntax:

         cat /etc/resolv.conf

    Sample:

         cat  /etc/resolv.conf

Output:

resolv.conf

Explanation:

  • In the file /etc/resolv.conf, your domain name is the entry prefixed by domain 

Get Hostname (FQDN)

Get FQDN (Fully qualified hostname).



    Syntax:

         hostname --fqdn

    Sample:

         hostname --fqdn

Output:

Network - Hostname - Get (FQDN)

Interpretation:

  • ping DNS Server and discovered it is offline — Windows machine and yesterday was patched today.  And, unfortunately this particular machine needs for a key to be pressed to fully come back online…Never figured out what is up with the BIOS

Set Domain Name

Good Resources on the Net:

Prerequisites – Networking – Name Resolution

Introduction

As Hadoop is fundamentally a testament to Network Clustering and Collaborative Engineering, your working hosts have to have TCP/IP verifiably working.

Validate Hostname



    Syntax:

         ping <hostname> 

    Sample:

         ping rachel 

Network -- ping hostname -- rachel

Since we got an error message stating that “unknown host <hostname>”, we need to go to our DNS Server and make sure that we have “A” entries for them ….

Our DNS is a Windows DNS Server, and it was relatively easy to create an “A” record for it:

 

network-hostname-dns-rachel

 

 

 

Went back and checked to ensure that our DNS Resolution is good:

Network - hostname - dns (rachel) -- resolved

Prerequisites – wget

Introduction – wget

To download files over HTTP, but without browser, and just through the command shell, we chose to use wget.

Install – wget

sudo yum -y install wget

 

Prerequisites – lsof

Introduction – lsof

lsof is SysInternals’s process monitor for Linux.  It lets us track files and network ports being used by a process.

Install – lsof

sudo yum -y install lsof

Prerequisites – Java

Here are the steps for validating that we have the right Java JDK installed.

Java – Minimal Requirements

We need Java and we need one of the latest versions (JDK 1.6 or JDK 1.7).

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Requirements-and-Supported-Versions/cdhrsv_topic_3.html

  • For JDK 1.6, CDH4 is certified with 1.6.0_31
  • For JDK 1.7, CDH4.2 and later is certified with 1.7.0_15

Is Java installed?

Is Java installed on our box, and if so what version?

java -version

Output:

java - command not found

Get URL for Java (+JDK +JRE)

To get to the Java download, please visit:

http://www.oracle.com/technetwork/java/javase/downloads/index.html

Please note that you do not want just the JRE, but JDK (which has the JRE bundled with it).

Thus click on JDK.

As of today (2013-05-12), the latest available JDK is 7U21.

java-downloads-url

Further down on that same download page, we will notice that there is a separate download file for each OS and bitness.

java-downloads-osandbitness

As we have a 32-bit Linux that is able to use rpm, we want to capture the URL for  jdk-7u21-linux-i586.rpm.

That URL ends up being

http://download.oracle.com/otn-pub/java/jdk/7u21-b11/jdk-7u21-linux-i586.rpm

Download Oracle/Sun Java JDK

Goggled for help and found:

How to automate and download and Installation of Java JDK on Linux:

http://stackoverflow.com/questions/10268583/how-to-automate-download-and-instalation-of-java-jdk-on-linux

wget --no-cookies --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2Ftechnetwork%2Fjava%2Fjavase%2Fdownloads%2Fjdk6-downloads-1637591.html;"  "http://download.oracle.com/otn-pub/java/jdk/7u21-b11/jdk-7u21-linux-i586.rpm"

Error:



Resolving download.oracle.com... 96.17.108.106, 96.17.108.163
Connecting to download.oracle.com|96.17.108.106|:443... connected.
ERROR: certificate common name âa248.e.akamai.netdownload.oracle.com
To connect to download.oracle.com insecurely, use --no-check-certificate.

And, other series of problems, until I really took care to make the following changes:

  • Changed the URL from http: to https:
  • Added, the option “–no-check-certificate”

And then ended up with a working syntax:


wget --no-cookies --no-check-certificate  --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com" "https://download.oracle.com/otn-pub/java/jdk/7u21-b11/jdk-7u21-linux-i586.rpm" -O "jdk-7u21-linux-i586.rpm"

Installed Oracle/Sun Java JDK

Syntax:

   yum install <jdk-rpm>

Sample:

   yum install jdk-7u21-linux-i586.rpm

Output:

java-jdk-installed

Install of Java JDK was successful!

Validated Install of Oracle/Sun Java JDK

Syntax:

   java -version

Sample:

   java -version

 Output:

java-jdk-version

Install Cloudera Bin Installer (./cloudera-manager-installer.bin)

Disclaimer

Please do not go down this road on a 32-bit system.

It will not work as cloudera-manager-installer.bin is a 64-bit software and will not run on a 32-bit.

This section is merely preserved for completeness; and as a place-holder.

 

Resource

As we are targeting v4.0x, we should direct our glance @ http://archive.cloudera.com/cm4/installer/

As of 2013-04-11, here is the folder view of what Cloudera has available:

cloudera-installers-folderList

We want the latest folder:

cloudera-installers-folderList (latest)

Download

Download using wget

The URL Link is http://archive.cloudera.com/cm4/installer/latest/cloudera-manager-installer.bin.

Here is the download specification:

  • Download URL: http://archive.cloudera.com/cm4/installer/latest/cloudera-manager-installer.bin
  • Output File: tmp/cloudera-manager-installer.bin

wget http://archive.cloudera.com/cm4/installer/latest/cloudera-manager-installer.bin   -O /tmp/cloudera-manager-installer.bin

Validate FileInfo

Validate FileInfo (file -i <file>)


file -i

Cloudera - Installer - file



cloudera-manager-installer.bin: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped

Explanation:

Prepare downloaded file

use chmod to make file executable.


Syntax:

   chmod u+x <file>

Sample:

  chmod u+x cloudera-manager-installer.bin

Run Installer

Run installer



sudo sudo ./cloudera-manager-installer.bin

Unfortunately, got an error message:

 



./cloudera-manager-installer.bin: ./cloudera-manager-installer.bin: cannot execute binary file

Verified that we can not install cloudera-manger-installer.bin on a 32-bit; the OS has to be a 64-bit OS.

Manage Yum Repository –  Cloudera

Background?

Once you find yourself using packages from a specific Vendor and correspondingly its repository quite a bit, I will advice you to please add that Vendor to your repository configurations.

Basically, you want to be able to do the following:

  • Aware your machine that it can safely access said repository for packages you request
  • Confirm that you trust the vendor’s GPG key 

Is Cloudera GPG key installed?

Repository keys are saved in the /etc/yum.repos.d/ folder.

Check folder

ls /etc/yum.repos.d/

Output:

Folder List -- etc:yum.repos.d:

Trust Vendor – Cloudera

 

Trust Vendor (Cloudera) by trusting its GPG Key.



Syntax:

 sudo rpm --import <key>

Sample:

 sudo rpm --import \

http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera

No feedback that the above succeeded.  But, I think we are good.

Review Vendor Repo File

Check folder

ls /etc/yum.repos.d/

Output:

Folder List -- etc:yum.repos.d: [v2]

Obviously, we now have the cloudera-cdh4.repo file in the /etc/yum.repos.d folder

Do we now have Vendor Repo File?

Check folder

ls /etc/yum.repos.d/

Output:

Folder List -- etc:yum.repos.d: [v2]

Obviously, we now have the cloudera-cdh4.repo file in the /etc/yum.repos.d folder

Review Contents of Vendor Repo File

Review Repo File Contents

cat /etc/yum.repos.d/cloudera-cdh4.repo

Output:

View - Vendor - Repo file

Explanation:

 

Decision Time

There are a few critical decisions you have to make:

  • What is your topology – A single system or a distributed system?
  • MapReduce or Yarn

Topology – Pseudo Distributed / Cluster

If you will be using a single node, then Cloudera terms this a Pseudo Distributed.  On the other hand, if you will be using a multiple nodes, Cloudera terms this Cluster.

MapReduce (MRv1) or Yarn (MRv2)

What is the difference between MapReduce and Yarn?

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_11_4.html

MapReduce has undergone a complete overhaul and CDH4 now includes MapReduce 2.0 (MRv2). The fundamental idea of MRv2′s YARN architecture is to split up the two primary responsibilities of the JobTracker — resource management and job scheduling/monitoring — into separate daemons: a global ResourceManager (RM) and per-application ApplicationMasters (AM). With MRv2, the ResourceManager (RM) and per-node NodeManagers (NM), form the data-computation framework. The ResourceManager service effectively replaces the functions of the JobTracker, and NodeManagers run on slave nodes instead of TaskTracker daemons. The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks. For details of the new architecture, see Apache Hadoop NextGen MapReduce (YARN).

Can we install both MapReduce (v1) and Yarn (Map Reduce [v2]?

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Quick-Start/cdh4qs_topic_3_1.html

For installations in pseudo-distributed mode, there are separate conf-pseudo packages for an installation that includes MRv1 (hadoop-0.20-conf-pseudo) or an installation that includes YARN (hadoop-conf-pseudo). Only one conf-pseudo package can be installed at a time: if you want to change from one to the other, you must uninstall the one currently installed.

Which of them shall I use?

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_4_1.html

Cloudera does not consider the current upstream MRv2 release stable yet, and it could potentially change in non-backwards-compatible ways. Cloudera recommends that you use MRv1 unless you have particular reasons for using MRv2, which should not be considered production-ready.

What is our decision?

  • We will go with Map Reduce [MRv1]

Installation File Matrix

To keep ourselves honest, let us prepare a quick checklist of RPMs.

Installation File Matrix

Installation File Matrix

If we will go with a Pseudo Install, then please look for the RPMs that have Pseudo in their name.

Mode Component RPM
Pseudo Distributed Map Reduce v1 hadoop-0.20-conf-pseudo
Pseudo Distributed Map Reduce v2 (Yarn) hadoop-conf-pseudo

On the other hand, if you will like a Cluster Install, then please follow the instructions documented in http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_4_4.html

Cluster installs have to be performed one component at a time and on each host, and is beyond the scope of our current posting.

Install Pseudo Install

Background

We have chosen the following install path:

  • Pseudo Mode
  • MapReduce v1

Let us go get and install hadoop-0.20-conf-pseudo

Review RPM Package (hadoop-0.20-conf-pseudo)

Before we install this package, let us quickly review and make sure that it is what we want:

  • Get general info
  • Get a quick dependency list

General Info

Before we even have the file, let us check the package’s info while the package is at rest (on the Vendor’s web site):



Syntax:

   yum info --nogpgcheck  <package-name>

Sample:

   yum info --nogpgcheck  hadoop-0.20-conf-pseudo

Output:

Hadoop - yum -info -- hadoop-conf-pseudo

Explanation:

  • Name :- hadoop-0.20-conf-pseudo
  • Architecture :- i386
  • Repo :- cloudera-cdh4
  • Summary :- Hadoop installation in pseudo-distribution mode with MRv1

Repoquery Info

You can use “Repoquery –list” to check on your package, prior to downloading it.

Beforehand, make that that you have installed the YumUtils package (“sudo yum install yum-utils”).

Run repoquery:



Syntax:

   repoquery --list <package-name>

Sample:

   repoquery --list hadoop-0.20-conf-pseudo

 

Dependency Info

Dependency info



Syntax:

   yum deplist --nogpgcheck  <package-name>

Sample:

   yum deplist --nogpgcheck  hadoop-0.20-conf-pseudo

Output:

Hadoop - yum -deplist -- hadoop-conf-pseudo

Explanation:

Here are the dependencies:

  • hadoop-0.20-mapreduce-tasktracker
  • hadoop-hdfs-datanode
  • hadoop-hdfs-namenode
  • hadoop-0.20-mapreduce-jobtracker
  • hadoop-hdfs-secondarynamenode
  • /bin/sh (bash)
  • hadoop (hadoop base)

Install Rpm (hadoop-0.20-conf-pseudo)

Install rpm



Syntax:

   sudo yum install <package-name>

Sample:

    sudo yum install hadoop-0.20-conf-pseudo

Output:

hadoop-conf-pseudo -- install (confirmation?)

We respond in the affirmative….

And, the installation completed:

hadoop-conf-pseudo -- install (afirmative)

Post Installation Review – File System

Background

In Linux, it is commonly said “Everything is a File System”.

And, so let us begin by reviewing the File System (FS).

Review our package files (rpm -ql)

Show files installed by our RPM:



Syntax:

   rpm -ql <package-name> 

Sample:

   rpm -ql hadoop-0.20-conf-pseudo 

Output:

 hadoop-conf-pseudo --ql

Explanation:

  • We have the pseudo MapReduce v1 configuration files (*.xml)
  • We have the base components folders (/var/lib/hadoop, /var/lib/hdfs)

Review our configuration files

Where are those configuration files ?

Glad you asked?



Syntax:

   ls -la <configuration>

Sample:

   ls -la /etc/hadoop/conf.pseudo.mr1

Output:

List Folder -- etc-hadoop-conf.pseudo.mr1

Support for various versions

Background

To maximize flexibility, CDH supports different installed versions.  But, keep in mind only  one version can be running at the same time.

The Alternatives framework underpins this support.

Alternatives

Introduction

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Quick-Start/cdh4qs_topic_3_2.html

The Cloudera packages use the alternatives framework for managing which Hadoop configuration is active. All Hadoop components search for the Hadoop configuration in /etc/hadoop/conf.

Review

Is alternatives actually in effect?

There are couple of things we can check:

  • alternatives –display <name>
  • update-alternatives –display <name>

alternatives –display



Syntax:

   sudo alternatives --display <file-name>

Sample:

   sudo alternatives --display hadoop

update-alternatives –display



Syntax:

   sudo update-alternatives --display <file-name>

Sample:

   sudo update-alternatives --display hadoop

Conclusion

Alternatives does not appear to be in play…

Component Level – User & Group – Review

Background

Let us do a quick review of our users and groups.

Hadoop – Users

Get User file (/etc/password) for well known Hadoop user accounts:

Quick, Quick, what are the well known user accounts:

  • hdfs
  • mapred
  • zookeeper
Syntax:
  cat /etc/passwd | cut -d: -f1 | egrep "xxx|yyy|zzz"

Sample:

  cat /etc/passwd | cut -d: -f1 | egrep "hdfs|mapred|zookeeper"

 

Output:

hadoop-conf-pseudo - match users

What does our little code do:

  • The file is /etc/passwd
  • Use cut passing in delimiter (:), and get first word in /etc/password
  • Match on any of the supplied users

Hadoop – Groups

Browse Group file (/etc/grown) for well known Hadoop groups and user accounts:

What are the well known Groups and what is their membership:

  • hadoop
  • hdfs
  • mapred
  • zookeeper
Syntax:
  cat /etc/group | egrep -i "xxx|yyy|zzz"

Sample:

  cat /etc/group | egrep -i "hadoop|hdfs|mapred|zookeeper"

 

What does our little code do:

  • The file is /etc/group
  • Match on any of the supplied groups

hadoop-conf-pseudo - match users and groups

Explanation:

  • Obviously, we have a group named in hadoop and its members are hdfs and mapred

Component Level – Review & Configuration – HDFS – NameNode – File System Format

 

Background

Here are a few things you should do to initialize HDFS Name Node.

HDFS – NameNode – Format

On the Hadoop HDFS Name Node, let us go ahead and format the NameNode:


sudo -u hdfs hadoop namenode -format

 

Explanation

  • The HDFS Named Services runs under the hdfs account, and so to gain access to it, let us sudo to that user name 

Output (Screen Shot):

hadoop-conf-pseudo -- hadoop namenode -format

Explanation:

  • We are able to format our namenode
  • Our default replication is 1
  • The File System is owned by the hdfs user
  • And, the File System ownership group is supergroup
  • Permission is enabled
  • High Availability (HA) is not enabled
  • We are in Append Mode
  • Our storage directory is /var/lib/hadoop-hdfs/cache/hdfs/dfs/name  

 

Reformat?

If the NameNode File System is already formatted, and you issue an HDFS format request, you will be asked to confirm that you want to re-format?

Screen shot:

HDFS - Reformat?

Text Output:

 

Re-format filesystem in Storage Directory /var/lib/hadoop-hdfs/cache/hdfs/dfs/name ? (Y or N)

Component Level – Review & Configuration – HDFS – Name Node – Temp Folder

 

Background

Like any other File System,  HDFS needs a temp folder

HDFS – Create and Grant Permissions to the Temp Folder (/tmp)

Let us create and grant the HDFS:/temp folder

Here are the particulars:

  • The HDFS folder name :- /tmp
  • The HDFS Permission :- 1777

Syntax:

  sudo -u hdfs hadoop fs -mkdir /tmp

  sudo -u hdfs hadoop fs -chmod -R 1777 /tmp

Sample:

  sudo -u hdfs hadoop fs -mkdir /tmp

  sudo -u hdfs hadoop fs -chmod -R 1777 /tmp

Explanation:

  • sudo as hdfs, issue fileSystem (fs) make-directory (mkdir) /tmp
  • Change permission to allow all to write, read, and execute

 

HDFS – Validate Folder (/tmp) Creation and Permission Set

Let us review HDFS:/tmp existence and permission set:

Introduction:

To gain access to HDFS, we do the following:

  • sudo as hdfs
  • We invoke hadoop fs
  • Our payload is -ls — List
  • Arguments : -d — Target directory and not files
  • And, we targeting the /tmp folder
Syntax:

  sudo -u hdfs hadoop fs -ls -d /tmp

Sample:

  sudo -u hdfs hadoop fs -ls -d /tmp

Output:

Hadoop - hdfs - ls - :tmp

Explanation:

  • HDFS :/tmp folder exists
  • Owner (hdfs) can read/write/execute
  • Group (supergroup) can read/write/execute
  • Everyone can read/write/execute and the sticky bit is set (t last character in the file permissions column)

Component Level – Review & Configuration – MapReduce System Directories

Background

There are quite a few HDFS folders that MapReduce needs.

HDFS – MapReduce Folders

Let us create and grant the HDFS:{MapReduce} folders:

  • Create new HDFS Folder {/var/lib/hadoop-hdfs/cache/mapred/mapred/staging}
  • Set permissions of /var/lib/hadoop-hdfs/cache/mapred/mapred/staging to 1777 – World writable and sticky-bit
  • Change the owner of  /var/lib/hadoop-hdfs/cache/mapred/mapred and sub-directories to user mapred


Syntax:

  sudo -u hdfs hadoop fs -mkdir -p \
     /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
  sudo -u hdfs hadoop fs -chmod 1777 \
    /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
  sudo -u hdfs hadoop fs -chown -R mapred \
    /var/lib/hadoop-hdfs/cache/mapred

Sample:

  sudo -u hdfs hadoop fs -mkdir -p \
     /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
  sudo -u hdfs hadoop fs -chmod 1777 \
    /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
  sudo -u hdfs hadoop fs -chown -R mapred \
    /var/lib/hadoop-hdfs/cache/mapred

 

HDFS – Validate Folder {MapReduce} Creation and Permission Set

Let us review HDFS:/var/lib/hadoop-hdfs/cache/mapred existence and permission set:

Syntax:

  sudo -u hdfs hadoop fs -ls -R /var/lib/hadoop-hdfs/cache/mapred

Sample:

  sudo -u hdfs hadoop fs -ls -R /var/lib/hadoop-hdfs/cache/mapred

Output:

cloudera-cdh-4--hdfs--folder--ls

Explanation:

HDFS :/var/lib/hadoop-hdfs/cache/mapred/mapred folder

  • Owner (mapred) can read/write/execute
  • Group (supergroup) can read/execute (but not write)
  • Everyone can read/execute (but not write)

HDFS :/var/lib/hadoop-hdfs/cache/mapred/mapred/staging folder

  • Owner (mapred) can read/write/execute
  • Group (supergroup) can read/write/execute
  • Everyone can read/write/execute and the sticky bit is set (t last character in the file permissions column)

CDH Services

Prepare Inventory of CDH Services

Service List

Here is our expected Service List.

Component Service Name
HDFS – Name Node (Primary) hadoop-hdfs-namenode
HDFS – Name Name (Secondary) hadoop-hdfs-secondarynamenode
HDFS – Data Node hadoop-hdfs-datanode
Hadoop-MapReduce – Job Tracker hadoop-0.20-mapreduce-jobtracker
Hadoop-MapReduce - Task Tracker hadoop-0.20-mapreduce-tasktracker

Using Chkconfig list Hadoop Services?


Syntax:

   # list all services
   sudo chkconfig --list 

   # list specific services, based on name
   sudo chkconfig --list | grep -i <service-name>

Sample:

   sudo chkconfig --list | grep -i "^hadoop"

Screen Shot:

hadoop-conf-pseudo -- Services

Explanation:

The services are auto-started starting from run-level 3.

Using /etc/init.d


Syntax:

   for service in /etc/init.d/<service-name>; do echo $service; done

Sample:

   for service in /etc/init.d/hadoop*; do echo $service; done

Screen Shot:

Service -- :etc:init.d

Starting CHD Services

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_27_1.html

Component Command Log
hadoop-hdfs-namenode sudo/sbin/service hadoop-hdfs-namenode start /var/log/hadoop-hdfs/hadoop-hdfs-namenode-<hostname>.log/var/log/hadoop-hdfs/hadoop-hdfs-namenode-<hostname>.out
hadoop-hdfs-secondarynamenode sudo /sbin/service hadoop-hdfs-namenode start /var/log/hadoop-hdfs/hadoop-hdfs-secondarynamenode-<hostname>.out
hadoop-hdfs-datanode sudo /sbin/service hadoop-hdfs-datanode start /var/log/hadoop-hdfs/hadoop-hdfs-datanode-<hostname>.out
hadoop-0.20-mapreduce-jobtracker sudo /sbin/service hadoop-0.20-mapreduce-jobtracker start /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-jobtracker-<hostname>.out
hadoop-0.20-mapreduce-tasktracker sudo /sbin/service hadoop-0.20-mapreduce-tasktracker start /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-tasktracker-<hostname>.out

Start Services – Using /etc/init.d

Look for items in the /etc/init.d/ folders that have hadoop in their names and start them.


Syntax:

   for service in /etc/init.d/<service-name>; do sudo $service start; done

Sample:

  for service in /etc/init.d/hadoop-*; do sudo $service start; done

Errors:

Here are some errors we received, because I chose not to follow instructions or jumped over some steps.

One thing I have to learn about Linux or Enterprise Systems in general is that “breverity in Instructions is sacrosanct” and you should make sure that you follow everything; or Google for help and hopefully someone else made the same mistakes and gave specific errors and resolution.

Errors – HDFS-NameNode

Here are HDFS Name Node errors.

The log file is

  • Syntax –> /var/log/hadoop-hdfs/hadoop-hdfs-namenode-<hostname>.log
  • Sample –>  /var/log/hadoop-hdfs/hadoop-hdfs-namenode-rachel.log
Error due to name resolution error

Specific Errors:

  • ERROR org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Error getting localhost name.
  • java.net.UnknownHostException: <hostname>: <hostname>
  • at java.net.InetAddress.getLocalHost(InetAddress.java:1466)

Screen Dump:



ERROR org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Error getting localhost name. Using 'localhost'...
java.net.UnknownHostException: rachel: rachel
at java.net.InetAddress.getLocalHost(InetAddress.java:1466)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.getHostname(MetricsSystemImpl.java:496)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configureSystem(MetricsSystemImpl.java:435)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configure(MetricsSystemImpl.java:431)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:180)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:156)
at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.init(DefaultMetricsSystem.java:54)
at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.initialize(DefaultMetricsSystem.java:50)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1205)
Caused by: java.net.UnknownHostException: rachel
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:894)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1286)
at java.net.InetAddress.getLocalHost(InetAddress.java:1462)
... 9 more

Explanation

  • Add hostname to your DNS Server

Error due to HDFS being in an un-consistent state


FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in 
namenode join

org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: 
Directory /var/lib/hadoop-hdfs/cache/hdfs/dfs/name is in an inconsistent state: 

storage directory does not exist or is not accessible.
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:296)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:592)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:435)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:397)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:399)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:433)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:609)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:590)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1141)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1205)

INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1

INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 

Specific Errors:

  • org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode joinorg.apache.hadoop.hdfs.server.common.InconsistentFSStateException
  • Directory /var/lib/hadoop-hdfs/cache/hdfs/dfs/name is in an inconsistent state:storage directory does not exist or is not accessible.

Screen Dump:

hadoop-cosf-pseudo -- hdfs -- inconsistent state

Explanation

  • Please go ahead and format the HDFS Name Node — This should be ran on primary NameNode:
    sudo -u hdfs hadoop namenode -format
    

     

Errors – HDFS-DataNode

Here are HDFS Data Node errors.

The log file is

  • Syntax –> /var/log/hadoop-hdfs/hadoop-hdfs-datanode-<hostname>.log
  • Sample –> /var/log/hadoop-hdfs/hadoop-hdfs-datanode-rachel.log
Error due to host name resolution error

Screen Shot:



[dadeniji@rachel conf]$ cat /var/log/hadoop-hdfs/hadoop-hdfs-datanode-rachel.log

2013-05-13 15:24:32,357 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:

/************************************************************
STARTUP_MSG: Starting DataNode

STARTUP_MSG:   host = java.net.UnknownHostException : rachel: rachel

STARTUP_MSG:   args = []

STARTUP_MSG:   version = 2.0.0-cdh4.2.1

STARTUP_MSG:   build = file:///data/1/jenkins/workspace/generic-package-centos32-6/topdir/BUILD/hadoop-2.0.0-cdh4.2.1/src/hadoop-common-project/hadoop-common -r 144bd548d481c2774fab2bec2ac2645d190f705b; compiled by 
'jenkins' on Mon Apr 22 10:26:05 PDT 2013

STARTUP_MSG:   java = 1.7.0_21
************************************************************/

2013-05-13 15:24:32,895 WARN org.apache.hadoop.hdfs.server.common.Util: 
Path /var/lib/hadoop-hdfs/cache/hdfs/dfs/data should be specified as a URI in configuration files. Please update hdfs configuration.

2013-05-13 15:24:33,962 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain

java.net.UnknownHostException: rachel: rachel
	at java.net.InetAddress.getLocalHost(InetAddress.java:1466)
	at org.apache.hadoop.security.SecurityUtil.getLocalHostName(SecurityUtil.java:223)
	at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:243)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1694)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1719)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1872)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1893)
Caused by: java.net.UnknownHostException: rachel
	at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
	at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:894)
	at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1286)
	at java.net.InetAddress.getLocalHost(InetAddress.java:1462)
	... 6 more

2013-05-13 15:24:33,987 INFO org.apache.hadoop.util.ExitUtil: Exiting withstatus 1

2013-05-13 15:24:34,006 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: 

/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at java.net.UnknownHostException: 
rachel: rachel
************************************************************/

Explanation

  • Need to quickly go in and make sure we are able to resolve our host name; for this specific host; the host name is rachel
Error due to required service not running


WARN org.apache.hadoop.hdfs.server.common.Util: Path /var/lib/hadoop-hdfs/cache/hdfs/dfs/data should be specified as a URI in configuration files.
Please update hdfs configuration.

WARN org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: 
tried hadoop-metrics2-datanode.properties,hadoop-metrics2.properties

INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).

INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics 
system started

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is rachel

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened streaming 
server at /0.0.0.0:50010

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing 
bandwith is 1048576 bytes/s

INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) 
via org.mortbay.log.Slf4jLog

INFO org.apache.hadoop.http.HttpServer: Added global filter 'safety'
 (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)

INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context datanode

INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static

INFO org.apache.hadoop.http.HttpServer: Added filter static_user_filter
 (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) 
to context logs

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server 
at 0.0.0.0:50075

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = false

INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075

INFO org.mortbay.log: jetty-6.1.26.cloudera.2

INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50075

INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 
50020

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened IPC server at /0.0.0.0:50020

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Refresh request 
received for nameservices: null

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting 
BPOfferServices for nameservices: <default>

WARN org.apache.hadoop.hdfs.server.common.Util: Path /var/lib/hadoop-hdfs/cache/hdfs/dfs/data should be specified as a URI in configuration files. 
Please update hdfs configuration.

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool 
<registering> (storage id unknown) service to localhost/127.0.0.1:8020 
starting to offer service

INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting

INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting

INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 0 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 1 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

Explanation

  • It looks like we are breaking when trying to communicate with host localhost, port 8020
  • So what is supposed to be listening on port 8020
  • Quick Google for “Hadoop” and port 8020 landed us @ http://blog.cloudera.com/blog/2009/08/hadoop-default-ports-quick-reference/ and the listening service is Hadoop NameNode
  • So let us go make sure that Hadoop\Name Node is running and listening on Port 8020

Errors – MapReduce – Job Tracker

The log file is

  • Syntax –> /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-jobtracker-<hostname>.log
  • Sample –> /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-jobtracker-rachel.log
Error due to MapReduce / File System Permission Error

Specific Errors:

  • INFO org.apache.hadoop.mapred.JobTracker: Creating the system directory
  • WARN org.apache.hadoop.mapred.JobTracker: Failed to operate on mapred.system.dir (hdfs://localhost:8020/var/lib/hadoop-hdfs/cache/mapred/mapred/system) because of permissions.
  • WARN org.apache.hadoop.mapred.JobTracker: This directory should be owned by the user ‘mapred (auth:SIMPLE)’
  • WARN org.apache.hadoop.mapred.JobTracker: Bailing out …
  • org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=WRITE, inode=”/”:hdfs:supergroup:drwxr-xr-x
  • Caused by: org.apache.hadoop.ipc.RemoteException (org.apache.hadoop.security.AccessControlException): Permission denied: user=mapred, access=WRITE, inode=”/”:hdfs:supergroup:drwxr-xr-x
  • FATAL org.apache.hadoop.mapred.JobTracker: org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=WRITE, inode=”/”:hdfs:supergroup:drwxr-xr-x

Screen Dump:



INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics 
with processName=JobTracker, sessionId=
INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 8021

INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030

INFO org.apache.hadoop.mapred.JobTracker: Creating the system directory

WARN org.apache.hadoop.mapred.JobTracker: Failed to operate on mapred.system.dir (hdfs://localhost:8020/var/lib/hadoop-hdfs/cache/mapred/mapred/system) because of 
permissions

WARN org.apache.hadoop.mapred.JobTracker: This directory should be owned 
by the user 'mapred (auth:SIMPLE)'

WARN org.apache.hadoop.mapred.JobTracker: Bailing out ...
org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:186)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:135)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4684)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4655)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2996)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:2960)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2938)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:648)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:417)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44096)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)

Explanation

  • Having HDFS File System permission problems
  • Are the folders not created or are they created and we are only having problems with the way they are privileged?
  • I remembered that there was extended coverage of HDFS Map Reduce folder permissions in the Cloudera Docs.  Let us go review and apply those permissions

Configuring init to start core Hadoop Services

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_27_2.html

Stopping Hadoop Services

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_27_3.html

Post Installation Review

Services – Review

Commands – service –status-all



sudo service --status-all | egrep -i "jobtracker|tasktracker|Hadoop"

Output:

hdaoop-conf-pseudo - service --status-all

Commands – tcp/ip service (listening)


sudo lsof -Pnl +M -i4 -i6 | grep LISTEN

tried running lsof, but got error message:


ps  -aux 2> /dev/null | grep "java"

Screen Dump (lsof: command not found):

lsof -- command not found

Once installed lsof (via instructions previously given)

Output:

hadoop-conf-pseudo -- services - listening

Explanation:

Explanation – Java

  • We have quite a few listening Java proceses
  • The java processes are listening on TCP/IP ports between 50010 and 50090; specifically 50010, 50020, 50030, 50060, 50070, 50075
  • And, also ports 8010 and 8020

Explanation - Auxiliary Services

  • sshd (port 22)
  • cupsd (port 631)

Commands – ps (running java applications)


ps -eo pri,pid,user,args | grep -i "java" | grep -v "grep" | awk '{printf "%-10s %-10s %-10s %-120s \n ", $1, $2, $3,  $4}'

Output:

ps -java programs (v2)

Interpretation:

  • With java app one will see -Dproc_secondarynamenode, -Dproc_namenode, and -Dproc_jobtracker –> This indicator obviously maps to specific Hadoop Services

Operational Errors

Operational Errors – HDFS – Name Node

Operational Errors – HDFS – Name Node – Security – Permission Denied



mkdir: Permission denied: user=dadeniji, access=WRITE, inode="/user/dadeniji":hdfs:supergroup:drwxr-xr-x

Validate:

Check the permissions for HDFS under /user folder:


sudo -u hdfs hadoop fs -ls /user

We received:

hdfs -- Hadoop -- fs -ls

Explanation:

  • For my folder, /user/dadeniji, my folder is still owned by hdfs.

Let us go change it:


sudo -u hdfs hadoop fs -chown $USER /user/$USER

Validate Fix:


hadoop fs -ls /user/$USER

Output:

hdfs -- Hadoop -- fs -ls (fixed) [v2]

Operational Errors – HDFS – DataNode

13/05/16 15:59:08 ERROR security.UserGroupInformation: PriviledgedActionException as:dadeniji (auth:SIMPLE) cause
:o rg.apache.hadoop.security.AccessControlException: Permission denied: user=<username>, access=EXECUTE,

inode=”/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/<username>”

:mapred:supergroup:drwx——



13/05/16 15:59:08 ERROR security.UserGroupInformation: PriviledgedActionException as:dadeniji (auth:SIMPLE) cause
 :o rg.apache.hadoop.security.AccessControlException: Permission denied: 
user=dadeniji, access=EXECUTE, 
inode="/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/dadeniji":
mapred:supergroup:drwx------
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:161)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:12
8)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4684)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkTraverse(FSNamesystem.java:4660)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2911)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:673)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamen
odeProtocolServerSideTranslatorPB.java:643)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlocki
ngMethod(ClientNamenodeProtocolProtos.java:44128)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689)

When we issued:



sudo -u hdfs hadoop fs -ls  \
   /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

We received:

fs -- check staging--dadeniji (v2)

Explanation:

  • For my personalized HDFS Staging folder (/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/dadeniji), the permission set is rwx(——).
  • To me it appears that the owner (mapred) is the only account that has any permissions.
  • Cloudera Docs is very prophetic about these type of errors:

    Installing CDH4 in Pseudo-Distributed Mode
    Starting Hadoop and Verifying it is Working Properly:
    Create mapred system directories
    http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Quick-Start/cdh4qs_topic_3_2.html
    If you do not create /tmp properly, with the right permissions as shown below, you may have problems with CDH components later. Specifically, if you don’t create /tmp yourself, another process may create it automatically with restrictive permissions that will prevent your other applications from using it.

Let us go correct it:

As it is merely a staging folder, let us remove it, and hope the system re-creates:



sudo -u hdfs hadoop fs -rm -r \
   /var/lib/hadoop-hdfs/cache/mapred/mapred/staging/dadeniji

Once corrected we can run the MapReduce jobs.

References

References – Cloudera

References – GPG Keys

References – Java – Installation on CentOS

References – Yum – Commands

References – Network – Changing Hostname

References – ps

References – ls

References – ssh

References – Linux – User Management

References – Hadoop – HDFS

Posted in Cloudera, Hadoop, Technical | Leave a comment

Technical : Microsoft – Windows Developer rant – “We are slower than other operating systems”

Technical : Microsoft – Windows Developer rant – “We are slower than other operating systems”

Like writers, programmers write because they have to. Whether creating software, proses, technical documentations, blogs, emails — Programmers do it, that is write, because they have to.

They have to express themselves..

It all started off with a small comment on Hacker News,  posted here:

https://news.ycombinator.com/item?id=5689391
There is not much discussion about Windows internals, not only because they are not shared, but also because quite frankly the Windows kernel evolves slower than the Linux kernel in terms of new algorithms implemented. For example it is almost certain that Microsoft never tested I/O schedulers, process schedulers, filesystem optimizations, TCP/IP stack tweaks for wireless networks, etc, as much as the Linux community did. One can tell just by seeing the sheer amount of intense competition and interest amongst Linux kernel developers to research all these areas.

The net result of that is a generally acknowledged fact that Windows is slower than Linux when running complex workloads that push network/disk/cpu scheduling to its limit: https://news.ycombinator.com/item?id=3368771 A really concrete and technical example is the network throughput in Windows Vista which is degraded when playing audio!http://blogs.technet.com/b/markrussinovich/archive/2007/08/2…

Note: my post may sound I am freely bashing Windows, but I am not. This is the cold hard truth. Countless of multi-platform developers will attest to this, me included. I can’t even remember the number of times I have written a multi-platform program in C or Java that always runs slower on Windows than on Linux, across dozens of different versions of Windows and Linux. The last time I troubleshooted a Windows performance issue, I found out it was the MFT of an NTFS filesystem was being fragmented; this to say I am generally regarded as the one guy in the company who can troubleshoot any issue, yet I acknowledge I can almost never get Windows to perform as good as, or better than Linux, when there is a performance discrepancy in the first place.

 

Like all good write up,  it continued here:

http://blog.zorinaq.com/?e=74
I contribute to the Windows Kernel.  We are slower than Operating Systems.  Here is why..

I was explaining on Hacker News why Windows fell behind Linux in terms of operating system kernel performance and innovation. And out of nowhere an anonymous Microsoft developer who contributes to the Windows NT kernel wrote a fantastic and honest response acknowledging this problem and explaining its cause. His post has been deleted! Why the censorship? I am reposting it here. This is too insightful to be lost. [Edit: The anonymous poster himself deleted his post as he thought it was too cruel and did not help make his point, which is about the social dynamics of spontaneous contribution.

However he let me know he does not mind the repost at the condition I redact the SHA1 hash info, which I did.][Edit: A second statement, apologetic, has been made by the anonymous person. See update at the bottom.]

“”"

I’m a developer in Windows and contribute to the NT kernel. (Proof: the SHA1 hash of revision #102 of [Edit: filename redacted] is [Edit: hash redacted].) I’m posting through Tor for obvious reasons.

Windows is indeed slower than other operating systems in many scenarios, and the gap is worsening. The cause of the problem is social. There’s almost none of the improvement for its own sake, for the sake of glory, that you see in the Linux world.

Granted, occasionally one sees naive people try to make things better. These people almost always fail. We can and do improve performance for specific scenarios that people with the ability to allocate resources believe impact business goals, but this work is Sisyphean. There’s no formal or informal program of systemic performance improvement. We started caring about security because pre-SP3 Windows XP was an existential threat to the business. Our low performance is not an existential threat to the business.

See, component owners are generally openly hostile to outside patches: if you’re a dev, accepting an outside patch makes your lead angry (due to the need to maintain this patch and to justify in in shiproom the unplanned design change), makes test angry (because test is on the hook for making sure the change doesn’t break anything, and you just made work for them), and PM is angry (due to the schedule implications of code churn). There’s just no incentive to accept changes from outside your own team. You can always find a reason to say “no”, and you have very little incentive to say “yes”.

There’s also little incentive to create changes in the first place. On linux-kernel, if you improve the performance of directory traversal by a consistent 5%, you’re praised and thanked. Here, if you do that and you’re not on the object manager team, then even if you do get your code past the Ob owners and into the tree, your own management doesn’t care. Yes, making a massive improvement will get you noticed by senior people and could be a boon for your career, but the improvement has to be very large to attract that kind of attention. Incremental improvements just annoy people and are, at best, neutral for your career. If you’re unlucky and you tell your lead about how you improved performance of some other component on the system, he’ll just ask you whether you can accelerate your bug glide.

Is it any wonder that people stop trying to do unplanned work after a little while?

Another reason for the quality gap is that that we’ve been having trouble keeping talented people. Google and other large Seattle-area companies keep poaching our best, most experienced developers, and we hire youths straight from college to replace them. You find SDEs and SDE IIs maintaining hugely import systems. These developers mean well and are usually adequately intelligent, but they don’t understand why certain decisions were made, don’t have a thorough understanding of the intricate details of how their systems work, and most importantly, don’t want to change anything that already works.

These junior developers also have a tendency to make improvements to the system by implementing brand-new features instead of improving old ones. Look at recent Microsoft releases: we don’t fix old features, but accrete new ones. New features help much more at review time than improvements to old ones.

(That’s literally the explanation for PowerShell. Many of us wanted to improve cmd.exe, but couldn’t.)

More examples:

  • We can’t touch named pipes. Let’s add %INTERNAL_NOTIFICATION_SYSTEM%! And let’s make it inconsistent with virtually every other named NT primitive.
  • We can’t expose %INTERNAL_NOTIFICATION_SYSTEM% to the rest of the world because we don’t want to fill out paperwork and we’re not losing sales because we only have 1990s-era Win32 APIs available publicly.
  • We can’t touch DCOM. So we create another %C#_REMOTING_FLAVOR_OF_THE_WEEK%!
  • XNA. Need I say more?
  • Why would anyone need an archive format that supports files larger than 2GB?
  • Let’s support symbolic links, but make sure that nobody can use them so we don’t get blamed for security vulnerabilities (Great! Now we get to look sage and responsible!)
  • We can’t touch Source Depot, so let’s hack together SDX!
  • We can’t touch SDX, so let’s pretend for four releases that we’re moving to TFS while not actually changing anything!
  • Oh god, the NTFS code is a purple opium-fueled Victorian horror novel that uses global recursive locks and SEH for flow control. Let’s write ReFs instead. (And hey, let’s start by copying and pasting the NTFS source code and removing half the features! Then let’s add checksums, because checksums are cool, right, and now with checksums we’re just as good as ZFS? Right? And who needs quotas anyway?)
  • We just can’t be fucked to implement C11 support, and variadic templates were just too hard to implement in a year. (But ohmygosh we turned “^” into a reference-counted pointer operator. Oh, and what’s a reference cycle?)

Look: Microsoft still has some old-fashioned hardcore talented developers who can code circles around brogrammers down in the valley. These people have a keen appreciation of the complexities of operating system development and an eye for good, clean design. The NT kernel is still much better than Linux in some ways — you guys be trippin’ with your overcommit-by-default MM nonsense — but our good people keep retiring or moving to other large technology companies, and there are few new people achieving the level of technical virtuosity needed to replace the people who leave. We fill headcount with nine-to-five-with-kids types, desperate-to-please H1Bs, and Google rejects. We occasionally get good people anyway, as if by mistake, but not enough. Is it any wonder we’re falling behind? The rot has already set in.

“”"

Edit: This anonymous poster contacted me, still anonymously, to make a second statement, worried by the attention his words are getting:

“”"

All this has gotten out of control. I was much too harsh, and I didn’t intend this as some kind of massive exposé. This is just grumbling. I didn’t appreciate the appetite people outside Microsoft have for Kremlinology. I should have thought through my post much more thoroughly. I want to apologize for presenting a misleading impression of what it’s like on the inside.

First, I want to clarify that much of what I wrote is tongue-in-cheek and over the top — NTFS does use SEH internally, but the filesystem is very solid and well tested. The people who maintain it are some of the most talented and experienced I know. (Granted, I think they maintain ugly code, but ugly code can back good, reliable components, and ugliness is inherently subjective.) The same goes for our other core components. Yes, there are some components that I feel could benefit from more experienced maintenance, but we’re not talking about letting monkeys run the place. (Besides: you guys have systemd, which if I’m going to treat it the same way I treated NTFS, is an all-devouring octopus monster about crawl out of the sea and eat Tokyo and spit it out as a giant binary logfile.)

In particular, I don’t have special insider numbers on poaching, and what I wrote is a subjective assessment written from a very limited point of view — I watched some very dear friends leave and I haven’t been impressed with new hires, but I am *not* HR. I don’t have global facts and figures. I may very well be wrong on overall personnel flow rates, and I shouldn’t have made the comment I did: I stated it with far more authority than my information merits.

Windows and Microsoft still have plenty of technical talent. We do not ship code that someone doesn’t maintain and understand, even if it takes a little while for new people to ramp up sometimes. While I have read and write access to the Windows source and commit to it once in a while, so do tens and tens of thousands of other people all over the world. I am nobody special. I am not Deep Throat. I’m not even Steve Yegge. I’m not the Windows equivalent of Ingo Molnar. While I personally think the default restrictions placed on symlinks limited their usefulness, there *was* a reasoned engineering analysis — it wasn’t one guy with an ulterior motive trying to avoid a bad review score. In fact, that practically never happens, at least consciously. We almost never make decisions individually, and while I maintain that social dynamics discourage risk-taking and spontaneous individual collaboration, I want to stress that we are not insane and we are not dysfunctional. The social forces I mentioned act as a drag on innovation, and I think we should do something about the aspects of our culture that I highlighted, but we’re far from crippled. The negative effects are more like those incurred by mounting an unnecessary spoiler on a car than tearing out the engine block. What’s indisputable fact is that our engineering division regularly runs and releases dependable, useful software that runs all over the world. No matter what you think of the Windows 8 UI, the system underneath is rock-solid, as was Windows 7, and I’m proud of having been a small part of this entire process.

I also want to apologize for what I said about devdiv. Look: I might disagree with the priorities of our compiler team, and I might be mystified by why certain C++ features took longer to implement for us than for the competition, but seriously good people work on the compiler. Of course they know what reference cycles are. We’re one of the only organizations on earth that’s built an impressive optimizing compiler from scratch, for crap’s sake.

Last, I’m here because I’ve met good people and feel like I’m part of something special. I wouldn’t be here if I thought Windows was an engineering nightmare. Everyone has problems, but people outside the company seem to infuse ours with special significance. I don’t get that. In any case, I feel like my first post does wrong by people who are very dedicated and who work quite hard. They don’t deserve the broad and ugly brush I used to paint them.

P.S. I have no problem with family people, and want to retract the offhand comment I made about them. I work with many awesome colleagues who happen to have children at home. What I really meant to say is that I don’t like people who see what we do as more of a job than a passion, and it feels like we have a lot of these people these days. Maybe everyone does, though, or maybe I’m just completely wrong.

References:

 

Posted in Microsoft | Leave a comment

Technical: Linux – User Administrator – Granting SysAdmin access

Technical: Linux – User Administrator – Granting SysAdmin access

Introduction

Access to running certain applications is restricted to the root user or users that are able to acquire administrative privileges.

Thus to successfully manage systems it is required to be able to login as the root account or one of the accounts that can act in its place.

Which processes can only be executed by “root” users?

These so called restricted modules have an s in the owner execute flag when viewed using ls -la.


  --check /bin folder and list files that have the signature "-rws"
  ls -la /bin/* | grep -i "\-rws"

There are a couple of things you want to note:

  • You need to escape the – symbol when identifying -rws; you escape – character by using the back-slash (\)
  • Notice that we are looking at the first three letters; which signify permission set for the owner
  • r — the owner is able to read the file
  • w — the owner is able to write\over-write the file
  • s — this usually have x to indicate that the owner can execute the file.  When not x, but s it means whomever is executing this process takes on the role of the file’s owner

Taking on the root role via membership in the wheel group

By convention Linux uses a group name named wheel as a surrogate group that can take on the role of the Admin.

Where did the name wheel come from ?

http://en.wikipedia.org/wiki/Wheel_%28Unix_term%29

In computing, the term wheel refers to a user account with a wheel bit, a system setting that provides additional special system privileges that empower a user to execute restricted commands that ordinary user accounts cannot access.  The term is derived from the slang phrase big wheel, referring to a person with great power or influence.

What is the “Wheel Group”

http://en.wikipedia.org/wiki/Wheel_%28Unix_term%29

Modern Unix systems use user groups to control access privileges. The wheel group is a special user group used on some Unix systems to control access to the su command, which allows a user to masquerade as another user (usually the super user).

Adding user to the wheel group

We can modify user accounts via the Graphical Interface or via the command shell’s utility such as usermod.

Command Shell – Utility – usermod

To modify user accounts, Linux relies on the usermod utility.  Here are a few quick points:

  • The file’s full name is /usr/sbin/usermod
  • One can change the user’s home directory via the -d (–home) option
  • One can change the user’s primary group via the -g ( –gid) option
  • One can wholly replace the user’s group membership via the -G (–groups) option
  • One can add to the user’s existing group by using the -a (–append) option
  • One can change the user’s shell by using the -s (–shell) option
  • One can unlock an account by using the -U (–unlock) option
Usermod – Add user to the wheel group

To add our user, myself, in this case to the wheel group, please do the following:
 


Syntax:

   usermod -g <group-name> <username>

Sample:

   usermod -g wheel dadeniji

Thanks goodness, you get good nice, indicative messages when the group or user name is not actualized on the system:

  • user does not exist
  • group does not exist

usermod

When things are good, we get no feedback.

usermodGood

Groups – Review User Group Membership

Get user groups


Syntax:

   groups <username>

Sample:

   groups dadeniji

Output:

listUserGroups

Groups – List all users in a group

List all users in a group


Sample:

   grep :`grep ^wheel /etc/group | cut -d: -f3`: /etc/passwd

Output:

listAllUsersInAGroup

Explanation of Script:

  • The surrounding ` means that the inner script be ran and the results internal preserved, and not displayed to the console
  • What does the inner script do — grep ^wheel / etc/group — it says to get the line in /etc/group that starts with the wheel word.  In  its entirety that line reads “wheel:x:10:”
  • The output of “grep ^wheel /etc/group” is piped “|” to the cut utility.  The syntax “cut -d: -f3″ says to get the third word using colon (:) as the delimiter   So when we ask for the first word of “wheel:x:10:”, we get back 10.  10 is obviously the GroupID for wheel
  • Please note that you need the colons (:) around the inner script, without it I got extraneous row; like the code and output pasted below:

Code (code and console output):


Command:
    grep -e  "`grep ^wheel /etc/group | cut -d: -f3`"  /etc/passwd

Output:
    uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
    games:x:12:100:games:/usr/games:/sbin/nologin
    dadeniji:x:500:10:Daniel Adeniji:/home/dadeniji:/bin/bash

-------------------------------------------------------------------------------
Command:
    grep -e  :"`grep ^wheel /etc/group | cut -d: -f3`":  /etc/passwd

Output:
   uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
   dadeniji:x:500:10:Daniel Adeniji:/home/dadeniji:/bin/bash

Output (Screen shot):

listAllUsersInAGroupCorrection

Explanation:

  • Without the colon (:), one will see the extra record for games.  Games group id is not 10, but 100

Ensure that wheel has sudo access via customization of sudoers

Why bother with sudoers?

If an account tries to access sudo without membership in the wheel group or the wheel group is not fully configured for sudo access via the sudoers file, then the error message pasted below will come up:

Output (Text):

<account> is not in the sudoers file.  This incident will be reported.


Output (Screenshot):

no-sudo-access

Email

Next time you, the root user, access uses your system, you will get a nice little notification telling you that you have a nice a little email waiting for you:

You have mail in /var/spool/mail/root

 

To view the email issue something like

tail /var/spool/mail/root

  

Screen shot:

EmailNotification

The email sent by the gossip delegator is quite straight forward.  The areas covered includes:

Email Header:

  • To — root
  • From — dadeniji (in our case)
  • Auto-Submitted: auto-generated
  • Subject: **SECURITY information for <hostname>
  • Mesage-Id: ******
  • Date: *****
tail /var/spool/mail/root

 

Email Contents:



rachel : May 12 17:41:27 : dadeniji : user NOT in sudoers ; TTY=pts/2 ; PWD=/home/dadeniji ; USER=root ; COMMAND=/bin/ls

Using visudo

Launch visudo:

visudo

Look for the lines that reference the wheels group:

Shipped:



## Allows people in group wheel to run all commands
# %wheel        ALL=(ALL)       ALL

## Same thing without a password
# %wheel        ALL=(ALL)       NOPASSWD: ALL

  • The  statements are refreshingly well documented
  • I will suggest that you un-comment the line references wheel, but does not make mention of NOPASSWD

Revised:



## Allows people in group wheel to run all commands
%wheel        ALL=(ALL)       ALL

## Same thing without a password
# %wheel        ALL=(ALL)       NOPASSWD: ALL

Corrected:

visudo - wheels (corrected)

Validation


sudo ls -la *

Output:

sudo (corrected)

Now we issue sudo <command> and supply our account’s (dadeniji) password, we are good.

References

References – RHEL

References – Files Permission

References – User Group Membership

References – Managing Groups

References – Bash

References – Grep Commands

References – Piping Grep Commands

References – Cut Commands

References – List All Members in a Group

References – /etc/group

 

References – usermod

 

Posted in CentOS, Linux, Redhat | Tagged , | Leave a comment

Technical – Hadoop – Hive – What is the Version # of Hive Service and Clients that you are running?

Technical – Hadoop – Hive – What is the Version # of Hive Service and Client that you are running?

Introduction

Hadoop is a speeding bullet.  You look online, Google for things, try it out, and sometimes you hit, but often you miss.

What do I mean by that?

Well this evening I was trying to play with Hive; specifically using Sqoop to import a table from MS SQL Server into Hive.

A bit of background, my MS SQL Server table has a couple of columns declared as datetime.

Upon running the Sqoop statement pasted below:



--connect "jdbc:sqlserver://sqlServerLab;database=DEMO" \
--username "dadeniji" \
--password "l1c0na" \
--driver "com.microsoft.sqlserver.jdbc.SQLServerDriver" \
-m 1 \
--hive-import \
--hive-table "customer" \
--table "dbo.customer" \
--split-by "customerID"

 

The above command basically gives the following instruction set:

  • Via JDBC Driver (jdbc:sqlserver) connect to SQL Instance (sqlServerLab) and database Demo
  • Use the following SQL Server credentials — username – dadeniji, password – l1c0na
  • JDBC Driver’s Class name – com.microsoft.sqlserver.jdbc.SQLServerDriver
  • Number of Map Reduce Jobs (m 1)
  • Sqoop Operation — hive-import
  • Hive Table — customer
  • SQL Server Table — dbo.customer
  • Split-by — customerID

I noticed in the Sqoop console log output statements a couple of warnings:



INFO manager.SqlManager: Executing SQL statement: 
SELECT t.* FROM dbo.customer AS t WHERE 1=0

WARN hive.TableDefWriter: Column InsertTime had to be cast to a less 
precise type in Hive

WARN hive.TableDefWriter: Column salesDate had to be cast to a less 
precise type in Hive

Processing

Explore MS SQL Server

So I quick went back and looked at my SQL Server Table:

  use [Demo];
  exec sp_help 'dbo.customer';

Output:

Hadoop - Sqoop - MS SQL Server - dbo.customer

The output is congruent with my thoughts:

  • The InsertTime is a datetime column
  • The salesDate is a datetime column

Explore Hive

Launch Hive:

In shell, issue “hive” to initiate Hive Shell:


hive

List all tables:

To confirm that a corresponding table has been created in Hive, uses list


show tables;

Output:

Hadoop - Sqoop - Client - Show tables

Display Table Structure (customer):

Display table structure using describe:


Syntax:
    describe <table-name>;

Sample:

    describe customer;

Output:

Hadoop - Sqoop - Client - Describe -- customer

Explanation:

  • So it is obvious that our two original MS SQL Server Date columns (Inserttime and salesdate) were not brought in as Datetime, but String

So I am thinking why?

Hive Datatype Support

I know that the Timestamp column was not one of the original datatypes supported by Hive.  It was added per Hive version 0.8.0

This is noted in:

HortonWorks – Hive – Language Manual – Datatypes
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.0.2/ds_Hive/language_manual/datatypes.html

Determine Hive Version

There are a couple of ways to get the Hive’s Server and Client Version Number

Determine Hive Version – Command Shell – Using ps

issue ps -aux



ps -aux | grep -i "Hive"

Output (Screen shot):

Hadoop - Hive - Version -- ps --aux

Output (Text):



hive     13767  0.0  1.9 841080 159768 ?       Sl   Apr15  17:00 /usr/java/default/bin/java -Xmx256m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hadoop -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx268435456 -Djava.net.preferIPv4Stack=true -Xmx268435456 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hive/lib/hive-service-0.10.0-cdh4.2.0.jar org.apache.hadoop.hive.metastore.HiveMetaStore -p 9083

410      13853  0.0  0.2 2207844 22824 ?       Ss   Apr15   0:00 postgres: hive hive 10.0.4.1(56963) idle            

410      13854  0.0  0.1 2206552 8388 ?        Ss   Apr15   0:00 postgres: hive hive 10.0.4.1(56964) idle            

dadeniji 18749  0.0  1.8 814332 152732 pts/0   Sl+  May10   0:21 /usr/java/default/bin/java -Xmx256m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hadoop -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx268435456 -Xmx268435456 -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/bin/../lib/hive/lib/hive-cli-0.10.0-cdh4.2.0.jar org.apache.hadoop.hive.cli.CliDriver

  • We have 4 processes bearing the “hive” name

Service Process

  • It is identifiable as a Hive Service via its name hive-service*.jar
  • It is running under the “hive” account name.  Its Process ID is 13767.  One of the Jar files referenced is /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hive/lib/hive-service-0.10.0-cdh4.2.0.jar 
  • The Cloudera Version# is 4.2 and Hive Version# is 0.10

Client Process

  • It is identifiable as a Hive Client via its name hive-cli*.jar
  • It is running under my username (dadeniji), as I kicked it off.  Its Process ID is 18749.  One of the Jar files referenced is /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/bin/../lib/hive/lib/hive-cli-0.10.0-cdh4.2.0.jar
  • The Cloudera Version# is 4.2 and Hive Version# is 0.10

“Postgress” Process

  • Hive’s uses an embedded postgress database
  • The processes are running under account 410

Determine Hive Version – Cloudera Manager Admin Console & Command Shell

  • Launch Web Browser
  • Connect to Admin console ( http://<clouderaManagerServices>:<port>).  In our case http://hadoopCMS:7180; as Cloudera Manager Service is running on a machine named hadoopCMS and we kept the default port# of 7180
  • The initial screen displayed in the Service Status page (/cmf/services/status)
  • Click on the service we are interested in (hive1)
  • The service’s specific “Status and Health Summary” screen is displayed.  In this case “Hive1 – Services and Health Summary” page
  • In the row labelled “Hive MetaStore Server” Click on the link underneath the “Status” column
  • This will bring you to the “hivemetastore” summary page.
  • For each Hive host, Hive process information and links the Hive Logs are displayed
  • On the “Show Recent Logs” row, click on “Full Stdout” log
  • The stdout.log appears – Here is break of what is provided

stdout.log


Mon Apr 15 21:06:24 UTC 2013
using /usr/java/default as JAVA_HOME

using 4 as CDH_VERSION

using /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hive 
    as HIVE_HOME

using /var/run/cloudera-scm-agent/process/22-hive-HIVEMETASTORE 
    as HIVE_CONF_DIR

using /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hadoop 
    as HADOOP_HOME

using /var/run/cloudera-scm-agent/process/22-hive-HIVEMETASTORE/hadoop-conf as HADOOP_CONF_DIR

Starting Hive Metastore Server

Java version

We quickly see that JAVA_HOME is defined as /usr/java/default.

To see what files constitute /usr/java/default

  ls /usr/java/default

Output:

Hadoop - Clopudera Manager - Java Version

Explanation:

  • /usr/java/default is symbolically linked to /usr/java/latest
  • /usr/java/latest is symbolically linked to /usr/java/jdk1.7.0_17 
Cloudera Distribution version

Based on the screen shot below, the CDH Version is 4

using 4 as CDH_VERSION
Hive Home

Based on the screen shot below, the Hive Home is /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hive

using /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hive as HIVE_HOME

Again, let us return to the command shell and see what files are in /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hive

Please add /lib suffix to get to the Jar files and only get only jar files that have hive in their names.



ls /opt/cloudera/parcels/CDH-4.2.0-1.cdh4.2.0.p0.10/lib/hive/lib/hive*.jar

Output:

Hadoop - Cloudera Manager - Server - ls Sqoop Jar files

Cloudera Manager Admin Console – Service Status

Hadoop - Cloudera Manager - Services - Status

Cloudera Manager Admin Console – Hive1 – Status and Health Summary

Hadoop - Cloudera Manager - Services - Status and Health Summary

Cloudera Manager Admin Console – Hive1 – Status Summary

Hadoop - Cloudera Manager - Hive - Status

Cloudera Manager Admin Console – Hive1 – Status Summary – Log – Stdout.log 

Hadoop - Cloudera Manager - Hive - Status - Log - stdout

Conclusion

It thus appears that we are running a version of Hive (0.10) in this case that it did not support the TimeStamp datatype.

The problem can also be with the version of Sqoop we have running or Sqoop’s ability to detect SQL Server’s datetime datatype or datetime data representation in general.

Posted in Hadoop, Hive, Sqoop | Tagged | Leave a comment

Technical: Hadoop – ZooKeeper – Client – Cloudera

Technical: Hadoop – ZooKeeper – Client (Cloudera)

Introduction

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_21.html

ZooKeeper is a high-performance coordination service for distributed applications. It exposes common services — such as naming, configuration management, synchronization, and group services – in a simple interface so you don’t have to write them from scratch. You can use it off-the-shelf to implement consensus, group management, leader election, and presence protocols. And you can build on it for your own, specific needs.

What are we trying to do

Review the ZooKeeper client bundled with Cloudera Hadoop distribution.  The tool is known as ZooKeeper-Client.

Configuration Validation

Folder – /etc/init.d/*cloudera*


ls -la /etc/init.d/*cloudera*

Screen Shot:

Listing -- etc--init-d--cloudera

Service – Status all



sudo service --status-all 2> /dev/null | grep -i "cloudera"

Screen Shot:

Service -- status-all

ZooKeeper Client

ZooKeeper Client – Launch Shell

To launch Zookeeper, please issue:


zookeeper-client

Output:

zookeeper-client Launch

(press enter) to get a command entry point

ZooKeeper Client – Help

To get a listing of commands:


help

Output:

zookeeper-help

ZooKeeper Client – Quit

To close the shell, issue the Quit command.


quit

Output:

ZooKeeper -- Quit

ZooKeeper Client – Connect

To connect to another ZooKeeper, issue Open :



Syntax:
   connect <hostname>:<portNumber>

Sample:
   connect hadoopHR:2181

Output:

ZooKeeper - Client - ConnectTo

ZooKeeper Client – ls

ZooKeeper primary object are folders and files.  To get a list of folders and files, issue:

Syntax:

   ls <folder-name>
Sample:

   ls /
   ls /hbase
   ls /zookeeper

On a base Zookeeper install, the base folders are /hbase and /zookeeper.

Output:

ls /hbase

Zookeeper - Client -- ls hbase

 ls /zookeeper

Zookeeper - Client -- ls zookeeper

ZooKeeper Client – Create Folder

To create your own folders:

Syntax:

   create <folder-name> <Associated-ID>
Sample:

   create  /corporate corp 
   create  /corporate/HR  corpHR

Zookeeper - Client - Create Folders

ZooKeeper Client – Remove Folder

To remove your own folders:

Syntax:

   rmr <folder-name>
Sample:

   rmr  /corpSec8

ZooKeeper - Client - Remove Folder

ZooKeeper Client – getAcl

To get permissions issue the getAcl command.

To get permission set for folder /advert:

Syntax:

   getAcl <folder-name>
Sample:

   getAcl  /corporate/HR

Output:

Zookeeper - Client - getAcl

Explanation:

  • Scheme -> world
  • User -> anyone (default and only allowable user)
  • Permission –> crdwa

To get permission set for folder /advert:


Syntax:

   getAcl <folder-name>
Sample:

   getAcl  /advert

Output:

ZooKeeper - Client - getAcl - Digest Authentication (folder advert)

Explanation:

  • Scheme -> digest
  • User -> dadeniji
  • Password –> safetec
  • Permission –> crdw

ZooKeeper Client – setAcl

To set permissions issue the setAcl command.

We have included setAcl commands as simply an engineering exercise.  I will discourage employing them for the following reasons.

  • The folder can become totally inaccessible
  • They are indomitable

Indomitable:

  • They can not be removed  - There is no resetAcl API
  • They are not cumulative

ZooKeeper Client – setAcl – Scheme (Host)

To set permission for specific Hosts or hosts that are in same domain, use:

Everyone whose hosts name has the corp.com moniker:

Syntax:

   setAcl <folder-name> <host>:<domain-name>:<permission-set>

Sample:

   setAcl /advert host:corp.com:crwda

The host whose FQDN name is appServer1.corp.com:

Syntax:

   setAcl <folder-name> <host>:hostname:<permission-set>

Sample:

   setAcl /advert host:appServer1.corp.com:cdrwa

ZooKeeper Client – setAcl – Scheme (IP Address)

To assign all permissions to a specific IP Address {10.0.4.70}:

Syntax:

   setAcl   <folder-name> ip:<ipAddress>:<permisson-set>
Sample:

   setAcl  /corpSec7 ip:10.0.4.70:cdrwa

ZooKeeper - Client - setAcl - IPAddress (folder corpSec7)

To validate that things are good, issue getAcl:

To review the Permission set, use getAcl:

Syntax:

   getAcl   <folder-name>
Sample:

   getAcl  /corpSec7

Output:

ZooKeeper - Client - getAcl - IPAddress (folder corpSec7)

ZooKeeper Client – setAcl – Scheme (World)

Anyone

For the following use case scenario:

  • Folder -> /corporate
  • Authentication Provider -> world
  • User –> anyone (The only valid user is the “anyone” user)
  • Permission -> crwda
Syntax:

   setAcl <folder-name> <scheme>:<permisson-set>

Sample:

   create /advert /advert
   setAcl /advert world:anyone:crdwa

Output:

Zookeeper - Client - setAcl - world

 

ZooKeeper Client – setAcl – Digest Authentication

To allow anyone within our local network the ability to use the /corporate/HR folder, do the following:

For the following use case scenario:

  • Folder -> /corporate
  • Authentication Provider -> digest
  • User –> dadeniji
  • User password –> waTER
  • Permission -> crwda
Syntax:

   setAcl <folder-name>:<scheme>:<permisson-set>

Sample:

   setAcl /corporate digest:dadeniji:waTER:crdwa

Output:

Zookeeper - Client - setAcl - Digest Authentication

For the following use case scenario:

  • Folder -> /advert
  • Authentication Provider -> digest
  • User –> dadeniji
  • User password –> safetec
  • Permission -> crwd
Syntax:

   setAcl <folder-name>:<scheme>:<permisson-set>

Sample:

   setAcl /advert digest:dadeniji:safetec:crdw


Output
:

ZooKeeper - Client - setAcl - Digest Authentication (folder advert)

ZooKeeper Client – Stat

Get folder Stats:

Syntax:

   stat <folder-name>

Sample:

   stat  /hbase
   stat  /zookeeper

Output:

ZooKeeper - Client - Stat

Error Messages:

Error Message – NoAuthException


Exception in thread "main" org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /corpSec7
at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.setACL(ZooKeeper.java:1375)
at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:733)
at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:593)
at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:365)
	at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
	at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)

Explanation:

  • One always has to be careful when setting permissions
  • And, it seems once they are set, it is difficult to change them

Logging:

Logging – Log File – Location

ZooKeeper log files are kept in /var/log/zookeeper

Logging – Log File – Name

The naming convention for log file is:

zookeeper-cmf-zookeeper1-SERVER-<FQDN>.log

References:

References – Installation

References – Getting Started

References – Programmer

References – Zookeeper & SASL

References – Mailing List

Posted in Hadoop, Technical, ZooKeeper | Tagged , | Leave a comment