Setup HBase
In order to use OpenTSDB, you need to have HBase up and running. This page will help you get started with a simple, single-node HBase setup, which is good enough to evaluate OpenTSDB or monitor small installations. If you need scalability and reliability, you will need to setup a full HBase cluster.
You can copy-paste all the following instructions directly into a terminal.
Setup a single-node HBase instance
If you already have an HBase cluster, skip this step. If you’re gonna be using less than 5-10 nodes, stick to a single node. Deploying HBase on a single node is easy and can help get you started with OpenTSDB quickly. You can always scale to a real cluster and migrate your data later.
wget http://www.apache.org/dist/hbase/hbase-0.96.2/hbase-0.96.2-hadoop1-bin.tar.gz tar xfz hbase-0.96.2-hadoop1-bin.tar.gz cd hbase-0.96.2-hadoop1
At this point, you are ready to start HBase (without HDFS) on a single node. But before starting it, I recommend using the following configuration:
hbase_rootdir=${TMPDIR-‘/tmp’}/tsdhbase iface=lo`uname | sed -n s/Darwin/0/p` cat >conf/hbase-site.xml <<EOF <?xml version=”1.0″?> <?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?> <configuration> <property> <name>hbase.rootdir</name> <value>file:///$hbase_rootdir/hbase-\${user.name}/hbase</value> </property> <property> <name>hbase.zookeeper.dns.interface</name> <value>$iface</value> </property> <property> <name>hbase.regionserver.dns.interface</name> <value>$iface</value> </property> <property> <name>hbase.master.dns.interface</name> <value>$iface</value> </property> </configuration> EOF
Make sure to adjust the value of hbase_rootdir
if you want HBase to store its data in somewhere more durable than a temporary directory. The default is to use /tmp
, which means you’ll lose all your data whenever your server reboots. The remaining settings are less important and simply force HBase to stick to the loopback interface (lo0
on Mac OS X, or just lo
on Linux), which simplifies things when you’re just testing HBase on a single node.
Now start HBase:
./bin/start-hbase.sh
Using LZO
There is no reason to not use LZO with HBase. Except in rare cases, the CPU cycles spent on doing LZO compression / decompression pay for themselves by saving you time wasted doing more I/O. This is certainly true for OpenTSDB where LZO can easily compress OpenTSDB’s binary data by 3 to 4x. Installing LZO is simple and is done as follows.
Pre-requisites
In order to build hadoop-lzo
, you need to have Ant installed as well as liblzo2 with development headers:
apt-get install ant liblzo2-dev # Debian/Ubuntu yum install ant ant-nodeps lzo-devel.x86_64 # RedHat/CentOS/Fedora brew install lzo # Mac OS X
Compile & Deploy
Thanks to our friends at Cloudera for maintaining the Hadoop-LZO package:
git clone git://github.com/cloudera/hadoop-lzo.git cd hadoop-lzo CLASSPATH=path/to/hadoop-core-1.0.4.jar CFLAGS=-m64 CXXFLAGS=-m64 ant compile-native tar hbasedir=path/to/hbase mkdir -p $hbasedir/lib/native cp build/hadoop-lzo-0.4.14/hadoop-lzo-0.4.14.jar $hbasedir/lib cp -a build/hadoop-lzo-0.4.14/lib/native/* $hbasedir/lib/native
Restart HBase and make sure you create your tables with COMPRESSION => 'LZO'
Common gotchas:
- Where to find
hadoop-core-1.0.4.jar
? On a normal, production HBase install, it will be under HBase’slib/
directory. In your development environment it may be stashed under HBase’starget/
directory, usefind
to locate it. - On Mac OS X, you may get
error: Native java headers not found. Is $JAVA_HOME set correctly?
whenconfigure
is looking forjni.h
, in which case you need to insertCPPFLAGS=-I/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers
beforeCLASSPATH
on the 3rd command above (the one that invokesant
). - On RedHat/CentOS/Fedora you may have to specify where Java is, by adding
JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk.x86_64
(or similar) to theant
command-line, before theCLASSPATH
. - On RedHat/CentOS/Fedora, if you get the weird error message that “Cause: the class org.apache.tools.ant.taskdefs.optional.Javah was not found.” then you need to install the
ant-nodeps
package. - The build may fail with
[javah] Error: Class org.apache.hadoop.conf.Configuration could not be found.
in which case you need to apply this change tobuild.xml
- On Ubuntu, the build may fail to compile the code with
LzoCompressor.c:125:37: error: expected expression before ',' token
. As per HADOOP-2009 the solution is to addLDFLAGS='-Wl,--no-as-needed'
to the command-line.
Migrating to a real HBase cluster
TBD. In short:
- Shut down all your TSDs.
- Shut down your single-node HBase cluster.
- Copy the directories named
tsdb
andtsdb-uid
from your local filesystem to the HDFS cluster backing up your real HBase cluster. - Run
./bin/hbase org.jruby.Main ./bin/add_table.rb /hdfs/path/to/hbase/tsdb
and again for thetsdb-uid
directory. - Restart your real HBase cluster (sorry).
- Restart your TSDs after making sure they now use your real HBase cluster.
Putting HBase in production
TBD. In short:
- Stay on a single node unless you can deploy HBase on at least 5 machines, preferably at least 10.
- Make sure you have LZO installed and make sure it’s enabled for the tables used by OpenTSDB.
- TBD…
Views: 28