摘自optntsdb.net的说明:
tcollector
tcollector is a client-sideprocess that gathers data from local collectors and pushes the data toOpenTSDB. You run it on all your hosts, and it does the work of sending eachhost's data to the TSD.
OpenTSDB is designed to make it easy tocollect and write data to it. It has a simple protocol, simple enough for evena shell script to start sending data. However, to do so reliably andconsistently is a bit harder. What do you do when your TSD server is down? Howdo you make sure your collectors stay running? This is where tcollector comesin.
Tcollector does several things for you:
Runs all of your data collectors andgathers their data
Does all of the connection management workof sending data to the TSD
You don't have to embed all of this code inevery collector you write
Does de-duplication of repeated values
Handles all of the wire protocol work foryou, as well as future enhancements
Deduplication
Typically you want to gather data abouteverything in your system. This generates a lot of datapoints, the majority ofwhich don't change very often over time (if ever). However, you wantfine-grained resolution when they do change. Tcollector remembers the lastvalue and timestamp that was sent for all of the time series for all of thecollectors it manages. If the value doesn't change between sample intervals, itsuppresses sending that datapoint. Once the value does change (or 10 minuteshave passed), it sends the last suppressed value and timestamp, plus thecurrent value and timestamp. In this way all of your graphs and such arecorrect. Deduplication typically reduces the number of datapoints TSD needs tocollect by a large fraction. This reduces network load and storage in thebackend. A future OpenTSDB release however will improve on the storage formatby using RLE (among other things), making it essentially free to store repeatedvalues.
Collectinglots of metrics with tcollector
Collectors in tcollector can be written inany language. They just need to be executable and output the data to stdout.Tcollector will handle the rest. The collectors are placed in the collectors
directory. Tcollectoriterates over every directory named with a number in that directory and runsall the collectors in each directory. If you name the directory 60
, then tcollector will try to runevery collector in that directory every 60 seconds. Use the directory 0
for any collectors that arelong-lived and run continuously. Tcollector will read their output and respawnthem if they die. Generally you want to write long-lived collectors since thathas less overhead. OpenTSDB is designed to have lots of datapoints for eachmetric (for most metrics we send datapoints every 15 seconds).
If there any non-numeric named directoriesin the collectors
directory, then they are ignored. We've included a lib
and etc
directory for library and configdata used by all collectors.
Installation of tcollector
You need to clone tcollector from GitHub:
git clonegit://github.com/OpenTSDB/tcollector.git
and edit 'tcollector/startstop'script to set following variables:
TSD_HOST=dns.name.of.tsdTCOLLECTOR_PATH=path/to/tcollector
To avoid having to run mkmetric
for every metric thattcollector tracks you can to start TSD with the --auto-metric
flag. This is useful to get started quickly, but it's notrecommended to keep this flag in the long term, to avoid accidental metriccreation.
运行说明:
git clone git://github.com/OpenTSDB/tcollector.git
配置tcollector
修改tcollector/startstop
TSD_HOST=localhost
TCOLLECTOR_PATH=/usr/hadoop/tcollector
运行tcollector
启动hbase
启动tsd
./build/tsdb tsd--port=4242 --staticroot=build/staticroot --cachedir=/tmp/tsdtmp –zkquorum=localhost
添加标签
./build/tsdb mkmetric df.bytes.tota df.bytes.used df.bytes.freedf.inodes.total df.inodes.used df.inodes.free
运行tcollector
cd /usr/hadoop/tcollector
./startstop start
cp /collectors/0/dfstat.py dfstat.py
./dfstat.py
查看hbase表记录
scan ‘tsdb-uid’
scan ‘tsdb’