Four winning ways to monitor machines through Web interfaces

最新推荐文章于 2024-09-20 09:32:53 发布

hanfangfang

最新推荐文章于 2024-09-20 09:32:53 发布

阅读量1k

点赞数

分类专栏： Shilh 文章标签： web intervals plugins graph system debian

Shilh 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

Systemadministrators need to keep an eye on their servers to make sure thingsare running smoothly. If they find a problem, they need to see when itstarted, so investigations can focus on what happened at that time.That means logging information at regular intervals and having a quickway to analyse this data. Here's a look at several tools that let youmonitor one or more servers from a Web interface.

Eachtool has a slightly different focus, so we'll take them all for a spinto give you an idea of which you might like to install on your machine.The language and design they use to perform statistic logging can havean impact on efficiency. For example, collectd is written in C and runsas a daemon, so it does not have to create any new processes in orderto gather system information. Other collection programs might bewritten in Perl and be regularly spawned from cron. While your diskcache will likely contain the Perl interpreter and any Perl modulesused by the collection program, the system will need to spawn one ormore new processes regularly to gather the system information.

RRDtool

The tools we'll take a look at often use other tools themselves, such as RRDtool, which includes tools to store time series data and graph it.

The RRDtool project focuses on making storing new time series datahave a low impact on the disk subsystem. You might at first think thisis no big deal. If you are just capturing a few values every 5-10seconds, appending them to the end of a file shouldn't even benoticeable on a server. Then you may start monitoring the CPU (load,nice values, each core -- maybe as many as 16 individual values),memory (swap, cache sizes -- perhaps another five), the free space onyour disks (20 values?) and a collection of metrics from your UPS(perhaps the most interesting 10 values). Even without consideringnetwork traffic, you can be logging 50 values from your system every 10seconds.

RRDtool is designed to write these values to disk in 4KB blocksinstead of as little 4- or 8-byte writes, so the system does not haveto perform work every logging interval. If another tool wants all thedata, it can issue a flush, ensuring that all the data RRDtool might becaching is stored to disk. RRDtool is also smart about telling theLinux kernel what caching policy to use on its files. Since data iswritten 4KB at a time, it doesn't make sense to keep it around incaches, as it will be needed again only if you perform analysis,perhaps using the RRDtool graph command.

As system monitoring tools write to files often, you might like tooptimize where those files are stored. On conventional spinning disks,filesystem seeks are more expensive than sequentially reading manyblocks, so when the Linux kernel needs to read a block it also readssome of the subsequent blocks into cache just in case the applicationwants them later. Because RRDtool files are written often, and usuallyonly in small chunks the size of single disk blocks, you might dobetter turning off readahead for the partition you are storing yourRRDfiles on. You can minimize the block readahead value for a device totwo disk blocks with the blockdev program from util-linux-ng with acommand like blockdev --setra 16 /dev/sdX. Turning off a-time updates and using writeback mode for the filesystem and RAID will also help performance. RRDtool provides advice for tuning your system for RRD.

collectd

The collectd project is designedfor repeatedly collecting information about your systems. While thetarball includes a Web interface to analyse this information, theproject says this interface is a minimal example and cites other projects such as Cacti for those who are looking for a Web interface to the information gathered by collectd.

There are collectd packages for Debian Etch, Fedora 9, and as a 1-Clickfor openSUSE. The application is written in C and is designed to run asa daemon, making it a low overhead application that is capable oflogging information at short intervals without significant impact onthe system.

When you are installing collectd packages you might like to investigate the plugin packagestoo. One of the major strengths of collectd is support through pluginsfor monitoring a wide variety of information about systems, such asdatabases, UPSes, general system parameters, and NFS and other serverperformance. Unfortunately the plugins pose a challenge to thepackagers. For openSUSE you can simply install the plugins-all package. The version (4.4.x) packaged for Fedora 9 is too old to include the PostgreSQL plugin. The Network UPS Tools (NUT) plugin is not packaged for Debian, openSUSE, or Fedora.

The simplest way to resolve this for now is to build collectd fromsource, configuring exactly the plugins that you are interested in.Some of the plugins that are not commonly packaged at the time ofwriting but that you may wish to use include NUT, netlink, postgresql,and iptables. Installation of collectd follows the normal ./configure; make; sudo make installprocess, but your configure line will likely be very long if youspecify which plugins you want compiled. The installation procedure andplugins I selected are shown in the below command block. I used theinit.d startup file provided in the contrib directory and had to changea few of the paths because of the private prefix I used to keep thecollectd installation under a single directory tree. Note that I had toalso build a private copy of iproute2 in order to get the libnetlinklibrary on Fedora 9

 
 
  
  
$ cd ./collectd-4.5.1
$ ./configure --prefix=/usr/local/collectd \
  --with-perl-bindings=INSTALLDIRS=vendor \
  --without-libiptc --disable-ascent --disable-static \
   --enable-postgresql --enable-mysql --enable-sensors \
   --enable-email --enable-apache --enable-perl \
   --enable-unixsock --enable-ipmi --enable-cpu --enable-nut \
   --enable-xmms --enable-notify_email  --enable-notify_desktop \
   --disable-ipmi --with-libnetlink=/usr/local/collectd/iproute2-2.6.26
$ make
...
$ sudo make install

$ su 
# install -m 700 contrib/fedora/init.d-collectd /etc/init.d/collectd
# vi /etc/init.d/collectd
...
CONFIG=/usr/local/collectd/etc/collectd.conf
...
   daemon /usr/local/collectd/sbin/collectd -C "$CONFIG"
# chkconfig collectd on

If more of the optional collectd plugins are packaged in the future,you may be able to install your desired build of collectd withouthaving to resort to building from source.

Before you start collectd, take a look at etc/collectd.conf and makesure you like the list of plugins that are enabled and their options.The configuration file defines a handful of global options followed bysome LoadPlugin lines that nominate which plugins you want collectd touse. Configuration of each plugin is done inside a <Plugin foo>...</Plugin>scope. You should also check in your configuration file that therrdtool plugin is enabled and that the DataDir parameter is set to adirectory that exists and which you are happy to have variable datastored in.

When you have given a brief look at the plugins that are enabled and their options, you can start collectd by running service collectd status as root.

To see the information collectd gathers you have to install the Webinterface or another project such as Cacti. The below steps shouldinstall the basic CGI script that is supplied with collectd. Thescreenshot that follows the steps shows the script in action.

 
 
  
  
# yum install rrdtool-perl
# cp contrib/collection.conf  /etc/
# vi /etc/collection.conf
datadir: "/var/lib/collectd/rrd/"
libdir: "/usr/local/collectd/lib/collectd/"
# cp collection.cgi /var/www/cgi-bin/
# chgrp apache /var/www/cgi-bin/collection.cgi

If you want to view your collectd data on a KDE desktop, check out kcollectd, a young but already useful project. You can also integrate the generated RRDtool files with Cacti, though the setup process is very longwinded for each graph.

With collectd the emphasis is squarely on monitoring your systems,and the provided Web interface is offered purely as an example thatmight be of interest. Being a daemon written in C, collectd can alsorun with minimal overhead on the system.

Cacti

While collectd emphasizes data collection, Cactiis oriented toward providing a nice Web front end to your systeminformation. Whereas collectd runs as a daemon and collects itsinformation every 10 seconds without spawning processes, Cacti runs a PHP scriptevery five minutes to collect information. (These time intervals forthe two projects are the defaults and are both user configurable.) Thedifference in default values gives an indication of how frequently eachproject thinks system information should be gathered.

Cacti is packaged for Etch, Fedora 9, and openSUSE 11. I used the Fedora packages on a 64-bit Fedora 9 machine.

Once you have installed Cacti, you might get the following errorwhen you try to visit http://localhost/cacti if your packages have notset up a database for you. The Cacti Web site has detailed instructions to help you set up your MySQL database and configure Cacti to connect.

 
 
  
  
FATAL: Cannot connect to MySQL server on 'localhost'. Please make sure
you have specified a valid MySQL database name in 'include/config.php'

When you first connect to your Cacti installation in a Web browseryou are presented with a wizard to complete the configuration. Cactipresents you with the paths to various tools, SNMP settings, and thePHP binary, and asks which version of rrdtool you have. Although Cactifound my rrdtool, I still had to tell it explicitly the version ofrrdtool I had. While this information was easy to supply, a button onthe wizard offering to execute rrdtool and figure it out from the --version string would have been a plus.

After the paths and versions are collected Cacti will ask you to login using the default username and password. When you log in youimmediately have to change the admin user's password

devices,create graphs, and view your graphs. This might lead you to believethat there are no graphs already created. Clicking on the graphs tabyou should see that you already have a small collection of graphs:Memory Usage, Load Average, Logged in Users, and Processes. The graphsview is shown in the screenshot below.

Additional information-gathering scripts available for Cacti let youexpand what information Cacti can monitor. For example, they let youcollect the load and input and output voltage of UPS devices.

Most of these projects' Web interfaces allow you to view yourstatistics using predefined time intervals such as hour, day, and week.Cacti goes a big step further and allows you to specify the exactinterval that you are interested in through the Web interface.

Cacti offers the most functional and polished Web interface amongthese projects. It lets you select the time interval displayed on yourgraphs from more predefined settings, and it also lets you explicitlynominate the start and end time you are interested in. Cacti is theonly one of the tools that lets you nominate a custom time range foryour graph.

Monitorix

Monitorix shows you systeminformation at a glance in three graphs: a central one on the left togive overview information and two smaller graphs on the right to giverelated details. It includes a Perl daemon that collects the statisticsfor your systems and a CGI Web interface that allows you to analyse thedata.

There are no Monitorix packages in the Fedora, openSUSE, or Debian repositories. The Monitorix downloadpage offers a noarch RPM file as well as a tarball for non-RedHat/Fedora-based distributions. I used version 1.2.1 on a 64-bit Fedora9 machine, installed from the noarch RPM file. If you are installing ona Debian-based system, in the monitorix-1.2.1 tarball there is aninstall.sh file that will copy the files to the correct path for yourdistribution, and an un_install.sh file, should you decide to removeMonitorix. You need to install Perl bindings for RRDtool (rrdtool-perlon Fedora 9, librrds-perl on Debian, and rrdtool on openSUSE 11) inorder to use Monitorix.

Once you have the files installed in the right place, either byinstalling the RPM file or running install.sh, you can start collectinginformation by running service monitorix start. Youshould also be able to visit http://localhost/monitorix/ and be offereda collection of graphs to choose from (or just nominate to see themall).

Monitorix doesn't include a plugin system but has builtin supportfor monitoring CPU, processes, memory, context switches, temperatures,fan speeds, disk IO, networktraffic, demand on services such as POP3 and HTTP, interrupt activity,and the numberof users attached to SSH and Samba. A screenshot of Monitorixdisplaying daily graphs is shown below. There are 10 main graph panels,even though you can see only about 1.5 here.

You can configure Monitorix by editing the /etc/monitorix.conf file,which is actually a Perl script. The MNT_LIST option allows you tospecify as many as seven filesystems to monitor. The REFRESH_RATEsetting sets how many seconds before a Web browser should automaticallyrefresh its contents when viewing the Monitorix graphs. You can alsouse Monitorix to monitor many machines by setting MULTIHOST="Y"and listing the servers you would like to contact in the SERV_LISTsetting, as shown below. Alternatively you can list entire subnets tomonitor using the PC_LIST and PC_IP options, examples of which areincluded in the sample monitorix.conf file.

 
 
  
  
MULTIHOST="Y"
our @SERV_LIST=("server number one", "http://192.168.1.10",
		"server number two", "http://192.168.1.11");

Having each graph panel made up of a main graph on the left and twosmaller graphs on the right allows Monitorix to convey a fair amount ofrelated information in a compact space. In the screenshot, the load isshown in the larger graph on the left, with the number of activeprocesses and memory allocation in the two smaller graphs on the right.Scrolling down, one of the graph panels shows network services demand.This panel has many services shown in the main graph and POP3 and WWWas smaller graphs on the right. Unfortunately, the selection of POP3seems to be hard-coded on line 2566 of monitorix.cgi where SERVU2explicitly uses POP3, so if you want to monitor an IMAP mail serviceinstead you are out of luck.

Monitorix is easy to install and get running, and its three graphper row aggregate presentation gives you a good high-level view of yoursystem. Unfortunately some things are still hard-coded into the CGIscript, so you have a somewhat limited ability to change the Webinterface unless you want to start hacking on the script. The lack ofpackages in distribution repositories might also turn away many users.

Munin

The Munin project isfairly clearly split into the gather and analyse functionality. Thislets you install just the package to gather information on many serversand have a single central server to analyse all the gatheredinformation. Munin is also widely packaged, making setup and updatesfairly simple.

Munin is written in Perl, ships with a collection of plugins,supports many versions of Unix operating systems, and has a searchable plugin site.

Munin is in the Fedora 9 repository, Debian Etch, and as a 1-Clickfor openSUSE 11. Again, I used the version from the repository for a64-bit Fedora 9 machine. The project provides two packages: themunin-node package includes all of the monitoring functionality, andthe munin package supports gathering information from machines runningmunin-node and graphing it via a Web interface. If you have a networkof machines, you probably only want to install munin-node on most ofthem and munin on one to perform analysis on all the collected data.

The main configuration file for munin-node,/etc/munin/munin-node.conf, lets you define where log files are kept,what user to run the monitoring daemon as, what address and port thedaemon should bind to, and which hosts are allowed to connect to thataddress and port to download the collected data. In the defaultconfiguration only localhost is allowed to connect to a munin-node.

You configure plugins through individual configuration files in/etc/munin/plugin-conf.d. Munin-node for Fedora is distributed withabout three dozen plugins for monitoring a wide range of system anddevice information.

When you visit http://localhost/munin, Munin displays an overviewpage showing you links to all the nodes that it knows about andincluding links to specific features of the nodes, such as disk,network, NFS, and processes. Clicking on a node name shows you atwo-column display. Each row shows a graph with the daily statistics onthe left and weekly on the right. Clicking on either graph in a rowtakes you to a details page showing that data for the day, week, month,and year. At the bottom of the details page a short textual displaygives more details about the data, including notification of irregularevents. For example, in the below screenshot of the details page forfree disk space, you can see a warning that one of the filesystems hasbecome quite full.

To get an idea of how well a system or service is running on a dailyor weekly basis, the Munin display works well. Unfortunately, the Webinterface does not allow you to drill into the data. For example, youmight like to see a specific two-hour period from yesterday, but youcan't get that graph from Munin.

The pluginssite for Munin is quite well done, allowing you to see an example graphfor many of the plugins before downloading. A drawback to the pluginssite is the search interface, which is very category-oriented. Somefull text search would aid users in finding an appropriate plugin. Forexample, to find the NUT UPS monitoring plugin you have to selecteither Sensors or "ALL CATEGORIES" first; just being able to throw UPSor NUT into a text box would enable quick cherry-picking of plugins.

A major advantage of Munin is that it ships as separate packages forgathering and analysing information, so you don't need to install a Webserver on each node. The additional information at the bottom of thedetails page should also prod you if some statistic has a value thatyou should really pay attention too.

Final words

Of these four applications, Cacti offers the best Web interface,letting you select the time interval displayed on your graphs from morepredefined settings, and it also lets you explicitly nominate the startand end time you are interested in. By contrast, in collectd, theemphasis is squarely on monitoring your systems, and the provided Webinterface is offered purely as an example that might be of interest.Given that collecting and analysing data can be thought of as separatetasks, it would be wonderful if collectd and Cacti could play welltogether. Unfortunately, setting up Cacti to use collectd-generatedfiles is a long, manual, error-prone process. While both Cacti andcollectd are useful projects by themselves, I can't help but think thatthe combination of the two would be greater than its parts. Monitorixand Munin are easy to install and offer a quick overview of a host, butMonitorix's three graph per row aggregate presentation gives you abetter high-level view of your system.

Which one might be best for you? If you spend a lot of time in dataanalysis or if you plan to allow non-administrators to get a glance ofthe system statistics, Cacti might be a good project to look intofirst. If you want to gather information on a system that is alreadyunder heavy load, see if collectd can be run without disrupting yoursystem. Munin's support for gathering information from many nodes usingdifferent application packages makes it interesting if you aremonitoring a small group of similar machines. If you have a singleserver and want a quick overview of what is happening, either Cacti orMonitorix is worth checking out first.