一、系统运行环境:

服务器系统版本:Centos 5.3  x64

nginx版本:nginx-0.8.54

nagios版本:nagios-3.2.3

二 、关闭不需要的服务:

chkconfig –level 2345 cups off

chkconfig –level 2345 ip6tables off

chkconfig –level 2345 iptables off

chkconfig –level 2345 netfs off

chkconfig –level 2345 nfslock off

chkconfig –level 2345 portmap off

chkconfig –level 2345 rhnsd off

chkconfig –level 2345 rpcsvcgssd off

chkconfig –level 2345 rpcidmapd off

chkconfig –level 2345 smartd off

chkconfig –level 2345 xfs off

chkconfig –level 2345 bluetooth off

chkconfig –level 2345 hidd off

chkconfig –level 2345 pand off

chkconfig –level 2345 dund off

chkconfig –level 2345 capi off

chkconfig –level 2345 firstboot off

chkconfig –level 2345 kudzu off

chkconfig –level 2345 mcstrans off

chkconfig –level 2345 pcscd off

chkconfig –level 2345 restorecond off

chkconfig –level 2345 rpcgssd off

三、下载所需包:

所有安装所需的源码包:本站点集成下载:

http://www.jiankli.com/download/nginx-0.8.54.tar.gz

http://www.jiankli.com/download/pcre-8.01.tar.gz

http://www.jiankli.com/download/FCGI-0.67.tar.gz

http://www.jiankli.com/download/FCGI-ProcManager-0.18.tar.gz

http://www.jiankli.com/download/nagios-3.2.3.tar.gz

http://www.jiankli.com/download/nagios-cn-3.2.3.tar.bz2

http://www.jiankli.com/download/nagios-plugins-1.4.13.tar.gz

http://www.jiankli.com/download/nrpe-2.8.1.tar.gz

http://www.jiankli.com/download/php-5.2.17.tar.gz

四、安装nginx:

安装nginx之前需要安装pcre包和zlib以支持重写,正则以及网页压缩等等

(1)首先安装pcre:

cd /usr/src

tar xzf pcre-8.01.tar.gz

cd pcre-8.01

./configure –prefix=/usr/local/pcre

make &&make install

(2)、然后再安装nginx :

useradd www

cd /usr/src

tar xzf nginx-0.8.54.tar.gz

cd nginx-0.8.54

./configure –prefix=/usr/local/nginx-0.8 –with-http_stub_status_module –with-openssl=/usr/ –with-pcre=/usr/src/pcre-8.01 –user=www –group=www

make &&make install
【nginx注意* –with-pcre=/usr/src/pcre-8.01指向的是源码包解压的路径,而不是安装的路径,否则会报

make[1]: *** [/usr/local/pcre/Makefile] Error 127 错误】

nginx安装完毕后。

五、正式安装nagios:(nagios中文版同样安装配置)

创建nagios用户和组,并把www加入nagios管理组:
useradd  nagios

groupadd nagcmd

usermod -g nagcmd nagios

usermod -g nagcmd www

cd /usr/scr

tar xzf nagios-3.2.3.tar.gz

cd nagios-3.2.3

./configure –with-command-group=nagcmd –prefix=/usr/local/nagios

make all

make install

make install-init

make install-config

make install-commandmode
nagios安装完毕:
验证程序是否被正确安装。切换目录到安装路径(这里是/usr/local/nagios),看是否存在 etc、bin、 sbin、 share、 var这五个目录,如果存在则可以表明程序被正确的安装到系统了。

bin
Nagios执行程序所在目录,这个目录只有一个文件nagios
etc
Nagios配置文件位置,初始安装完后,只有几个*.cfg-sample文件
sbin
Nagios Cgi文件所在目录,也就是执行外部命令所需文件所在的目录
Share
Nagios网页文件所在的目录
Var
Nagios日志文件、spid 等文件所在的目录

接下来编译并安装nagios插件 nagios-plugins
cd /usr/src

tar zxvf nagios-plugins-1.4.13.tar.gz

cd nagios-plugins-1.4.13

./configure –perfix=/usr/local/nagios –with-nagios-user=nagios –with-nagios-group=nagios

make && make install

插件安装完毕!

ls /usr/local/nagios/libexec
会显示安装的插件文件,即所有的插件都安装在libexec这个目录下。

六、安装perl fcgi模块:【首先得安装php并编译fastcGI模块】

(1)安装php

cd   /usr/src

tar xzf  php-5.2.17.tar.gz

cd php-5.2.17

./configure –prefix=/usr/local/php –with-config-file-path=/usr/local/php/etc –with-mysql=/usr/local/mysql –with-mysqli=/usr/local/mysql/bin/mysql_config –with-iconv-dir=/usr/local –with-freetype-dir –with-jpeg-dir –with-png-dir –with-zlib –with-libxml-dir=/usr –enable-xml –disable-rpath –enable-discard-path –enable-safe-mode –enable-bcmath –enable-shmop –enable-sysvsem–enable-inline-optimization –with-curl –with-curlwrappers –enable-mbregex  –enable-fpm  –enable-sockets

make  &&make install

安装完毕!【注意这个参数在此可以不加–enable-fastcgi;其他之前版本需要加上,以上安装根据自己的选择添加,如果报错,根据具体报错找原因】

(2)、安装FCGI
cd /usr/src

tar -zxvf FCGI-0.67.tar.gz

cd FCGI-0.67

perl Makefile.PL

make

make install
(3)、安装FCGI-ProcManager:
cd /usr/src

tar -xzxf FCGI-ProcManager-0.18.tar.gz

cd FCGI-ProcManager-0.18

perl Makefile.PL

make

make install
安装完毕即可。

mkdir -p /usr/local/nagios/share/nagios

ln -s /usr/local/nagios/share/p_w_picpaths /usr/local/nagios/share/nagios/p_w_picpaths

ln -s /usr/local/nagios/share/stylesheets /usr/local/nagios/share/nagios/stylesheets     ///避免无法显示图片。
接下来配置cgi脚本、nginx配置文件:
cd/usr/local/nagios/bin/perl-cgi.pl

#!/usr/bin/perl
use FCGI;
#perl -MCPAN -e ‘install FCGI’
use Socket;
#this keeps the program alive or something after exec’ing perl scripts
END()
{
}
BEGIN()
{
}
*CORE::GLOBAL::exit
=
sub
{
die
“fakeexitnrc=”.shift().”n”;
};
eval
q{exit};
if
($@)
{
exit
unless
$@
=~
/^fakeexit/;
}
;
&main;
sub main {
#$socket = FCGI::OpenSocket( “:3461″, 10 ); #use IP sockets
$socket
= FCGI::OpenSocket(
“/var/run/nagios.sock”,
10
);
#use UNIX sockets – user running this script must have w access to the ‘nginx’ folder!!
$request
= FCGI::Request(
*STDIN,
*STDOUT,
*STDERR,
%ENV,
$socket
);
if
($request)
{request_loop()};
FCGI::CloseSocket(
$socket
);
}
sub request_loop {
while(
$request->Accept()
>=
0
)
{
#processing any STDIN input from WebServer (for CGI-GET actions)
$env
=
$request->GetEnvironment();
$stdin_passthrough
=”;
$req_len
=
0
+
$ENV{CONTENT_LENGTH};
if
($ENV{REQUEST_METHOD}
eq
‘GET’){
$stdin_passthrough
.=
$ENV{‘QUERY_STRING’};
}
#running the cgi app
if
(
(-x $ENV{SCRIPT_FILENAME})
&&
#can I execute this?
(-s
$ENV{SCRIPT_FILENAME})
&&
#Is this file empty?
(-r $ENV{SCRIPT_FILENAME})
#can I read this file?
){
#http://perldoc.perl.org/perlipc.html#Safe-Pipe-Opens
open
$cgi_app,
‘-|’,
$ENV{SCRIPT_FILENAME},
$stdin_passthrough
or
print(“Content-type: text/plainrnrn”);
print
“Error: CGI app returned no output – Executing $ENV{SCRIPT_FILENAME} failed !n”;
if
($cgi_app)
{print
<$cgi_app>;
close
$cgi_app;}
}
else
{
print(“Content-type: text/plainrnrn”);
print
“Error: No such CGI app – $req_len – $ENV{CONTENT_LENGTH} – $ENV{REQUEST_METHOD} – $ENV{SCRIPT_FILENAME} may not exist or is not executable by this process.n”;
}
}
}

保存并退出,并给予执行权限
chmod +x /usr/local/nagios/bin/perl-cgi.pl

以下是我的nginx.conf文件server内容:

server
{
listen 80;
server_name 192.168.2.79;
root /usr/local/nagios/share;
index index.php;
auth_basic “You Name”;

auth_basic_user_file /usr/local/nagios/etc/htpasswd;
log_format nagios ‘$remote_addr – $remote_user [$time_local] “$request” ‘
‘$status $body_bytes_sent “$http_referer” ‘
‘”$http_user_agent” $http_x_forwarded_for’;
access_log /usr/local/nginx/nagios.log nagios;
location ~ .*.(php|php5)?$
{
#fastcgi_pass unix:/tmp/php-cgi.sock;
fastcgi_pass 127.0.0.1:9000;

fastcgi_index index.php;
include fcgi.conf;
}
location ~ .cgi$ {
root /usr/local/nagios/sbin;
rewrite ^/nagios/cgi-bin/(.*).cgi /$1.cgi break;
fastcgi_index index.cgi;
fastcgi_pass unix:/var/run/nagios.sock;

fastcgi_param SCRIPT_FILENAME /usr/local/nagios/sbin$fastcgi_script_name;
fastcgi_param QUERY_STRING $query_string;

fastcgi_param REMOTE_ADDR $remote_addr;
fastcgi_param REMOTE_PORT $remote_port;
fastcgi_param REQUEST_METHOD $request_method;
fastcgi_param REQUEST_URI $request_uri;
fastcgi_param REMOTE_USER $remote_user;

#默认没有红色这一句,如果没有,后面nagios页面会报错,具体错误后面给出。

#fastcgi_param SCRIPT_NAME $fastcgi_script_name;
fastcgi_param SERVER_ADDR $server_addr;
fastcgi_param SERVER_NAME $server_name;
fastcgi_param SERVER_PORT $server_port;
fastcgi_param SERVER_PROTOCOL $server_protocol;
fastcgi_param SERVER_SOFTWARE nginx;
fastcgi_param CONTENT_LENGTH $content_length;
fastcgi_param CONTENT_TYPE $content_type;
fastcgi_param GATEWAY_INTERFACE CGI/1.1;
fastcgi_param HTTP_ACCEPT_ENCODING gzip,deflate;
fastcgi_param HTTP_ACCEPT_LANGUAGE zh-cn;
}

fcgi.conf文件内容如下:

fastcgi_param  GATEWAY_INTERFACE  CGI/1.1;
fastcgi_param  SERVER_SOFTWARE    nginx;

fastcgi_param  QUERY_STRING       $query_string;
fastcgi_param  REQUEST_METHOD     $request_method;
fastcgi_param  CONTENT_TYPE       $content_type;
fastcgi_param  CONTENT_LENGTH     $content_length;

fastcgi_param  SCRIPT_FILENAME    $document_root$fastcgi_script_name;
fastcgi_param  SCRIPT_NAME        $fastcgi_script_name;
fastcgi_param  REQUEST_URI        $request_uri;
fastcgi_param  DOCUMENT_URI       $document_uri;
fastcgi_param  DOCUMENT_ROOT      $document_root;
fastcgi_param  SERVER_PROTOCOL    $server_protocol;

fastcgi_param  REMOTE_ADDR        $remote_addr;
fastcgi_param  REMOTE_PORT        $remote_port;
fastcgi_param  SERVER_ADDR        $server_addr;
fastcgi_param  SERVER_PORT        $server_port;
fastcgi_param  SERVER_NAME        $server_name;

# PHP only, required if PHP was built with –enable-force-cgi-redirect
fastcgi_param  REDIRECT_STATUS    200;

以上可以把nginx中fastcGI部分整合到fcgi.conf中。

 

七、创建一个nagiosadmin用户:

用于Nagios的WEB接口登录。

记下你所设置的登录口令,一会儿你会用到它
这里要借助于apche的htpasswd,在有apache机器上执行下列命令
/usr/local/apache2/bin/htpasswd -c /usr/local/nagios/etc/htpasswd nagiosadmin
输入两次一样的密码即可。
启 动nagios ;chkconfig –add nagios 添加到service 中,chkconfig nagios –level 35 on 设置成开机启动。【把selinux、iptables关闭,或者更改iptables策略需要访问nagios】

八、nagios配置

刚安装完成的nagios,其配置文件的目录是/usr/local/nagios/etc,下图是其etc目录的文件:
先把这些文件改名,如 cgi.cfg-sample改成cgi.cfg ,用命令cp cgi.cfg-sample cgi.cfg …依样把余下的几个*.cfg-sample都复制成*.cfg文件。从nagios2.6版开始,不用修改配置文件localhost.cfg就可以直 接运行../bin/nagios –v nagios.cfg验证程序是否能正常运行(nagios2.5及以前版本的最小运行的配置文件是minimal.cfg,但需要修改这个文件多处才能 验证成功)。当然,我们不能指望这个最小的配置文件能够满足实际的需求,因此,需要对现有的配置文件进行修改,其次增加自定义的一些配置文件。这里,我们 分两步进行:先修改配置文件再增添自定义文件。
(一)   修改配置文件
Nagios的主配置文件是nagios.cfg,我们就从这个文件开始修改。用vi编辑nagios.cfg,注释行 #cfg_file=/usr/local/nagios/etc/localhost.cfg[2],然后把下面几行的注释去掉:
cfg_file=/usr/local/nagios/etc/contactgroups.cfg  //联系组配置文件路径
cfg_file=/usr/local/nagios/etc/contacts.cfg       //联系人配置文件路径
cfg_file=/usr/local/nagios/etc/hostgroups.cfg     //主机组配置文件路径
cfg_file=/usr/local/nagios/etc/hosts.cfg          //主机配置文件路径
cfg_file=/usr/local/nagios/etc/services.cfg       //服务配置文件路径
cfg_file=/usr/local/nagios/etc/timeperiods.cfg    //监视时段配置文件路径
改check_external_commands=0为check_external_commands=1 .这行的作用是允许在web界面下执行重启nagios、停止主机/服务检查等操作。把command_check_interval的值从默认的1改成 command_check_interval=10s(根据自己的情况定这个命令检查时间间隔,不要太长也不要太短)。主配置文件要改的基本上就是这 些,通过上面的修改,发现/usr/local/nagios/etc并没有文件hosts.cfg等一干文件,怎么办?稍后手动创建它们。
第二个要修改的配置文件是cgi.cfg,它的作用是控制相关cgi脚本。先确保use_authentication=1。曾看过不少的文章,都是建议 把use_authentication的值设置成”0”来取消验证,这是一个十分糟糕的想法。接下来修改 default_user_name=vincent ,再后面的修改在下表列出:
authorized_for_system_information=nagiosadmin,vincent
authorized_for_configuration_information=nagiosadmin,vincent
authorized_for_system_commands=vincent  //多个用户之间用逗号隔开
authorized_for_all_services=nagiosadmin,vincent
authorized_for_all_hosts=nagiosadmin,vincent
authorized_for_all_service_commands=nagiosadmin,vincent
authorized_for_all_host_commands=nagiosadmin,vincent 

 

 

那么上述用户名打那里来的呢?是执行命令 /usr/local/apache/bin/htpasswd –c /usr/local/nagios/etc/htpasswd vincent 所生成的,这个要注意,不能随便加没有存在的验证用户,为了安全起见,不要添加过多的验证用户。
第3个修改的配置文件是misccommands.cfg,这个文件的主要功能是用来发送报警短信和报警邮件,对其的修改如下所示:
#host-notify-by-sms   //发送短信报警
define command {
command_name      host-notify-by-sms
command_line      /usr/local/bin/sms_send “Host $HOSTSTATE$ alert for $HOSTNAME$! on ‘$DATETIME$’ ” $CONTACTPAGER$
}
#service notify by sms  //发送短信报警
define command {
command_name     service-notify-by-sms
command_line     /usr/local/bin/sms_send “‘$HOSTADDRESS$’ $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$” $CONTACTPAGER$ 

 

}
主机和服务的邮件报警通知已经在文件中,不须更改。也可以把短信和邮件报警通知这些配置块写到文件commands.cfg中,效果是一样的。
(二)增加新的配置文件
先创建简单的配置文件timeperiods.cfg,其内容如下:
define timeperiod{
timeperiod_name 24×7
alias           24 Hours A Day, 7 Days A Week
sunday          00:00-24:00
monday          00:00-24:00
tuesday         00:00-24:00
wednesday       00:00-24:00
thursday        00:00-24:00
friday          00:00-24:00
saturday        00:00-24:00
}
这个文件的定义明晰易懂,不多做说明。另建议7X24小时监控。
第二个手动创建的配置文件是 contacts.cfg,其格式如下:
define contact {
contact_name         sa    //不要有空格
alias                system administrator
service_notification_period    24×7
host_notification_period       24×7
service_notification_options   w,u,c,r
host_notification_options       d,u,r
service_notification_commands  service-notify-by-sms,service-
notify-by-email  //这个命令读配置文件miscommands.cfg
host_notification_commands     host-notify-by-email,host-noti
fy-by-sms      //这个命令读配置文件miscommands.cfg
email                           vincent@163.com
pager                          13333333333 //手机号,收报警短信
}     //不要把这个符号写掉了
define contact {
contact_name        vincent
alias                system administrator
service_notification_period    24×7
host_notification_period       24×7
service_notification_options   w,u,c,r
host_notification_options       d,u,r
service_notification_commands  service-notify-by-sms,service-
notify-by-email
host_notification_commands     host-notify-by-email,host-noti
fy-by-sms
email                          vincent@sohu.com
pager                          13312345678
}
上面的文件定义了2个联系人,如果有更多联系 人的话,照这个格式在后面追加即可。服务通知选项(service_notification_options)与主机通知选项 (host_notification_options)的几个选项在这里说明一下:w-warning , u-unknown,c-critical,r-recovery;d-down,u-unreachable,注意一下,主机报警和服务报警有些差异。
紧接着的第三个手动创建的配置文件是contactgroups.cfg文件,这个文件是依照上一个文件contacts.cfg来的,contactgroups文件相对简单一些,其格式如下:
define contactgroup {
contactgroup_name    sagroup  //不要用空格
alias                system administrator group
members              sa,vincent  //本例有2个成员
}
多个成员之间用逗号做分界符,如果有更多的联系组,就依相同的格式在文件中追加余下的组。
关键的角色终于登场,这就是配置文件hosts.cfg。下面是我定义的两个主机的基本样式:
#define monitor  host
############################################
# Wangjing IDC servers                                          #
############################################
define host {
host_name                  nagios-server
alias                      nagios server
address                    61.x..x.49
contact_groups             sagroup //多个联系组用逗号分隔,数据来源于contactgroups.cfg
check_command              check-host-alive
max_check_attempts         5
notification_interval      10    //值可调,大小什么值合适需自己测定
notification_period        24×7
notification_options        d,u,r
}
define host {
host_name                  24-25
alias                      server 24-25
address                    202.X.24.25
contact_groups             sagroup
check_command             check-host-alive //down机就发报警通知
max_check_attempts         5
notification_interval      10
notification_period        24×7
notification_options        d,u,r
}
更多的主机依此格式逐个追加进来。小技巧,如果是连续的ip段,最好自己写个脚本生成hosts.cfg文件,为了以后维护方便,尽可能在文件中使用易读的注释(如本例# Wangjing IDC servers            #)。
再一个重量级的配置文件是services.cfg,没有这个文件,什么监控也没用。下面给出一个样式文件:
#service definition
###########################################
#  Wangjing IDC servers service for host-live            #
###########################################
define service {
host_name        nagios-server  //来源:hosts.cfg
service_description   check-host-alive
check_period          24×7
max_check_attempts    4
normal_check_interval 3
retry_check_interval  2
contact_groups        sagroup  //来源:contactgroups.cfg
notification_interval   10
notification_period     24×7
notification_options    w,u,c,r
check_command           check-host-alive  //检查主机是否存活
}
define service {
host_name        74-210
service_description   check_tcp 80
check_period          24×7
max_check_attempts    4
normal_check_interval 3
retry_check_interval  2
contact_groups        sagroup
notification_interval   10
notification_period     24×7
notification_options    w,u,c,r
check_command      check_tcp!80 //检查tcp 80端口服务是否正常
}
书写时要注意的是,check_tcp与要监控的服务端口之间要用”!”做分隔符。如果服务太多,以应该考虑用脚本来生成。
主机组配置文件hostgroups.cfg,这是一个可选的项目,它建立在文件hosts之上,其格式如下:
define hostgroup {
hostgroup_name  sa-servers
alias           sa servers
members         nagios-server,24-25,24-26  //用逗号间隔多个主机
}
多个主机组依上面的格式逐个追加上去。后面给一个主机组的截图。

 

启动nginx,确定有fastcGI 9000端口。 后台启动screen   ./perl-cgi.pl >/dev/null  ;ctrl +a +d 退出screen。

然后chmod  777   /var/run/nagios.sock  ;并且重启service nagios restart

运行程序/usr/local/nagios –v /usr/local/nagios/etc/nagios.cfg来检查所有配置文件的正确性。如果十分幸运的话,运行完毕将在输出尾部出现
Total Warnings: 0
Total Errors:   0
Things look okay – No serious problems were detected during the pre-flight check

 

九、安装完毕,然后访问:

直接访问 http://192.168.2.79 会弹出登录框 ,输入用户名nagiosadmin 和密码,会看到nagios默认监控localhost!

页面如下:而且点击详细服务关闭报警,也不报错:

 

 


这样的话,nginx+nagios 环境搭建完毕!剩下的就是具体监控客户端的配置啦!文章后续给出!

本文环境已经测试通过,搭建的时候,也遇到了好多例如403、502、504等等系列问题,一点一滴的解决。一件事只有你真正做过,你才会感觉到学到很多东西。如下的报错:

(1)、It seems that you have chosen to not use the authentication functionality of the CGIs.

I don’t want to be personally responsible for what may happen as a result of allowing

unauthorized users to issue commands to Nagios,so you’ll have to disable this safeguard if you

are really stubborn and want to invite trouble.

(2)、It appears as though you do not have permission to view information for any of the hosts you requested…

If you believe this is an error, check the HTTP server authentication requirements for accessing this CGI 
and check the authorization options in your CGI configuration file

可参照修改:http://hi.baidu.com/shengit/blog/item/b21b770965c6e8de62d986c0.html

本文参考文章链接:
http://bbs.linuxtone.org/thread-4441-1-1.html

http://bbs.linuxtone.org/thread-7404-1-1.html

http://www.comeonsa.com/

http://sery.blog.51cto.com/10037/20520