1
|
On each target server:
|
2
|
|
3
|
1. install the nrpe "service" package:
|
4
|
|
5
|
sudo apt-get install nagios-nrpe-server // (I know, the server in the name is unfortunate)
|
6
|
|
7
|
2. copy nrpe_local.cfg to /etc/nagios/nrpe_local.cfg
|
8
|
|
9
|
(for convenience, the text of nrpe_local.cfg (if cut and paste is easier than scp) :
|
10
|
|
11
|
##########################################################
|
12
|
#
|
13
|
# nrpe_local.cfg
|
14
|
#
|
15
|
# this companion cfg file to nrpe.cfg that contains all of
|
16
|
# the customized settings for dataone
|
17
|
#
|
18
|
##########################################################
|
19
|
|
20
|
|
21
|
## the ip address of monitor.dataone.org
|
22
|
##########################################
|
23
|
allowed_hosts=129.24.0.12
|
24
|
|
25
|
|
26
|
## the following are typical commands to be mapped
|
27
|
########################################################
|
28
|
command[check_disk]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10%
|
29
|
command[check_users]=/usr/lib/nagios/plugins/check_users -w 20 -c 50
|
30
|
command[check_load]=/usr/lib/nagios/plugins/check_load -w 5,4,3 -c 10,6,4
|
31
|
command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z
|
32
|
command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 250 -c 400
|
33
|
|
34
|
<<FILE
|
35
|
|
36
|
|
37
|
3. restart the nrpe daemon with:
|
38
|
|
39
|
sudo /etc/init.d/nagios-nrpe-server restart
|
40
|
|
41
|
4. open up the firewall for port 5666:
|
42
|
|
43
|
sudo ufw allow from 129.24.0.12 to tcp port 5666
|
44
|
|
45
|
5. from the central monitoring server, test communication with:
|
46
|
|
47
|
check_nrpe -H <target server IP> -c check_users
|
48
|
GOOD:
|
49
|
USERS OK - 1 users currently logged in...
|
50
|
BAD:
|
51
|
CHECK_NRPE: Socket timeout after 10 seconds.
|
52
|
(see http://www.debianclusters.org/index.php/Nagios_NRPE_Addon_Installation_and_Configuration
|
53
|
for more troubleshooting errors (toward bottom of page)
|
54
|
|