Task #175: Identify an infrastructure monitoring framework - Infrastructure - DataONE Tasks

Task #175

Identify an infrastructure monitoring framework

Added by Dave Vieglais about 15 years ago. Updated over 14 years ago.

Status:

Closed

Priority:

High

Assignee:

Dave Vieglais

Category:

d1_monitor

Target version:

CCI-0.4

Start date:

Due date:

% Done:

100%

Milestone:

None

Product Version:

Story Points:

Sprint:

Description

It will soon be necessary to have a monitoring system in place that basically monitors resource use and system availability for the DataONE infrastructure. This would be helpful to identify system outages or loading on particular components (e.g. CNs running at max memory or MN bandwidth limited, etc).

The selected system should support a plugin architecture so that additional metrics can be added to the monitor (e.g. total data sets in system, or data sets per node).

A few candidates are:

"nagios":http://www.nagios.org/
"ganglia":http://ganglia.sourceforge.net/ (more suited to LAN configurations)
"zabbix":http://www.zabbix.com
"zenoss":http://community.zenoss.org
"rrdtool":http://oss.oetiker.ch/rrdtool/ (generic timeseries recording and viewing - not a monitor)
"cacti":http://www.cacti.net/

History

#1 Updated by Matthew Jones about 15 years ago

We have Nagios set up at NCEAS for monitoring all of our servers -- it works well and allows custom monitoring scripts making it very flexible. Given our time constraints, I suggest we simply use that for now to minimize time to deployment -- the last thing we need is more tasks. -- Nick may already have it set up for cn-dev -- I'll check with him.

#2 Updated by Dave Vieglais over 14 years ago

Currently using Cacti for monitoring.

Also available in: Atom PDF

Project

General

Profile

Infrastructure

Issues

Custom queries

Task #175

Identify an infrastructure monitoring framework

History

#1 Updated by Matthew Jones about 15 years ago

#2 Updated by Dave Vieglais over 14 years ago