Re: Monitor depth
> One thing I rarely see discussed about
> monitoring, is how throughly each server
> is tested. For instance, for SMTP,
Many of the systems discussed, at the very least Nagios, base their monitoring around an open plugin system. One can certainly write a plugin to comprehensively test a service, it's just a matter of doing it.
Some plugins just require a little thought to implement. For instance, on a host with several virtual websites, it's not enough to get to the IP and speak HTTP, you need to know that a particular site is being accessed. An easy setup is to use check_http to check for a specific page on each site and do a critical error on a 404 (page not found). Simple and works.
Re: Polling is very important
> Nope. With a sniffer on the host or a
> mirrored port you would see any problem
> immediately, rather than waiting for a
> polling period.
AGAIN, if the host can't communicate back to the NMS, you won't know it's dead. You need some kind of polling going, at the very least a check that traps or whatever mechanism gets back to the NMS still works.
> I'm not ruling out polling completely,
> just saying it should not be the core
> way of finding problems on the
No, your original point was that polling was worthless. You are now agreeing with my point, you need to have a polling mechanism somewhere, if for no other reason than to cross check that certain services are still active (SNMP traps are being sent and received, event messages are being sent, received and processed, etc). I'm not talking about the service itself, you're correct that an active monitor (I don't like the term 'real time') would be a plus for some services. But you need something to make sure your reporting infrastructure is working and that means periodic "I'm alive" messages, ie polling.