Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and grids. It is based on a hierarchical design targeted at federations of clusters. Ganglia is currently in use on over 500 clusters around the world and has scaled to handle clusters with 2000 nodes.
GroundWork Monitor Community Edition can give you insight into your computing infrastructure, allowing you to see the current and historical states of all your computers: servers, desktops, and laptops, all of your network devices, all of your services (like TCP/IP and Web services), and all of your applications (like mail servers and database apps). You can choose to be alerted when something goes awry via pager, SMS, email, or phone, and even set up automatic restarts or fall-overs.
DSM Lite a offers a single user login to manage email, MySQL, DNS, etc. from a single interface for an unlimited number of domains. It allows companies to offer a no-cost solution to entry level server plans, or provide Web-based management and automatic updates of critical Linux based infrastructure servers.
BigDaddy is a program for monitoring servers. It is similar to Nagios, with the added benefit of also monitoring and controlling the crontab (or any scheduled application) across an entire fleet of servers. The application comes in the form of a daemon for monitoring and reporting as well as an easy-to-use Web-based GUI for controlling monitoring, viewing timelines of incidents, filing incidents and graphing statistics. The application is extensible with any sort of monitoring module and notification is based on a five step escalation process.
NetOculus is a network monitoring system that provides all the functions of the well-known monitoring system MRTG. It also has a number of its own benefits. It can automatically monitor any kind of detectable alteration in a computer network (and in relatively separated network areas). It can efficiently notify staff about hardware state changes. Analytical information is aggregated. Specific pieces of hardware are associated with the staff members who responsible for them. The solutions for solved problems are reported to the staff for further use.