check_openmanage is a plugin for Nagios that checks the hardware health of Dell servers running OpenManage Server Administrator (OMSA). The plugin can be used remotely with SNMP or locally with NRPE, check_by_ssh, or similar. It checks the health of the storage subsystem, power supplies, memory modules, temperature probes, etc., and gives an alert if any of the components are faulty or operate outside normal parameters.
memtester is a user-space utility for testing the memory subsystem in a computer to determine if it is faulty. It does a good job of finding intermittent faults and non-deterministic faults. It has many tests to help catch borderline memory. memtester should compile and run on any 32- or 64-bit Unix or Unix-like system.
check_hpasm is a plugin for Nagios which checks the hardware health of Hewlett-Packard Proliant servers. To accomplish this, you must have installed the hpasm package. The plugin checks the health of processors, power supplies, memory modules, fans, CPU- and board-temperatures, and alerts you if one of these components is faulty or operates outside its normal parameters.
GroundWork Monitor Community Edition can give you insight into your computing infrastructure, allowing you to see the current and historical states of all your computers: servers, desktops, and laptops, all of your network devices, all of your services (like TCP/IP and Web services), and all of your applications (like mail servers and database apps). You can choose to be alerted when something goes awry via pager, SMS, email, or phone, and even set up automatic restarts or fall-overs.
memtest86+ is a memory tester which is based on memtest86 v3.0, and provides an up-to-date version of this useful tool, which aims to be as reliable as the original. It has been fixed to work on AMD64 systems, and also properly detects all current CPUs and motherboard chipsets. It supports ECC polling for AMD64, i875P, and E7205, and displays some useful settings for the most popular chipsets.