Fully automated eagle eye – monitoring IT business processes efficiently
Eight arms and 42 eyes – that’s what a system administrator would need to keep an eye on all of his systems at all times. How big is the network load? Is the CPU peacefully smoldering away right now? Is the Java-process eating up lots of RAM again? Is a log-in from the website still possible? Are e-mails cluttering up the mail-server?
Continuous manual reviewing is downright impossible, but essential for a smooth running of day to day business. A static checking of values may be prevalent but surveillance of business-related IT-processing ought to be as dynamic as your business.
As a rule, it is not individual values that determine whether or not a service is at the user’s disposal, but a group of values or even a dynamic process that decide whether or not a service is available. What use is the delivery of a portal’s homepage, when instead of the log-in mask a „service unavailable“ appears or the log-in takes several minutes even though the user specific content is available within a few milliseconds. These dynamic processes, too, can be monitored automatically, and, by an appropriate choice of thresholds, problems may even be surmised by the administrator before the users‘ acceptance limit is breached and they reach for the phone. Such monitoring does not necessarily have to be carried out internally but it can and should – if services are available to external users – be done from the outside. For the linkage to the outside plays an important role. A misconfiguration of the router or firewall can easily lead to the sealing off of a service from the outside world without anyone noticing it.
With standard tools like Nagios and Icinga you can depict these requirements. Anyone shunning a big effort at setting up complex test scenarios can have tools like cucumber-nagios do part of the work, and describe tests that are easily readable for anybody and can be integrated into Nagios, or he can outsource monitoring to a partner. This partner can take on not only monitoring, but also troubleshooting and reporting.
Just how available is my application really?
Reporting in particular generally only takes place, when you feel your service is not available to a sufficient degree any longer. Then it is up to administrators – usually within a very short period of time - to compile statistics appropriate for the management that unmistakably show the availability of a service. No easy task – especially if the monitoring does not regard the complex service structure as an integrated construction, but rather monitors individual services and performance parameters. This is where so-called multi-checks are employed. They can link individual parameters appropriately and thus depict a realistic image of availability. The ultimate challenge is automatic compilation of the required reports so administrators will only be burdened with disturbances. Particularly in Service Level Agreements those automated reports are regularly demanded to gauge the quality of the service. This is not just about bare figures – it is about money, so these reports do not only have to be self-explanatory, but above all they have to be correct. Solutions in this area are as individual as the services to be monitored for availability.
Our team of IT-service-consultants will be happy to assist you - together with their partners – in adjusting or introducing a solution.