Server virtualization helps organizations to use of their available hardware resources more efficiently. Most applications for server virtualization come with some on-board monitoring tools, but their monitoring options are limited to individual metrics of the virtual servers and therefore do not meet the requirements of comprehensive server monitoring. In this blog, I will show you how to monitor virtual servers and virtualization platforms such as VMware vSphere, Citrix XenServer or Microsoft Hyper-V correctly.
What is meant by server virtualisation and virtual servers?
In contrast to hardware servers, virtual servers do not have their own physical basis, but share hardware resources with other systems. For this purpose, they are located in an abstraction layer between hardware and the user. In this abstracted layer, the actual physical hardware resources are shared among the virtual systems.
The most common method of virtualizing servers is to use a hypervisor. Such systems allow hardware resources such as processors, RAM and hard disk space to be shared among several virtual servers (which are then known as virtual machines). Server operating systems such as Windows or Linux distributions run on these virtual machines, which in turn can host server applications. The best-known providers of hypervisors are VMware, Citrix and Microsoft.
In addition to classic server virtualization, more and more organizations are using containers or container orchestration platforms such as Kubernetes. With this approach, the companies virtualise individual applications and not a server operating system. Container monitoring is similar to the monitoring of virtual servers, but includes additional aspects that need to be taken into account.
This belongs in virtual server monitoring
When hardware is decoupled from individual subsystems, it becomes more difficult to identify the causes of problems because complex connections are created between different software and hardware levels. It is therefore important to properly consider the dependencies in monitoring. Also, compared to infrastructures that just rely on physical servers, you must expect to monitor more systems when you are using virtualized servers. Therefore, you need a monitoring tool that can automatically detect and monitor virtual machines. In addition to the virtualization platform and the virtual machines, you should also keep an eye on the storage and the network components.
This may sound complicated, but the appropriate monitoring solution takes a lot of the work out of this for you. It's not just about simply identifying the systems as hosts and services, but it's also about accurately displaying the systems in the monitoring and providing precise alerts that tell where to look for the source of a problem.
Platforms such as VMware vSphere, Citrix XenServer or Microsoft Hyper-V provide monitoring data that Checkmk can record using special agents. In contrast to the Checkmk agents on the server operating system, the special agents run directly on the Checkmk server. In a few seconds, you can thus have an initial basis for monitoring your virtual servers. These are automatically created in Checkmk as services. You can of course also monitor the virtual machines with normal Checkmk agents via the operating system and include them as separate hosts in the monitoring. Furthermore, as an all-in-one platform, Checkmk offers precises insights into all relevant areas. Monitoring the shared storage of the virtual machines, for example, is important as a virtual server only checks, if connected storage has sufficient space for its own applications. If this is the case, the virtual server reserves this space on the storage. The reservation will not be considered as occupied space by the storage, because only completed write operations count for the storage. It will only raise an alarm if the actually-used memory reaches a critical value and will ignore reservations by the virtual machines.
If there is a performance peak in multiple virtual machines, more memory is reserved than is actually available. The checking mechanisms of the storage and the individual virtual machines are not sufficient in this situation. If the virtual machines now are saving data on the storage, the storage system detects the over-allocation too late. There is a risk of data loss and perhaps even a system crash.
Checkmk monitors the virtual machine, the storage and the virtualization platform and would immediately detect the problem by comparing all data sources. In this instance, with the monitoring data from the virtualization platform, Checkmk would see that the sum of the reserved memory of all virtual machines exceeds the available memory on the storage. Checkmk also automatically specifies appropriate threshold values for all areas. This enables a monitoring of all hosts to be set up in a few minutes without you being swamped by alarms or even false positives.
You should also be aware that you probably use redundant connections between your virtual machines, switches and storage. Checkmk also maps such information via the monitoring of the hypervisor, as this manages the interfaces. This makes your monitoring more intelligent. At the same time, you are also able to customize monitoring access and approvals for each Checkmk user according to their area of responsibility, level of detail requirements and security approvals of the information.
In particular, virtualization is more dynamic and therefore it is important that you can automate the monitoring of your virtual environments. Checkmk offers numerous features here, such as the Dynamic Configuration Daemon (DCD), which makes manual maintenance of hosts superfluous. The DCD allows Checkmk to add and remove hosts from the monitoring automatically based on data from VMware, Kubernetes and other sources.
In addition to virtual servers, more and more organizations are making use of cloud servers. Monitoring these servers via Checkmk's special agents is also possible. You can find more details on the subject of cloud server monitoring in the next blog in this series.