4.2. Monitoring the Compute Cluster

After you create the compute cluster, you can monitor it on the COMPUTE > Overview screen.

The compute cluster status is displayed on top of the screen and can be one of the following:

HEALTHY
All compute cluster components and nodes operate normally.
CONFIGURING
The compute cluster configuration (the default CPU model for VMs or the number of compute nodes) is changing.
WARNING
The compute cluster operates normally but some issues have been detected.
CRITICAL
The compute cluster has encountered a critical problem and is not operatonal.

The charts show the information on CPU, RAM, and storage usage; the number of virtual machines grouped by status and resource consumption; and compute-related alerts.

4.2.1. Used CPUs Chart

This chart displays CPU utilization of the compute cluster. The following statistics are available:

System
The number of logical cores used by system and storage services on all nodes in the compute cluster.
VMs
The number of logical cores used by virtual machines on all nodes in the compute cluster.
Free
The number of unused logical cores on all nodes in the compute cluster.
Fenced
The number of CPUs on all fenced nodes in the compute cluster.
Total
The total number of logical cores on all nodes in the compute cluster.
Provisioned vCPUs
The number of vCPUs provisioned for all VMs in the compute cluster.
../_images/stor_image138_ac.png

A similar chart is available for each individual node in the compute cluster.

4.2.2. Reserved RAM Chart

This chart displays RAM utilization of the compute cluster. The following statistics are available:

System
The amount of RAM reserved for system and storage services on all nodes in the compute cluster.
VMs
The amount of RAM provisioned for all VMs in the compute cluster.
Free
The amount of free RAM on all nodes in the compute cluster.
Fenced
The amount of RAM on all fenced nodes in the compute cluster.
Total
The total amount of RAM on all nodes in the compute cluster.
Used by VMs
The amount of RAM actually used by all VMs in the compute cluster.
../_images/stor_image139_ac.png

A similar chart is available for each individual node in the compute cluster.

4.2.3. Provisioned Storage Chart

This chart shows usage of storage space by the compute cluster. The following statistics are available:

Used
The amount of storage space actually occupied by data in all volumes provisioned in the compute cluster.
Free
The amount of unused space in all volumes provisioned in the compute cluster.
Total
The total size of volumes provisioned in the compute cluster.
Free physical space
The amount of physical space available in the storage cluster.
../_images/stor_image140_ac.png

4.2.4. VM Status Chart

The VMs status chart shows the total number of virtual machines in the compute cluster and groups them by status, which can be the following:

Running
The number of virtual machines that are up and running.
In progress
The number of virtual machines that are in a transitional state: building, restarting, migrating, etc.
Stopped
The number of virtual machines that are suspended or powered off.
Error
The number of virtual machines that have failed. You can reset state for such VMs to their last stable state.
../_images/stor_image141_ac.png

To see a full list of virtual machines filtered by the chosen status, click the number next to the status icon.

4.2.5. Top VMs Chart

The Top VMs chart lists virtual machines with the highest resource consumption sorted by CPU, RAM, or Storage in descending order. To switch between lists, click the desired resource.

../_images/stor_image142_ac.png

To see a full list of virtual machines in the compute cluster, click Show all.

4.2.6. Alerts Chart

The Alerts chart lists all the alerts related to the compute cluster sorted by severity. Alerts include the following:

Critical
The compute cluster has encountered a critical problem. For example, one or more of its components have been unavailable for more than 10 seconds or some resource has exceeded its soft limit.
Warning
The compute cluster is experiencing issues that may affect its performance. For example, one or more of its components operate slowly or some resource is approaching its soft limit.
Other
Some other issue has happened with the compute cluster. For example, its license is about to expire or has expired.

To see a full list of compute-related alerts, click Show all.