Efficient resource management of a fleet of OpenShift cluster with Hosted Control Planes / HyperShift -

When managing several Kubernetes/OpenShift clusters, one must also deal with issues like resource and cost optimisation. How can we establish a more effective balance between the resources that are available and those that are allotted to workloads? How can you achieve that new clusters are provisioned more quickly? What steps can be taken to create a more independent lifecycle management for a fleet of clusters? All those questions have a significant impact on how to maintain and manage a large number of clusters. This article gives an overview of the solution that is being pursued with HyperShift.

Overview

In a system landscape that is based on multiple Kubernetes/OpenShift clusters, you typically have a defined setup flavour, each with a number of control nodes and a set of working nodes, depending on the resource demands. Since OpenShift 4.x the amount of control nodes is fixed at 3 separate nodes. This means that, in addition to worker nodes, a typical cluster has dedicated control nodes to manage the entire cluster. This also implies that corresponding resources (CPU, Memory, storage, up to additional hardware) must be included in the design and provisioning activities. In such setups, the question of why the resources allocation between management (control) and workload is so inefficient arises. This is intensified when there are many small clusters with equivalent worker resources to/below the resources for control nodes.

Another issue is the time taken for cluster provisioning. A consumer primarily needs to worker nodes in the cluster, but has to wait until all control nodes are also provisioned too.

OpenShift offers a selection of existing installation options - the form factors - to address this issues and optimisations. These will be briefly discussed in the course of this article.

Optimising resources is not the only challenge when operating a fleet of clusters. Other issues are

how to reduce the resource consumption of recurring services like image registry, logging/monitoring stack etc
…as well how to reduce the effort to manage and operate them
in terms of operation, an independent life cycle management between control and worker nodes would be helpful
an other dimension is to support different architectures and mix them in a cluster

The main points are to deal with the cost and resource consumption, be flexible in the provisioning and how to manage the nodes more efficiently. And this is where Hosted Control Planes / HyperShift comes in.

After giving an overview of the different form factors, the fundamental concept of HyperShift is discussed in the following sections, along with any benefits that may result.

Preparation

The preparation includes only the installation of the Cryostat Operator. In OpenShift is this simple installation from the OperatorHub

Search for Cryostat
Install in a new namespace
Afterwards create an instance (default values are totally fine)

This results in a Cryostat instance with route to the UI and Grafana instance, where Cyrostat upload a subset of the recorded events.

Info: Grafana is protected and credentials are in a Secret, see for details in the usage instruction

Record

To record the events from a running Java containerized application are the following points necessary

The application should expose an JMX endpoint
Cryostat has to discover this application and endpoint

In case a Quarkus application is used, is JMX in the JVM-mode always enabled. Here are only the right environment variables necessary to expose the endpoints in the Kubernetes Deployment

env:
  - name: JAVA_OPTS_APPEND
    value: >-
    -Dcom.sun.management.jmxremote
    -Dcom.sun.management.jmxremote.port=9998
    -Dcom.sun.management.jmxremote.rmi.port=9998
    -Djava.rmi.server.hostname=quarkus-playground-s2i.demo-jfr
    -Dcom.sun.management.jmxremote.authenticate=false
    -Dcom.sun.management.jmxremote.ssl=false
    -Dcom.sun.management.jmxremote.local.only=false

Consider the following configurations

define a port for JMX and use this also in the Kubernetes Service definition of the application
define the hostname with <service-name>.<namespace>. The communication between Cryostat and the app stays inside the cluster, so no (external) IP is here needed

As mentioned above the Service should contains the JMX port

ports:
    #...
    - port: 9998
      targetPort: 9998
      protocol: TCP
      name: jfr-jmx

Consider here, that the JMX port contains the name jfr-jmx. This is the name, that Cryostat expects. This is helpful, in case Cryostat is running in the same namespace as the application/container. In this case, Cryostat would discovers this new target application automatically.

Usually, Cryostat is running in an own namespace or should record various applications in different namespaces. For that reason, one have to register the target applications manually via the Cryostat UI using this URL format as example

service:jmx:rmi:///jndi/rmi://quarkus-playground-s2i.demo-jfr:9998/jmxrmi

Same here, for Service name and namespace to reach the JMX endpoint of the application.

After selecting the Target JVM in the Cryostat UI and Recording tab, one have the option to define the duration of the recording and the detail level.

After the recording provides Cryostat a short Analysis report with the main pain points. Details are in the context menu of the recording like a link to Grafana, HTML report or the option to download the JFR file.

A more comprehensive analysis can be carried out with JDK Mission control. For this, the recording file (.jfr) only has to be imported.

Outlook

JDK Flight Recorder (JFR) and JDK Mission Control are not really new. But Cryostat still is to some extent. Red Hat and the community is still improving and increasing the feature set like

GraalVM Native Image support
Better integration in OpenShift like RBAC/OAuth support
Determine workload via labels and automated rules etc
or asserting JFR events with JfrUnit

Summary

The mature profiling solution JDK Flight Recorder (JFR) and because of its minimal overhead, it is ideal to use it in productive environments. And because of Cryostat, this can also be done in an OpenShift environment.

This all enables a more transparent environment through the option and capability to simply analyse JVM containerized applications. Memory leaks - bye bye.

In the quarkus-playground project are the relevant configuration integrated.

References