Open Cluster Management (OCM) is a community-driven project focused on multicluster and multicloud scenarios for Kubernetes applications. In OCM, the multicluster scheduling capabilities are provided by Placement. As we have talked about in the previous article Using the Open Cluster Management Placement for Multicluster Scheduling, you can use Placement to filter clusters by label or claim selector. Placement also provides some default prioritizers which can be used to sort and select the most suitable clusters.
One of the default prioritizers is ResourceAllocatableCPU and ResourceAllocatableMemory. They provide the capability to sort clusters based on the allocatable CPU and memory. However, when considering the resource-based scheduling, the limitation is that “AllocatableCPU” and “AllocatableMemory” are static values and don’t change, even if the cluster is running out of resources. And in some cases, the prioritizer needs more extra data to calculate the score of the managed cluster. For example, there is a requirement to schedule based on resource monitoring data from the cluster. For this reason, we need a more extensible way to support scheduling based on customized scores.
The following features introduced in this article are based on Open Cluster Management v0.7.0 and also delivered in Red Hat Advanced Cluster Management for Kubernetes 2.5.
What is Placement extensible scheduling?
OCM Placement introduces the
AddOnPlacementScore API to support scheduling based on customized scores. This API can be used by Placement and can store the customized scores. For more details on the definitions of AddOnPlacementScore, see types_addonplacementscore.go. See the following
- lastTransitionTime: "2021-10-28T08:31:39Z"
message: AddOnPlacementScore updated successfully
- name: "cpuAvailable"
- name: "memAvailable"
conditions:Contains the different condition statuses for this
validUntil:Defines the valid time of the scores. After this time, the scores are considered to be invalid by placement. Nil means no expiration. The controller owning this resource should keep the scores up-to-date.
scores:Contains a list of score names and values of this managed cluster. In the above example, the API contains a list of customized scores: cpuAvailable and memAvailable.
All the customized score information is stored in
status, as we don’t expect users to update it.
- As a score provider, a third-party controller could run on either the hub or managed cluster to maintain the lifecycle of
AddOnPlacementScoreand update the score in
- As a user, you need to know the resource name
defaultand customized score name
memAvailableto specify the name in the
placementYAML to select clusters. For example, the followinng
placementselects the top three clusters with the highest
- In Placement, if the user defines the s
coreCoordinatetype as AddOn, the Placement controller will get the
AddOnPlacementScoreresource with the name “default” in each cluster’s namespace, read score “cpuAvailable” in the score list, and use that score to sort clusters.
You can refer to the enhancements to learn more about the design. In the design, lifecycle maintenance (create, update, and delete) of the
AddOnPlacementScore custom resource is not covered, as we expect the customized score provider itself to manage it. In this article, we use an example to show you how to implement a third-party controller to update your own scores and extend the multiple clusters scheduling capability with your own scores.
How to implement a customized score provider
The example code is in the resource-usage-collect GitHub repository. It provides the score of the cluster’s available CPU and memory, which can reflect the cluster’s real-time resource utilization. It is developed with OCM addon-framework and can be installed as an add-on plugin to update customized scores in
AddOnPlacementScore. See Add-on Developer Guide to learn more about how to develop an addon.
The resource-usage-collect add-on follows the hub-agent architecture as below.
The resource-usage-collect add-on contains a controller and an agent.
- The resource-usage-collect-controller runs on the hub cluster. It is responsible for creating the
ManifestWorkfor resource-usage-collect-agent in each cluster namespace.
- On each managed cluster, the work agent watches the
ManifestWorkand installs the resource-usage-collect-agent on each cluster. The resource-usage-collect-agent is the core part of this addon; it creates
AddonPlacementScorefor each cluster on the Hub cluster and refreshes
validUntilevery 60 seconds.
AddonPlacementScore is ready, you can specify the customized score in a Placement to select clusters.
The workflow and logic of the resource-usage-collect add-on are easy to understand. The following steps will help you get started:
Prepare an OCM environment with 2 ManagedClusters
- Run the setup dev environment by kind sript to prepare an environment by running the following command:
curl -sSL https://raw.githubusercontent.com/open-cluster-management-io/OCM/main/solutions/setup-dev-environment/local-up.sh | bash
- Run the following command to confirm that two
ManagedClusterand a default
$ clusteradm get clusters
NAME ACCEPTED AVAILABLE CLUSTERSET CPU MEMORY KUBERENETES VERSION
cluster1 true True default 24 49265496Ki v1.23.4
cluster2 true True default 24 49265496Ki v1.23.4
$ clusteradm get clustersets
NAME BOUND NAMESPACES STATUS
default 2 ManagedClusters selected
- Run the following commands to bind the default
ManagedClusterSetto the default
clusteradm clusterset bind default --namespace default
$ clusteradm get clustersets
NAME BOUND NAMESPACES STATUS
default default 2 ManagedClusters selected
Install the resource-usage-collect add-on
- Run the following command to git clone the source code:
git clone firstname.lastname@example.org:JiahaoWei-RH/resource-usage-collect.git
- Run the following command to prepare the image:
# get imagebuilder first
go get email@example.com
export PATH=$PATH:$(go env GOPATH)/bin
# build image
- Run the following command to deploy the resource-usage-collect add-on:
- Run the following commands to verify the installation:
On the hub cluster, verify that the resource-usage-collect-controller pod is running.
$ kubectl get pods -n open-cluster-management | grep resource-usage-collect-controller
resource-usage-collect-controller-55c58bbc5-t45dh 1/1 Running 0 71s
On the hub cluster, verify that the
AddonPlacementScore is generated for each managed cluster.
$ kubectl get addonplacementscore -A
NAMESPACE NAME AGE
cluster1 resource-usage-score 3m23s
cluster2 resource-usage-score 3m24s
AddonPlacementScore status should contain a list of scores as follows:
$ kubectl get addonplacementscore -n cluster1 resource-usage-score -oyaml
- name: cpuAvailable
- name: memAvailable
AddonPlacementScore is not created or there are no scores in the status, go into the managed cluster and check if the resource-usage-collect-agent pod is running well by running the following command:
$ kubectl get pods -n default | grep resource-usage-collect-agent
resource-usage-collect-agent-5b85cbf848-g5kqm 1/1 Running 0 2m
Select clusters with the customized scores
If everything is running correctly, you can try to create a Placement and select clusters with the customized scores.
- Create a Placement to select one cluster with the highest cpuAvailable score.
cat << EOF | kubectl apply -f -
- Verify the Placement decision.
$ kubectl describe placementdecision -n default | grep Status -A 3
Cluster Name: cluster1
Cluster1 is selected by
Run the following command to get the customized score in
AddonPlacementScore and the cluster score set by
Placement. You can see that the
cpuAvailable score is 12 in
AddonPlacementScore. This value is also the cluster score in
Placement events, which indicates that the Placement is using the customized score to select clusters.
$ kubectl get addonplacementscore -A -o=jsonpath='range .items[*].metadata.namespace"\t".status.scores"\n"end'
$ kubectl describe placement -n default placement1 | grep Events -A 10
Type Reason Age From Message
---- ------ ---- ---- -------
Normal DecisionCreate 50s placementController Decision placement1-decision-1 is created with placement placement1 in namespace default
Normal DecisionUpdate 50s placementController Decision placement1-decision-1 is updated with placement placement1 in namespace default
Normal ScoreUpdate 50s placementController cluster1:12 cluster2:12
Now you know how to install the resource-usage-collect add-on and consume the customized score to select clusters. Next, let’s take a deeper look at some key points when you consider implementing a customized score provider.
Where to run the customized score provider
The customized score provider could run on either the hub or managed cluster. Combined with user stories, you should be able to tell whether the controller should be placed in a hub or a managed cluster.
In our example, the customized score provider is developed with addon-famework, which follows the hub-agent architecture. The resource-usage-collect-agent is the real score provider. It is installed on each managed cluster and retrieves the available CPU and memory of the managed cluster, calculates a score, and updates it in
AddonPlacementScore. The resource-usage-collect-controller just takes care of installing the agent.
In other cases, for example, if you want to use the metrics from Thanos to calculate a score for each cluster, then the customized score provider only needs to be placed on the hub, as Thanos has all the metrics collected from each managed cluster.
How to maintain the AddOnPlacementScore CR lifecycle
In our example, the code to maintain the
AddOnPlacementScore CR is in pkg/addon/agent/agent.go.
When should the score be created?
AddOnPlacementScoreCR can be created with the existence of a
ManagedClusteror on demand for the purpose of reducing objects on the hub.
In our example, the add-on creates a
AddOnPlacementScorefor each managed cluster if it does not exist and a score is calculated when creating the CR for the first time.
When should the score be updated?
We recommend that you set
ValidUntilwhen updating the score so that the Placement controller can know if the score is still valid in case it failed to update for a long time.
The score could be updated when your monitoring data changes, or when you need to update it before it expires.
In our example, in addition to recalculating and updating the score every 60 seconds, the update will also be triggered when the node or pod resource in the managed cluster changes.
How to calculate the score
The code to calculate the score is in pkg/addon/agent/calculate.go. A valid score must be in the range between -100 and 100. You need to normalize the scores before updating them in
When normalizing the score, you might run into the following issues:
The score provider knows the max and min value of the customized scores.
In this case, it is easy to achieve smooth mapping by using a formula. If the actual value is X, and X is in the interval [min, max], then
score ＝ 200 * (x - min) / (max - min) - 100
The score provider doesn’t know the max and min value of the customized scores.
In this case, you need to set a max and min value by yourself, as without a max and min value, it is not possible to map a single value X to the range [-100, 100].
When X is greater than this max value, the cluster can be considered healthy enough to deploy applications, and the score can be set as 100. And if X is less than the min value, the score can be set as -100.
if X >= max
score = 100
if X <= min
score = -100
In our example, the resource-usage-collect-agent running on each managed cluster doesn’t have a holistic view to know the max/min value of the CPU/memory usage of all the clusters, so we manually set the max value as
MAXMEMCOUNT in the code, and the min value is set as 0. The score calculation formula can be simplified as follows:
score = x / max * 100
In this article, we introduced what Placement extensible scheduling is and used an example to show how to implement a customized score provider. This article also listed three key points the developer needs to consider when implementing a third-party score provider. After reading this article, you should have a clear view of how Placement extensible scheduling can help you extend the multicluster scheduling capabilities.
All the features introduced in this article are based on Open Cluster Management v0.7.0 and also delivered in Red Hat Advanced Cluster Management for Kubernetes 2.5. The latest features will keep updating in Extend the multicluster scheduling capabilities with placement.
Feel free to ask questions in the Open-cluster-management-io GitHub community or contact us by using Slack.