Imbalance of CPU and memory requests from 2022-03-31 15:20 CEST to 2022-04-19 17:00 CEST

Updates

Effective May 1st 2022 we’ll factor CPU requests into the pricing of APPUiO Cloud should they exceed the underlying platforms’ ratio. See products.docs.vshn.ch for a detailed description about how it will affect pricing.

If your workload doesn’t exceed the underlying platforms’ ratio, nothing will change in the pricing. When you deploy a workload which violates the platforms’ ratio, you’ll notice a message in the oc cli stating Current memory to CPU ratio of MYVALUE/core in this namespace is below the fair use ratio of XGi/core. If you see this message you should tweak your settings before May 1st so that your invoice isn’t affected.

Should you have any questions or need help defining the best matching ratio, don’t hesitate to contact us. We’ll also continue to work on this topic as already written in the earlier update to improve this situation.

April 19, 2022 · 17:00 CEST

Update

Thanks to all of you who have had a look into this and acted upon.

We were able to have a closer look at the situation and come up with a plan.

Our initial thought was to just outright forbid CPU requests not fitting the memory to CPU ratio of the platform. We understand, however, that this would be a bad user experience. We also understand that there are legitimate cases for higher CPU requests. Instead, only a warning will be shown, one you will be able to suppress when you have one of those legitimate use cases. You might have already noticed them.

To compensate for the additional resources we have to allocate to the cluster, we have found a way to factor CPU requests into the pricing of APPUiO Cloud. The specifics of this will be communicated soon. Those changes will not impact the current month’s invoice.

The invoice currently shows only one number. This is abstract and can be hard if not impossible to understand. We will extend the invoice and provide additional details, to show if you are billed due to effective usage, memory requests, or CPU requests.

On top of that, we will look into the Vertical Pod Autoscaler Recommender (VPA). If this works as anticipated, it will give you a suggestion on how to tune requests and limits of your applications based on their actual use. Or, if you are so inclined, you can let the VPA automatically apply those recommendations for you.

In summary, what will happen next?

We will soon publish documentation about how to suppress those warnings.
The April invoice might show additional details on what exactly you got billed for. This will for sure be the case for May.
We will then update the product description, to explain how CPU requests will be factored into the APPUiO Cloud pricing.
The new pricing will be communicated during the month of April, and will start to take effect in May.

April 8, 2022 · 17:24 CEST

Essential

APPUiO Cloud is billed based on memory. CPU is defined as “fair use”. But what does fair use mean? The underlying virtual machines on cloudscale.ch have a memory to CPU ratio of 4 GiB per CPU. Consequently, the workload on APPUiO Cloud has to fit into that ratio. That ratio has gone off balance in the last few days.

The cause of that off balance is not the effective CPU usage, but the configured CPU requests on a Pod level. A lot of requests were set to levels by far above what the application actually needs. In order to ensure all workloads can be scheduled, we had to spin up more and more machines, which are now doing nothing. This is not economically viable, and we have to take action.

To begin with, we would like to explain the effect and purpose of requests and how CPU and memory requests differ.

A resource request can be translated as: “My workload requires AT LEAST that amount of resources”. This can also be rephrased as: “My workload won’t work with less resources”. If a workload requires on average 200 milicores, it makes sense to set the request to 200 milicores. Setting it to anything higher, will be a waste of resources. This is because the system will reserve that amount of resources for your workload and will not allocate this to other workload. Assume all workloads will be scheduled with a request of 25% above what actually is used. We would end up with 25% of unused resources.

It is also important to understand the difference between memory and CPU in this context. Memory is a non-compressible resource, if an application can not allocate enough, it will crash. CPU on the other hand is more forgiving. If CPU becomes sparse, the application will still work. It just uses more time to finish its calculations.

On a shared platform like APPUiO Cloud, probability tells us, that not everybody will use lots of CPUs all the time, so things will average. Even if your workloads will use more than requested, we make sure that there is enough available.

This is why we have chosen to bill APPUiO Cloud based on memory and as CPU is granted on “fair use” on APPUiO Cloud, nobody is paying for that overhead.

So what does this mean?

We request all APPUiO Cloud users to check their CPU requests and set them to lower levels where possible. Remember, request only what is really needed. Further, we will look into ways to enforce the memory to CPU ratio we get from the underlying infrastructure provider. We are also looking into ways to bill CPU requests that exceed that ratio.

We will let you know, as soon as we have new information to share.

A detailed documentation on Kubernetes resource management (requests and limits) can be found at https://kb.vshn.ch/rancher/explanations/kubernetes_resource_management.html

March 31, 2022 · 15:42 CEST

← Back