Cost Control Dashboard
The platform comes with the cost control dashboard enabled by default, offering insights into potential cost savings through virtual clusters.
To track allocations and calculate savings for workloads running inside virtual clusters the platform deploys and manages Prometheus and OpenCost on each connected cluster. Prometheus uses a time series database that requires persistent volume storage to retain metrics over time. The platform-managed Prometheus is configured to collect the minimum metrics required for the cost control dashboard. OpenCost provides CPU and Memory allocation metrics for the deployed workloads.
In addition to the Prometheus instance on the connected cluster, the cluster hosting the platform has a global Prometheus installed that is configured to aggregate metrics from all connected clusters. This utilizes the connected cluster private network so no additional networking configuration is required.
By default, the Cost Control Dashboard requests 60Gi persistent volumes for each connected cluster's metrics, and an additional 60Gi persistent volume for global metric aggregation. See the Configuration section for details on enabling/disabling the dashboard or configuring its resource needs.
Configuration​
Although the platform manages Prometheus and OpenCost automatically, certain
settings can be configured to better suit specific environments or needs. The
primary configuration is part of the platform constControl
config.
To turn off the cost control dashboard, which is enabled by default, set costControl.enabled
to false
:
config:
costControl:
enabled: false
The remaining configuration customizes the cost settings, global Prometheus, and connected cluster OpenCost and Prometheus. These are detailed in the following sections.
Cost settings​
The platform provides initial cost defaults as shown below. These can be customized for your environment.
config:
costControl:
settings:
# Average CPU cost
averageCPUPricePerNode:
price: 31
timePeriod: Monthly
# Average Memory cost
averageRAMPricePerNode:
price: 31
timePeriod: Monthly
# Monthly or Yearly cost for Hosted Control Planes
controlPlanePricePerCluster:
price: 900
timePeriod: Yearly
Global observability​
These settings control, the global Prometheus's availability, retention, storage, cpu, and memory configuration. The default settings are provided below for reference, and may be customized.
config:
costControl:
global:
metrics:
# Number of replicas for the Prometheus Deployment
replicas: 1
# Prometheus TSDB retention settings: https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects
retention: 1y
# CPU and Memory Requests and Limits for the Prometheus Deployment
resources:
requests:
cpu: 500m
memory: 6Gi
limits:
cpu: '2'
memory: 6Gi
# Storage Settings for Persistent Volumes
storage:
size: 60Gi
# Valid values:
# - null: use the default storage class
# - "": disable automatic storage provisioning
# - name of the preferred configured storage class
storageClass: null
retention
refers to Prometheus's TSDB retention settings. Details regarding valid units can be found in the
prometheus documentation
storageClass
configures the storage class that should be used for provisioning of the Prometheus deployment's
StatefulSet
. Valid values are:
null
: use the cluster's default storage class""
(empty string): turn off automatic provisioning- The name of a pre-configured storage class on the cluster
Overridden resources
are not merged with the defaults to avoid unexpected changes from Helm charts used by the
platform to manage Prometheus. Therefore, all requests and limits that you wish to apply should be provided.
Cluster observability​
These settings control the Prometheus's configuration for each connected cluster. These are applied uniformly across multiple clusters, but can be customized using Cluster Specific Configuration.
config:
costControl:
cluster:
metrics:
# Number of replicas for the Prometheus Deployment
replicas: 1
# Prometheus TSDB retention settings: https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects
retention: 1y
# CPU and Memory Requests and Limits for the Prometheus Deployment
resources:
requests:
cpu: 500m
memory: 3Gi
limits:
cpu: '1'
memory: 3Gi
# Storage Settings for Persistent Volumes
storage:
size: 60Gi
# Valid values:
# - null: use the default storage class
# - "": disable automatic storage provisioning
# - name of the preferred configured storage class
storageClass: null
Overridden resources
are not merged with the defaults to avoid unexpected changes from Helm charts used by the
platform to manage Prometheus. Therefore, all requests and limits that you wish to apply should be provided.
Cluster cost control​
These settings control the OpenCost deployment on each connected cluster. While there are many possible ways to configure OpenCost, the Platform configures it with the minimum settings required for the dashboard. As such only availability, CPU, and memory settings may be configured.
config:
costControl:
cluster:
opencost:
# Number of replicas for the OpenCost Deployment
replicas: 1
# CPU and Memory Requests and Limits for the OpenCost Deployment
resources:
requests:
cpu: 500m
memory: 3Gi
limits:
cpu: '1'
memory: 3Gi
Overridden resources
are not merged with the defaults to avoid unexpected changes from Helm charts used by the
platform to manage OpenCost. Therefore, all requests and limits that you wish to apply should be provided.
Other cluster settings​
For individual cluster configuration, the Cluster resource allows for further configuration specific to each cluster.