Version: v4.3 Stable

Cost Control Dashboard

The platform comes with the cost control dashboard enabled by default, offering insights into potential cost savings through virtual clusters.

info

This feature is available from the platform version v4.2.0 and was introduced in vCluster version v0.22.0

To track allocations and calculate savings for workloads running inside virtual clusters the platform deploys and manages Prometheus and OpenCost on each connected cluster. Prometheus uses a time series database that requires persistent volume storage to retain metrics over time. The platform-managed Prometheus is configured to collect the minimum metrics required for the cost control dashboard. OpenCost provides CPU and Memory allocation metrics for the deployed workloads.

In addition to the Prometheus instance on the connected cluster, the cluster hosting the platform has a global Prometheus installed that is configured to aggregate metrics from all connected clusters. This utilizes the connected cluster private network so no additional networking configuration is required.

Storage requirements

By default, the Cost Control Dashboard requests 60Gi persistent volumes for each connected cluster's metrics, and an additional 60Gi persistent volume for global metric aggregation. See the Configuration section for details on enabling/disabling the dashboard or configuring its resource needs.

Network requirements

In air-gapped or private GKE clusters it may be necessary to add a firewall rule allowing incoming traffic to port 9090. This port is used when aggregating metrics from connected clusters.

Configuration

Although the platform manages Prometheus and OpenCost automatically, certain settings can be configured to better suit specific environments or needs. The primary configuration is part of the platform constControl config.

To turn off the cost control dashboard, which is enabled by default, set costControl.enabled to false:

Turning off the Cost Control Dashboard
config:
  costControl:
    enabled: false

The remaining configuration customizes the cost settings, global Prometheus, and connected cluster OpenCost and Prometheus. These are detailed in the following sections.

Cost settings

The platform provides initial cost defaults as shown below. These can be customized for your environment.

Default Cost Settings
config:
  costControl:
    settings:
      # Average CPU cost
      averageCPUPricePerNode:
        price: 31
        timePeriod: Monthly
      # Average Memory cost
      averageRAMPricePerNode:
        price: 31
        timePeriod: Monthly
      # Monthly or Yearly cost for Hosted Control Planes
      controlPlanePricePerCluster:
        price: 900
        timePeriod: Yearly

Global observability

These settings control, the global Prometheus's availability, retention, storage, cpu, and memory configuration. The default settings are provided below for reference, and may be customized.

Default Global Prometheus Settings
config:
  costControl:
    global:
      metrics:
        # Number of replicas for the Prometheus Deployment
        replicas: 1
        # Prometheus TSDB retention settings: https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects
        retention: 1y
        # CPU and Memory Requests and Limits for the Prometheus Deployment
        resources:
          requests:
            cpu: 500m
            memory: 6Gi
          limits:
            cpu: '2'
            memory: 6Gi
        # Storage Settings for Persistent Volumes
        storage:
          size: 60Gi
          # Valid values:
          # - null:  use the default storage class
          # - "": disable automatic storage provisioning
          # - name of the preferred configured storage class
          storageClass: null

retention refers to Prometheus's TSDB retention settings. Details regarding valid units can be found in the prometheus documentation

storageClass configures the storage class that should be used for provisioning of the Prometheus deployment's StatefulSet. Valid values are:

null: use the cluster's default storage class
"" (empty string): turn off automatic provisioning
The name of a pre-configured storage class on the cluster

Overriding Resources

Overridden resources are not merged with the defaults to avoid unexpected changes from Helm charts used by the platform to manage Prometheus. Therefore, all requests and limits that you wish to apply should be provided.

Cluster observability

These settings control the Prometheus's configuration for each connected cluster. These are applied uniformly across multiple clusters, but can be customized using Cluster Specific Configuration.

Default Cluster Prometheus Settings
config:
  costControl:
    cluster:
      metrics:
        # Number of replicas for the Prometheus Deployment
        replicas: 1
        # Prometheus TSDB retention settings: https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects
        retention: 1y
        # CPU and Memory Requests and Limits for the Prometheus Deployment
        resources:
          requests:
            cpu: 500m
            memory: 3Gi
          limits:
            cpu: '1'
            memory: 3Gi
        # Storage Settings for Persistent Volumes
        storage:
          size: 60Gi
          # Valid values:
          # - null:  use the default storage class
          # - "": disable automatic storage provisioning
          # - name of the preferred configured storage class
          storageClass: null

Overriding Resources

Cluster cost control

These settings control the OpenCost deployment on each connected cluster. While there are many possible ways to configure OpenCost, the Platform configures it with the minimum settings required for the dashboard. As such only availability, CPU, and memory settings may be configured.

Default OpenCost Settings
config:
  costControl:
    cluster:
      opencost:
        # Number of replicas for the OpenCost Deployment
        replicas: 1
        # CPU and Memory Requests and Limits for the OpenCost Deployment
        resources:
          requests:
            cpu: 500m
            memory: 3Gi
          limits:
            cpu: '1'
            memory: 3Gi

Overriding Resources

Overridden resources are not merged with the defaults to avoid unexpected changes from Helm charts used by the platform to manage OpenCost. Therefore, all requests and limits that you wish to apply should be provided.

Other cluster settings

For individual cluster configuration, the Cluster resource allows for further configuration specific to each cluster.

Configuration​

Cost settings​

Global observability​

Cluster observability​

Cluster cost control​

Other cluster settings​

Configuration

Cost settings

Global observability

Cluster observability

Cluster cost control

Other cluster settings