Optimize Dev/Test Costs: Kubernetes Scaling with the FinOps Scaling Operator

The Cost of Idle Development Resources

Kubernetes provides incredible flexibility for application development and testing. However, development and testing environments often suffer from a common problem: resource idleness. During non-business hours, or when teams in certain timezones are offline, these environments can consume significant resources (CPU, memory) even when they’re not actively used. This translates to unnecessary costs, especially in cloud-based Kubernetes deployments. FinOps principles dictate that we should minimize this waste and align resource consumption with actual usage.

The FinOps Scaling Operator offers a solution by automating the scaling of Kubernetes Deployments based on customizable schedules. This allows teams to dynamically adjust resource allocation, ensuring availability when needed and minimizing costs during periods of inactivity.

Repo Link: https://github.com/chrissam/finops-scaling-operator

Use Case: Development and Testing Environments

This operator is particularly well-suited for optimizing resource usage in non-production Kubernetes clusters or namespaces, where development, testing, and CI/CD workloads are deployed. These environments often exhibit predictable usage patterns:

Business Hour Activity: High resource utilization during standard business hours when developers and testers are actively working.
Off-Hour Idleness: Low or no activity during evenings, weekends, or holidays.
Distributed Teams: Organizations with distributed teams across multiple timezones may experience staggered usage patterns, with resources being heavily used in one timezone while idle in another.

The FinOps Scaling Operator helps to address the challenge of off-hour idleness and timezone-based usage variations in these scenarios.

Key Features

Here’s a breakdown of the key features of the FinOps Scaling Operator:

Customizable Schedules: Define scaling schedules with specific days of the week and time ranges. This allows you to precisely align resource allocation with development and testing schedules, including considerations for different timezones.
Timezone Support: Specify timezones for your schedules, ensuring accurate scaling regardless of your team’s location or the cluster’s geographic location.
Granular Control: Manage scaling policies at the namespace level or configure scaling for individual Deployments. This enables fine-grained control over resource allocation for different microservices or test suites.
Exclusion Rules: Exclude specific namespaces or Deployments from automatic scaling. This is important for critical services that must remain running 24/7, such as CI/CD runners or shared databases.
Forceful Scaling: Enforce a default scaling schedule in namespaces without explicitly defined policies. This provides a safety net to ensure that resources are scaled down during off-peak times, even if individual teams haven’t configured specific policies.
Easy Rollback: Because the operator stores the original replica count before scaling down, it’s easy to quickly scale Deployments back up when needed, either manually or by adjusting the schedule.
Global Configuration: Use the FinOpsOperatorConfig resource to define global settings for the operator, such as excluded namespaces and the scaling check interval.

Custom Resources: Defining Scaling Policies

The FinOps Scaling Operator uses Custom Resources (CRs) to allow users to define their scaling configurations:

FinOpsScalePolicy

The FinOpsScalePolicy CR is where you define the scaling schedules and target Deployments. Users create these policies in their namespaces to specify when and how their applications should scale.

Here’s a comprehensive example that demonstrates multiple features:

apiVersion: finops.devopsideas.com/v1alpha1
kind: FinOpsScalePolicy
metadata:
name: team-x-scaling-policy
namespace: team-x
spec:
timezone: "America/Los_Angeles" # Pacific Time
defaultSchedule:
  days: ["Mon", "Tue", "Wed", "Thu", "Fri"] # Weekdays
  startTime: "17:00" # 5 PM PT
  endTime: "09:00" # 9 AM PT
deployments:
- name: app-a
  minReplicas: 1 # Scale to 1 replica during off-hours
  optOut: false
- name: app-b
  minReplicas: 0 # Scale to 0 replicas on weekends
  schedule:
    days: ["Sat", "Sun"]
    startTime: "00:00"
    endTime: "24:00"
  optOut: false
- name: ci-cd-runner
  minReplicas: 2 # Maintain 2 replicas
  optOut: true # Exclude from scaling
- name: nightly-job
  minReplicas: 0 # Scale to 0, but only late at night
  schedule:
    days: ["*"] # Every day
    startTime: "23:00"
    endTime: "05:00"
  optOut: false

Explanation:

spec.timezone: The timezone (“America/Los_Angeles”) is crucial for schedule accuracy, especially in distributed teams.
spec.defaultSchedule: A default schedule (weekdays, 5 PM to 9 AM PT) applies to Deployments unless they define their own.
- days: [“Mon”, “Tue”, “Wed”, “Thu”, “Fri”] – Standard weekdays are specified. Use [“*”] for every day.
- startTime / endTime: Times define the scaling window.
spec.deployments: A list of per-Deployment settings.
- name: app-a: “app-a” uses the defaultSchedule.
  - minReplicas: 1: Scales down to 1 replica (but not lower) during off-hours.
  - optOut: false: This Deployment is subject to scaling. (Optional: Default is set to false)
- name: app-b: “app-b” overridesdefaultSchedule.
  - minReplicas: 0: Scales down to zero replicas.
  - schedule: Defines a different schedule (weekends only).
  - optOut: false: This Deployment is subject to scaling.
- name: ci-cd-runner: “ci-cd-runner” is excluded from scaling.
  - optOut: true: This prevents the operator from scaling this Deployment. Essential for critical services.
- name: nightly-job: “nightly-job” has a specific schedule.
  - schedule: Scales down every day between 11 PM and 5 AM.

FinOpsOperatorConfig

The FinOpsOperatorConfig CR provides global configuration options for the operator. This is usually managed by the cluster admins. Only one instance of FinOpsOperatorConfig can be created in a cluster. This FinOpsOperatorConfig resource should be created in the same namespace where the controller runs.

Example:

apiVersion: finops.devopsideas.com/v1alpha1
kind: FinOpsOperatorConfig
metadata:
  name: global-config
  namespace: scaling-operator-system
spec:
  excludedNamespaces:
    - kube-system
    - kube-public
  maxParallelOperations: 5
  checkInterval: "5m"
  forceScaleDown: true
  forceScaleDownSchedule:
    days: ["*"]
    startTime: "18:00"
    endTime: "08:00"
  forceScaleDownTimezone: "America/New_York"

In this example:

The operator is configured to exclude the kube-system and kube-public namespaces from scaling. It is crucial to carefully consider which namespaces to exclude, as forceScaleDown can affect any namespace not explicitly excluded.
The maxParallelOperations is set to 5, limiting the number of concurrent scaling operations to avoid overwhelming the Kubernetes API server.
The checkInterval is set to “5m”, meaning the operator will check for scaling needs every 5 minutes.
forceScaleDown is set to true, and a forceScaleDownSchedule is provided. This is a powerful feature that, if used incorrectly, could disrupt critical applications. When forceScaleDown is enabled, any namespace without a FinOpsScalePolicy will be scaled down according to the specified schedule (in this case, between 6 PM and 8 AM Eastern Time on weekdays). Use this feature with extreme caution and ensure that all critical namespaces and deployments are either excluded or have their own FinOpsScalePolicy to prevent unintended downtime.

Installation

You can install the FinOps Scaling Operator using either make commands (if you’re building from source) or, preferably, via a Helm chart. Refer Installation section in README for the details.

Benefits of Automated Scaling for Development and Testing

Automating Kubernetes scaling in development and testing environments with the FinOps Scaling Operator offers several key benefits:

Significant Cost Reduction: Scaling down idle development and testing deployments during off-hours dramatically reduces resource consumption, leading to substantial cost savings in cloud environments.
Optimized Resource Allocation: Ensures that resources are available when developers and testers need them, improving productivity, while avoiding unnecessary allocation when they are not.
Faster Environment Availability: The operator’s ability to quickly scale deployments back up minimizes wait times when developers or testers start their workday.
Support for Distributed Teams: Timezone-aware scheduling enables efficient resource sharing across globally distributed teams, maximizing resource utilization around the clock.
Simplified Environment Management: Automates the tedious task of manually scaling deployments, freeing up DevOps and platform engineering teams to focus on higher-value activities.
Improved Budget Predictability: By aligning resource consumption with predictable usage patterns, the operator helps to improve budget forecasting and control in development and testing projects.

The FinOps Scaling Operator is a powerful tool for organizations seeking to optimize their Kubernetes resource consumption in development, testing, and other non-production environments. By automating Deployment scaling, it enables significant cost reductions, improved resource allocation, and increased team productivity.