Configuring Cluster-Wide KubeArchive Policies

This document explains how to configure cluster-wide policies using ClusterKubeArchiveConfig. These policies apply to all namespaces that have a KubeArchiveConfig resource.

Prerequisites

KubeArchive installed and running in a cluster (Installation)
Cluster administrator permissions
Familiarity with basic KubeArchiveConfig concepts (Configuring KubeArchive)

The ClusterKubeArchiveConfig Resource

To configure cluster-wide KubeArchive policies create a ClusterKubeArchiveConfig custom resource. Only one ClusterKubeArchiveConfig is allowed per cluster and it must be named kubearchive. ClusterKubeArchiveConfigs have this general form:

---
apiVersion: kubearchive.org/v1
kind: ClusterKubeArchiveConfig
metadata:
  name: kubearchive
spec:
  resources: [...] (1)

1	`spec.resources` is a list of elements, each defining cluster-wide rules for a specific kind.

`selector`: Selecting Resources

The key selector within ClusterKubeArchiveConfigs define resources. It requires two keys: kind and apiVersion. Each entry on spec.resources requires a selector:

---
apiVersion: kubearchive.org/v1
kind: ClusterKubeArchiveConfig
metadata:
  name: kubearchive
spec:
  resources:
    - selector:
        apiVersion: v1
        kind: Pod
      ...
    - selector:
        apiVersion: batch/v1
        kind: Job
      ...
    - selector:
        apiVersion: apps/v1
        kind: Deployment
      ...

With each entry on spec.resources ClusterKubeArchiveConfig supports archiveWhen, deleteWhen, archiveOnDelete, and keepLastWhen. These keys accept a string which is an expression in the CEL language format. When a resource defined by selector changes or gets deleted KubeArchive evaluates the expressions. They must evaluate to either true or false.

`archiveWhen`: Archiving Resources

The archiveWhen key defines when KubeArchive should archive resources cluster-wide. The following example configures KubeArchive to archive all pods when they match the status.phase == "Succeeded" condition:

---
apiVersion: kubearchive.org/v1
kind: ClusterKubeArchiveConfig
metadata:
  name: kubearchive
spec:
  resources:
    - selector:
        apiVersion: v1
        kind: Pod
      archiveWhen: status.phase == "Succeeded"

archiveWhen: "true" is also a valid expression that configures KubeArchive to archive the resource every time its updated.

archiveWhen is processed by both the controller and vacuums. Resources processed by the controller are handled immediately after Kubernetes sends the event. Resources processed by vacuums are handled based on how often the vacuum operations are scheduled to run.

This policy will apply to all namespaces that have a KubeArchiveConfig resource, in addition to any namespace-specific archiveWhen rules.

`deleteWhen`: Deleting Resources

The deleteWhen key enables automatic deletion of resources from the Kubernetes cluster based on a CEL expression. This helps keep the cluster free of resources that are no longer needed.

KubeArchive archives resources deleted using deleteWhen.

deleteWhen is processed by the controller only. Resources are handled immediately after Kubernetes sends the event.

While deleteWhen is not used in vacuum operations, similar functionality can be achieved using keepLastWhen with a count of 0.

For example, these two approaches have similar effects but different timing:

Using deleteWhen (processed immediately by controller):

deleteWhen: has(status.completionTime)

Using keepLastWhen (processed during vacuum operations):

keepLastWhen:
  - name: delete-completed
    when: "has(status.completionTime)"
    count: 0

The deleteWhen approach deletes resources immediately when they match the condition, while the keepLastWhen approach with count 0 deletes matching resources only when vacuums run.

The following ClusterKubeArchiveConfig configures KubeArchive to delete (and archive) all completed jobs across the cluster:

---
apiVersion: kubearchive.org/v1
kind: ClusterKubeArchiveConfig
metadata:
  name: kubearchive
spec:
  resources:
    - selector:
        apiVersion: batch/v1
        kind: Job
      deleteWhen: has(status.completionTime)

`keepLastWhen`: Keeping the Last N Resources

The keepLastWhen key provides a way to automatically retain only the most recent N resources that match specific criteria, while deleting older ones. This is particularly useful for managing resources like completed jobs, where you want to keep only the latest few for reference while cleaning up older ones.

KubeArchive archives resources deleted using keepLastWhen.

keepLastWhen is processed by vacuums only. Resources are handled based on how often the vacuum operations are scheduled to run, not immediately after Kubernetes sends events.

Unlike the other fields, keepLastWhen consists of an array of rules, each with:

name - A unique identifier for the rule that namespaces can reference for overrides
when - A CEL expression defining which resources to consider
count - Number of resources to keep (must be greater than or equal to 0; when 0, all matching resources are deleted)
sortBy - (optional) Field to sort by (defaults to metadata.creationTimestamp)

Basic Example

The following ClusterKubeArchiveConfig configures KubeArchive to keep only the last 10 completed jobs cluster-wide:

---
apiVersion: kubearchive.org/v1
kind: ClusterKubeArchiveConfig
metadata:
  name: kubearchive
spec:
  resources:
    - selector:
        apiVersion: batch/v1
        kind: Job
      keepLastWhen:
        - name: completed-jobs
          when: "has(status.completionTime)"
          count: 10

Custom Sort Field

You can sort by fields other than creation timestamp, such as the resource name:

keepLastWhen:
  - name: completed-jobs-by-name
    when: "has(status.completionTime)"
    count: 5
    sortBy: "metadata.name"

Multiple Rules

Multiple rules can be specified to handle different types of resources with different retention policies:

keepLastWhen:
  - name: daily-backups
    when: "metadata.name.startsWith('daily-backup-')"
    count: 7
  - name: weekly-backups
    when: "metadata.name.startsWith('weekly-backup-')"
    count: 4

Namespace Overrides

Namespaces can override cluster-wide keepLastWhen rules to apply stricter retention policies. The name field in each cluster rule allows namespaces to reference and override specific rules.

See the KubeArchiveConfig documentation for detailed information and examples on using overrides.

`archiveOnDelete`: Archiving on Deletion From the Cluster

The archiveOnDelete key enables KubeArchive to work with other applications that clean up resources. This allows you to use specialized deletion tools while still archiving resources with KubeArchive.

archiveOnDelete is processed by the controller only. Resources are handled immediately after Kubernetes sends the event.

The following ClusterKubeArchiveConfig configures KubeArchive to archive pods when they get deleted from the cluster only if they match the condition status.phase == "Succeeded", so failed pods that get deleted do not get archived:

---
apiVersion: kubearchive.org/v1
kind: ClusterKubeArchiveConfig
metadata:
  name: kubearchive
spec:
  resources:
    - selector:
        apiVersion: v1
        kind: Pod
      archiveOnDelete: status.phase == "Succeeded"

Interaction With Namespace Filters

Global filters configured in ClusterKubeArchiveConfig only work in namespaces containing a KubeArchiveConfig custom resource. Both global and namespace filters are combined with an "OR" type of logic and used for that specific namespace.

For example, if you have this ClusterKubeArchiveConfig:

---
apiVersion: kubearchive.org/v1
kind: ClusterKubeArchiveConfig
metadata:
  name: kubearchive
spec:
  resources:
    - selector:
        apiVersion: batch/v1
        kind: Job
      deleteWhen: has(status.failed)
    - selector:
        apiVersion: v1
        kind: Pod
      archiveOnDelete: "true"

And the following KubeArchiveConfig in the namespace "my-team":

---
apiVersion: kubearchive.org/v1
kind: KubeArchiveConfig
metadata:
  name: kubearchive
  namespace: my-team
spec:
  resources:
    - selector:
        apiVersion: batch/v1
        kind: Job
      archiveWhen: has(status.startTime)
      deleteWhen: has(status.completionTime)

In the "my-team" namespace, KubeArchive will:

Archive jobs when they have status.startTime (namespace rule)
Delete jobs when they have status.completionTime OR status.failed (namespace rule OR cluster rule)
Archive pods when they are deleted (cluster rule)

Next Steps

Learn more about namespace-specific configuration in Configuring KubeArchive
Explore cel.dev to learn more about the CEL expression language

All ClusterKubeArchiveConfig and KubeArchiveConfig resources must be named "kubearchive".

Configuring Cluster-Wide KubeArchive Policies

Prerequisites

The ClusterKubeArchiveConfig Resource

selector: Selecting Resources

archiveWhen: Archiving Resources

deleteWhen: Deleting Resources

keepLastWhen: Keeping the Last N Resources

Basic Example

Custom Sort Field

Multiple Rules

Namespace Overrides

archiveOnDelete: Archiving on Deletion From the Cluster

Interaction With Namespace Filters

Next Steps

`selector`: Selecting Resources

`archiveWhen`: Archiving Resources

`deleteWhen`: Deleting Resources

`keepLastWhen`: Keeping the Last N Resources

`archiveOnDelete`: Archiving on Deletion From the Cluster