Logging Integrations
Overview
KubeArchive supports logging, but it is not a logging system itself and does not implement logging. Instead, KubeArchive integrates with logging systems and provide URLs for retrieving log files from the logging system for a specific Kubernetes resource.
It is important to note that logs are tied to Pods. When a user requests the logs
for a Tekton PipelineRun, what they expect to get back are the logs attached to the
Pods that were part of the PipelineRun. Similar cases exist for requesting logs for
Jobs and CronJobs. KubeArchive handles this seamlessly for the user.
KubeArchiveConfig Configuration
KubeArchive retrieves log URLs using the owner references field of a resource.
When logs for a resource are requested, a query is made to find all the resources
that have that initial resource as an owner. Then each resource returned is
processed similarly, eventually building up a list of Pods and from those a
list of log file links. This generic approach works for any resource.
A KubeArchiveConfig needs to be configured correctly to support this, meaning it must
be configured so that the initial resource and any dependent resources, all the way
down to and including the Pods, are archived.
---
apiVersion: kubearchive.org/v1
kind: KubeArchiveConfig
metadata:
name: kubearchive
namespace: test
spec:
resources:
- deleteWhen: has(status.completionTime)
selector:
apiVersion: "batch/v1"
kind: Job
- archiveOnDelete: true
selector:
apiVersion: "v1"
kind: Pod
In this example, the Job is configured to be archived and deleted when
the status contains a "completionTime" key. When that deletion happens,
kubernetes will in turn delete the associated Pod. Since we have
configured archiveOnDelete for Pods to be true, KubeArchive will archive
the Pod itself and generate the URLs for all the associated logs.
|
---
apiVersion: kubearchive.org/v1
kind: KubeArchiveConfig
metadata:
name: kubearchive
namespace: test
spec:
resources:
- selector:
apiVersion: tekton.dev/v1
kind: PipelineRun
deleteWhen: has(status.completionTime)
- selector:
apiVersion: tekton.dev/v1
kind: TaskRun
archiveOnDelete: true
- selector:
apiVersion: v1
kind: Pod
archiveOnDelete: has(body.metadata.labels["tekton.dev/pipeline"])
In this example the following happens:
-
PipelineRunsare archived when they complete. -
TaskRunsare archived when they are deleted. -
Podsare archived when they are deleted and are also part of a TektonPipeline.
Configuration
|
The logging configuration is read once at startup. Changes to the logging
|
The logging configuration is split into three Kubernetes resources:
-
kubearchive-logging-writerConfigMap — used by the Sink to generate log URLs and query metadata at archival time. -
kubearchive-logging-readerConfigMap — used by the API Server to know how to query each logging backend. -
kubearchive-loggingSecret — used by the API Server to authenticate requests to the logging backends.
The API Server mounts the reader ConfigMap and the secret together as a projected volume. The Sink only mounts the writer ConfigMap.
Writer ConfigMap
The kubearchive-logging-writer ConfigMap is used by the Sink. It contains entries
for generating log URLs and query metadata when resources are archived. The key
LOG_URL is required and specifies the base URL of the logging backend. Other
keys define template variables whose values are extracted from the resource using
CEL expressions.
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kubearchive-logging-writer
namespace: kubearchive
data:
POD_ID: "cel:metadata.uid" (1)
QUERY: "kubernetes.pod_id:{POD_ID} AND kubernetes.container_name:{CONTAINER_NAME}" (2)
LOG_URL: "https://my-logging-backend.example.com:9200" (3)
| 1 | Values prefixed with cel: are CEL expressions evaluated against the resource body.
{POD_ID} and {CONTAINER_NAME} are substituted at URL generation time. |
| 2 | QUERY is stored alongside the log URL and made available to the reader at query time.
The {CONTAINER_NAME} variable is always provided by KubeArchive. |
| 3 | The base URL of the logging backend. This URL is used as a key to match a provider in the reader ConfigMap and the secret headers. |
Additional supported keys include NAMESPACE, START, and END, whose values
are also stored and made available for variable substitution at query time.
Reader ConfigMap
The kubearchive-logging-reader ConfigMap is used by the API Server. It contains
a single key LOG_PROVIDERS with a YAML value that defines how to query each
logging backend. The top-level keys in the YAML are the base URLs of the logging
backends (matching the LOG_URL from the writer ConfigMap). Each backend defines
a tail and/or full endpoint configuration.
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kubearchive-logging-reader
namespace: kubearchive
data:
LOG_PROVIDERS: |
https://my-logging-backend.example.com:9200: (1)
tail: (2)
reverse: true (3)
path: /index/_search (4)
method: GET (5)
params: (6)
q: ${QUERY}
size: ${TAIL_LINES}
sort: "@timestamp:desc"
json-path: "$.hits.hits[*]._source.message" (7)
full: (8)
reverse: false
path: /index/_search
method: GET
params:
q: ${QUERY}
size: 10000
sort: "@timestamp:asc"
json-path: "$.hits.hits[*]._source.message"
| 1 | The base URL must match the LOG_URL in the writer ConfigMap. |
| 2 | tail defines the endpoint used when the tailLines query parameter is provided. |
| 3 | When reverse is true, the API Server buffers all results and reverses
them before returning. Buffering also occurs when the tailLines query parameter
is set, so that only the last N lines are returned. When neither condition applies,
results are streamed directly to the client as they are read. |
| 4 | The path appended to the base URL. |
| 5 | HTTP method. Supports GET (with query parameters) and POST (with JSON body). |
| 6 | Query parameters for GET requests. For POST requests, use body instead.
Template variables like ${QUERY}, ${TAIL_LINES}, ${START}, ${END},
and ${NAMESPACE} are substituted at request time. |
| 7 | Optional JSONPath expression applied to the response body to extract log lines. |
| 8 | full defines the endpoint used when no tailLines parameter is provided. |
Secret
Authentication headers for each logging backend are stored in a kubearchive-logging
Secret under a single HEADERS key. The value is a YAML document that maps each
backend base URL to its required HTTP headers.
---
apiVersion: v1
kind: Secret
metadata:
name: kubearchive-logging
namespace: kubearchive
type: Opaque
stringData:
HEADERS: | (1)
https://my-logging-backend.example.com:9200: (2)
Authorization: "Basic YWRtaW46cGFzc3dvcmQ=" (3)
| 1 | The HEADERS key contains a YAML document with per-backend headers. |
| 2 | The base URL must match the LOG_URL in the writer ConfigMap. |
| 3 | HTTP headers sent with every request to this backend. |
Supported Logging Systems
KubeArchive currently integrates with Elasticsearch, Splunk, and Loki.
|
Because the reader ConfigMap and the secret are keyed by base URL, multiple
logging backends can coexist at the same time. This is useful when migrating
from one backend to another — both can be configured in |
Elasticsearch
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kubearchive-logging-writer
data:
POD_ID: "cel:metadata.uid"
QUERY: "kubernetes.pod_id:{POD_ID} AND kubernetes.container_name:{CONTAINER_NAME}"
LOG_URL: "https://kubearchive-es-http.elastic-system.svc.cluster.local:9200"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kubearchive-logging-reader
data:
LOG_PROVIDERS: |
https://kubearchive-es-http.elastic-system.svc.cluster.local:9200:
tail:
reverse: true
path: /fluentd/_search
method: GET
params:
q: ${QUERY}
_source_includes: message
sort: "@timestamp:desc"
size: ${TAIL_LINES}
json-path: "$.hits.hits[*]._source.message"
full:
reverse: false
path: /fluentd/_search
method: GET
params:
q: ${QUERY}
_source_includes: message
sort: "@timestamp:asc"
size: 10000
json-path: "$.hits.hits[*]._source.message"
Splunk
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kubearchive-logging-writer
data:
POD_ID: "cel:metadata.uid"
QUERY: 'search * | spath "kubernetes.pod_id" | search "kubernetes.pod_id"="{POD_ID}" | spath "kubernetes.container_name" | search "kubernetes.container_name"="{CONTAINER_NAME}" | sort time | table "message"'
LOG_URL: "https://splunk-single-standalone-service.splunk-operator.svc.cluster.local:8089"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kubearchive-logging-reader
data:
LOG_PROVIDERS: |
https://splunk-single-standalone-service.splunk-operator.svc.cluster.local:8089:
tail:
reverse: false
path: /services/search/jobs/export
method: GET
params:
search: ${QUERY} | head ${TAIL_LINES}
output_mode: json
json-path: "$.result.message"
full:
reverse: false
path: /services/search/jobs/export
method: GET
params:
search: ${QUERY}
output_mode: json
json-path: "$.result.message"
---
apiVersion: v1
kind: Secret
metadata:
name: kubearchive-logging
stringData:
HEADERS: |
https://splunk-single-standalone-service.splunk-operator.svc.cluster.local:8089:
Authorization: "Basic YWRtaW46cGFzc3dvcmQ="
Loki
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kubearchive-logging-writer
data:
NAMESPACE: "cel:metadata.namespace"
POD_ID: "cel:metadata.uid"
START: "cel:status.?startTime == optional.none() ? int(now()-duration('1h'))*1000000000: status.startTime"
QUERY: '{stream="{NAMESPACE}"} | pod_id = `{POD_ID}` | container = `{CONTAINER_NAME}`'
LOG_URL: "http://loki-gateway.grafana-loki.svc.cluster.local:80"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kubearchive-logging-reader
data:
LOG_PROVIDERS: |
http://loki-gateway.grafana-loki.svc.cluster.local:80:
tail:
reverse: true
path: /loki/api/v1/query_range
method: GET
params:
query: ${QUERY}
start: ${START}
limit: ${TAIL_LINES}
direction: backward
json-path: "$.data.result[*].values[*][1]"
full:
reverse: false
path: /loki/api/v1/query_range
method: GET
params:
query: ${QUERY}
start: ${START}
limit: 10000
direction: forward
json-path: "$.data.result[*].values[*][1]"
---
apiVersion: v1
kind: Secret
metadata:
name: kubearchive-logging
stringData:
HEADERS: |
http://loki-gateway.grafana-loki.svc.cluster.local:80:
X-Scope-OrgID: "kubearchive"