Skip to content

Detector configuration options and examples

This document covers configuration of a Detector resource.

When configuring a Detector resource there are a few required fields and optional fields

Minimal Spec

An example minimal spec with only required fields would be as follows:

apiVersion: monitoring.amitdebachar/v1alpha1
kind: Detector
metadata:
  name: detector-name
spec:

  ## (required) The detector image
  image: "amitde7896/anomaly-operator:latest-detector"

  ## (required) Promethus HTTP API endpoint 
  prom_url: "http://prometheus.monitoring.svc.cluster.local"

  ## (required) Evaluation interval in minutes in which the Detector
  ## will query Prometheus and analyze for anomalies
  ## Note: use float for under 1 min interval e.g. "0.01"  
  interval_mins: "15"

  ## (required) List of PromQL expressions which the detector 
  ## will query and evaluate using the configured Prometheus endpoint and interval   
  queries:

    ## (required) Query name which will appear in the logs and metrics 
    ## in order to correlate the detected anomaly to the configured query  
  - name: "sum_pods_running_anomaly"

    ## (required) Query PromQL expression
    query: 'sum(kube_pod_status_phase{phase=~"Running", pod=~"application-pod-.*"}) > 1'

    ## (required) parse_datetime formated text such as - "1m" "4h" "2d" "3w"
    ## Past period for which the detector will train and learn the trend
    ## Note: bigger value means longer query time
    train_window: "14d"

Custom Anomaly Spec

You can tune some parameters in order to avoid misclassifying an anomaly.

An example custom anomaly spec with optional fields would be as follows:

apiVersion: monitoring.amitdebachar/v1alpha1
kind: Detector
metadata:
  name: detector-name
spec:
  ...
  queries:
  - name: "sum_pods_running_anomaly"
    query: 'sum(kube_pod_status_phase{phase=~"Running", pod=~"application-pod-.*"}) > 1'
    train_window: "14d"

    ## (optional) parse_datetime formated text such as - "1m" "4h" "2d" "3w"
    ## Past period for which the detector will evaluate anomalies based on "train_window" period
    ## Default: "1h" 
    detection_window_hours: "2h"

    ## (optional) float value which tunes the Prophet parameter "changepoint_prior_scale"
    ## https://facebook.github.io/prophet/docs/trend_changepoints.html#adjusting-trend-flexibility
    ## This will affect how flexible or stiff the model is between change points (date points)
    ## Note: Lower is stiffer Higher is more flexible  
    ## Default: "0.05" 
    flexibility: "10"

    ## (optional) integer value which is used for PromQL request in the "step" paramter
    ## This will affect how many data points the detector is getting from Prometheus for a given timeframe
    ## Note: Higher means longer query time and more sensitive to anomalies
    ## Default: number of hours in "train_window"
    ## Disclaimer: setting too small value for a longer "train_window" can cause PromQL to fail
    resolution: 1400

    ## (optional) integer value represent precetange 
    ## which is used to increase/decrease the buffer the detector will append to the detection threshold 
    ## Note: Higher means less likely to find anomalies
    ## Default: 100
    buffer_pct: 150

Custom Pod Spec

You can override the pod template and spec used for the detector pod.

Warning

The override should be mindful of the default pod spec that is created by the operator.

It is reccommended to at least use the same default spec and add more options and not change the existing ones such as env section and volumeMounts/volumes.

Change at your own risk.

Warning

Both image and the pod_spec.spec.containers.*.image are required in this case but only the containers section will determine the image used in the created deployment

An example custom pod spec would be as follows:

apiVersion: monitoring.amitdebachar/v1alpha1
kind: Detector
metadata:
  name: detector-name
spec:
  image: "..."
  ...
  pod_spec:
    spec:
      containers:
        - env:
            - name: LOG_LEVEL
              value: INFO
          image: amitde7896/anomaly-operator:latest-detector
          imagePullPolicy: Always
          name: custom-detector-spec
          ports:
            - containerPort: 9090
              name: http
              protocol: TCP
          resources:
            limits:
              cpu: 1500m
              memory: 1G
            requests:
              cpu: 1500m
              memory: 1G
          volumeMounts:
            - mountPath: /app/config.yaml
              name: detector-name
              subPath: detector-name-conf.yaml
      serviceAccount: detector-name
      serviceAccountName: detector-name
      volumes:
        - configMap:
            defaultMode: 420
            name: detector-name
          name: detector-name
  ...
  queries:
  ...