Monitoring and Observability

SAPL Node exposes health, metrics, and decision data through standard Spring Boot Actuator and Micrometer interfaces. There is no proprietary monitoring agent. Use your existing observability stack (Prometheus, Grafana, Loki, ELK, or any tool that consumes these standard interfaces).

PDP Health Indicator

The PDP reports one of three operational states:

State Meaning Health Status
LOADED Policies compiled and active. The PDP is fully operational. UP
STALE A hot reload failed, but the PDP is still serving decisions from the previous valid configuration. UP (with warning)
ERROR No valid configuration loaded. The PDP cannot make valid authorization decisions and serves INDETERMINATE. DOWN

In multi tenant deployments, the health indicator aggregates the state of all PDP instances. If all instances are LOADED, health is UP. If any instance is STALE while none are ERROR, health is UP with a warning detail. If any instance is ERROR, health is DOWN.

The health endpoint returns detail fields for each PDP instance:

Field Description
state Current operational state (LOADED, STALE, or ERROR).
configurationId Identifier of the active configuration. Absent in ERROR state.
combiningAlgorithm The combining algorithm in use, with votingMode, defaultDecision, and errorHandling fields. Absent in ERROR state.
documentCount Number of SAPL documents in the active configuration.
lastSuccessfulLoad Timestamp of the last successful configuration load.
lastFailedLoad Timestamp of the last failed configuration load. Absent if no failure occurred.
lastError Error message from the last failed load. Absent if no failure occurred.

Example health response with one loaded and one stale PDP:

{
  "status": "UP",
  "components": {
    "pdp": {
      "status": "UP",
      "details": {
        "warning": "One or more PDPs are serving stale policies",
        "pdps": {
          "default": {
            "state": "LOADED",
            "configurationId": "v42",
            "combiningAlgorithm": {
              "votingMode": "PRIORITY_PERMIT",
              "defaultDecision": "DENY",
              "errorHandling": "ABSTAIN"
            },
            "documentCount": 12,
            "lastSuccessfulLoad": "2026-03-10T08:15:30Z"
          },
          "staging": {
            "state": "STALE",
            "configurationId": "v5",
            "combiningAlgorithm": {
              "votingMode": "PRIORITY_DENY",
              "defaultDecision": "DENY",
              "errorHandling": "PROPAGATE"
            },
            "documentCount": 3,
            "lastSuccessfulLoad": "2026-03-10T07:00:00Z",
            "lastFailedLoad": "2026-03-10T08:10:00Z",
            "lastError": "Parse error in staging-policy.sapl at line 5"
          }
        }
      }
    }
  }
}

Detail fields are only visible to authenticated users. The default configuration uses show-details: when-authorized. See Security for securing actuator endpoints.

Actuator Endpoints

Endpoint Auth Required Description
/actuator/health No Overall health status. Returns UP or DOWN for load balancers.
/actuator/health/liveness No Kubernetes liveness probe. Reports whether the JVM process is alive.
/actuator/health/readiness No Kubernetes readiness probe. Reports whether the node is ready to accept traffic.
/actuator/info Yes PDP configuration details: configType, index, configPath, policiesPath.
/actuator/prometheus Yes Prometheus metrics scrape endpoint.

Health endpoints are unauthenticated so Kubernetes probes work without credentials. The info and prometheus endpoints require authentication to prevent information disclosure.

Kubernetes Probes

Configure liveness, readiness, and startup probes for Kubernetes deployments:

apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
        - name: sapl
          image: ghcr.io/heutelbeck/sapl-node:4.0.0
          ports:
            - containerPort: 8443
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8443
            initialDelaySeconds: 15
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8443
            initialDelaySeconds: 10
            periodSeconds: 5
          startupProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8443
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 12

The startup probe gives the PDP time to compile policies before liveness checks begin. With the values above, the maximum startup time is 65 seconds (initialDelaySeconds + periodSeconds * failureThreshold = 5 + 5 * 12 = 65). Once the startup probe succeeds, Kubernetes switches to the liveness and readiness probes.

The liveness probe detects a hung JVM process. The readiness probe gates traffic until the node is ready. Both are independent of the PDP compilation state: a node that is still loading policies is alive but not yet ready.

Decision Metrics

SAPL Node exposes four custom Prometheus metrics covering the golden signals for PDP decision traffic:

Metric Type Tags Description
sapl.decisions Counter decision (PERMIT, DENY, INDETERMINATE, NOT_APPLICABLE) Total authorization decisions by outcome.
sapl.decision.first.latency Timer   Time from subscription to first decision.
sapl.subscriptions.active Gauge   Currently active SSE streaming subscriptions.
sapl.subscription.duration Timer   Total lifetime of completed subscriptions.

These metrics cover both one shot (decide-once) and streaming (decide) endpoints. Standard Spring Boot HTTP metrics (http.server.requests) are also available for request level monitoring.

Enable metrics in application.yml:

io.sapl.pdp.embedded:
  metrics-enabled: true

SAPL Node enables metrics by default. When embedding the PDP as a library, metrics-enabled defaults to false. When disabled, no metrics are recorded and there is zero runtime overhead. The property is a final boolean that the JIT compiler evaluates at startup. Dead metric recording branches are eliminated entirely.

Configure Prometheus to scrape the metrics endpoint:

scrape_configs:
  - job_name: sapl
    metrics_path: /actuator/prometheus
    basic_auth:
      username: prometheus
      password: secret
    static_configs:
      - targets: ['sapl:8443']

The prometheus endpoint requires authentication. Use a dedicated service account with Basic Auth or API key credentials. See Security for credential generation.

Info Endpoint

The /actuator/info endpoint returns PDP configuration under the sapl key:

{
  "sapl": {
    "configType": "BUNDLES",
    "index": "NAIVE",
    "configPath": "/policies",
    "policiesPath": "bundles"
  }
}

This endpoint requires authentication and is intended for operational dashboards and inventory systems.

Decision Logging

The PDP emits structured JSON log entries via the reporting interceptor. Each entry contains the authorization subscription (subject, action, resource, environment), the decision (PERMIT, DENY, INDETERMINATE, NOT_APPLICABLE), and any obligations or advice attached to the decision.

Enable subscription lifecycle logging with two properties:

io.sapl.pdp.embedded:
  print-subscription-events: true
  print-unsubscription-events: true

These log when a new authorization subscription starts and when it ends. This is useful for tracking active clients and debugging connection lifecycle issues.

Filtering, retention, and alerting on decision log entries are handled by your log infrastructure (Loki, ELK, Fluentd, CloudWatch). The PDP does not push logs to any external service.

Evaluation Diagnostics

Four properties control diagnostic output during policy evaluation:

Property Description
print-trace Logs the full JSON evaluation trace on each decision. Shows every evaluation step the PDP performed.
print-json-report Logs a JSON evaluation report on each decision. More compact than the full trace.
print-text-report Logs a human readable text report on each decision. Shows which policies matched, how each was evaluated, and why the combining algorithm produced its result.
pretty-print-reports Pretty prints JSON in logged traces and reports.

Enable all diagnostics during development:

io.sapl.pdp.embedded:
  print-trace: true
  print-json-report: true
  print-text-report: true
  pretty-print-reports: true

The text report is the most useful diagnostic tool for understanding why a particular decision was reached. It provides a step by step view of the evaluation process in a format designed for human consumption.

Disable all diagnostic properties in production. They produce significant log volume under load and are intended for development and staging environments only. See Configuration for the full property reference.