In today’s complex, distributed systems, understanding the journey of requests through various services is essential for troubleshooting, performance tuning, and security auditing. Tracing provides visibility into the system’s behavior, offering insights that are critical for maintaining and improving application performance and reliability.

Benefits of Tracing

  • Enhanced Visibility: Tracing illuminates the path of requests across service boundaries, helping identify bottlenecks and inefficiencies in microservices architectures.
  • Improved Debugging: By providing a granular view of requests, tracing simplifies debugging, allowing developers to pinpoint the source of errors and latency issues.
  • Performance Optimization: Tracing data helps in identifying slow operations and performance anomalies, guiding optimization efforts for better resource utilization and response times.
  • Security and Compliance: Tracing can be used to monitor and audit access to sensitive data, ensuring compliance with security policies and regulations.

Advantages of the KoalaOps Tracing Architecture

Our setup includes ArgoCD, Grafana, Grafana Tempo, and an OpenTelemetry Collector. Each tool offers unique benefits for a comprehensive tracing and observability solution:

  • ArgoCD for GitOps: Manages clusters and applications through GitOps, ensuring Kubernetes clusters’ state matches Git configurations for automated deployments.
  • Grafana: Offers data visualization in dashboards with customizable views, alerting, and community plugins for enhanced functionality.
  • Grafana Tempo: Provides high-volume trace storage with cost-effective scalability and simplified troubleshooting by integrating with Grafana.
  • Otel Collector: Collects and exports telemetry data in a vendor-agnostic manner, offering flexible data handling and resource efficiency.

This combination enhances system observability and leverages each component’s strengths for improved reliability, performance, and security.

Installation Guide

This guide outlines the prerequisites and steps required to set up our recommended tracing tools on your platform, utilizing ArgoCD and several open-source addons including Grafana, Grafana Tempo, and the Otel Collector.

Prerequisites

Ensure you have ArgoCD installed on your cluster. If not, please refer to the Control Plane GitOps section on the KoalaOps platform which will guide you through the installation.

Grafana Tempo Addon Installation

The recommended practice is to run Grafana from a management cluster that displays data from all other workload clusters.

Configuration for the Management Cluster

  1. Select the Grafana Tempo addon from the KoalaOps GitOps section.
  2. Enable the addon specifically for the management cluster.
  3. Input the following custom YAML configuration to set up the primary management cluster instance:
    enabled: true
    version: 1.0.1
    addonParameters:
      standardBackend: true
      createSecrets: true
      mirrorSecretsToAllClusters: true
    valuesObject:
      secretName: tempo-auth
      ingress:
        enabled: true
        host: {tempo.example.com}
    

Ensure to modify the relevant host value in the ingress section.

Configuration for all other workload clusters

  1. Enable the Grafana Tempo addon for each cluster intended to act as a client.
  2. Use a simplified YAML configuration for the workload clusters:
enabled: true
version: 0.1.0
addonParameters:
  standardClient: true
valuesObject:
  secretName: tempo-auth
  ingress:
    enabled: true
    host: {tempo.example.com}

Ensure to modify the host value in the ingress section to point to the primary grafana installation.

Grafana Addon

  1. Choose the Grafana addon on the KoalaOps GitOps section.
  2. Locate the management cluster row and use the ticket on the left to enable the addon for it.
  3. Edit the custom values YAML and add the following configuration:
    enabled: true
    version: 0.1.0
    addonParameters:
      ingressHost: {grafana.example.com}
    

The Grafana addon is not required on the client clusters, but only on the management cluster.

OtelCollector Addon

This addon needs to be installed on all clusters.

  1. Choose the OtelCollector addon on the KoalaOps GitOps section.
  2. Use the tickers on the left to enable the addon for all clusters.
  3. Edit the custom values YAML and add the following configuration for each of the workload clusters:
    enabled: true
    version: 0.1.0
    addonParameters:
      tempo:
        enabled: true
        insecureHost: false
        hostAndPort: {tempo.example.com}
        basicAuth:
          enabled: true
          secretName: tempo
    
  4. Edit the custom values YAML and add the following configuration for the management cluster:
    enabled: true
    version: 0.1.0
    addonParameters:
      tempo:
        enabled: true
        insecureHost: true
        hostAndPort: {tempo.example.com}:4317
        basicAuth:
          enabled: false
    

The hostAndPort field for the management cluster should be directed to port 4317.

Note that it may take 5-15 minutes for the installation of all addons to complete.

Accessing Grafana

To log in to Grafana, use the host name that was defined in the “Grafana Tempo” addon:

  • Username: admin
  • Password Retrieval:
    kubectl get secret grafana -o jsonpath="{.data.admin-password}" -n observability | base64 --decode ; echo
    

Important: For security reasons, it’s strongly recommended to change the default admin password and create additional user accounts as needed through the Grafana UI.

OpenTelemetry Code Instrumentation

OpenTelemetry provides a set of APIs, libraries, agents, and instrumentation tools to capture distributed traces and metrics from your application. It supports various programming languages, allowing developers to collect and export telemetry data (like traces and metrics) to analysis tools in a standardized way.

OpenTelemetry supports instrumentation for many popular programming languages, see the full list and the required steps to add support for you code here: https://opentelemetry.io/docs/languages/