Setup Distributed Tracing in Grey Matter on Kubernetes

Prerequisites

  1. An existing Grey Matter deployment.

  2. kubectl access to the k8s cluster

  3. The greymatter CLI with access to the Fabric mesh.

Overview

  1. Install the trace server

  2. Configure a Sidecar to emit traces

  3. Verify traces are collected

Steps

1. Install Jaegar

For this walkthrough we'll use Jaeger as the trace backend. For convenience, we'll use their provided all-in-one Docker image. This image provides a simple server and web UI that is useful for small deployments. Larger production deployments should use a more resilient deployment strategy.

We'll launch the server with the configuration below. The Jaeger server is going to be exposed through a k8s service, so it can be directly addressed by all Sidecars, and the trace port is 9411, which we'll need to set on the Sidecar.

apiVersion: v1
kind: Service
metadata:
name: jaeger
labels:
app: jaeger
spec:
ports:
- port: 9411
targetPort: 9411
name: trace
- port: 16686
targetPort: 16686
name: ui
selector:
app: jaeger
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: jaeger
spec:
selector:
matchLabels:
app: jaeger
replicas: 1
template:
metadata:
labels:
app: jaeger
spec:
containers:
- name: jaeger
image: jaegertracing/all-in-one
imagePullPolicy: Always
ports:
- name: trace
containerPort: 9411
- name: ui
containerPort: 16686
env:
- name: COLLECTOR_ZIPKIN_HTTP_PORT
value: "9411"
- name: QUERY_BASE_PATH
value: "/apps/trace"
- name: LOG_LEVEL
value: "debug"

Write the above Kubernetes deployment to a file (jaeger.yaml) and create the resources:

kubectl apply -f ./jaeger.yaml

2. Turn on tracing for the Edge

Now that the trace server is up and running, we can configure a service to send out traces for each request. To do this, we do need to restart the Sidecar with new runtime options. This is due to how the Sidecar needs to address the trace server. It must be defined at runtime, and cannot be defined through the service mesh later.

We can setup the sidecar with the traceserver by editing the current deployment:

kubectl edit deployment edge

And adding in the following lines to the Sidecar's environment variables:

- name: TRACING_ENABLED
value: "true"
- name: TRACING_ADDRESS
value: "jaeger"
- name: TRACING_PORT
value: "9411"

After editing the deployment, we're now ready to turn on tracing on the edge Sidecar using the greymatter cli. Edit the ingress listener for the edge Sidecar:

greymatter edit listener edge-listener

And add in the following config. This will turn on tracing for all requests that pass through this listener.

"tracing_config": {
"ingress": true
}

3. Verify Trace Collection

At this point everything is setup. Sending more requests through the edge (such as refreshing the browser or calling a service), will trigger a trace to be collected.

We can watch this in the logs of the trace server.

kubectl logs -f jaeger-5dc85d4bbd-7whzl
{"level":"debug","ts":1588867245.6727643,"caller":"handler/thrift_span_handler.go:130","msg":"Zipkin span batch processed by the collector.","span-count":1}
{"level":"debug","ts":1588867245.6729214,"caller":"app/span_processor.go:148","msg":"Span written to the storage by the collector","trace-id":"e6681bca2d0b0bac","span-id":"e6681bca2d0b0bac"}
{"level":"debug","ts":1588867250.6756654,"caller":"handler/thrift_span_handler.go:130","msg":"Zipkin span batch processed by the collector.","span-count":1}
{"level":"debug","ts":1588867250.6758242,"caller":"app/span_processor.go:148","msg":"Span written to the storage by the collector","trace-id":"25cb6d784522de88","span-id":"25cb6d784522de88"}
{"level":"debug","ts":1588867255.6768801,"caller":"handler/thrift_span_handler.go:130","msg":"Zipkin span batch processed by the collector.","span-count":1}
{"level":"debug","ts":1588867255.67703,"caller":"app/span_processor.go:148","msg":"Span written to the storage by the collector","trace-id":"615a316d024c403c","span-id":"615a316d024c403c"}
{"level":"debug","ts":1588867260.6802413,"caller":"handler/thrift_span_handler.go:130","msg":"Zipkin span batch processed by the collector.","span-count":1}

We can also view this in the provided Jaegar UI. To quickly view the UI, port-forward your local shell to the UI port of the Jaegar pod.

NOTE: The trace UI can also be setup as a Service in the mesh and routed through the Edge like all other services. See the guide for deploying a service for a walkthrough on those steps.

kubectl port-forward $(kubectl get pod | grep jaeger | cut -d" " -f1) 16686

Using your browser, navigate to localhost:16686

Jaeger Trace UI