Troubleshooting

SPIRE Setup

Note that the SPIRE server must be running all containers before any other pods in the installation are created. Otherwise, any pods that are already existing when the server is installed will not have identities created. If you suspect that on install this may not have been the case, try deleting all pods (other than the SPIRE ones) and letting new ones come up.
If both the server and agent appear to be behaving correctly, the problem could be with the sidecars connecting via SDS - follow the steps to troubleshoot SPIRE sidecar configurations.

SPIRE Server

To verify that entries are being created for identities in the mesh run:
1
kubectl exec -it server-0 -n spire -c server -- /opt/spire/bin/spire-server entry show -registrationUDSPath /run/spire/socket/registration.sock
Copied!
You'll see a list of all of the SPIFFE identities existing in the mesh. If the identity of your service (or any/all services) is missing, this will be a problem. First try deleting any pod that is missing from the list of entries. When the new pod is created, rerun the command to see if the entry is now there. If it is not, there is likely a deeper problem with the SPIRE server and it's permissions within your environment.
If all entries appear to be there, one for each of the core services, one with SPIFFE id spiffe://<spire-trust-domain>/agent for each node in the k8s cluster, and one for any service that you have launched into the mesh, exit the server container and check on the SPIRE agents.

SPIRE Agent

To check that the agents are behaving properly, get the name for any agent pod kubectl get pods -n spire and run:
1
kubectl exec -it <agent-pod-name> -n spire -- /opt/spire/bin/spire-agent api fetch -socketPath /run/spire/socket/agent.sock
Copied!
If you do not see a SVID listed for the agent, attestation from the agent to the server failed.

SPIRE Sidecar Configuration

In a SPIRE enabled deployment, your sidecar's should be configured to get their certificates from the SPIRE server via the secret configuration field on their listener objects. Using the greymatter CLI, run greymatter get listener <sidecar-listener> for your sidecar's listener. It should have a secret object configured.
If the /stats admin endpoint in the verify mTLS section indicated values for ssl.fail_verify_san and/or you saw TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED in the sidecar debug logs, check that the sidecar listener secret has the correct subject_names configured. This value should be a list of all SPIFFE identities that can communicate with the sidecar. If the sidecar only needs to be reached by edge, the only value should be spiffe://<spire-trust-domain>/edge. If another service will need to make egress requests to this sidecar, there should be a list of those identities.
If this is not the problem, to verify that your sidecar has received its certs, execute into the container and run curl localhost:8001/certs. You should see something like the following:
1
{
2
"certificates": [
3
{
4
"ca_cert": [
5
{
6
"path": "\u003cinline\u003e",
7
"serial_number": "5e6bb7c3",
8
"subject_alt_names": [],
9
"days_until_expiration": "3400",
10
"valid_from": "2020-03-13T16:41:39Z",
11
"expiration_time": "2030-03-11T16:41:39Z"
12
}
13
],
14
"cert_chain": [
15
{
16
"path": "\u003cinline\u003e",
17
"serial_number": "e47a12e8c054b2c537d8ee647a3a359d",
18
"subject_alt_names": [
19
{
20
"uri": "spiffe://quickstart.greymatter.io/fibonacci"
21
}
22
],
23
"days_until_expiration": "0",
24
"valid_from": "2020-11-17T21:54:00Z",
25
"expiration_time": "2020-11-17T22:54:10Z"
26
}
27
]
28
}
29
]
30
}
Copied!
If the certificates list is empty, there is a problem getting certs from the SPIRE agents. Run curl localhost:8001/config_dump and check to see if there are any dynamic_warming_secrets.
If there are no secrets in the certificates list, and no dynamic_warming_secrets, your listener secret configuration is likely missing or the sidecar is incorrectly configured. Go back to this step and verify that the secret and mesh configs are set correctly.
If your dynamic_warming_secrets section is not empty, this is a problem.
1
"dynamic_warming_secrets": [
2
{
3
"name": "spiffe://quickstart.greymatter.io/fibonacci",
4
"version_info": "uninitialized",
5
"last_updated": "2020-11-18T17:49:39.419Z",
6
"secret": {
7
"@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret",
8
"name": "spiffe://quickstart.greymatter.io/fibonacci"
9
}
10
}
11
]
Copied!
There is likely a misconfigured listener secret. See what the values listed for "name" in the dynamic_warming_secrets are. This should only be the identity of that particular sidecar - spiffe://<spire-trust-domain>/<sidecar-name>. For example, for the fibonacci sidecar it should be spiffe://<spire-trust-domain>/fibonacci, for edge it should be spiffe://<spire-trust-domain>/edge.
If this section does have only its own identity, try deleting the pod and retry the request when a new one comes up.
If this section contains an identity for a different sidecar, a secret is misconfigured. Check the listener object greymatter get listener <listener-key> for this sidecar's ingress and verify that the value in secret_name is its own identity spiffe://<spire-trust-domain>/<service-name> (e.g. spiffe://<spire-trust-domain>/fibonacci for fibonacci). If the sidecar has an egress route to another sidecar in the mesh i.e. edge to fibonacci cluster, it could be a misconfigured cluster secret. In this case, also check any egress cluster objects greymatter get cluster <cluster-key> and verify again that the value in secret_name is its own identity spiffe://<spire-trust-domain>/<service-name>.
If everything with the sidecar's listener and cluster secret configurations look correct and the above steps don't indicate any problems, try the troubleshooting SPIRE server section and see if there is an entry for your service in the server. If there isn't, try uninstalling your service/sidecar and following this guide again step by step.

Other Issues

If you are still running into issues and need assistance please contact us at Grey Matter Support.
Last modified 2mo ago