Setup Rate Limiting in Grey Matter on Kubernetes

Envoy allows configuration of how many requests / second a service can field. This is useful in preventing DDOS attacks and otherwise making sure your server's resources aren't easily overrun.

Unlike circuit breakers which are configured on each cluster, rate limiting is configured across multiple listeners or clusters. Overall, circuit breakers are good for avoiding cascading failures of a bad downstream host. However, when:

[A] large number of hosts are forwarding to a small number of hosts and the average request latency is low (e.g., connections/requests to a database server). If the target hosts become backed up, the downstream hosts will overwhelm the upstream cluster. In this scenario it is extremely difficult to configure a tight enough circuit breaking limit on each downstream host such that the system will operate normally during typical request patterns but still prevent cascading failure when the system starts to fail. Global rate limiting is a good solution for this case.

This is because rate limiting sets a global number of requests / second across a service, which is independent of the number of instances configured for a cluster. This guide shows how to configure rate limiting on the edge node in a Grey Matter deployment to an edge namespace.

Pre-requisites

  1. greymatter setup with a running Fabric mesh.

  2. An existing Grey Matter deployment running on Kubernetes _(tutorial)

  3. kubectl or oc setup with access to the cluster

Steps

1. Deploy Ratelimit Service

Rate limiting relies on an external service to regulate and calculate the current number of requests / second. For this example we use envoy proxy's open source rate limit service which is based on lyft's original rate limiting service. This blog post is a good example of this architecture.

Begin by creating a deployment for the rate limit service by applying these configs to kubernetes. Note for all these examples I use edge as the namespace – this may differ for your deployment. Additionally, be sure to update the redis password with your own:

---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: ratelimit
name: ratelimit
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: ratelimit
template:
metadata:
labels:
app: ratelimit
spec:
serviceAccountName: default
containers:
- name: ratelimit
image: "envoyproxy/ratelimit:v1.4.0"
imagePullPolicy: IfNotPresent
env:
- name: USE_STATSD
value: "false"
- name: LOG_LEVEL
value: "debug"
- name: REDIS_SOCKET_TYPE
value: "tcp"
- name: REDIS_URL
value: "redis.default.svc:6379"
- name: RUNTIME_ROOT
value: "/"
- name: RUNTIME_SUBDIRECTORY
value: "ratelimit"
- name: REDIS_AUTH
valueFrom:
secretKeyRef:
name: redis-password
key: password
command: ["/bin/sh","-c"]
args: ["mkdir -p /ratelimit/config && cp /data/ratelimit/config/config.yaml /ratelimit/config/config.yaml && cat /ratelimit/config/config.yaml && /bin/ratelimit"]
ports:
- name: server
containerPort: 8081
- name: debug
containerPort: 6070
volumeMounts:
- name: ratelimit-config
mountPath: /data/ratelimit/config
readOnly: true
volumes:
- name: ratelimit-config
configMap:
name: ratelimit
---
kind: Service
apiVersion: v1
metadata:
name: ratelimit
labels:
app: ratelimit
spec:
ports:
- name: server
port: 8081
protocol: TCP
targetPort: 8081
- name: debug
port: 6070
protocol: TCP
targetPort: 6070
selector:
app: ratelimit
sessionAffinity: None
type: ClusterIP
---
apiVersion: v1
kind: ConfigMap
metadata:
name: ratelimit
namespace: default
data:
config.yaml: |-
---
domain: edge
descriptors:
- key: path
value: "/"
rate_limit:
unit: second
requests_per_unit: 1

When applied with kubectl, this service will serve as a central hub to limit the number of requests coming from the domain "edge" to 100 requests / second.

2. Configure Grey Matter Sidecar To Use Ratelimit Filter

In order to configure rate limiting on a sidecar, we need to configure the sidecar with an additional cluster that points to the rate limit service. As a convenience, Grey Matter Sidecar allows for defining a cluster using environment variables. The following sample environment variables define a cluster ratelimit that points to our deployed ratelimit service:

...
tcp_cluster:
type: 'value'
value: 'ratelimit'
tcp_host:
type: 'value'
value: 'ratelimit.default.svc.cluster.local'
tcp_port:
type: 'value'
value: '8081'
...

Make sure that you can see this cluster on whichever sidecar you have configured. It's available on localhost:8001/clusters under the ratelimit cluster.

Now let's update the listener config for the sidecar we've configured. Edit the listener in the grey matter CLI and add the following attributes:

...
"active_network_filters": [
"envoy.rate_limit"
],
"network_filters": {
"envoy_rate_limit": {
"stat_prefix": "edge",
"domain": "edge",
"failure_mode_deny": true,
"descriptors": [
{
"entries": [
{
"key": "path",
"value": "/"
}
]
}
],
"rate_limit_service": {
"grpc_service": {
"envoy_grpc": {
"cluster_name": "ratelimit"
}
}
}
}
},
...

You should see that requests to the ratelimit cluster succeed on the localhost:8001/clusters endpoint on the sidecar. You should also see logs in the ratelimit pod on every request since the logs are set to debug. If there are ever more than 1 request / second, all requests are blocked until the 1 request / second threshold is released.

3. Trust but Verify

The ratelimit we set can be tested by changing the number of requests / second to 1 and spamming the sidecar. Make a series of curl requests to the edge. The response code should be 429, although if TLS is enabled this often is just an SSL error. You should see also something like the following logs in the ratelimit pod:

time="2020-09-03T21:54:40Z" level=debug msg="returning normal response"
time="2020-09-03T21:54:40Z" level=debug msg="cache key: edge_path_/_1599170080 current: 1"
time="2020-09-03T21:54:40Z" level=debug msg="returning normal response"
time="2020-09-03T21:54:40Z" level=debug msg="starting get limit lookup"
time="2020-09-03T21:54:40Z" level=debug msg="looking up key: path_/"
time="2020-09-03T21:54:40Z" level=debug msg="found rate limit: path_/"
time="2020-09-03T21:54:40Z" level=debug msg="starting cache lookup"
time="2020-09-03T21:54:40Z" level=debug msg="looking up cache key: edge_path_/_1599170080"
time="2020-09-03T21:54:40Z" level=debug msg="cache key: edge_path_/_1599170080 current: 3"

This shows that the ratelimit service is registering calls from edge and is checking the requests / second against the limit of a maximum of 1 request / second. If you wish to raise the limit in a production environment, 100 requests / second is a good starting point.

References