Setup Rate Limiting

Envoy allows configuration of how many requests / second a service can field. This is useful in preventing DDOS attacks and otherwise making sure your server's resources aren't easily overrun.

Unlike circuit breakers which are configured on each cluster, rate limiting is configured across multiple listeners or clusters. Overall, circuit breakers are good for avoiding cascading failures of a bad downstream host. However, when:

[A] large number of hosts are forwarding to a small number of hosts and the average request latency is low (e.g., connections/requests to a database server). If the target hosts become backed up, the downstream hosts will overwhelm the upstream cluster. In this scenario it is extremely difficult to configure a tight enough circuit breaking limit on each downstream host such that the system will operate normally during typical request patterns but still prevent cascading failure when the system starts to fail. Global rate limiting is a good solution for this case.

This is because rate limiting sets a global number of requests / second across a service, which is independent of the number of instances configured for a cluster. This guide shows how to configure rate limiting on the edge node in a Grey Matter deployment to an edge namespace.

Pre-requisites

  1. greymatter setup with a running Fabric mesh.

  2. An existing Grey Matter deployment running on Kubernetes _(tutorial)

  3. kubectl or oc setup with access to the cluster

Steps

1. Deploy Ratelimit Service

Rate limiting relies on an external service to regulate and calculate the current number of requests / second. For this example we use envoy proxy's open source rate limit service which is based on lyft's original rate limiting service. This blog post is a good example of this architecture.

Begin by creating a deployment for the rate limit service by applying these configs to kubernetes. Note for all these examples I use edge as the namespace – this may differ for your deployment. Additionally, be sure to update the redis password with your own:

---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: ratelimit
cluster: edge.ratelimit
name: ratelimit
namespace: edge
spec:
replicas: 1
selector:
matchLabels:
app: ratelimit
cluster: edge.ratelimit
template:
metadata:
labels:
app: ratelimit
cluster: edge.ratelimit
spec:
serviceAccountName: edge
containers:
- name: ratelimit
image: "envoyproxy/ratelimit:v1.4.0"
imagePullPolicy: IfNotPresent
env:
- name: USE_STATSD
value: "false"
- name: LOG_LEVEL
value: "debug"
- name: REDIS_SOCKET_TYPE
value: "tcp"
- name: REDIS_URL
value: "redis.edge.svc:6379"
- name: REDIS_AUTH
valueFrom:
secretKeyRef:
name: redis
key: password
- name: RUNTIME_ROOT
value: "/"
- name: RUNTIME_SUBDIRECTORY
value: "ratelimit"
command: ["/bin/sh","-c"]
args: ["mkdir -p /ratelimit/config && cp /data/ratelimit/config/config.yaml /ratelimit/config/config.yaml && cat /ratelimit/config/config.yaml && /bin/ratelimit"]
ports:
- name: server
containerPort: 8081
- name: debug
containerPort: 6070
volumeMounts:
- name: ratelimit-config
mountPath: /data/ratelimit/config
readOnly: true
volumes:
- name: ratelimit-config
configMap:
name: ratelimit
---
kind: Service
apiVersion: v1
metadata:
name: ratelimit
namespace: edge
spec:
ports:
- name: server
port: 8081
protocol: TCP
targetPort: 8081
- name: debug
port: 6070
protocol: TCP
targetPort: 6070
selector:
cluster: edge.ratelimit
sessionAffinity: None
type: ClusterIP
---
apiVersion: v1
kind: ConfigMap
metadata:
name: ratelimit
namespace: edge
data:
config.yaml: |-
---
domain: edge
descriptors:
- key: path
value: "/"
rate_limit:
unit: second
requests_per_unit: 100
---
apiVersion: v1
data:
password: REPLACE_THIS_PASSWORD_WITH_YOUR_OWN
kind: Secret
metadata:
creationTimestamp: null
name: redis
namespace: edge

When applied with kubectl, this service will serve as a central hub to limit the number of requests coming from the domain "edge" to 100 requests / second.

2. Deploy Redis

We also need a redis cache to keep track of the rate limit, which is configured with a second deployment:

kind: Deployment
apiVersion: apps/v1
metadata:
labels:
cluster: edge.redis
name: redis
namespace: edge
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
cluster: edge.redis
deployment: redis
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
cluster: edge.redis
deployment: redis
spec:
containers:
- name: redis
image: docker.io/centos/redis-32-centos7
env:
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: redis
key: password
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 200m
memory: 500Mi
requests:
cpu: 100m
memory: 128Mi
imagePullSecrets:
- name: index.docker.io
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
---
kind: Service
apiVersion: v1
metadata:
name: redis
namespace: edge
spec:
ports:
- name: server
port: 6379
protocol: TCP
targetPort: 6379
selector:
cluster: edge.redis
sessionAffinity: None
type: ClusterIP
apiVersion: v1
data:
password: PASSWORD
kind: Secret
metadata:
creationTimestamp: null
name: redis
namespace: edge

Together, the rate limit service is now accessible. Logs should show that the rate limit service starts up. To see if there are any errors there is also a debug endpoint for the rate limit image on 6070. Forward this port and then go to http://localhost:6070/rlconfig to see the loaded configuration (should be the same as our 100 requests / second config map) and also any stack traces http://localhost:6070/debug/pprof/.

3. Configure Grey Matter Sidecar To Use Ratelimit Filter

In order to configure rate limiting on a sidecar, we need to configure the sidecar with an additional cluster that points to the rate limit service. As a convenience, Grey Matter Sidecar allows for defining a cluster using environment variables. The following sample environment variables define a cluster ratelimit that points to our deployed ratelimit service:

...
tcp_cluster:
type: 'value'
value: 'ratelimit'
tcp_host:
type: 'value'
value: 'ratelimit.edge.svc.cluster.local'
tcp_port:
type: 'value'
value: '8081'
...

Make sure that you can see this cluster on whichever sidecar you have configured. It's available on localhost:8001/clusters under the ratelimit cluster.

Now let's update the listener config for the sidecar we've configured. Edit the listener in the grey matter CLI and add the following attributes:

...
"active_network_filters": [
"envoy.rate_limit"
],
"network_filters": {
"envoy_rate_limit": {
"stat_prefix": "edge",
"domain": "edge",
"failure_mode_deny": true,
"descriptors": [
{
"entries": [
{
"key": "path",
"value": "/"
}
]
}
],
"rate_limit_service": {
"grpc_service": {
"envoy_grpc": {
"cluster_name": "ratelimit"
}
}
}
}
},
...

You should see that requests to the ratelimit cluster succeed on the localhost:8001/clusters endpoint on the sidecar. You should also see logs in the ratelimit pod on every request since the logs are set to debug. If there are ever more than 100 requests / second, all requests are blocked until the 100 requests / second threshold is released.

This can be tested by changing the number of requests / second to 1 and spamming the sidecar. The response code should be 429, although if TLS is enabled this often is just an SSL error.

References