Light Dark Auto

Introduction to GSL

What is GSL?

Greymatter Specification Language , GSL for short, is a declarative domain specific language designed for application networking in a modern mesh-like topology. It focuses on boilerplate reduction, providing natural object relationships, and drop-in option customization without compromising the rich feature set supported by greymatter. By doing so, we hope to empower users from across the experience spectrum, ultimately leading to the ability to make complex configuration changes rapidly and with confidence.

The CUE language forms the foundation of GSL. CUE is a data configuration language with roots to both the field of linguistics and to internal Google projects. This gives CUE a unique set of advantages compared to other data configuration languages. Namely, it has been birthed from environments steeped in extreme complexity and scale. If you want to read more about the origins or technical details, we recommend the official CUE documentation site.

By writing GSL in CUE, we pass all of its key features to you, including schema validation and versioning, a powerful unification engine, familiar JSON-like syntax, and no inheritance features to trip over. We'll explore CUE in a bit more detail later, but all you need to know is that every value you write in GSL is type checked, written without consideration for whitespace, and obvious--the data lives right in front of you, there is no multi-layer inheritance hell or templates on top of templates.

Next, we'll dive into what GSL looks like, how it's structured, and what the main elements are. Before we continue, note that although all GSL is CUE, not all CUE is GSL. Since you write GSL in CUE, you can extend it using any CUE construct. This property may become useful when your organization requires additional abstraction or organization-wide defaults.

The Basics of GSL

Since GSL utilizes CUE, it's helpful to know the basics of CUE syntax and behaviors. This section assumes little to no CUE experience thus attempts to explain CUE-isms as the come up. Most concepts should be familiar if you come from a programming background. That being said, we recommend you read the CUE documentation if not before then at least after reading this article if this is your first time with CUE.

CUE Terminology

CUE syntax looks similar to JSON with some slight improvements. Objects, called structs, don't require strings around keys or commas after values. Additionally, you don't need a top-level object holding every CUE value. CUE values are types, which sounds strange but means that you can assign a key to a type like string or to a literal value "Hello, World".

CUE types include:

  • bool booleans

  • string strings

  • int integers (and other numerical types)

  • [Type] arrays supporting mixed types

  • {key: value} structs

  • #Definition: { key: value } definitions

  • _ anything

  • _|_ nothing, which is a stand-in for an error in CUE

Most of these types should look familiar. Of special note are #Definitions. Definitions are like blueprints. They describe the shape of data. If you come from an Object-Oriented Programming background, definitions can be thought of as classes although in no way is CUE OO.

The Service Construct

Since GSL describes an application's networking behavior we structure all configuration around the atomic configuration object called the service construct. The service construct exposes service-level options like health checks and cataloging information. It also forms the start of the application networking construct hierarchy.

Application Networking Construct Hierarchy

The networking construct hierarchy is the collection of "sub-atomic" configuration objects linked together to form end-to-end request handling pipelines. These sub-atomic configuration constructs are (from parent to child):

  1. Listeners
  2. Routes
  3. Upstreams Each sub-atomic object has a one-to-many relationship with its children, starting at the service. In other words, the service can have multiple listeners which can have multiple routes which can have multiple upstreams. By combining all these elements, we can express complex networking principles while enforcing human-readable models. We'll dig deeper into what these objects do later.

Service Construct Structure

To declare a service you need to invoke the service construct. The service construct is governed by its respective definition:

#Service: {
	name: string

	// ... configurable metadata ...
	
	ingress: [ListenerName]: #Listener
	egress: [ListenerName]: #Listener

	edge: #SpecialRoute
}

From top to bottom:

  • name is a string referring to the service name. This value is extremely important. It must match the string your service announces itself as in the mesh, else it will not be associated with this configuration.
  • ingress is a template struct that maps any number of listener names to listener constructs. Listeners found in the ingress struct will only handle incoming connections.
  • egress is exactly the same as ingress but for egress listeners. There are some slight differences the final generated code, but you don't need to worry about them. Just note that ingress listeners will only handle incoming connections and egress listeners will only handle outgoing connections.
  • edge refers to the connection bridge between the service and its edge proxy. It's a "special route" definition that looks like a GSL route construct.

CUE-ism: What's a template? A template is a struct (object) that takes any number of keys and maps them to a singular struct type. We use them for attaching any number of listeners to services, routes to listeners, and upstreams to routes.

The service also includes other options allowing for customization over health checking, its dashboard information, and more. For the sake of simplicity we'll focus on required fields.

Listeners

Listeners describe the connection point for a request. Listeners, well, listen for connections. To define a listener, add its name to the ingress struct:

ingress: {
	"listener name": {
		// Listener config
	}
	"next listener name": {
		// Listener config
	}
}

Every service must have one listener named itself. If the service is named "apple" then an ingress listener must also be named "apple". The service named listener represents the primary ingress point used for metrics and audit collection.

CUE-ism: If CUE supports non-quoted keys, why do we wrap the listener name in quotes? Since these keys are user generated we wrap them in quotes to separate them from GSL keys as a convention. You can, if you really really want, depart from the convention.

There are two sub-types of listener, an #HTTPListener and a #TCPListener. We declare a listener to be either one by embedding it.

ingress: {
	"myServiceName": {
		#HTTPListener
	}
}

or

ingress: {
	"myServiceName": {
		#TCPListener
	}
}

CUE-ism: Embedding occurs when you refer to a value (struct or definition) from within another struct of definition. CUE spreads the keys and values into the encompassing data structure.

This changes the structure of the listener slightly. For starters, HTTP listeners have routes whereas TCP listeners do not. Some filters only work for HTTP as well. Once you define the type of listener we can now build out routing for HTTP or the direct upstream for TCP. We'll stick with the HTTP Listener.

Routes

Route definitions specify exactly that, a route from someplace to another place. Specifically, they link HTTP listeners to an upstream based on whether or not the incoming request URL matches the pattern specified by the route name. To add a route to a listener, use the routes key and provide a route name:

ingress: {
	"myServiceName": {
		#HTTPListener
		routes: {
			"/": { 
				// Route configuration goes here
			}
		}
	}
}

The route name you choose is quite important as greymatter will use it to match incoming requests. In this case, the route name is the catch-all "/" which matches all requests. Just like listeners, you can have multiple routes:

...
routes: {
	// matches any URL with prefix /goToA
	"/goToA": {}
	"/goToB": {}
	// you can also use regex expressions!
	#"/us-(east|west)"#: {
		// You do need to signal that the path should match a regex
		// pattern instead of the prefix
		route_match: match_type: "regex"
	}
}
...

CUE-ism: wrapping a string with the # symbol denotes a "raw string" meaning you can write special characters without escaping them .

Routes also have options for redirects, retries, header injection, and route rewriting. These are mostly optional or have defaults, so you can skip them if need be.

Upstreams

At the bottom of the hierarchy and the terminal point of the request (from the proxy's point of view), is the upstream. An upstream represents the host to send traffic to. Upstreams attach to routes and provide the proxy a host value to send the request to. To associate an upstream with a route, add an upstream to a route's upstreams key.

...
routes: {
	"/": {
		upstreams: {
			"my upstream name": #DefaultUpstream
		}
	}
}

For a TCP listener, upstreams attach directly on the listener:

"myTcpListener": {
	upstream: {
		"myTcpService": #DefaultUpstream
	}
}

A valid upstream requires a host value. There are two ways to specify this value. You can either rely on service discovery to resolve the host IP address and port or configure a host manually with a domain name or raw IP. If the upstream service is part of the mesh then it should be discoverable. In this case, find the upstream's service name (the value in the service file name field) and use it as the name of the upstream. For example, if we are configuring an edge and want to send all requests to the apple service where the location of the apple service is dynamically discovered then we would write this:

--- apple.cue ---
#Service & {
	name: "apple"
	...
}

--- edge.cue ---
{
	...
	"/": {
		upstreams: 
			// This matches the apple.cue name field
			"apple": #DefaultUpstream
	}
	...
}

Note the matching "apple" strings. GSL by default attempts to populate the upstream host dynamically and will drive the necessary configuration off the upstream name. If, however, apple service was an external service to the mesh or a local service, an instance array is required:

...
upstreams: 
	"apple": #DefaultUpstream & {
		instances: [
			{
				host: "127.0.0.1"
				port: 8000
			}
		]
	}

The upstream name is arbitrary for statically configured hosts.

CUE-ism: What does #DefaultUpstream & { ... } mean? The & (unification operator) unifies two CUE values together, usually a definition and a struct, as in this example. Unifying means to take two values and see if they form an "is-a" relationship between them. For A & B to unify, all the fields in A must be in B, all the fields in B must be in A, and none of the fields can have irresolvable types like being both a string and an integer. This can be a tricky concept to grasp until you use it a few times. In GSL, & typically boils down to taking a definition and overriding it with data.

Other than specifying the destination host (either implicitly or explicitly), you can also control whether the connection between the service and the upstream is secured with TLS, includes circuit breaking, traffic splitting or shadowing, uses HTTP/2, or is load balanced.

Putting it all together

So now that we've explored the 4 application networking constructs in GSL, let's see a full service file put together.

package services

// import v1 of GSL and alias it as gsl
import (
	gsl "greymatter.io/gsl/v1"
	globals "my.module/greymatter/globals:globals"
)

// invoke the Service construct, note the gsl. prefix, telling CUE to search in the gsl package
myService: gsl.#Service & {
	context: myService.#NewContext & globals
	
	name: "myservice"

	ingress: {
		"myservice": {
			gsl.#HTTPListener
			routes: {
				"/": {
					"local": {
						instances: [
							{
								host: "127.0.0.1"
								port: 8080
							}
						]
					}
				}
			}
		}
	}
	egress: {
		"myApi": {
			gsl.#HTTPListener
			port: 8081
			routes: {
				"/": {
					upstreams: {
						"myApi": gsl.#DefaultUpstream
					}
					
				}
			}
		}
	}
}

exports: MyService: myService

There a few new odds and ends here that we didn't cover above. Let's start at the top with CUE package management and break them down.

package services

// import v1 of GSL and alias it as gsl
import (
	gsl "my.module/greymatter.io/gsl/v1"
	globals "my.module/greymatter/globals:globals"
)

At the top of the file, you have to declare the file package and import GSL. CUE has a thin package management layer which includes operations like declaring, importing, and auto-unifying packages. A package is just a namespaced collection of CUE files. We bundle up all of GSL as a dependency package that service files can import (if set up with greymatter init). Whenever using a GSL built-in, you must prefix it with gsl., just like this, gsl.#Service & {.

Inside the service construct we also create a context. You don't need to worry about this too much other than know that a service context provides certain values, like global values, to the internal workings of the service definition.

myService: gsl.#Service & {
	context: myService.#NewContext & globals
	

The final new concept lies at the end of the file with the export statement. For GSL to register your service, it must be exported. To export a service use this syntax:

exports: NameOfService: ServiceField

In our case it would be:

exports: MyService: myService

And that's it, a full service file describing a service which sends all incoming requests on port 10808 to a local machine port 8080, and allows outgoing requests on port 8081 to a service called 'myApi', using greymatter to figure out how to get to 'myApi'. That's the application networking construct hierarchy in essence. And for more complex services, you simply scale up the number of listeners, routes, or upstreams as needed.

Although the config is small, we can scaffold and automate most of it, like the import statements, by using greymatter init. That command spits out directory structures and file contents that match the conventions we set up and ultimately lets you get to real configuration faster.