This guide covers the necessary steps to upgrade an existing Grey Matter 1.2 installation running on Kubernetes to Grey Matter 1.3. The procedure detailed below was run against an AWS EKS cluster but should work for other Kubernetes-based systems.
The Grey Matter mesh shown in this guide is for demonstration/proof of concept purposes only and is not intended for production use. Contact Grey Matter Customer Support for more information on a production deployment.
While these instructions should work against all versions of AWS EKS, they have been specifically tested and confirmed against versions: 1.15
, 1.16
, 1.17
, 1.18
.
helm v3
kubectl
Grey Matter credentials requested via Grey Matter Support
kubectl get pods
The output should look similar to below. Check to make sure that all the pods are in Running
or Completed
state. You may see more pods than listed below depending on if there are other deployed services.
NAME READY STATUS RESTARTS AGEcatalog-6c768b578f-t2vmv 2/2 Running 0 4m47scatalog-init-xgxgd 0/1 Completed 3 4m14scontrol-7849f77964-gszdt 1/1 Running 0 6m30scontrol-api-0 2/2 Running 0 6m30scontrol-api-init-cjxrx 0/1 Completed 0 6m29sdashboard-b8bff9c69-rzp2z 2/2 Running 0 4m47sdata-0 2/2 Running 0 5m11sdata-internal-0 2/2 Running 0 5m11sdata-mongo-0 1/1 Running 0 5m11sedge-fd9fc77c5-6zjr2 1/1 Running 0 5m35sinternal-data-mongo-0 1/1 Running 0 5m11sinternal-jwt-security-7f4bbdd995-f2czx 2/2 Running 0 6m30sinternal-redis-cc7f64cd-p2gl7 1/1 Running 0 6m30sjwt-security-757d67f477-5tqxb 2/2 Running 2 6m30spostgres-slo-0 1/1 Running 0 4m47sprometheus-0 2/2 Running 0 4m47sredis-7957ffffd4-qshxh 1/1 Running 0 6m30sslo-5fbd879488-87jcl 2/2 Running 0 4m47s
Download the migration scripts from the Grey Matter Nexus Repository.
Clone the Grey Matter helm-charts GitHub repo and set the branch to release-2.3
.
Download and install the Grey Matter CLI.
git clone git@github.com:greymatter-io/helm-charts.gitcd helm-charts && git switch release-2.3
Set the environment type. In the current helm-charts directory, edit global.yaml
and change the following setting according to the installed environment type (the options are eks
, kubernetes
, or openshift
). In this example, we will be using eks
.
global:environment: eks
Untar the migration scripts that you downloaded in the previous step.
tar -xvf gm-1.3-upgrade.tar.gzcd gm-1.3-upgrade
Source-in the environment script to set the appropriate environment variables. The script will attempt to determine the host:port of the mesh load balancer. If it does not find it, you'll be prompted to enter the value manually. In that scenario, use an existing hostname and port or external IP and port.
source ./environment
This script is targeted for POSIX shells like Bash. If running in Zsh, you'll need to emulate a POSIX shell.
If subsequent commands timeout or you receive errors, double check the values of the environment variables by running env | grep GREYMATTER_
. You can override any of them manually to suit your needs.
Switch to the mesh directory and run the getmesh.sh
script.
cd mesh && ./getmesh.sh
The getmesh.sh
script should have created a bunch of directories containing the mesh configs. Run the following commands to check the generated files for content. You're just checking to make sure the configs are actually captured and the files don't contain errors or some other output.
cat cluster/cluster.jsoncat domain/domain.jsoncat listener/listener.jsoncat proxy/proxy.jsoncat route/route.jsoncat shared_rules/shared_rules.jsoncat zone/zone.json
Switch to the catalog
directory and run the getcatalog.sh
script.
cd ../catalog && ./getcatalog.sh
The command should display output similar to the below indicating a successful connection to the Catalog service. Note that some paths/URLs will be different.
--> GREYMATTER_CONSOLE_LEVEL: debug--> GREYMATTER_API_HOST: a6a0ea8ea86e343e2825e59383797d12-1005777820.us-east-1.elb.amazonaws.com:10808--> GREYMATTER_API_SSLCERT: /Users/mike/Code/gm-upgrade-1.3/certs/quickstart.crt--> GREYMATTER_API_SSLKEY: /Users/mike/Code/gm-upgrade-1.3/certs/quickstart.key--> GREYMATTER_API_SSL: true--> GREYMATTER_API_INSECURE: true* * * Connection was made to Grey Matter Control API * * *--> GREYMATTER_CONSOLE_LEVEL: debug--> GREYMATTER_API_HOST: a6a0ea8ea86e343e2825e59383797d12-1005777820.us-east-1.elb.amazonaws.com:10808--> GREYMATTER_API_SSLCERT: /Users/mike/Code/gm-upgrade-1.3/certs/quickstart.crt--> GREYMATTER_API_SSLKEY: /Users/mike/Code/gm-upgrade-1.3/certs/quickstart.key--> GREYMATTER_API_SSL: true--> GREYMATTER_API_INSECURE: true* * * Connection was made to Catalog Api API * * *
The getcatalog.sh
script should have created 2 directories containing the Catalog configs. Run the following commands to check the generated files for content. You're just checking to make sure the configs are actually captured and the files don't contain errors or some other output.
cat clusters/clusters.jsoncat zones/zones.json
Switch to the fabric
directory in the helm-charts
repo and run the Grey Matter fabric upgrade.
cd ../../helm-charts/fabric && make upgrade-fabric
You should get output similar to the below indicating that the Helm chart was applied successfully.
Hang tight while we grab the latest from your chart repositories......Successfully got an update from the "greymatter" chart repositoryUpdate Complete. ⎈Happy Helming!⎈Saving 4 chartsDeleting outdated charts[mike@orion:~/Code/helm-charts/fabric]$ make upgrade-fabricrm -f ./charts/*echo "target hit package-fabric"target hit package-fabrichelm dep up .Hang tight while we grab the latest from your chart repositories......Successfully got an update from the "greymatter" chart repositoryUpdate Complete. ⎈Happy Helming!⎈Saving 4 chartsDeleting outdated chartshelm upgrade fabric . --set=global.waiter.service_account.create=false -f ../global.yaml --no-hookscoalesce.go:199: warning: destination for resources is a table. Ignoring non-table value <nil>coalesce.go:199: warning: destination for resources is a table. Ignoring non-table value <nil>coalesce.go:199: warning: destination for resources is a table. Ignoring non-table value <nil>Release "fabric" has been upgraded. Happy Helming!NAME: fabricLAST DEPLOYED: Tue Nov 24 15:10:57 2020NAMESPACE: defaultSTATUS: deployedREVISION: 2TEST SUITE: NoneNOTES:Grey Matter 3.0.0 has been installed.fabric deployed to namespace "default" at 03:10:57 on 11/24/03NOTE: It may take a few minutes for the installation to become stable.You can watch the status of the pods by running 'kubectl get pods -w -n default'
Run this command to check the status of the Kubernetes pods.
kubectl get pod
Confirm all of the pods are in running/completed state before continuing, similar to what is shown below. You may need to rerun the previous command a few times or add the -w
flag to the previous command until everything stabilizes.
NAME READY STATUS RESTARTS AGEcatalog-6c768b578f-t2vmv 2/2 Running 0 22hcatalog-init-xgxgd 0/1 Completed 3 22hcontrol-6db59845fd-d4tx9 1/1 Running 0 2m5scontrol-api-0 2/2 Running 0 92scontrol-api-init-cjxrx 0/1 Completed 0 22hdashboard-b8bff9c69-rzp2z 2/2 Running 0 22hdata-0 2/2 Running 0 22hdata-internal-0 2/2 Running 0 22hdata-mongo-0 1/1 Running 0 22hedge-fd9fc77c5-6zjr2 1/1 Running 0 22hinternal-data-mongo-0 1/1 Running 0 22hjwt-redis-db87bcc4c-v6dtw 1/1 Running 0 2m5sjwt-security-5c96d95749-2htgs 2/2 Running 0 2m5smesh-redis-0 1/1 Running 0 2m5spostgres-slo-0 1/1 Running 0 22hprometheus-0 2/2 Running 0 22hslo-5fbd879488-87jcl 2/2 Running 0 22h
After the Grey Matter fabric portion of the upgrade, the mesh configs that were capture in step 4 need to be reapplied.
cd ../../gm-1.3-upgradecd mesh && ./populate.sh
You should see a bunch of configs getting applied and their config outputted in series (it will take a minute or two to complete). There should be no errors if everything is working and connected properly.
At this point, you will not be able to access the dashboard until the Grey Matter edge
is updated, so we will do that now.
cd ../../helm-charts/edge && helm upgrade edge . -f ../global.yaml --set edge.ingress.type=LoadBalancer
You should receive the below output indicating that the Helm chart was successfully applied.
Release "edge" has been upgraded. Happy Helming!NAME: edgeLAST DEPLOYED: Tue Nov 24 15:31:22 2020NAMESPACE: defaultSTATUS: deployedREVISION: 2TEST SUITE: NoneNOTES:Grey Matter Edge 1.5.1 has been installed.NOTE: It may take a few minutes for the installation to become stable.You can watch the status of the pods by running 'kubectl get pods -w -n default'
Run this command to check the status of the Kubernetes pods.
kubectl get pod
Confirm all of the pods are in running/completed state before continuing, similar to what is shown below. You may need to rerun the previous command a few times or add the -w
flag to the previous command until everything stabilizes.
NAME READY STATUS RESTARTS AGEcatalog-6c768b578f-t2vmv 2/2 Running 0 23hcatalog-init-xgxgd 0/1 Completed 3 23hcontrol-6db59845fd-d4tx9 1/1 Running 0 48mcontrol-api-0 2/2 Running 0 48mcontrol-api-init-cjxrx 0/1 Completed 0 23hdashboard-b8bff9c69-rzp2z 2/2 Running 0 23hdata-0 2/2 Running 0 23hdata-internal-0 2/2 Running 0 23hdata-mongo-0 1/1 Running 0 23hedge-ff5578dbf-5xkfw 1/1 Running 0 68sinternal-data-mongo-0 1/1 Running 0 23hjwt-redis-db87bcc4c-v6dtw 1/1 Running 0 48mjwt-security-5c96d95749-2htgs 2/2 Running 0 48mmesh-redis-0 1/1 Running 0 48mpostgres-slo-0 1/1 Running 0 23hprometheus-0 2/2 Running 0 23hslo-5fbd879488-87jcl 2/2 Running 0 23h
Once all the pods are in running/completed state, confirm that you can access the Grey Matter dashboard again.
We will now upgrade Grey Matter sense. Once we do this, the Catalog service will lose its knowledge of the services due to changing the persistence to Redis. The step after this one will remedy this.
Switch to the sense directory and upgrade Grey Matter sense.
cd ../sense && make upgrade-sense
You should get the below output if the Helm charts were applied successfully.
rm -f ./charts/*helm dep up .Hang tight while we grab the latest from your chart repositories......Successfully got an update from the "greymatter" chart repositoryUpdate Complete. ⎈Happy Helming!⎈Saving 3 chartsDeleting outdated charts--disable-openapi-validationkubectl delete rolebinding prometheus-sa-rolebindingrolebinding.rbac.authorization.k8s.io "prometheus-sa-rolebinding" deletedhelm upgrade sense . --set=global.waiter.service_account.create=false -f ../global.yaml --no-hooks --install --disable-openapi-validationRelease "sense" has been upgraded. Happy Helming!NAME: senseLAST DEPLOYED: Mon Dec 7 14:13:09 2020NAMESPACE: defaultSTATUS: deployedREVISION: 2TEST SUITE: NoneNOTES:Grey Matter 3.0.1 has been installed.sense deployed to namespace "default" at 02:13:09 on 12/07/02NOTE: It may take a few minutes for the installation to become stable.You can watch the status of the pods by running 'kubectl get pods -w -n default'
Run this command to check the status of the Kubernetes pods.
kubectl get pod
Confirm all of the pods are in running/completed state before continuing, similar to what is shown below. You may need to rerun the previous command a few times or add the -w
flag to the previous command until everything stabilizes.
NAME READY STATUS RESTARTS AGEcatalog-6c768b578f-t2vmv 2/2 Running 0 23hcatalog-init-xgxgd 0/1 Completed 3 23hcontrol-6db59845fd-d4tx9 1/1 Running 0 48mcontrol-api-0 2/2 Running 0 48mcontrol-api-init-cjxrx 0/1 Completed 0 23hdashboard-b8bff9c69-rzp2z 2/2 Running 0 23hdata-0 2/2 Running 0 23hdata-internal-0 2/2 Running 0 23hdata-mongo-0 1/1 Running 0 23hedge-ff5578dbf-5xkfw 1/1 Running 0 68sinternal-data-mongo-0 1/1 Running 0 23hjwt-redis-db87bcc4c-v6dtw 1/1 Running 0 48mjwt-security-5c96d95749-2htgs 2/2 Running 0 48mmesh-redis-0 1/1 Running 0 48mpostgres-slo-0 1/1 Running 0 23hprometheus-0 2/2 Running 0 23hslo-5fbd879488-87jcl 2/2 Running 0 23h
Due to the change in persistence mechanisms, the Catalog service will need to be re-populated with the entries that were exported in a prior step.
cd ../../gm-1.3-upgrade/catalog && ./populate.sh
You should see a bunch of configs getting applied and their config outputted in series. There should be no errors if everything is working and connected properly. Once the command completes, confirm that you have access to the Grey Matter dashboard and that the cards for the various services are present and green.
Since the persistence mechanism for business impacts was migrated to Redis, you need to run a script in order to copy over the existing business impacts.
First, navigate to the Nexus repository containing the different versions of the script using the below URL.
https://nexus.greymatter.io/#browse/browse:raw:release%2Fbi-migration
Once logged in, you will see three (3) versions of the script available with different extensions: .darwin
(for MacOS), .linux
(for Linux), and .windows
(for Windows). Click on the version appropriate for your computer. Then click on the link next to Path in the panel on the right to download.
Before running the script, we will need to set some environment variables.
export SLO_ROUTE=/services/slo/latestexport CATALOG_ROUTE=/services/catalog/latest
Then, navigate to the location where the script was downloaded and run the script. In the example below, we are running the .darwin
version of the script from the Mac default download location (~/Downloads
). You may need to change the commands accordingly if working with another OS.
cd ~/Downloads && ./migrate-business-impacts.darwin
You should see output similar to the below. The number of clusters updated may differ based on what's installed in the Grey Matter mesh and the business impact levels.
Updated cluster Grey Matter Catalog 1.0.7 with business impact: 'high'Updated cluster Grey Matter Control API 1.4.3 with business impact: 'critical'Updated cluster Grey Matter Edge 1.4.5 with business impact: 'critical'Updated cluster Grey Matter JWT Security 1.1.1 with business impact: 'critical'Finished. 4 clusters updated.
At this point, the upgrade is complete. Check the dashboard and services to make sure everything looks like it should.