Grey Matter APIer is a real-time data transformation service which exposes a RESTful JSON API from a backing Excel or CSV file.
JSON has become a ubiquitous data-interchange format, offering excellent trade-offs in brevity, expressibility, and parse-ability. Although modern web APIs utilize JSON for these reasons, many traditional data source collection practices naturally gravitate towards data record formats such as Excel spreadsheets or CSV files. To address this disconnect, Grey Matter APIer operates as a real-time data-transformation microservice, converting tabular data into a JSON API for interoperability with modern web technologies.
APIer combines performance with robust error handling and automatic documentation generation. APIer has a wide range of configuration options, with minimal prerequisites, providing for simplicity and portability while offering your organization a unique and complimentary function within the Grey Matter Data Mesh Platform.
APIer is written in Python's FastAPI which gives it unique benefits of an auto-generated OpenAPI specification and input type-checking, among others. It's also lightweight and ephemeral because APIer keeps a minimal state. Upon startup, it parses through the provided environment variables. It attempts to request a remote spreadsheet and defaults to an exponential backoff procedure if the first request fails. Upon a successful request, it parses the spreadsheet and caches the result. This allows for rapid request response times, while still maintaining data freshness. The default cache refresh rate is set reasonably, but depending on your needs it can be configured with
APIer's row-column transformation scheme converts every row in a spreadsheet to a JSON object. Each row's columns become a new field in the JSON object, where the column header becomes the key name, and the row's column value becomes the key's value. APIer does not perform any sanitation or transformation on the values or the column names, other than converting "None" types to proper JSON.
APIer treats all spreadsheets as a potential collection of sheets. This aligns perfectly with Excel spreadsheets, and its open source equivalents, which can contain many sheets with potentially different internal structures. This abstraction still holds for CSV files, at the expense of being slightly contrived. As a result, CSV files have a fake "base" sheet which contains the entire spreadsheet. This has the upside of being more consistent, however, potentially enabling simpler programmatic interaction.
At this time, APIer does not support sheet-level environment variable customization. This is only a problem if the structure of two sheets in a spreadsheet is radically different. That is, Sheet 1 starts at Column N and Row A whereas Sheet 2 starts at Column M and Row B. There are a couple of workarounds:
Standardize the spreadsheets. This is the best option, unless you lack complete control over the source, the sheets cannot be reconciled to a common format, or if the structure must remain to support other dependencies or institutional concerns.
Set up multiple APIer instances for each sheet, or collection of similar sheets. This strategy works due to the
SHEET_NAMES environment variable. If the first two sheets follow the same row-column structure and the last two sheets follow the same row-column structure you can set up one APIer with the first sheets configured together and another instance with the last two sheets configured together.
The mechanism behind the
q query parameter is a hardened and secured python
eval function. It accepts any valid Python expression, assuming it doesn't violate the function's preconditions, for instance, calling anything found in
Read the full API docs for more details.
The URL of the spreadsheet to load. This must be a direct download link.
The file format type of the remote spreadsheet. This field is not case sensitive. It will be auto-populated if the file URL ends in:
This field is only valid for Excel spreadsheets. An Excel spreadsheet can contain many sheets. This variable allows for selecting a subset of those. Leave the variable empty to expose all sheets.
Used by APIer to skip over specific rows. Rows are 0-indexed.
A list of comma separated numbers representing the columns from the spreadsheet to use. Columns are 0-indexed.
The port the server listens on.
The host that the server will bind to.
Used by the documentation site to fetch resources from the server behind a proxy with a route prefix. Set this value if your application will not be exposed at the root URL of your domain. Ensure to include as trailing slash.
The time in seconds until the cache should refresh. A value of 0 disables the cache.
A descriptive name relating to the spreadsheet which will be injected into the OpenAPI specification.
The path where APIer will serve the docs from.