Consequences of HTTP APIs for configuraiton management systems

September 3, 2023 - Reading time: 4 minutes

Difference between OS and HTTP API resource management

Unlike configuration management systems that work with OS configuration, by infrastructure here we mean systems where resources are accessible via HTTP REST APIs.

In OS every resource we manage (files, users, groups, permission, networks etc.) can be identified using IDs provided by configuration system. This is for example file path, user and group names etc. This means that resources can be later managed using same IDs that are already present in the configuration.

In contrast REST APIs use POST semantics to crate resources. This results in IDs assigned by the system under management (API server) at the moment of resource creation. In order to be able to map logical configuration IDs (e.g. resource type and names) to server side IDs we need a mapping that has to be maintained alongside of the configuration.

State management

The ID mapping state needs to be:

Durable as it is accessed any time configuration is applied or checked if there is work to be done.
Globally accessible to any system running configuration management.
Access to it serialized to avoid concurrent mutation.

Another side effect of this need for state is difficulty in using IaC for already existing systems that are not managed using state mapped IDs. In practice it is best to start from scratch with a new infrastructure so that the configuration logical name and ID mappings can be created and recorded in the state gradually instead of trying to manually map new configuration to existing objects.

Runtime data flow and expression engine

Another serious consequence and complication of POST base APIs is the need to pass resource IDs at runtime during configuration process to dependent resources. This is due to the fact that the IDs for resources that are yet to be crated are not know when configuration is written and evaluated. They only become available during application of the configuration as resource IDs are allocated by the infrastructure. Any resource that relates to another resource by ID will need this ID dynamically passed during application.

This dynamically allocated IDs may need some extra processing (e.g. put in collections) before they can be passed to dependent resources witch creates the need for expression evaluation capabilities during runtime.

Alternatives

REST HTTP APIs could use PUT verb where the client decides on the identity of the resource. The server side would use client provided IDs to identify the resources or internally store the mapping between client provided IDs and internal server side IDs.

This way we could avoid the need for state, data flow and expression engines as all IDs would be known when configuration is written or evaluated.

Another approach is to use infrastructure systems that supports for storing mappings. For example most (but not all) AWS resources can be labeled. This configuration side labels can then be used to discover AWS resource IDs and otherwise allow configuration to only identify resources by labels. The AWS resource IDs and how they are discovered from labels become resource implementation detail.

Relation with RPC

In RPC systems a similar issue exists. Some systems like Cap'n Proto or 9P allow clients to assign IDs to responses from RPC calls and therefore pipeline multiple calls without having to wait for actual response values before issuing another one.

In context of IaC, we could create inter-dependent resources in parallel to because the IDs are know up-front.