Cloud Configuration: You’re doing it wrong

Problems of complex configuration and how to fix them

All applications need configuration. If the number of deployable units (DU) is small, it’s manageable, but as the number goes toward 100 or more the configuration complexity increases, and that is just for one environment. As the number of environments goes up, the complexity is multiplied. Consider a simple example of 10 deployable units, each with 20 configuration items, and over five environments for a total of 1,000 configuration items. 

Why is this a problem?

  1. Whenever there is an addition of a configuration for one DU, we need to make sure the setting is propagated to all environments.
  2. As the number of DUs increases, managing the complexity becomes harder and harder.
  3. Not all configurations need to be changed in each environment, so how do we keep track?
  4. If each DU uses a different configuration mechanism, it becomes even harder to keep track of.
  5. The 12-factor manifesto quite rightly calls out that configuration should go with the environment, not the code, so file-based configuration (the default for most programming languages) is a bad idea.
  6. While secrets are “technically” configuration, most configuration systems are not secure enough to put secrets in. This can lead to either poorly secured secrets in configuration or two different sources of truth for getting application values.
  7. If your system is tenanted and each tenant has configuration/secrets of its own (per environment), the problem grows proportionally
  8. Troubleshooting a problem in an environment can be quite challenging, even if the logging provides the missing or incorrect configuration or secret value. Adding to the complexity is that configuration values may not be used except under certain circumstances in particular code pathways.
  9. Auditing can become problematic, especially in determining retrospectively how the system was configured as many/most configuration engines do not do a good job of determining the configuration in a point of time.

How do we solve these issues?

  1. Have one source of truth for both secrets and configurations. 
  2. Be able to compare keys across environments, including the ability to detect missing or duplicate values.
  3. Be able to identify point in time keys and values.
  4. Have multiple sets of configuration/secrets to organize keys into; these are referred to as “key spaces.”
  5. Enforce access control by environment and key spaces so that there is a change audit trail and to prevent configuration and secrets disclosure in production environments.
  6. The tool should support a base-line configuration that can be overridden selectively by environment. A good use case for this is log level or other control over the level of detail captured by logging and auditing systems.
  7. Be disciplined. Secrets and configurations are core to a successful system, so they should be treated as such.

With great software comes great responsibility. Enterprise configuration management done with discipline and following best practices will help avoid a lot of pain, reduce risk, make deployments smoother, and troubleshooting easier.