Topoconfig NPM

topoconfig

toposource-enhanced uniconfig remastered

Motivation

Configs can be complex. Let's try to make them a little more convenient. Topoconfig: enhancing config declarations with graphs

Config mess

Many years ago configs were pretty simple. They looked more or less like .properties-files or INI-files, simple kv-maps with sections or composite keys to bring some kind of context:

# https://docs.oracle.com/cd/E23095_01/Platform.93/ATGProgGuide/html/s0204propertiesfileformat01.html

# You are reading a comment in ".properties" file.
! The exclamation mark can also be used for comments.
# Lines with "properties" contain a key and a value separated by a delimiting character.
# There are 3 delimiting characters: '=' (equal), ':' (colon) and whitespace (space, \t and \f).
website = https://en.wikipedia.org/
language : English
topic .properties files
# A word on a line will just create a key with no value.
empty

; last modified 1 April 2001 by John Doe
[owner]
name = John Doe
organization = Acme Widgets Inc.

[database]
; use IP address in case network name resolution is not working
server = 192.0.2.62     
port = 143
file = "payroll.dat"

At the same time, another part of the configuration was supplied from the environment variables or CLI parameters reflecting the idea of dynamic settings.

Now we use dotenv-files, ironic :

# https://hexdocs.pm/dotenvy/0.2.0/dotenv-file-format.html
S3_BUCKET=YOURS3BUCKET
SECRET_KEY=YOURSECRETKEYGOESHERE

Even then, the resolution logic began to penetrate into the app layer.

// Just an illustration. This problem existed before JS was invented

const config = require('config')
const logLevel = process.env.DEBUG ? 'trace' : config.get('log.level') || 'info'
//...
const dbConfig = config.get('Customer.dbConfig')
db.connect(dbConfig, ...)

if (config.has('optionalFeature.detail')) {
  const detail = config.get('optionalFeature.detail')
  //...
}

When centralized configuration management came, the settings has been moved partially to the remote storage. Local pre-config (entrypoints, db credentials) was used to get the rest. Configuration assembly has become multi-stage.

Later, specialized systems such as vault made new additions: now env holds an access token and defines an entrypoint by running mode to make a POST request to reveal credentials profile to mix this data to the entire config.

Here's how uniconfig obtains secrets from the vault storage:

{
  "data": {
    "secret": "$vault:data"
  },
  "sources": {
    "vault": {
      "data": {
        "data": {
          "method": "GET",
          "url": "$url:",
          "opts": {
            "headers": {
              "X-Vault-Token": "$token:auth.client_token"
            }
          }
        },
        "sources": {
          "url": {
            "data": {
              "data": {
                "data": {
                  "name": "$pkg:name",
                  "space": "openapi",
                  "env": "$env:ENVIRONMENT_PROFILE_NAME",
                  "vaultHost": "$env:VAULT_HOST",
                  "vaultPort": "$env:VAULT_PORT"
                },
                "template": "{{=it.env==='production' ? 'https': 'http'}}://{{=it.vaultHost}}:{{=it.vaultPort}}/v1/secret/applications/{{=it.space}}/{{=it.name}}"
              },
              "sources": {
                "env": {
                  "pipeline": "env"
                },
                "pkg": {
                  "pipeline": "pkg"
                }
              }
            },
            "pipeline": "datatree>dot"
          },
          "token": {
            "data": {
              "data": {
                "method": "POST",
                "url": "$url:",
                "opts": {
                  "json": {
                    "role": "$pkg:name",
                    "jwt": "$jwt:"
                  }
                }
              },
              "sources": {
                "pkg": {
                  "pipeline": "pkg"
                },
                "jwt": {
                  "data": {
                    "data": {
                      "data": {
                        "tokenPath": "$env:TOKEN_FILE",
                        "defaultTokenPath": "/var/run/secrets/kubernetes.io/serviceaccount/token"
                      },
                      "template": "{{=it.tokenPath || it.defaultTokenPath}}"
                    },
                    "sources": {
                      "env": {
                        "pipeline": "env"
                      }
                    }
                  },
                  "pipeline": "datatree>dot>file"
                },
                "url": {
                  "data": {
                    "data": {
                      "data": {
                        "env": "$env:ENVIRONMENT_PROFILE_NAME",
                        "vaultHost": "$env:VAULT_HOST",
                        "vaultPort": "$env:VAULT_PORT"
                      },
                      "template": "{{=it.env==='production' ? 'https': 'http'}}://{{=it.vaultHost}}:{{=it.vaultPort}}/v1/auth/kubernetes/login"
                    },
                    "sources": {
                      "env": {
                        "pipeline": "env"
                      },
                      "pkg": {
                        "pipeline": "pkg"
                      }
                    }
                  },
                  "pipeline": "datatree>dot"
                }
              }
            },
            "pipeline": "datatree>http>json"
          }
        }
      },
      "pipeline": "datatree>http>json"
    }
  }
}

Meanwhile, formats have been evolving (JSON5, YAML), config entry points are constantly changing. These fluctuations, fortunately, were covered by tools like the cosmiconfig.

[
  'package.json',
  `.${moduleName}rc`,
  `.${moduleName}rc.json`,
  `.${moduleName}rc.yaml`,
  `.${moduleName}rc.yml`,
  `.${moduleName}rc.js`,
  `.${moduleName}rc.ts`,
  `.${moduleName}rc.mjs`,
  `.${moduleName}rc.cjs`,
  `.config/${moduleName}rc`,
  `.config/${moduleName}rc.json`,
  `.config/${moduleName}rc.yaml`,
  `.config/${moduleName}rc.yml`,
  `.config/${moduleName}rc.js`,
  `.config/${moduleName}rc.ts`,
  `.config/${moduleName}rc.cjs`,
  `${moduleName}.config.js`,
  `${moduleName}.config.ts`,
  `${moduleName}.config.mjs`,
  `${moduleName}.config.cjs`,
]

Configs are still trying to be declarative, but they can't. Templates appeared first.

template:
    metadata:
      annotations:
        cni.projectcalico.org/ipv4pools: '["${APP_NAME}"]'
        vault.hashicorp.com/agent-init-first: "true"
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/secrets-injection-method: "env"
        vault.hashicorp.com/secrets-type: "static"
        vault.hashicorp.com/agent-inject-secret-${APP_NAME}: secret-v2/applications/${DEPLOYMENT_NAMESPACE}/${APP_NAME}
        vault.hashicorp.com/agent-inject-template-${APP_NAME}: |
          {{ with secret "secret-v2/applications/${DEPLOYMENT_NAMESPACE}/${APP_NAME}" }}
            {{- range $secret_key, $secret_value := .Data.data }}
            export {{ $secret_key }}={{ $secret_value }}
            {{- end }}
          {{ end }}
        vault.hashicorp.com/auth-path: ${AUTH_PATH}
        vault.hashicorp.com/role: ${APP_NAME}

Then templates inside templates. With commands and scripts invocations inside dynamic DSL wrapped into matrices.

      - uses: actions/cache@v3
        id: yarn-cache
        with:
          path: ${{ needs.init.outputs.yarn-cache-dir }}
          key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
          restore-keys: |
            ${{ runner.os }}-yarn-

      - name: Restore artifact from cache (if exists)
        uses: actions/cache@v3
        with:
          path: artifact.tar
          key: artifact-${{ needs.init.outputs.checksum }}

      - name: Check artifact
        if: always()
        id: check-artifact
        run: echo "::set-output name=exists::$([ -e "artifact.tar" ] && echo true || echo false)"

As we can see, syntax complexity increases as the cost of declarativeness. It's still unclear how this problem can be mitigated. Perhaps new specialized formats will appear or more strict forms (schemas) of using existing ones will be introduced.

Budget loss

Anyway, ::$([ is definitely not an optimal solution. Сonfusing, fragile and overcomplicated for the most developers. For example, here is how Python Engineer was fighting against kube.yaml:

fix vault in kube yaml Jul 04	XS		
fix vault in kube yaml Jul 04	XS
fix vault in kube yaml Jul 04	XS
fix vault in kube yaml Jul 04	XS
fix vault in kube yaml Jul 04	XS
fix vault in kube yaml Jul 04	XS
fix vault in kube yaml Jul 04	XS
fix vault in kube yaml Jul 04	XS
fix vault in kube yaml Jul 03	XS
fix vault in kube yaml Jul 03	XS
fix vault in kube yaml Jul 03	XS
fix vault in kube yaml Jul 03	XS
fix vault in kube yaml Jul 03	XS
fix vault in kube yaml Jul 03	XS
...

This is definitely not configuring but more guessing. On a company scale, such exercises are a significant waste of resources. And this experience is almost one-time only, which cannot be formalized and transmitted except by copy-paste. Every time we see the same thing, with a different number of attempts.

What we need

The overcomplexity problem seems to have arisen from the fact that we combined resolving, processing and accessing data into one structure. Although the entire theory of programming / CS instructs us to do exactly the opposite. Separation of concerns: imagine a config which explicitly divides value resolutions, compositions and operations.

{
  "data": "<how to expose values>",
  "sources": "<how to resolve values>",
  "cmds": "<available cmds/ops/fns>"
}

Let data to represent how the result structure may be built if all the required transformations were made — like a mapping.

{
  "data": {
    "a": {
      "b": "$b.some.nested.prop.value.of.b",
      "c": "$external.prop.of.prop"
    }
  }
}

Templating bases on regular substring replacements:

String.format("foo %s", "bar")                   // gives 'foobar'
// But positional contract is enhanced with named refmap
String.format("foo $a $b $a", {"a": "A", "b": "B"}) // returns 'foo A B A'
//            ↑ data chunks ↑ sources map

Let sources to describe how to obtain and process values for referencing in data map. Like reducing pipelines.

{
  "sources": {
    "a": "<pipeline 1>",
    "b": "<pipeline 2>"
  }
}

Let pipeline to compose actions in natural ~~human~~ dev-readable format like CLI:

cmd param > cmd2 param param > ... > cmd3

Let intermediate values be referenced by lateral (bubbling concept) or nested contexts.

{
  "sources": {
    "a" : "cmd param",
    "b": "cmd $a" // b refers to a
  }
}

Apply DAG walker for consistency checks and processing.

🚧 Status

Working draft. The API may change significantly

Key features

Declarative notation. Atomic transformations. No syntax bloating by design.
Injecting values using dot-prop paths
Explicit CLI-like pipelines
Customizable transformers (aka cmds)

Install

yarn add topoconfig@draft

Usage

import {topoconfig} from 'topoconfig'
import * as cmds from '@topoconfig/cmds' // optional

const config = await topoconfig({
  // define functions to use in pipelines: sync or async
  cmds: {
    foo: () => 'bar',
    baz: async (v) => v + 'qux',
    ...cmds
  },
  // pipelines to resolve intermediate variables
  sources: {
    a: 'foo > baz', // pipeline returns 'barqux'
    b: {            // b refers to b.data
      data: {
        c: {
          d: 'e'
        }
      }
    }
  },
  // output value
  data: {
    // $name.inner.path populates var ref with its value
    x: '$b.c.d',  // 'e'
    y: {
      z: '$a'     // 'barqux'
    }
  }
})

Customization

Just as bash allows you to use any commands from the environment, so does topoconfig. Declare custom handlers for your pipelines. Real-world usage example may look like:

import {topoconfig} from 'topoconfig'

const config = await topoconfig({
  data: {
    foo:      '$a',
    url:      'https://some.url',
    param:    'regular param value',
    num:      123,
    pwd:      '\\$to.prevent.value.inject.use.\.prefix',
    a: {
      b:      '$b.some.nested.prop.value.of.b',
      c:      '$external.prop.of.prop'
    },
    log: {
      level:  `$loglevel`
    }
  },
  sources: {
    a:        'file ./file.json utf8',
    b:        'json $a',
    c:        'get $b > assert type number',
    cwd:      'cwd',
    schema:   'file $cwd/schema.json utf8 > json',
    external: 'fetch http://foo.example.com > get .body > json > get .prop > ajv $schema',
    extended: 'extend $b $external',
    loglevel: 'find $env.LOG_LEVEL $argv.log-level $argv.log.level info',
    template: `dot {{? $name }}
<div>Oh, I love your name, {{=$name}}!</div>
{{?? $age === 0}}
<div>Guess nobody named you yet!</div>
{{??}}
You are {{=$age}} and still don't have a name?
{{?}} > assert $foo`,
  },
  cmds: {
    // http://olado.github.io/doT/index.html
    dot:      (...chunks) => dot.template(chunks.join(' '))({}),
    extend:   Object.assign,
    cwd:      () => process.cwd(),
    file:     (file, opts) => fs.readFile(file, opts),
    json:     JSON.parse,
    get:      lodash.get,
    argv:     () => minimist(process.argv.slice(2)),
    env:      () => process.env,
    find:     (...args) => args.find(Boolean),
    fetch:    async (url) => {
      const res = await fetch(url)
      const code = res.status
      const headers = Object.fromEntries(res.headers)
      const body = await res.text()

      return {
        res,
        headers,
        body,
        code
      }
    },
    //...
  }
})

You can also use the default @topoconfig/cmds preset as a shortcut or create your own. No limitations.

import {topoconfig} from 'topoconfig'
import * as cmds from '@topoconfig/cmds'

const config = await topoconfig<ReturnType<typeof cmds.conf>>({
  cmds,
  data: '$output',
  sources: {
    // resolve a config file name by env profile 
    env: 'env',
    name: 'dot {{ $env.ENVIRONMENT_PROFILE_NAME || "config" }}.json',

    // read the config as json
    config: 'file $name > json',
    // read its schema
    schema: 'file schema.json > json',

    // and finally wrap the result with Conf API
    // https://github.com/antongolub/misc/tree/master/packages/topoconfig/cmds#conf
    output: 'conf $config $schema',
  }
})

Implementation Notes

export type TData = number | string | { [key: string]: TData } | { [key: number]: TData }
export type TCmd = (...opts: any[]) => any
export type TCmds = Record<string | symbol, TCmd>
export type TConfigDeclaration = {
  data: TData,
  sources?: Record<string, string | TConfigDeclaration>
  cmds?: TCmds
}

TConfigDeclaration defines two sections: data and sources:

data describes how to build the result value based on the bound sources: it populates $-prefixed refs with their values in every place.
sources is a map, which declares the algorithm to resolve intermediate values through cmd calls composition. To fetch data from remote, to read from file, to convert, etc.

{
  "data": "$res",
  "sources": {
    "res": "fetch https://example.com > get .body > json"
  }
}

cmd is a provider that performs a specific action.

type TCmd = (...opts: any[]) => any

directive is a template for defining a value transformation pipeline

// fetch http://foo.example.com > get body > json > get .prop
// ↑ cmd ↑opts                  ↑ pipes delimiter

Pipings

The first queued cmd operates only with explicitly declared params: cmd foo bar invokes cmd('foo', 'bar'). But every next chunk accepts the result of previous call as the first argument and applies the rest declared after: cmd1 foo bar > cmd2 baz will be transformed to cmd2(cmd1('foo', 'bar'), 'baz').

Next steps

Add ternaries: cmd ? cmd1 > cmd2 ... : cmd
Handle or statement: cmd > cmd || cmd > cmd
🚧 Provide commands presets: import {cmds} from 'topoconfig/cmds' or @topoconfig/cmds
Provide lazy-loading for cmds:

{
  cmds: {
    foo: 'some/package',
    bar: './local/plugin.js'
  }
}

Provide pipeline factories as cmd declaration.

{
  cmds: {
    readjson: 'path $0 resolve > file $1 > json'
  }
}

Use vars as cmd refs:

{
  sources: {
    files: 'glob ./*.*'
    reader: 'detect $files'
    foo: 'file $files.0 > $reader'
  }
}

Bring smth like watchers to trigger graph re-resolution from the specified vertex

Refs and Inspirations

License

MIT

@everything-registry/sub-chunk-2960 @topoconfig/cmds

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago

2 years ago