0.0.2 • Published 4 years ago

arqe v0.0.2

Weekly downloads
2
License
MIT
Repository
github
Last release
4 years ago

arqe - Abstract Relational Query Engine

General purpose engine for making anything available through a relational query interface.

Provides:

  • A query engine that handles parsing, planning, and execution.
  • A pretty easy API for mounting your own custom tables into the system.
  • Standard tables for things like accessing the filesystem, network, caching, etc.

Current project status: Not ready to use in any useful way yet.

TODO: Planning to add some brief usage examples here.

Inspirations and Influences

Goals

  • Unify and simpify the modern application development process.
  • Store code as structured relational data.
  • Unified database for all the metadata around code: configs, feature flags, branching, versioning, a/b traffic splitting, etc.
  • Eliminate the "build step" as we know it, instead have a single fluid system that understands stages.
  • Functional-reactive database with efficient change propogation.
  • Enable amazing developer dashboards and tools.

Elaboration on: Unified system for code metadata

With modern practices, a typical software product is a combination of a few things:

  • The source code
  • Configuration data
  • Versioning / deployment state

Each of these pieces of data can vary across multiple dimensions:

  • Source code varies across branches.
  • Configration varies across environments or branches.
  • Versioning state has different styles of how it changes, such as canary or blue/green deployments.
  • Additionally there can be per-user code or configuration changes, also called traffic splitting or A/B testing.

Within that there's some finer-grained breakdowns. Configuration can be divided up into "slow-changing" data (such as deployed XML / YAML files) or "fast-changing" (such as feature flags / feature toggles).

The old fashioned perspective is to think of just the source code as "the program", and everything else is a less significant detail. But increasingly, we're realizing that the real "program" is the amalgamation of everything listed above. Especially with the popularity of microservices, where versioning / deployment effects become a lot more significant, we're finding that the program running on our desk is a distant representation of the experience that our users get.

So the problem statement is:

1) The boundaries between those various systems is poorly defined. For example it's fuzzy on whether a piece of data should live in code versus configuration. 2) A lot of these systems have similar capabilities, the common thread is that some data varies across some dimensions. But typically those systems are implemented with completely different technologies which leads to a lot of complexity and duplicate effort.

So this project tries to solve the problem by implementing a single database layer that stores everything mentioned above. A defining feature in this system is that any piece of data can vary across multiple dimensions. The dimensions might be: branches, environments, data centers, canaries, traffic splits.

Elaboration on: Removing the build system as we know it

Typical software projects have another duplication of effort - using completely different technologies for the build process compared to the application itself. This means we have roughly double the toolset to manage. And, it leads to awkward situations where we might want to move a certain task from build-time to runtime, or vice-versa, but it's not as easy as it should be.

  • If we want the system to be more "live", where we can change anything and see the results quickly, then we need to move tasks from build-time to runtime. Potentially the entire application might need to rebuild while we're running it.
  • If we want to have good runtime performance, we often need to move tasks from runtime to build-time. A specific example in the web world is taking client-side HTML rendering and moving to happen ahead of time on the server (SSR).

With Arqe we're hoping to build a single unified system for managing data. For each data query, the system could understand whether it can be performed at build-time versus runtime. And since it's a reified attribute, we could add more build stages as needed. Then the system knows enough that it can efficiently pregenerate any arbitrary derived data for a certain stage.