@gratico/satya NPM

Satya is a distributed database using Apache Arrow as a Storage format and aims to support both OTLP(Online transaction processing) and OLAP(Online analytical processing) workloads. It uses a similar but independent approach as described in Tianyu Li's et al by employing a Columunar format and then building a MVCC transaction manager on top of it while.

The unconventional choice of Columunar Storage(v/s Row Storage) is aimed at supporting OLAP workloads. Columunar storage introduces problems with write amplification in typical OTLP workloads, and hence Satya employs a number of techniques(doing away with variable length types in Arrow using Dictionary encoding etc) to make it suitable for such OTLP workloads.

The choice of Apache Arrow as a Storage format is motivated by a desire to have a composable database system with extremely loose coupling. It therefore supports either [https://github.com/duckdb/duckdb] or [https://github.com/apache/arrow-datafusion] as its query engines.

** High level goals

Full ACID compliance
- Snapshot Isolation - Optimisitc Concurrency Control
  - Multi Version Concurrency Control
Apache Arrow compactiable Columunar Storage Format
Pluggable storage backend(FileSystem, Redis or anything actually)
Pluggable consensus manager
- CAS(compare and swap) using SharedArrayBuffer in browsers
- Paxos in other environments
Support for Larger-than-Memory Databases

** References

[https://db.cs.cmu.edu/papers/2020/p534-li.pdf]
[https://15799.courses.cs.cmu.edu/fall2013/static/papers/p731-sikka.pdf]

gratico-engineering

@duckdb/duckdb-wasm apache-arrow

@infinitebrahmanuniverse/nolb-_grat

0.8.3

3 years ago