0.8.3 • Published 2 years ago

@gratico/satya v0.8.3

Weekly downloads
-
License
UNLICENSED
Repository
-
Last release
2 years ago

Satya is a distributed database using Apache Arrow as a Storage format and aims to support both OTLP(Online transaction processing) and OLAP(Online analytical processing) workloads. It uses a similar but independent approach as described in Tianyu Li's et al by employing a Columunar format and then building a MVCC transaction manager on top of it while.

The unconventional choice of Columunar Storage(v/s Row Storage) is aimed at supporting OLAP workloads. Columunar storage introduces problems with write amplification in typical OTLP workloads, and hence Satya employs a number of techniques(doing away with variable length types in Arrow using Dictionary encoding etc) to make it suitable for such OTLP workloads.

The choice of Apache Arrow as a Storage format is motivated by a desire to have a composable database system with extremely loose coupling. It therefore supports either [https://github.com/duckdb/duckdb] or [https://github.com/apache/arrow-datafusion] as its query engines.

** High level goals

  • Full ACID compliance
    • Snapshot Isolation - Optimisitc Concurrency Control
      - Multi Version Concurrency Control
  • Apache Arrow compactiable Columunar Storage Format
  • Pluggable storage backend(FileSystem, Redis or anything actually)
  • Pluggable consensus manager
    • CAS(compare and swap) using SharedArrayBuffer in browsers
    • Paxos in other environments
  • Support for Larger-than-Memory Databases

** References