2.10.92 • Published 2 years ago

pg-tube v2.10.92

Weekly downloads
-
License
UNLICENSED
Repository
-
Last release
2 years ago

pg-tube: collect touch events from tables

  • "Tube" is internally a queue table with batches of touch event ids ("pods") from some table/shard.
  • Ideologically, "tube" is a user-space logical replication slot which only records touch events (ids).
  • Tubes can be added (tube_ensure_exists(tube, partitions)) and removed (tube_ensure_absent(tube)) at run-time. When removing, all the tables are automatically detached from the tube.
  • Tubes and attached tables can be enumerated using tube_list() function.
  • Each tube table has 5 main columns: 1) some auto-incremental sequence, 2) shard (some number which is assigned to a table when it's attached), 3) operation type (with a special operations '}' to denote the backfill ending), 4) an array (chunk) of bigint ids, and 5) some optional JSON payload.
  • Pods can be "regular" (contain non-empty ids) and "control" (have empty ids and non-empty payload). Control pods are: "backfill_schedule", "backfill_end".
  • An arbitrary table can be "attached" to a tube (tube_table_ensure_attached(tube, table, shard)). When attaching, you must also provide some "shard number" value which will then be recorded in the tube along with ids. (If shard number is not provided, pg-tube tries to infer it from the table name's numeric suffix.) Once attached, all insert/update/delete operations will result into adding a "touch" pod to the corresponding tube.
  • A table can also be "detached" from a tube (with tube_table_ensure_detached(tube, table)).
  • To run the initial piping, backfill function should be called in separate transactions, one after another (tube_backfill_step1(), tube_backfill_step2() and then tube_backfill_step3(step1_result, tube, table, order_col, shard)). This will bulk-insert all the ids from the table to the tube (op='B') record. Ordering by the provided column in step3 is important, because for encrypted databases (e.g. ECIES), ephemeral shared key for neighbor rows will likely be cached with short expiration time. So the closer the updates we process are in time, the higher is the chance to have a Diffie-Hellman hit.
  • There is tube_stats() function which shows all the details about the current tubes structure and content.

In its nature, tubes are eventual-consistent, mainly because of enrichment process which takes time and delivers the results with unpredictable latency. It means that there is absolutely no way to rely on some external source of order in the events (like row version), and the only way to solve it all is to never replay the same id concurrently. We also strictly rely on the fact that the downstream for the replay should be eventual-consistent and e.g. never reorder writes A and B for the same id once an ack from write A is received.

The application code should constantly monitor all existing tubes. One tube may correspond to one logical unit of work (e.g. we may have a live ES index, a spare ES index and a cache-invalidation tube, thus 3 tubes). For each tube, it spawns a beforehand-known number of separate replication stream workers, each worker processing only ids from their own set of shards (e.g. a "shard % 3" formula can be used as a function to allocate different touch events to different workers with numbers 0, 1, 2). Each worker reads ids from the tube, processes them in a strictly serial way ordering from lowest seq to highest (so it's guaranteed that there is never a concurrency when processing the same id) and removes from the tube. To reach eventual consistency, the workers must do enrichment (i.e. loading the actual data from the DB after receiving a touch event for some id). It's critical that each shard is processed by exactly one worker (no parallelism within one shard), and that the same id is never processed concurrently; having these assumptions allows us to NOT keep record versions in the tube and just rely on natural eventual consistency ordering (we can't even count on whether a record in the tube corresponds to a deletion or to an insert/update).

Backfill events (op='B') are processed in the exact same way as insert/update/delete/touch events, but with lower priority (the application uses ORDER BY op='B', seq clause in SELECT which matches the index exactly). Also, once a control pod with type=backfill_end, start_seq=S payload is received, it signals the worker that pods with seq < S are not needed on the downstream anymore, and they should be removed (garbage collected).

Testing

Use yarn test and yarn test:db to run the automated tests.

2.10.90

2 years ago

2.10.91

2 years ago

2.10.92

2 years ago

2.10.70

2 years ago

2.10.71

2 years ago

2.10.72

2 years ago

2.10.73

2 years ago

2.10.74

2 years ago

2.10.75

2 years ago

2.10.76

2 years ago

2.10.77

2 years ago

2.10.79

2 years ago

2.10.80

2 years ago

2.10.81

2 years ago

2.10.82

2 years ago

2.10.83

2 years ago

2.10.84

2 years ago

2.10.85

2 years ago

2.10.86

2 years ago

2.10.87

2 years ago

2.10.88

2 years ago

2.10.89

2 years ago

2.10.53

2 years ago

2.10.54

2 years ago

2.10.55

2 years ago

2.10.56

2 years ago

2.10.57

2 years ago

2.10.58

2 years ago

2.10.59

2 years ago

2.10.60

2 years ago

2.10.61

2 years ago

2.10.62

2 years ago

2.10.63

2 years ago

2.10.64

2 years ago

2.10.65

2 years ago

2.10.67

2 years ago

2.10.68

2 years ago

2.10.69

2 years ago

2.10.30

2 years ago

2.10.31

2 years ago

2.10.32

2 years ago

2.10.33

2 years ago

2.10.34

2 years ago

2.10.35

2 years ago

2.10.36

2 years ago

2.10.37

2 years ago

2.10.38

2 years ago

2.10.39

2 years ago

2.10.40

2 years ago

2.10.41

2 years ago

2.10.42

2 years ago

2.10.43

2 years ago

2.10.44

2 years ago

2.10.45

2 years ago

2.10.46

2 years ago

2.10.47

2 years ago

2.10.48

2 years ago

2.10.49

2 years ago

2.10.1

2 years ago

2.10.2

2 years ago

2.10.10

2 years ago

2.10.11

2 years ago

2.10.0

2 years ago

2.10.12

2 years ago

2.10.13

2 years ago

2.10.14

2 years ago

2.10.15

2 years ago

2.10.16

2 years ago

2.10.17

2 years ago

2.10.18

2 years ago

2.10.19

2 years ago

2.10.9

2 years ago

2.10.7

2 years ago

2.10.8

2 years ago

2.10.5

2 years ago

2.10.6

2 years ago

2.10.3

2 years ago

2.10.4

2 years ago

2.10.20

2 years ago

2.10.21

2 years ago

2.10.22

2 years ago

2.10.23

2 years ago

2.10.24

2 years ago

2.10.25

2 years ago

2.10.27

2 years ago

2.10.28

2 years ago

2.10.29

2 years ago

2.9.53

2 years ago

2.9.56

2 years ago

2.9.57

2 years ago

2.9.54

2 years ago

2.9.55

2 years ago

2.9.58

2 years ago

2.9.59

2 years ago

2.9.60

2 years ago

2.9.61

2 years ago

2.10.50

2 years ago

2.10.51

2 years ago

2.10.52

2 years ago

2.9.41

2 years ago

2.9.42

2 years ago

2.9.40

2 years ago

2.9.45

2 years ago

2.9.46

2 years ago

2.9.43

2 years ago

2.9.44

2 years ago

2.9.49

2 years ago

2.9.47

2 years ago

2.9.48

2 years ago

2.9.52

2 years ago

2.9.50

2 years ago

2.9.51

2 years ago

2.9.34

2 years ago

2.9.35

2 years ago

2.9.32

2 years ago

2.9.33

2 years ago

2.9.38

2 years ago

2.9.39

2 years ago

2.9.36

2 years ago

2.9.37

2 years ago

2.9.31

2 years ago

2.9.30

2 years ago

2.9.29

2 years ago

2.9.28

2 years ago

2.9.27

2 years ago

2.9.26

2 years ago

2.9.25

2 years ago

2.9.24

2 years ago

2.9.23

2 years ago

2.9.22

2 years ago

2.9.20

2 years ago

2.9.19

2 years ago

2.9.18

2 years ago