13.0.5 • Published 4 years ago

@gived/yjs v13.0.5

Weekly downloads
4
License
MIT
Repository
github
Last release
4 years ago

Yjs

A CRDT framework with a powerful abstraction of shared data

Yjs is a CRDT implementation that exposes its internal data structure as shared types. Shared types are common data types like Map or Array with superpowers: changes are automatically distributed to other peers and merged without merge conflicts.

Yjs is network agnostic (p2p!), supports many existing rich text editors, offline editing, version snapshots, undo/redo and shared cursors. It scales well with an unlimited number of users and is well suited for even large documents.

:construction_worker_woman: If you are looking for professional support to build collaborative or distributed applications ping us at yjs@tag1consulting.com.

Table of Contents

Overview

This repository contains a collection of shared types that can be observed for changes and manipulated concurrently. Network functionality and two-way-bindings are implemented in separate modules.

Bindings

NameCursorsBindingDemo
ProseMirror                                                  y-prosemirrordemo
Quilly-quilldemo
CodeMirrory-codemirrordemo
Monacoy-monacodemo
Acey-acedemo
Textareay-textareademo
DOMy-domdemo

Providers

Setting up the communication between clients, managing awareness information, and storing shared data for offline usage is quite a hassle. Providers manage all that for you and are the perfect starting point for your collaborative app.

Getting Started

Install Yjs and a provider with your favorite package manager:

npm i yjs y-websocket

Start the y-websocket server:

PORT=1234 node ./node_modules/y-websocket/bin/server.js

Example: Observe types

const yarray = doc.getArray('my-array')
yarray.observe(event => {
  console.log('yarray was modified')
})
// every time a local or remote client modifies yarray, the observer is called
yarray.insert(0, ['val']) // => "yarray was modified"

Example: Nest types

Remember, shared types are just plain old data types. The only limitation is that a shared type must exist only once in the shared document.

const ymap = doc.getMap('map')
const foodArray = new Y.Array()
foodArray.insert(0, ['apple', 'banana'])
ymap.set('food', foodArray)
ymap.get('food') === foodArray // => true
ymap.set('fruit', foodArray) // => Error! foodArray is already defined

Now you understand how types are defined on a shared document. Next you can jump to the demo repository or continue reading the API docs.

API

import * as Y from 'yjs'

Shared Types

Y.Doc

const doc = new Y.Doc()

Y.Doc Events

Document Updates

Changes on the shared document are encoded into document updates. Document updates are commutative and idempotent. This means that they can be applied in any order and multiple times.

Example: Listen to update events and apply them on remote client

const doc1 = new Y.Doc()
const doc2 = new Y.Doc()

doc1.on('update', update => {
  Y.applyUpdate(doc2, update)
})

doc2.on('update', update => {
  Y.applyUpdate(doc1, update)
})

// All changes are also applied to the other document
doc1.getArray('myarray').insert(0, ['Hello doc2, you got this?'])
doc2.getArray('myarray').get(0) // => 'Hello doc2, you got this?'

Yjs internally maintains a state vector that denotes the next expected clock from each client. In a different interpretation it holds the number of structs created by each client. When two clients sync, you can either exchange the complete document structure or only the differences by sending the state vector to compute the differences.

Example: Sync two clients by exchanging the complete document structure

const state1 = Y.encodeStateAsUpdate(ydoc1)
const state2 = Y.encodeStateAsUpdate(ydoc2)
Y.applyUpdate(ydoc1, state2)
Y.applyUpdate(ydoc2, state1)

Example: Sync two clients by computing the differences

This example shows how to sync two clients with the minimal amount of exchanged data by computing only the differences using the state vector of the remote client. Syncing clients using the state vector requires another roundtrip, but can safe a lot of bandwidth.

const stateVector1 = Y.encodeStateVector(ydoc1)
const stateVector2 = Y.encodeStateVector(ydoc2)
const diff1 = Y.encodeStateAsUpdate(ydoc1, stateVector2)
const diff2 = Y.encodeStateAsUpdate(ydoc2, stateVector1)
Y.applyUpdate(ydoc1, diff2)
Y.applyUpdate(ydoc2, diff1)

Relative Positions

This API is not stable yet

This feature is intended for managing selections / cursors. When working with other users that manipulate the shared document, you can't trust that an index position (an integer) will stay at the intended location. A relative position is fixated to an element in the shared document and is not affected by remote changes. I.e. given the document "a|c", the relative position is attached to c. When a remote user modifies the document by inserting a character before the cursor, the cursor will stay attached to the character c. insert(1, 'x')("a|c") = "ax|c". When the relative position is set to the end of the document, it will stay attached to the end of the document.

Example: Transform to RelativePosition and back

const relPos = Y.createRelativePositionFromTypeIndex(ytext, 2)
const pos = Y.createAbsolutePositionFromRelativePosition(relPos, doc)
pos.type === ytext // => true
pos.index === 2 // => true

Example: Send relative position to remote client (json)

const relPos = Y.createRelativePositionFromTypeIndex(ytext, 2)
const encodedRelPos = JSON.stringify(relPos)
// send encodedRelPos to remote client..
const parsedRelPos = JSON.parse(encodedRelPos)
const pos = Y.createAbsolutePositionFromRelativePosition(parsedRelPos, remoteDoc)
pos.type === remoteytext // => true
pos.index === 2 // => true

Example: Send relative position to remote client (Uint8Array)

const relPos = Y.createRelativePositionFromTypeIndex(ytext, 2)
const encodedRelPos = Y.encodeRelativePosition(relPos)
// send encodedRelPos to remote client..
const parsedRelPos = Y.decodeRelativePosition(encodedRelPos)
const pos = Y.createAbsolutePositionFromRelativePosition(parsedRelPos, remoteDoc)
pos.type === remoteytext // => true
pos.index === 2 // => true

Y.UndoManager

Yjs ships with an Undo/Redo manager for selective undo/redo of of changes on a Yjs type. The changes can be optionally scoped to transaction origins.

const ytext = doc.getArray('array')
const undoManager = new Y.UndoManager(ytext)

ytext.insert(0, 'abc')
undoManager.undo()
ytext.toString() // => ''
undoManager.redo()
ytext.toString() // => 'abc'

Example: Stop Capturing

UndoManager merges Undo-StackItems if they are created within time-gap smaller than options.captureTimeout. Call um.stopCapturing() so that the next StackItem won't be merged.

// without stopCapturing
ytext.insert(0, 'a')
ytext.insert(1, 'b')
um.undo()
ytext.toString() // => '' (note that 'ab' was removed)
// with stopCapturing
ytext.insert(0, 'a')
um.stopCapturing()
ytext.insert(0, 'b')
um.undo()
ytext.toString() // => 'a' (note that only 'b' was removed)

Example: Specify tracked origins

Every change on the shared document has an origin. If no origin was specified, it defaults to null. By specifying trackedTransactionOrigins you can selectively specify which changes should be tracked by UndoManager. The UndoManager instance is always added to trackedTransactionOrigins.

class CustomBinding {}

const ytext = doc.getArray('array')
const undoManager = new Y.UndoManager(ytext, new Set([42, CustomBinding]))

ytext.insert(0, 'abc')
undoManager.undo()
ytext.toString() // => 'abc' (does not track because origin `null` and not part
                 //           of `trackedTransactionOrigins`)
ytext.delete(0, 3) // revert change

doc.transact(() => {
  ytext.insert(0, 'abc')
}, 42)
undoManager.undo()
ytext.toString() // => '' (tracked because origin is an instance of `trackedTransactionorigins`)

doc.transact(() => {
  ytext.insert(0, 'abc')
}, 41)
undoManager.undo()
ytext.toString() // => '' (not tracked because 41 is not an instance of
                 //        `trackedTransactionorigins`)
ytext.delete(0, 3) // revert change

doc.transact(() => {
  ytext.insert(0, 'abc')
}, new CustomBinding())
undoManager.undo()
ytext.toString() // => '' (tracked because origin is a `CustomBinding` and
                 //        `CustomBinding` is in `trackedTransactionorigins`)

Example: Add additional information to the StackItems

When undoing or redoing a previous action, it is often expected to restore additional meta information like the cursor location or the view on the document. You can assign meta-information to Undo-/Redo-StackItems.

const ytext = doc.getArray('array')
const undoManager = new Y.UndoManager(ytext, new Set([42, CustomBinding]))

undoManager.on('stack-item-added', event => {
  // save the current cursor location on the stack-item
  event.stackItem.meta.set('cursor-location', getRelativeCursorLocation())
})

undoManager.on('stack-item-popped', event => {
  // restore the current cursor location on the stack-item
  restoreCursorLocation(event.stackItem.meta.get('cursor-location'))
})

Miscellaneous

Typescript Declarations

Yjs has type descriptions. But until this ticket is fixed, this is how you can make use of Yjs type declarations.

{
  "compilerOptions": {
    "allowJs": true,
    "checkJs": true,
  },
  "maxNodeModuleJsDepth": 5
}

Yjs CRDT Algorithm

Conflict-free replicated data types (CRDT) for collaborative editing are an alternative approach to operational transformation (OT). A very simple differenciation between the two approaches is that OT attempts to transform index positions to ensure convergence (all clients end up with the same content), while CRDTs use mathematical models that usually do not involve index transformations, like linked lists. OT is currently the de-facto standard for shared editing on text. OT approaches that support shared editing without a central source of truth (a central server) require too much bookkeeping to be viable in practice. CRDTs are better suited for distributed systems, provide additional guarantees that the document can be synced with remote clients, and do not require a central source of truth.

Yjs implements a modified version of the algorithm described in this paper. I will eventually publish a paper that describes why this approach works so well in practice. Note: Since operations make up the document structure, we prefer the term struct now.

CRDTs suitable for shared text editing suffer from the fact that they only grow in size. There are CRDTs that do not grow in size, but they do not have the characteristics that are benificial for shared text editing (like intention preservation). Yjs implements many improvements to the original algorithm that diminish the trade-off that the document only grows in size. We can't garbage collect deleted structs (tombstones) while ensuring a unique order of the structs. But we can 1. merge preceeding structs into a single struct to reduce the amount of meta information, 2. we can delete content from the struct if it is deleted, and 3. we can garbage collect tombstones if we don't care about the order of the structs anymore (e.g. if the parent was deleted).

Examples:

  1. If a user inserts elements in sequence, the struct will be merged into a single struct. E.g. array.insert(0, ['a']), array.insert(0, ['b']); is first represented as two structs ([{id: {client, clock: 0}, content: 'a'}, {id: {client, clock: 1}, content: 'b'}) and then merged into a single struct: [{id: {client, clock: 0}, content: 'ab'}].
  2. When a struct that contains content (e.g. ItemString) is deleted, the struct will be replaced with an ItemDeleted that does not contain content anymore.
  3. When a type is deleted, all child elements are transformed to GC structs. A GC struct only denotes the existence of a struct and that it is deleted. GC structs can always be merged with other GC structs if the id's are adjacent.

Especially when working on structured content (e.g. shared editing on ProseMirror), these improvements yield very good results when benchmarking random document edits. In practice they show even better results, because users usually edit text in sequence, resulting in structs that can easily be merged. The benchmarks show that even in the worst case scenario that a user edits text from right to left, Yjs achieves good performance even for huge documents.

State Vector

Yjs has the ability to exchange only the differences when syncing two clients. We use lamport timestamps to identify structs and to track in which order a client created them. Each struct has an struct.id = { client: number, clock: number} that uniquely identifies a struct. We define the next expected clock by each client as the state vector. This data structure is similar to the version vectors data structure. But we use state vectors only to describe the state of the local document, so we can compute the missing struct of the remote client. We do not use it to track causality.

License and Author

Yjs and all related projects are MIT licensed.

Yjs is based on my research as a student at the RWTH i5. Now I am working on Yjs in my spare time.

Fund this project by donating on Patreon or hiring me for professional support.