0.6.1 • Published 14 days ago

@sisujs/meta-cbot v0.6.1

Weekly downloads
-
License
ISC
Repository
gitlab
Last release
14 days ago

SisuJs Meta-Cbot

Welcome to the documentation for the meta-cbot -module of SisuJs. It features two components designed to enhance web development by improving type handling and communication between server and browser. The components are:

  • Cbot (Character Based Object Transport): A type validating protocol designed for web data transport, intended to replace JSON for server-browser communication.
  • Meta -model: A framework for creating reflective types in Javascript/Typescript, enhancing type enforcement and data integrity.

For more details, refer to the API documentation or contact me via sisujs@sisujs.fi. I also appreciate any feedback to make this documentation and project better.

Rationale for creating Cbot and Meta-model

Both Cbot and Meta-model have emerged to counter limitations of JSON and Javascript-object-model in general. And they are intented to solve couple of real limitations.

Limitations of JSON

JSON (JavaScript Object Notation), is widely used as a data interchange format but it exhibits several limitations due limited types and format:

  • Lack of Enforced Validation: With JSON, the burden of data validation rests entirely on the receiver. The sender can transmit any data structure, leaving the receiver to manually verify if the data conforms to expected schema.
  • Inefficiency in Data Representation: JSON is not optimized for data transport. It was originally designed as a simple data notation, which becomes inefficient in transport scenarios. For example, if property names are lengthy and repetitive, they consume excessive amount of space in messages.
  • Limited Data Types: JSON inherently supports only basic data types, that is, strings, numbers, booleans, objects, and arrays. Types such as dates, big integers, or custom classes require additional encoding and handling.

Limitations of Javascript object-model

  • Lack of Enforced Types: As Javascript objects are basically just named functions masking to be maps, there is no language level type-validation available. To achieve this, a separate meta-model or schema is required.

Cbot (Character Based Object Transport)

To address the limitations of JSON, I introduce Cbot (Character Based Object Transport), a protocol with following design choices:

  • Character based protocol: Unlike many other advanced protocols such as gRPC, Cbot is character-based, making it suitable for native communication over HTTP. It operates on a line-based protocol where each line is separated by newlines and can contain one or more operations. These operations collectively construct the desired object tree.
  • Enforced data types: Cbot is able to ensure that data structures adhere to a defined schema at protocol level. This ensures that recieved message is exactly what it was intended to be.
  • Machine first message format: Cbot is designed to be primarly machine readable with straightforward encoding. It also minimizes message sizes with optimizing repetive property names.
  • Support for extended data types: Cbot natively supports a broader range of data types, including Date, BigInt, Set, and Map. This eliminates the need for manual encoding and handling of complex data types. In the future it is also able to incorporate Temporal API if it is actually released.

Meta-model

The Meta-model serves as a framework designed to manage and enforce types within Javascript and Typescript environments, where native objects lack comprehensive reflective information. This lack limits interactions to basic operations like iterating over property names and function calls, without inherent type support. Components most important features include:

  • Complementing Cbot: The Meta-model complements Cbot by providing the necessary infrastructure to enforce and map intended types for each object during serialization and deserialization.

  • Typed objects with external APIs: When dealing with databases or third-party APIs that require JSON for data transport or storage, the Meta-model can convert typed objects to plain JavaScript objects and vice versa. This includes:

    • Adaptable encoding: During the encoding process, it is possible to alter objects to ensure compatibility with JSON, such as modifying data structures or adding additional metadata necessary for reconstructing the original typed objects upon decoding. Practical examples are polymorphic objects that may be hard to handle.
    • Support for extended types: The Meta-model adds support for extended datatypes which are not originally supported by JSON, although some transformations may require the use of manual encoders due the limitations of JSON.

Is Cbot Available on Other Platforms (Java, Go, etc.)?

Currently, Cbot is only implemented as a npm-library. However, expanding support to include other platforms such as Java, Go, and others would be appreciated. If you are interested in developing implementations for these or any other languages, please feel free to reach out for support.

Is Cbot and Meta-model production ready?

Totally, yes. These concepts have been in use already for a long time as a private project. So making this public is the next step. Also I decided to start version numbering from 0.6. This gives some room for enhancement before declaring version as 1.0.

Tiny Tutorial

Instead of trying to explain all aspects of the protocol, I thought it is more meaningful to just show a couple of examples which will provide a usable overview of what these features are about. This approach also gives you an easy starting point if you choose to try Cbot in your own project. And it is really simple as demonstrated below.

Untyped example: Library

Cbot can be used as a simple replacement for JSON.stringify() and JSON.parse(). In this example I will be modeling a library with tree concepts: Author, Book and Catalog. I will also use interfaces to clarify the model, but objects are kept untyped.

Below is the used model:

interface Author {
  name: string
  birthdate: Date
}

interface Book {
  title: string
  isbn: string
  author: Author
  publicationYear: number
}

interface Catalog {
  genre: string
  books: Book[]
  authors: Set<Author>
}

Notice that because Cbot supports natively dates and sets, I just decided do use them in the model as an example.

Next, let's define some actual values for the model:

const isaac:Author = {
  name: "Isaac Asimov",
  birthdate: new Date("1920-01-02")
}

const foundation:Book = {
  title: "Foundation",
  isbn: "978-0-394-51330-7",
  publicationYear: 1951,
  author: isaac
}

const iRobot:Book = {
  title: "I, Robot",
  isbn: "978-0-394-51331-4",
  publicationYear: 1950,
  author: isaac
}

const catalog:Catalog[] = [
  {
    genre: "Science Fiction",
    books: [
      foundation,
      iRobot
    ],
    authors: new Set([isaac])
  }
]

In order to serialize something, one needs an instance of Cbot in some static variable. You should always create just one instance for each schema and use it globally.

In this case a totally schemaless version is created.

import { Cbot } from "@sisujs/common"

const cbot = Cbot.getInstance();

Now we can actually serialize an object to a string and vice versa:

const serialized = cbot.serialize(foundation);
const deserialized:Book = cbot.deserialize(serialized);
console.log(serialized);

And the console log will produce following output:

1
E
A  title
B  JKFoundation
A !isbn
B !JK978-0-394-51330-7
A "publicationYear
B "Ic1951
A #author
B #E
A $name
B $JKIsaac Asimov
A %birthdate
B %If1920-01-02T00:00:00.000Z
F
F

Because this protocol is meant to be machine readable character stream, it is not easy to decipher what is going on. For that it needs to be visualized in more human readable format. For this there are two categories:

  • Disassembly: This will display all opcodes and values in assembly-code like format. This format may be usefull for debugging or understanding what is going on.

  • YAML: YAML is probably the visualization you actually want to use and it displays the actual object tree with values. I chose YAML over JSON because it is more expressive and it also allows comments to be added for clarification.

    Also within YAML, there are options for simple and full YAML. Simple displays only data information, but full YAML show more meta-information.

So let's first take a Dissasembly-view for the output:

console.log(cbot.visualize(serialized, VisualizationMode.DISASSEMBLY));

Which outputs:

MCSM 
OBJB (plain)
  DEFN 0 title
  ASGV 0 (title) STRN SSTR Foundation
  DEFN 1 isbn
  ASGV 1 (isbn) STRN SSTR 978-0-394-51330-7
  DEFN 2 publicationYear
  ASGV 2 (publicationYear) NATV FLOAT64 1951
  DEFN 3 author
  ASGV 3 (author) OBJB (plain)
    DEFN 4 name
    ASGV 4 (name) STRN SSTR Isaac Asimov
    DEFN 5 birthdate
    ASGV 5 (birthdate) NATV ISO_DATETIME 1920-01-02T00:00:00.000Z
  OBJE
OBJE

What you can see here is that every line starts with a specific operation code. Let's find out what each command means:

  • MCSM (Model Checksum)
    • Each message typically begins with a checksum which receiver can use to determine if sender is using same schema. If different checksum is detected then message parsing is also rejected. In this case checksum is empty, because there is no schema.
  • OBJE (Object Begin)
    • This opcode begins a new object. In this case a plain object is created.
  • DEFN (Define)
    • This opcode creates a relation between an id and a key. Id is a number encoded to 2-character long string. And key is any string that needs to be referenced later.
  • ASGV (Assign value)
    • This opcode assigns a property value to an object. In this case it has an id = 0 that represents property name title. The rest of the line tells what is the actual value.
  • STRN (String)
    • This opcode denotes a beginning of a string. This protocol has two types of strings, which is needed because strings do not have any length restrictions and they may also contain newlines which needs to be encoded.
  • SSTR (Simple string)
    • This opcode denotes a simple string that fits into one line and does not contain newlines
  • NATV (Native value)
    • Denotes a beginning of a value that protocol supports natively
  • FLOAT64
    • Denotes a 64-bit float value
  • ISO_DATETIME
    • Denotes a ISO 8601 Date and Time -value
  • OBJE (Object End)
    • Ends an object

The same can also be visualized as YAML

console.log(cbot.visualize(serialized, VisualizationMode.SIMPLE_YAML));

Which outputs:

title: Foundation
isbn: 978-0-394-51330-7
publicationYear: 1951
author:
  name: Isaac Asimov
  birthdate: 1920-01-02T00:00:00.000Z

Adding untyped schema

In the disassembly view, I previously mentioned the DEFN opcodes. These opcodes come into play when the protocol first encounters an unknown property name. At that point, it assigns a new 2-character long id value to that name. And if the same property name appears again within the same message, the id is reused during serialization.

This mechanism requires that each new message re-establish these definitions. Therefore, it can be beneficial to predefine property names in advance, allowing for the omission of these definitions during the serialization process.

And this is the version of a classless schema. Now, let's define a new instance for Cbot:

const cbot2 = Cbot.getInstance({
  staticKeys: [
    'name',
    'birthdate',
    'title',
    'isbn',
    'author',
    'publicationYear',
    'books',
    'genre',
    'authors'
  ]
});

I have now configured a set of statically defined keys which no longer require external definitions. However, this configuration does not prevent the creation of new message-based definitions for property names that aren't included in the predefined keys.

Additionally, since this configuration represents a schema-full version, both the sender and receiver must use the same configuration to ensure correct operation.

We can now use cbot2 for serialization.

console.log(cbot2.visualize(
  cbot2.toString(foundation),
  VisualizationMode.DISASSEMBLY));

And, its disassembly view will be as follows:

MCSM 123a4e7d
OBJB (plain)
  ASGV 8 (title) STRN SSTR Foundation
  ASGV 5 (isbn) STRN SSTR 978-0-394-51330-7
  ASGV 7 (publicationYear) NATV FLOAT64 1951
  ASGV 0 (author) OBJB (plain)
    ASGV 6 (name) STRN SSTR Isaac Asimov
    ASGV 2 (birthdate) NATV ISO_DATETIME 1920-01-02T00:00:00.000Z
  OBJE
OBJE

So within this disassembly view, you can see that the Model Checksum now has a value and all DEFN opcodes are missing. The checksum is not intended to be particularly secure or complex; it is used more as a quick sanity check.

Finally, just for fun, let's visualize the catalog variable in full YAML, which displays more metadata.

console.log(cbot2.visualize(
  cbot2.toString(catalog),
  VisualizationMode.FULL_YAML));

Which outputs:

# Model Checksum: 123a4e7d
# Array
- genre: !!str Science Fiction
  books:
    # Array
    - title: !!str Foundation
      isbn: !!str 978-0-394-51330-7
      publicationYear: !!float 1951
      author:
        name: !!str Isaac Asimov
        birthdate: !!timestamp 1920-01-02T00:00:00.000Z
    - title: !!str I, Robot
      isbn: !!str 978-0-394-51331-4
      publicationYear: !!float 1950
      author:
        name: !!str Isaac Asimov
        birthdate: !!timestamp 1920-01-02T00:00:00.000Z
  authors:
    # Set
    - name: !!str Isaac Asimov
      birthdate: !!timestamp 1920-01-02T00:00:00.000Z

Conclusion

This example demonstrated that Cbot can serve as a zero-configuration replacement for JSON while offering the benefits of using more native types. The decision to use it, however, is yours to make.

Although Cbot can function in this manner, it is not the primary reason I have been developing this protocol for several years. Rather, the key motivation is its ability to utilize a typed schema at the protocol level itself.

Typed example: Library

This example employs the same concepts as the untyped one but with actual types. It necessitates the creation of classes and a corresponding Meta-model. While the Meta-model has its own distinct use cases, they are not within the scope of this example.

The first step is to create a namespace. This namespace does not correspond in any way to Typescript namespaces, but it is an important tool for categorizing the Meta-model into usable groups. For instance, all Sisu-related classes are in the namespace sisujs.

So let's import some stuff and create a namespace.

import { Meta, Namespace, Value } from '@sisujs/common';

const NS = Namespace.of('library').init();

The init() function ensures that this is the first time namespace for the name is declared thus preventing namespace pollution.

And the next thing to do is to create the model:

@NS.class('Author')
class Author {

  @Value.string()
  name: string

  @Value.date()
  birthdate: Date
}

@NS.class('Book')
class Book {

  @Value.string()
  title: string

  @Value.string()
  isbn: string

  @Value.of(Author)
  author: Author

  @Value.int32()
  publicationYear: number

  @Value.number()
  rating: number
}

@NS.class('Catalog')
class Catalog {

  @Value.string()
  genre: string

  @Value.array().of(Book)
  books: Book[]

  @Value.set().of(Author)
  authors: Set<Author>
}

NS.seal([
  Author,
  Book,
  Catalog
]);

What can be observed here is that classes and all their properties require a corresponding decorator that duplicates the intent for using the meta-model. It is also entirely possible to have (transient) properties without a decorator, and these will not be included in the model.

Also, I used a int32 as the value-type for publicationYear. Although Javascript itself does not support integers the protocol does. and it has its own rules how to deal with the case.

The last thing is to call seal() function for all declared types, which prevents adding further declarations.

The next step is to create the values.

// Created manually

const isaac = new Author();

isaac.name = "Isaac Asimov";
isaac.birthdate = new Date('1920-01-02');

// Created by (partial) template

const foundation = Meta.of(Book, {
  title: "Foundation",
  isbn: "978-0-394-51330-7",
  publicationYear: 1951,
  author: isaac,
  rating: 5.6
});

const iRobot = Meta.of(Book, {
  title: "I, Robot",
  isbn: "978-0-394-51331-4",
  publicationYear: 1950,
  author: isaac
});

const catalog = [
  Meta.of(Catalog, {
    genre: "Science Fiction",
    books: [foundation, iRobot],
    authors: new Set([isaac])
  })
];

This demonstrates two ways of creating typed objects, whether by manually assigning values or using the template method.

The next thing to do is to get a proper Cbot-instance:

const cbot = Cbot.getInstance({
  namespaces: [NS]
});

Now, instead of adding a staticKeys-parameter, I've added the namespace I am interested in. This illustrates why namespacing is important. You may have types for multiple purposes, and you may not want to expose them all publicly. In such cases, a different namespace can be used for the private types.

And now we are ready to begin the serialization process. There are no significant differences here compared to the untyped case, so I will simply demonstrate what disassembly and full YAML representation reveals.

First with disassembly:

console.log(cbot.visualize(
  cbot.serialize(foundation),
  VisualizationMode.DISASSEMBLY));

Which outputs:

MCSM 12451c9c
OBJB 7 (library.Book)
  ASGV 0 (author) OBJB 6 (library.Author)
    ASGV 2 (birthdate) NATV ISO_DATETIME 1920-01-02T00:00:00.000Z
    ASGV 9 (name) STRN SSTR Isaac Asimov
  OBJE
  ASGV 5 (isbn) STRN SSTR 978-0-394-51330-7
  ASGV 10 (publicationYear) NATV INT32 1951
  ASGV 11 (rating) NATV FLOAT64 5.6
  ASGV 12 (title) STRN SSTR Foundation
OBJE

The output is practically the same as before. The difference here is that now objects have also an id, which corresponds to the typename for the object.

And lastly Yaml:

console.log(cbot.visualize(
  cbot.serialize(catalog),
  VisualizationMode.FULL_YAML));

Which outputs:

# Model Checksum: 12451c9c
# Array
- # library.Catalog
  authors:
    # Set
    - # library.Author
      birthdate: !!timestamp 1920-01-02T00:00:00.000Z
      name: !!str Isaac Asimov
  books:
    # Array
    - # library.Book
      author:
        # library.Author
        birthdate: !!timestamp 1920-01-02T00:00:00.000Z
        name: !!str Isaac Asimov
      isbn: !!str 978-0-394-51330-7
      publicationYear: !!int 1951
      rating: !!float 5.6
      title: !!str Foundation
    - # library.Book
      author:
        # library.Author
        birthdate: !!timestamp 1920-01-02T00:00:00.000Z
        name: !!str Isaac Asimov
      isbn: !!str 978-0-394-51331-4
      publicationYear: !!int 1950
      rating: null
      title: !!str I, Robot
  genre: !!str Science Fiction

Again, the main difference here is that representation contains more comments to specify correct types.

And that's it! There is practically nothing more to show here. These examples are fully usable for you project as is and the API documentation contains more information. And do provide feedback to improve documentation further.


Although, there is one more thing for the old school guys. You can also create Meta-model totally manually without decorators like this:

var NS = Namespace.of('library');

function Author() {
  this.name = null;
  this.birthdate = null;
}

Value.string()(Author, 'name');
Value.date()(Author, 'birthdate');
NS.class('Author')(Author);