0.1.0 • Published 5 years ago

baudot v0.1.0

Weekly downloads
5
License
MIT
Repository
github
Last release
5 years ago

Baudot.js

Build NPM

A Baudot encoder and decoder for node.js and the browser.

Sports a programmatic and a command line interface.

Comes with ITA1, ITA2 and US-TTY alphabets, and you can bring your own.

Command line usage

You can either install the package globally, or use npx:

npm install baudot --global
echo -n 'HELLO, WORLD!' | npx baudot encode | hexdump
0000000 14 01 12 12 18 1b 0c 04 1f 13 18 0a 12 09 1b 0d
0000010

echo -n 'HELLO, WORLD!' | npx baudot encode | npx baudot decode
HELLO, WORLD!

echo -n 'HELLO; WORLD$' | npx baudot encode -v us-tty | npx baudot decode -v us-tty
HELLO; WORLD$

Programmatic usage

npm install baudot
# or yarn add baudot
import {encoder, decoder, US_TTY} from 'baudot';

// Defaults to ITA2
const encode = encoder();
const decode = decoder();

console.log(encode('H'));
// >>> [20]

console.log(encode('!'));
// >>> [27, 13]
// Note the switch to Figure Set

console.log(encode('HELLO, WORLD!'));
// >>> [31, 20, 1, 18, 18, 24, 27, 12, 4, 31, 19, 24, 10, 18, 9, 27, 13]
// Note the switch to Letter Set at the beginning, and the Figure Set at the end

console.log(decode(encode('HELLO, WORLD!')));
// >>> "HELLO, WORLD!"

console.log(
  decoder(US_TTY)(
    encoder(US_TTY)('HELLO; WORLD$')
  )
);
// >>> "HELLO; WORLD$"

Troubleshooting

Decoding US_TTY correctly:

OPERATORS HAD TO MAINTAIN A "STEADY RHYTHM"; THE USUAL SPEED OF OPERATION WAS 30 WORDS PER MINUTE

Trying to decode US_TTY as ITA2:

OPERATORS HAD TO MAINTAIN A +STEADY RHYTHM+= THE USUAL SPEED OF OPERATION WAS 30 WORDS PER MINUTE

Trying to decode ITA1 as ITA2/US_TTY:

U
MEYUMHTJEKTYUTGEIVYEIVTETHY
EK TMJ YJGTYJ
TSHSE5£

KTUCTU
TPUMKHTPEHT
MTGIVSY
Z

Trying to decode ITA2/US_TTY as ITA1:

UYLAWYPZGJUYWAGYRIBO AE

Missing a figure or letter shift somewhere:

OPERATORS HAD TO MAINTAIN A "53-$6 4#65#."; 5#3 77-) 033$ 9! 9034-589, 2- 30 WORDS PER MINUTE

Trying to encode lower-case letters:

O             30 W P M

API

Alphabet

type Alphabet = string[][];

An array of two sets of characters.

encoder()

(alphabet: Alphabet = ITA2) =>
  (str: string) => number[]

Returns an encoder function that, when fed a unicode string, produces one or more baudot-encoded words, one for each symbol. Characters that are not part of the selected alphabet are ignored.

The encoder will keep track of the current Letter or Figure Set, and will produce the appropriate LS and FS shift codes when fed characters from the other set.

decoder()

(alphabet: Alphabet = ITA2) =>
  (word: number | Iterable<number>) => string

Returns a decoder function that, when fed one or more baudot-encoded words, one for each symbol, produces a unicode string containing the decoded data. Word that are not valid indexes into the selected alphabet are ignored.

The decoder will keep track of the current Letter or Figure Set, and will use it to decode words. When fed a LS or FS character, it will shift to the appropriate set.

Included alphabets

All built-in alphabets can be directly imported from the main package, and contain both letter (LS) and figure (FS) sets:

import {encoder, decoder} from 'baudot';
import {ITA1, ITA2, US_TTY} from 'baudot';

const encode = encoder(US_TTY);
const decode = decoder(US_TTY);

decode(encode('£$'));
// >>> "$"
BinaryHexDecITA1 LSITA1 FSITA2 LSITA2 FSUS_TTY LSUS_TTY FS
00000000<NUL><NUL><NUL><NUL><NUL><NUL>
00001011A1E3E3
00010022E2<LF><LF><LF><LF>
00011033<CR><CR>A-A-
00100044Y3<SPC><SPC><SPC><SPC>
00101055U4S'S<BEL>
00110066I<DC1>I8I8
00111077O5U7U7
01000088<FS><SPC><CR><CR><CR><CR>
01001099J6D<ENQ>D$
010100A10G7R4R4
010110B11H+J<BEL>J'
011000C12B8N,N,
011010D13C9F!F!
011100E14F<DC2>C:C:
011110F15D0K(K(
100001016<SPC><LS>T5T5
100011117<LF><LF>Z+Z"
100101218X,L)L)
100111319Z:W2W2
101001420S.H£H#
101011521T<DC3>Y6Y6
101101622W?P0P0
101111723V'Q1Q1
110001824<DEL><DEL>O9O9
110011925K(B?B?
110101A26M)G&G&
110111B27L=<FS><FS><FS><FS>
111001C28R-M.M.
111011D29Q/X/X/
111101E30N<DC4>V=V;
111111F31P%<LS><LS><LS><LS>

Custom alphabets

Alphabets are two sets of 32 characters, one set for letters, and one for figures. Characters in each set are ordered by their code, from code zero to 31.

The LS and FS symbols are special: they won't become part of the final output, but they will cause the decoder to switch from one set to another. Conversely, when trying to encode a character from a different set, the appropriate LS or FS will also be encoded.

import {
  encoder, decoder,
  NUL, ENQ, BEL, LF, CR, FS, LS,
  DC1, DC2, DC3, DC4, DEL,
} from 'baudot';

const ITA2_LOWERCASE = [
  [
    /* LS:   0-7 */  NUL, `e`, LF,  `a`, ` `, `s`, `i`, `u`,
    /* LS:  8-15 */  CR,  `d`, `r`, `j`, `n`, `f`, `c`, `k`,
    /* LS: 16-23 */  `t`, `z`, `l`, `w`, `h`, `y`, `p`, `q`,
    /* LS: 24-32 */  `o`, `b`, `g`, FS,  `m`, `x`, `v`, LS ,
  ], [
    /* FS:   0-7 */  NUL, `3`, LF,  `-`, ` `, `'`, `8`, `7`,
    /* FS:  8-15 */  CR,  ENQ, `4`, BEL, `,`, `!`, `:`, `(`,
    /* FS: 16-23 */  `5`, `+`, `)`, `2`, `£`, `6`, `0`, `1`,
    /* FS: 24-32 */  `9`, `?`, `&`, FS,  `.`, `/`, `=`, LS ,
  ],
];

console.log(
  decoder(ITA2_LOWERCASE)(
    encoder(ITA2_LOWERCASE)('hello, world!')
  )
);
// >>> "hello, world!"

Missing features

  • Replace unknown characters with the most similar alternative
  • Optional bit-packing into 8-bit bytes
  • Optional automatic re-encoding of figure shifts after spaces
  • Optional automatic reset to Letter Set after spaces
  • Set shift locking to access the third and fourth code page

License

Copyright (c) 2020, zenoamaro \zenoamaro@gmail.com\

Licensed under the MIT LICENSE.

0.1.0

5 years ago