1.0.3 • Published 3 months ago

utf64 v1.0.3

Weekly downloads
-
License
0BSD
Repository
github
Last release
3 months ago

utf-64

A terse, human-readable, URL-safe encoding for JSONish strings.

Overview

Use this when you need to encode a string to make it URL-safe, but you also want to keep it as small and readable as possible (unlike base64). For example:

Input stringbase64utf64
HelloSGVsbG8=YHello
"Hello!"IkhlbGxvISI=AYHelloGA
{"Hello":"world"}eyJIZWxsbyI6IndvcmxkIn0=MAYHelloAFAworldAN

I made this because I wanted to build a web API with a nice JSON schema that could also be cached by a CDN. To make it cacheable, I had to use the GET method; but GET can't (portably) have a request body, so this means all the API parameters need to be packed into the URL. UTF-64 is a fire-and-forget way to solve this problem.

UTF-64 uses the very permissive 0BSD licence so you can freely use this code & spec anywhere. I picked 0BSD as it seems to be the "public domain equivalent" most widely accepted by corporations, e.g. Google has a specific exception permitting the use of 0BSD.

Installation & usage

JavaScript

npm install utf64
import * as utf64 from "utf64";

console.log(utf64.encode("Hello!"));
console.log(utf64.decode("YHelloG"));

Python

pip install utf64
import utf64

print(utf64.encode("Hello!"))
print(utf64.decode("YHelloG"))

Go

go get utf64.moreplease.com
package main

import (
	"fmt"
	"utf64.moreplease.com"
)

func main() {
	fmt.Println(utf64.Encode("Hello!"))
	result, err := utf64.Decode("YHelloG")
	if err != nil {
		panic(err)
	}
	fmt.Println(result)
}

Rust

Kindly contributed by Mark Musante (@mjmusante)

cargo add utf64
use utf64::*;

fn main() {
    println!("{}", "Hello!".encode_utf64().unwrap());
    match "YHelloG".decode_utf64() {
        Ok(result) => println!("{result}"),
        Err(e) => panic!("{e}"),
    }
}

Command-line tool

The JS package includes a utf64 command-line tool:

npm install -g utf64
utf64 "Hello\!"
utf64 -d YHelloG

Specification

Output is encoded using base64url-compatible characters: _ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-

utf64Decoded as
_As-is
a to zAs-is
0 to 9As-is
ABCDEFGHIJKLMNOPQRSTUMapped to: "',.;:!?()[]{}#=+-*/\
VNewline
WSpace
XPrefix for Unicode 0-63. For example, "Xk" is "%" (U+0025)
YPrefix for Unicode 64-127. For example, "Y_" is "@" (U+0040)
ZPrefix for Unicode 128+. The following characters are interpreted as UTF-8, reduced to 6-bit bytes by stripping the redundant top two bits. For example, "ZhBr" is "" (UTF-8 [11]100010 [10]000010 [10]101100)

See test.json for tests that (hopefully) cover all the edge cases, for both valid and invalid encodings.

1.0.3

3 months ago

1.0.2

7 months ago

1.0.1

7 months ago

1.0.0

7 months ago

0.0.3

9 months ago

0.0.10

8 months ago

0.0.2

9 months ago

0.0.8

9 months ago

0.0.5

9 months ago

0.0.7

9 months ago

0.0.6

9 months ago

0.0.1

1 year ago