String-to-unicode-variant2 NPM

𝗎҉ toUnicodeVariant

Javascript function to convert a string into different kind of ⓤⓝⓘⓒⓞⓓⓔ variants.

toUnicodeVariant is an attempt to utilize unicode in a structured, organized and logical manner.

browser

<script src="path/to/toUnicodeVariant.js"></script>

nodejs

const toUnicodeVariant = require('path/to/toUnicodeVariant.js')

Usage

Pass a string and the name of a variant (or alias), and you get the unicoded' string in return :

toUnicodeVariant(string, variant, combinings)
...
toUnicodeVariant('monospace', 'm') //like first row below

Variant	Alias	Description	Example
monospace	m	Monospace	𝚖𝚘𝚗𝚘𝚜𝚙𝚊𝚌𝚎
bold	b	Bold text	𝐛𝐨𝐥𝐝
italic	i	Italic text	𝑖𝑡𝑎𝑙𝑖𝑐
bold italic	bi	bold+italic text	𝒃𝒐𝒍𝒅 𝒊𝒕𝒂𝒍𝒊𝒄
script	c	Handwriting style	𝓈𝒸𝓇𝒾𝓅𝓉
bold script	bc	Bolder handwriting	𝓫𝓸𝓵𝓭 𝓼𝓬𝓻𝓲𝓹𝓽
gothic	g	Gothic (fraktur)	𝔤𝔬𝔱𝔥𝔦𝔠
gothic bold	bg	Gothic in bold	𝖌𝖔𝖙𝖍𝖎𝖈 𝖇𝖔𝖑𝖉
doublestruck	d	Outlined text	𝕕𝕠𝕦𝕓𝕝𝕖𝕤𝕥𝕣𝕦𝕔𝕜
𝗌𝖺𝗇𝗌	s	Sans-serif style	𝗌𝖺𝗇𝗌
bold 𝗌𝖺𝗇𝗌	bs	Bold sans-serif	𝗯𝗼𝗹𝗱 𝘀𝗮𝗻𝘀
italic 𝗌𝖺𝗇𝗌	is	Italic sans-serif	𝘪𝘵𝘢𝘭𝘪𝘤 𝘴𝘢𝘯𝘴
bold italic sans	bis	Bold italic sans-serif	𝙗𝙤𝙡𝙙 𝙞𝙩𝙖𝙡𝙞𝙘 𝙨𝙖𝙣𝙨
circled	o	Letters within circles	ⓒⓘⓡⓒⓛⓔⓓ
circled negative	on	-- negative	🅒🅘🅡🅒🅛🅔🅓
squared	q	Letters within squares	🅂🅀🅄🄰🅁🄴🄳
squared negative	qn	-- negative	🆂🆀🆄🅰🆁🅴🅳
paranthesis	p	Letters within paranthesis	⒫⒜⒭⒠⒩⒯⒣⒠⒮⒤⒮
fullwidth	w	Wider monospace font	ｆｕｌｌｗｉｄｔｈ
flags	f	Regional codes	🇩🇰 🇺 🇳 🇮 🇨 🇴 🇩 🇪
numbers dot	nd	Numbers with trailing dot	⒈⒉⒊⒋
numbers comma	nc	Numbers with trailing comma	🄂🄃🄄🄅
number double circled	ndc	Numbers within double circle	⓵⓶⓷⓸
roman	r	Roman numerals	Ⅰ, Ⅱ, ⅯⅯⅩⅩⅢ

Combining with underline, strike and other diacritical marks

The unicoded' text can be combined with a broad range of diacritical marks

toUnicodeVariant('underlined', 'bold italic', 'underline-double')//𝒖̳𝒏̳𝒅̳𝒆̳𝒓̳𝒍̳𝒊̳𝒏̳𝒆̳𝒅̳

You can control the space between each character by using space-combinings. In the above table, rendering of the halo- and enclose- samples are used along with a space-en to make them look nicer.

Combinings can be combined

You can use two, three or more combinings either by passing a comma separated string, or by passing an array of strings :

toUnicodeVariant('The quick brown fox jumps ...', 'sans', 'underline, overline, strike')
toUnicodeVariant('The quick brown fox jumps ...', 'sans', ['underline', 'overline', 'strike'])

𝖳̶̲̅𝗁̶̲̅𝖾̶̲̅ ̶̲̅𝗊̶̲̅𝗎̶̲̅𝗂̶̲̅𝖼̶̲̅𝗄̶̲̅ ̶̲̅𝖻̶̲̅𝗋̶̲̅𝗈̶̲̅𝗐̶̲̅𝗇̶̲̅ ̶̲̅𝖿̶̲̅𝗈̶̲̅𝗑̶̲̅ ̶̲̅𝗃̶̲̅𝗎̶̲̅𝗆̶̲̅𝗉̶̲̅𝗌̶̲̅ ̶̲̅𝗈̶̲̅𝗏̶̲̅𝖾̶̲̅𝗋̶̲̅ ̶̲̅𝗍̶̲̅𝗁̶̲̅𝖾̶̲̅ ̶̲̅𝗅̶̲̅𝖺̶̲̅𝗓̶̲̅𝗒̶̲̅ ̶̲̅𝖽̶̲̅𝗈̶̲̅𝗀̶̲̅

You can use shorthand aliases or a mix, 'u,o,s', ['u','o','strike'] etc.

Special chars

Language specific special chars like ç, ò or ø are not supported by any unicode "variant", and will almost certainly never be in any future. The script and gothic fonts are in fact just various kind of mathematical symbols (see references below). For many of the variants, converting a special char like ø will at best look odd, probably ruin the entire string (vary on reader / browser).

But -- by using the base latin character as fallback, and inject a makeover of diacritical marks, we can experimentally try to mimick some language specific characters. Adding diacritics fails with the figurative variants, but it works okay with most of the rest.

toUnicodeVariant('üničode', 'bold italic') //𝒖̈𝒏𝒊𝒄̌𝒐𝒅𝒆
toUnicodeVariant('ÜNIĈODE', 'bold italic') //𝑼𝑵𝑰𝑪𝑶𝑫𝑬

Additions, limitations

Besides the limitations you can see in the various compatibility tables above, some variants offers extra unique features - other variants are reduced to one single feature alone.

Ⅻ roman, continued

If you pass a number (integer) instead of a string, that number will be romanized automatically before converting to unicode

 toUnicodeVariant(2023, 'roman') //ⅯⅯⅩⅩⅢ

flags, f

az-AZ only. Based on the highly special regional indicator symbols (see references below, U1F100.pdf). Using that you'll need to pass a string with whitespace between each character (otherwise expect weird output, there is no fallback to monospace) :

toUnicodeVariant('U N I C O D E', 'f') //🇺 🇳 🇮 🇨 🇴 🇩 🇪

However, if you pass a string that contain a country code, or even the name of some international organization, many readers will render the corresponding flag instead :

toUnicodeVariant('DK EU UN', 'flags') //🇩🇰 🇪🇺 🇺🇳

Reset a unicoded' string

Use String.normalize()

See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize

'𝖆𝖇𝖈𝖉𝖊𝖋𝖌𝖍𝖎𝖏𝖐𝖑𝖒𝖓𝖔𝖕𝖖𝖗𝖘𝖙𝖚𝖛𝖜𝖝𝖞𝖟'.normalize('NFKC') //or NFKD

returns abcdefghijklmnopqrstuvwxyz

Test

Browser: test/browser.html Node: test$ node node.js

These tests show all variants and their coverage az-AZ-09, along with flag combinations For reference, in Chrome (Ubuntu 20.04, 112.x) variants looks like this :

-- Or you can review a sample output, test/result-sample.html.txt. Try it out in different browsers - there are significant difference in coverage.