0.4.1 • Published 10 years ago

unicoderegexp v0.4.1

Weekly downloads
15,511
License
-
Repository
github
Last release
10 years ago

unicoderegexp

Various regular expressions for unicode character classes (letter, punctuation, number, etc.) and helper functions for composing them.

Used by the purify library.

The module exports a bunch of useful RegExps each with a single character class in them:

  • letter
  • mark
  • number
  • punctuation
  • symbol
  • separator
  • other
  • visible
  • printable
unicodeRegExp.visible.test("a"); // true
unicodeRegExp.visible.test(" "); // false
unicodeRegExp.visible.test("\u00a0"); // false -- a non-breaking space is not visible

To validate an entire string you need to build a new RegExp:

var visibleStringRegExp = new RegExp('^' + unicodeRegExp.visible.source + '*$');
visibleStringRegExp.test("foobar"); // true
visibleStringRegExp.test("foo bar"); // false because of the space

unicodeRegExp.removeCharacterFromCharacterClassRegExp(/[æøå]/, 'æ'); // /[\u00f8\u00e5]/
unicodeRegExp.spliceCharacterClassRegExps(/[a-b]/, /[c-d]/); // /[a-bc-d]/

The info about which characters belong to which classes was taken from the XRegExp library and its Unicode plugin.