Rex_regex NPM | npm.io

rex_regex-js

javascript manipulation of regular expressions

want to use it ? want to get a glimpse ? want to participate ?

Presentation

This module aims to symplify the task in creating BIG regexes with dynamic aspect (variables can be set then changed at anytime)

The purpose of it was for building a parser (see example)

The name, I find it kind of fun, since Rex Reges seems to mean king of the kings in latin.

USE

works in browser or in node.

Add it to a script tag and call the object rex_regex to use it in the broser

In node, use a require. For simplicity reasons the variable will be called the same here.

const rex_regex = require("rex_regex");//or path to the file

elements

rex_regex creates elements. to create an element, for example a group matching the word "hello", you call either

var group = rex_regex.group("hello")
//or
var goup = new rex_regex.group("hello")

elements have all in common these properties and functions

group.raw; // "hello"
group.text; // "(hello)"
group.regex("g"); // /(hello)/g

group.one(); // new element (hello)
group.any(); // new element (hello)*
group.many(); // new element (hello)+
group.some(3,5); // new element (hello){3,5}
  group.some(3); // new element (hello){3}
  group.some(3,Infinity); // new element (hello){3,}
  // thanks mfix22
  // https://github.com/mfix22/rexrex
  // if I saw your module earlier I could have used it instead of coding this

For now there are 3 element types :

rex_regex.chars()

the chars element is a sequel of characters.

Be careful, escaped \ are not working in chars, because \ are ignored in the any, many an some functions to allow escaping other characters. so don't do rex_regex.chars("\\\\") and blame me

the any and many and some functions will apply to each character : a+b+ ab ...

var a = rex_regex.char("ab")

a.many().text; // "a+b+"
a.some(5).regex("m"); // /a{5}b{5}/m

rex_regex.set()

the set element is a set of characters, for example ab9

when calling it, DON'T write the brackets.

any and many and some will apply to the whole set ab9*

sets have a special operator, to be used carefully because poorly coded : not. It will return a new set element with the ^ at the beginning. To improve this will need to add properties to the element like negated:true, I don't know

var a = rex_regex.set("ab9")

a.any().text; // "[ab9]*"
a.some(5,7).regex("g"); // /[ab9]{5,7}/g

a.not().text; // [^ab9]

rex_regex.group()

the group element is a group of characters, for example (ab9)

when calling it, DON'T write the brackets.

there is no (? ) or (?! ) or anything of this kind ... because I didn't need it, but if you want to participate please feel at ease. To improve this will need to add properties to the element like lazy:true, I don't know

var a = rex_regex.group("ab9")

a.raw; // "ab9"
a.any().text; // "(ab9)*"
a.some(5,Infinity).regex(); // /(ab9){5,}/

chaining

every function except .regex() returns a new rex_regex._core.Element, furthermore when creating a new rex_regex you can send either a string OR a rex_regex Element (the text will be taken), or several, multiple arguments are ok. allowing you to chain calls like in the example.

var regexp = rex_regex.chars(

)

why ?

Personally, I needed to have flexible variables in a regex, so I just coded it and that's all.

=> top

example

let's build a simple parser which separate words; spaces and hashtags in 2 groups, with only letters in hashtags :

// variable definitions -------------------
var hashtagC = rex_regex.chars("#");
var hashtagAuthorizedS = rex_regex.set("a-zA-Z");
var wordS = rex_regex.set("\\w");
var spacesS = rex_regex.set("\\s\\t\\n")

// making groups -------------------------
var hashtagGroup = rex_regex.group(
    hashtagC,
    hashtagAuthorizedS.many()
  );
var wordGroup = rex_regex.group(
    wordS.many()
  );
var spaceGroup = rex_regex.group(
    spacesS.many()
  );
var otherGroup = rex_regex.group(
  ".+"// i agree it's easier to do like that sometimes
)

var bigRegex =
  rex_regex.chars(
    hashtagGroup,"|", // or
    wordGroup,"|",
    spaceGroup,"|",
    otherGroup
  ).regex("g")

// gives /(#[a-zA-Z]+)|([\w]+)|([\s\t\n]+)|(.+)/g
// try it here

of course typing the regex is so much FASTER, but it would need you to think a bit if one day you would like to change the # for a @. (not really in fact... but it helps seeing what you are doing, at least to me it seems like it does)

This example is very simple, I coded this to produce the following regex :

([#@][^ \t\r\n:#>;\-]+;|[:#][^ \t\r\n:#>;\-,\\<=+*%°ç^_`\-&|([{~}\]\)§!?$£¤€.]*[>;\-])|
(:[/!@#'":]|[:#][^ \t\r\n:#>;\-,\\<=+*%°ç^_`\-&|([{~}\]\)§!?$£¤€.]+)|
([/!@#'":];|[^ \t\r\n:#>;\-,\\<=+*%°ç^_`\-&|([{~}\]\)§!?$£¤€.]+[>;\-])|
([ \t]+)|
(\r|\n|\r\n)|
([^ \t\r\n:#>;\-]+)|
([^ \t\r\n]+)

I would never have had the patience to write this without a tool like rex_regex. And Imagine if one day I wanted to change some characters ?? I say, headache !

=> top

Making Of

Pull rules

If you want to participate, you are most welcome

here are the few rules to keep it coherent

priorities in writing Let's keep some guideline across the code
- readable first
  today computers are powerfull, let's write something easy to read, with spaces, linebreaks and lots of comments, even if it costs, those who need can minify
- flexible first ex-aequo
  let's store as many core parameters, and assemble them in logic order, so they can be adapted later. (ex the regex creator instead of a regex string)
- fast third
  if we can make the code fast, it's after the readability, but it's cool too
- light last
  computers are POWERFUL today, I prefer big objects with clear property names, parsing a few strings should'nt kill your memory.

What's done

see use

What's next (to-do list)

adding properties to sets ang groups like

negated:true
lazy:true
named:true
name:"name"
...

=> top

regular expressions regexp js

0.1.1

7 years ago

0.1.0

7 years ago

rex_regex v0.1.1

rex_regex-js

contents

Presentation

USE

elements

rex_regex.chars()

rex_regex.set()

rex_regex.group()

chaining

why ?

example

Making Of

Pull rules

What's done

What's next (to-do list)