3.1.0 • Published 3 years ago

@csstools/tokenizer v3.1.0

Weekly downloads
-
License
CC0-1.0
Repository
github
Last release
3 years ago

CSS Tokenizer

This tools lets you tokenize CSS according to the CSS Syntax Specification. Tokenizing CSS is separating a string of CSS into its smallest, semantic parts — otherwise known as tokens.

This tool is intended to be used in other tools on the front and back end. It seeks to maintain:

  • 100% compliance with the CSS syntax specification. ✨
  • 100% code coverage. 🦺
  • 100% static typing. 💪
  • 1kB maximum contribution size. 📦
  • Superior quality over Shark P. 🦈

Usage

Add the CSS tokenizer to your project:

npm install @csstools/tokenizer

Tokenize CSS in JavaScript:

import { tokenize } from '@csstools/tokenizer'

for (const token of tokenize(cssText)) {
  console.log(token) // logs an individual CSSToken
}

Tokenize CSS in classical NodeJS:

const { tokenizer } = require('@csstools/tokenizer')

let iterator = tokenizer(cssText), iteration

while (!(iteration = iterator()).done) {
  console.log(iteration.value) // logs an individual CSSToken
}

Tokenize CSS in client-side scripts:

<script type="module">

import { tokenize } from 'https://unpkg.com/@csstools/tokenizer?module'

for (const token of tokenize(cssText)) {
  console.log(token) // logs an individual CSSToken
}

</script>

Tokenize CSS in classical client-side scripts:

<script src="http://unpkg.com/@csstools/tokenizer"></script>
<script>

const tokens = Array.from(tokenizeCSS(cssText)) // an array of CSSTokens

</script>

How it works

The CSS tokenizer separates a string of CSS into tokens.

interface CSSToken {
  /** Position in the string at which the token was retrieved. */
  tick: number

  /** Number identifying the kind of token. */
  type:
    | 1 // Symbol
    | 2 // Comment
    | 3 // Space
    | 4 // Word
    | 5 // Action
    | 6 // Atword
    | 7 // Hash
    | 8 // String
    | 9 // Number
  
  /** Character code (when a Symbol, otherwise -1) */
  code: number

  /** Lead, like the opening of a comment or the quotation mark of a string. */
  lead: string,

  /** Data, like the numbers before a unit, or the letters after an at-sign. */
  data: string,

  /** Tail, like the unit of a number, or the closing of a comment. */
  tail: string,
}

As an example, the CSS string @media would become a Atword token where @ and media are recognized as distinct parts of that token. As another example, the CSS string 5px would become a Number token where 5 and px are recognized as distinct parts of that token. As a final example, the string 5px 10px would become 3 tokens; the Number as mentioned before (5px), a Space token that represents a single space (), and then another Number token (10px).

Benchmarks

As of August 23, 2021, these benchmarks were averaged from my local machine:

  ┌────────────────────────────────────────────────────┬────────┬────────┬────────┐
  │                      (index)                       │   ms   │ ms/50k │ tokens │
  ├────────────────────────────────────────────────────┼────────┼────────┼────────┤
  │ CSSTree 1 x 7.55 ops/sec ±11.49% (24 runs sampled) │ 132.48 │ 13.87  │ 477434 │
  │ PostCSS 8 x 13.78 ops/sec ±2.73% (39 runs sampled) │ 72.56  │  3.88  │ 935267 │
  │ Tokenizer x 17.09 ops/sec ±1.09% (47 runs sampled) │ 58.52  │  3.09  │ 948045 │
  └────────────────────────────────────────────────────┴────────┴────────┴────────┘

Benchmark: Bootstrap
  ┌──────────────────────────────────────────────────┬──────┬────────┬────────┐
  │                     (index)                      │  ms  │ ms/50k │ tokens │
  ├──────────────────────────────────────────────────┼──────┼────────┼────────┤
  │ CSSTree 1 x 118 ops/sec ±2.39% (77 runs sampled) │ 8.5  │  13.1  │ 32425  │
  │ PostCSS 8 x 408 ops/sec ±0.10% (96 runs sampled) │ 2.45 │  2.4   │ 51170  │
  │ Tokenizer x 288 ops/sec ±0.14% (93 runs sampled) │ 3.48 │  2.92  │ 59566  │
  └──────────────────────────────────────────────────┴──────┴────────┴────────┘

Development

You wanna take a deeper dive? Awesome! Here are a few useful development commands.

npm run build

The build command creates all the files needed to run this tool in many different JavaScript environments.

npm run build

npm run benchmark

The benchmark command builds the project and then tests its performance as compared to PostCSS. These benchmarks are run against Boostrap and Tailwind CSS.

npm run benchmark

npm run test

The test command tests the coverage and accuracy of the tokenizer.

As of September 26, 2020, this tokenizer has 100% test coverage:

npm run test
3.1.0

3 years ago

2.0.2

3 years ago

3.0.0

3 years ago

2.0.1

3 years ago

2.0.0

3 years ago

1.0.0

4 years ago