1.0.1 • Published 7 months ago

@evologi/fixed-width v1.0.1

Weekly downloads
-
License
MIT
Repository
github
Last release
7 months ago

fixed-width

npm Libraries.io dependency status for latest release JavaScript Style Guide ci

A fixed-width file format toolset with streaming support and flexible options.

Features

  • Flexible: lots of options
  • Zero dependencies: small footprint
  • Native streaming support
  • Well tested: code coverage above 90%
  • Support big datasets: production ready
  • Native ESM support: future proof
  • TypeScript support

Fixed width file format

The fixed-width file format is a simple method to store data on a file. A separator divides rows (typically a newline char), and all fields have a fixed width on the row.

This format is useful to seek data directly from the file without any parsing procedure, but It has no compression mechanism. This limit will result in an over-usage of memory (both RAM and disk) to store the same data with other file formats (like JSON or XML).

But, in the year insert current year here A.D., you can still find someone using such file format. Fear not, my friend! This Node.js library will provide a toolset to parse and generate fixed-width files.

Example file

The next section will contain an example of fixed-width data with two fields: username and age. The first field has a width of 12 chars, and the second one has a width of 3 chars right-aligned.

username    age
alice       024
bob         030

Install

npm i @evologi/fixed-width

Usage

import { parse, stringify, Parser, Stringifier } from '@evologi/fixed-width'

parse(input, options)

It parses a text or a buffer into an array of elements. If It receives string of buffer It returns an array of all parsed items. It It receives an iterable (could be async), returns the same type of iterable inputted.

  • input <String> | <Buffer> | <Iterable> | <AsyncIterable> The raw data to parse.
  • options <Object> See options section.
  • Returns: <Array> | <Iterable> | <AsyncIterable>
import { parse } from '@evologi/fixed-width'

const text = 'alice       024\nbob         030\n'

const users = parse(text, {
  eol: '\n',
  fields: [
    {
      property: 'username',
      width: 12
    },
    {
      cast: value => parseInt(value, 10),
      property: 'age',
      width: 3
    }
  ]
})

// [{ username: 'alice', age: 24 }, { username: 'bob', age: 30 }]
console.log(users)

stringify(input, options)

It serializes an array of elements into a string. If the argument is an array, the output will be a string. The whole conversion is performed at the moment and in-memory. If the argument is some kind of iterable (sync or async), the output will be the same kind of inputted iterable.

  • input <Array> | <Iterable> | <AsyncIterable>
  • options <Object> See options section.
  • Returns: <String> | <Iterable> | <AsyncIterable>
import { stringify } from '@evologi/fixed-width'

const users = [
  { username: 'alice', age: 24 },
  { username: 'bob', age: 30 }
]

const text = stringify(users, {
  eof: true,
  eol: '\n',
  fields: [
    {
      align: 'left',
      property: 'username',
      width: 12
    },
    {
      align: 'right',
      pad: '0',
      property: 'age',
      width: 3
    }
  ]
})

// 'alice       024\nbob         030\n'
console.log(text)

Parser.stream(options)

It returns a Transform stream that accepts strings (or buffers) as input and emits the parsed elements.

import { Parser } from '@evologi/fixed-width'

const stream = Parser.stream({
  eol: '\n',
  fields: [
    {
      property: 'username',
      width: 12
    },
    {
      cast: value => parseInt(value, 10),
      property: 'age',
      width: 3
    }
  ]
})

stream
  .on('error', err => console.error(err))
  .on('data', data => console.log(data))
  .on('end', () => console.log('end'))

stream.write('alice       024\nbob         030')
stream.end()

Stringifier.stream(options)

It returns a Transform stream that accepts objects (o arrays) as input and emits the serialized strings.

import { Stringifier } from '@evologi/fixed-width'

const stream = Stringifier.stream({
  eof: true,
  eol: '\n',
  fields: [
    {
      align: 'left',
      property: 'username',
      width: 12
    },
    {
      align: 'right',
      pad: '0',
      property: 'age',
      width: 3
    }
  ]
})

let text = ''

stream
  .on('error', err => console.error(err))
  .on('data', data => { text += data.toString() })
  .on('end', () => {
    // 'alice       024\nbob         030\n'
    console.log(text)
  })

stream.write({ username: 'alice', age: 24 })
stream.write({ username: 'bob', age: 30 })
stream.end()

new Parser(options)

It creates a Parser instance. This object is useful when a custom optimized procedure is necessary. This object is used internally by the Node.js stream and the parse() function.

It consists of only two methods, and all those methods are strictly synchronous.

Parser#write(stringOrBuffer)

It accepts a string or a buffer as input and returns an iterable that outputs all parsed objects up to the last completed row (line).

  • stringOrBuffer <String> | <Buffer>
  • Returns: <Iterable>

Parser#end()

It resets the Parser status and returns an iterable that could output other objects contained on the last partial row (line).

  • Returns: <Iterable>
import { Parser } from '@evologi/fixed-width'

const parser = new Parser({
  eol: '\n',
  fields: [
    {
      property: 'username',
      width: 12
    },
    {
      cast: value => parseInt(value, 10),
      property: 'age',
      width: 3
    }
  ]
})

const users = Array.from(
  parser.write('alice       024\nbob         030')
).concat(
  Array.from(
    parser.end()
  )
)

// [{ username: 'alice', age: 24 }, { username: 'bob', age: 30 }]
console.log(users)

new Stringifier(options)

It creates a Stringifier instance. This object is useful when a custom optimized procedure is necessary. This object is used internally by the Node.js stream and the stringify() function.

It consists of only two methods, and all those methods are strictly synchronous.

Stringifier#write(obj)

Push an object to serialize. Returns the serialized text of the passed object, including new line terminators.

  • obj <*>
  • Returns: <String>

Stringifier#end()

Close the parsing and returns a final string.

  • Returns: <String>
import { Stringifier } from '@evologi/fixed-width'

const stringifier = new Stringifier({
  eof: true,
  eol: '\n',
  fields: [
    {
      align: 'left',
      property: 'username',
      width: 12
    },
    {
      align: 'right',
      pad: '0',
      property: 'age',
      width: 3
    }
  ]
})

let text = ''
text += stringifier.write({ username: 'alice', age: 24 })
text += stringifier.write({ username: 'bob', age: 30 })
text += stringifier.end()

// 'alice       024\nbob         030\n'
console.log(text)

Options

encoding

Type: <String>

Default: "uft8"

The encoding used to handle strings and buffers. Only Node.js encodings are supported.

eol

Type: <String>

The End Of Line character that divides record rows. It will defautl to os.EOL for serialization. For parsing, the Parser will try to guess the correct line separator.

eof

Type: <Boolean>

Default: true

Appends the End Of File char. If true, an End Of Line character is added at the end of the file.

pad

Type: string

Default: " "

Values shorter than their field's width will be padded with this value while serializing. It's also the trimming value removed while parsing.

See trim, field.pad, and field.align options.

trim

Type: <Boolean> | <String>

Default: true

It enables or disabled values' trimming while parsing. You can also specify partial trims with "left" and "right" values. A false value will disable trimming.

The trimmed value corresponds to the field's padding value.

trim('004200', { pad: '0', trim: 'right' })
// the trimmed value will be '0042'

from

Type: <Number>

Default: 1

The first line to consider while parsing (inclusive). It is a 1-based integer (one is the first line).

to

Type: <Number>

Default: Infinity

The last line to consider while parsing (inclusive). It is a 1-based integer (one is the first line).

allowLongerLines

Type: <Boolean>

Allow lines to be longer than the declared fields while parsing.

Default: true

allowShorterLines

Type: <Boolean>

Allow lines to be shorter than the declared fields while parsing.

Default: false

skipEmptyLines

Type: <Boolean>

Completely ignore all empty lines. This options does not change the behaviour of the allowShorterLines option.

Default: true

fields

Type: <Array>

This option is the only required one. It contains the specs for all the fields.

field.align

Type: <String>

Default: "left"

Field's value alignment. Can be "left" or "right".

field.cast

Type: <Function>

A casting function that accepts the raw string value and returns the parsed one. It also provides a context object as second argument. Only used while parsing.

const options = {
  fields: [
    {
      width: 5,
      cast: (value, ctx) => {
        // value is always a string
        // ctx = { column: 1, line: 1, width: 5 }
        return parseInt(value)
      }
    }
  ]
}

field.column

Type: <Number>

Field's columns. This is 1-based value (one is the first column). It defaults to the sum of all the previously defined fields' widths.

field.pad

Type: <Number>

Field level padding value. It defaults to the global one.

field.property

Type: <String> | <Symbol>

This option controls the expected format of both input and output objects.

Parsing

By defining this option, the Parser will emit objects. If the option is omitted, the emitted values will be arrays.

Serializing

By defining this option, the Stringifier will expect objects. If the option is omitted, the expected values will be arrays.

field.width

Type: <Number>

Field's width. Required.

Errors

All errors that can occur during the parsing or serializing phase contain an error code. Error objects also contain enough info (properties) to debug the problem.

It's possible to detect custom fixed-width errors with their constructor:

import { FixedWidthError } from '@evologi/fixed-width'

try {
  // parse or stringify...
} catch (err) {
  if (err instanceof FixedWidthError) {
    console.log(err.code)
  } else {
    console.log('UNKNOWN_ERROR')
  }
}

UNEXPECTED_LINE_LENGTH

This error is raised when a partial line is found.

You can suppress this error with allowLongerLines or allowShorterLines options.

EXPECTED_STRING_VALUE

This error is raised when a value cannot be serialized into a string.

FIELD_VALUE_OVERFLOW

This error is raised when a string value has a width that exceeds its field's width.