0.6.1 • Published 9 years ago

tito v0.6.1

Weekly downloads
101
License
CC0-1.0
Repository
github
Last release
9 years ago

tito

tito is a Node.js module and command-line utility for translating between tabular text streams in formats such as CSV, TSV, JSON and HTML tables. It stands for Tables In, Tables Out.

Formats

  • JSON: structured with JSONPath queries or newline-delimited (the default for input and output).
  • Comma-, tab-, and otherwise-delimited text, with support for custom column and row delimiters.
  • HTML tables, with support for targeted parsing with CSS selectors and formatted output.

Installation

Install it with npm:

npm install -g tito

Examples

Here are some examples of what tito can do:

Convert CSV to TSV

Use the --read and --write options to set the read and write formats:

tito --read csv data.csv --write tsv data.tsv

Or pipe data into and out of tito via stdio:

cat data.csv | tito --read csv --write tsv > data.tsv
Turn HTML tables into CSV

tito's html reader uses a streaming HTML parser and can target tables with CSS selectors:

curl -s "http://www.federalreserve.gov/releases/h15/current/" \
  | tito --read.format html --read.selector 'table.statistics' --write csv \
  > interest-rates.csv
Import structured JSON data from a URL into dat

tito can take structured JSON like this:

{
  "results": [
    { /* ... */ },
    // etc.
  ]
}

and turn it into newline-delimited JSON. Just set --read.format to json and --read.path to the JSONPath expression of your data elements. For the structure above, which is common to many REST APIs, you would use results.*. You could then use the following to import data from one such API into dat:

curl -s http://api.data.gov/some-data \
  | tito --read.format json --read.path 'results.*' \
  | dat import
Map and filter your data

The tito --map and --filter options allow you to perform streaming transformations on your data. Both options can either be specified as fof-compatible expressions or filenames.

tito --filter 'd => d.Year > 2000' \
  --map 'd => {{year: d.Year, region: d.Region, revenue: +d.Revenue}}' \
  --read csv data.csv

If you specify an existing filename for either --map or --filter, it will be require()d and its value passed to fof(). This means that you can specify map and filter transformations in JSON or JavaScript, e.g.:

{
  year: 'd => +d.Year',
  region: 'Region',
  revenue: 'd => +d.Revenue'
}

then, you could use this transformation with:

tito --map ./transform.json \
  --read csv --write json input.csv > output.json

Usage

This is the output of tito --help formats:

tito [options] [input] [output]

Options:
  --read, -r     the input format (see below)        [default: "ndjson"]
  --write, -w    the output format (see below)       [default: "ndjson"]
  --in, -i       the input filename                                     
  --out, -o      the output filename                                    
  --filter, -f   filter input by this data expression           [string]
  --map, -m      map input to this data expression              [string]
  --help, -h     Show this help message.                                
  --version, -v  Print the version and exit                             

Formats:

The following values may be used for the input and output format
options, --read/-r or --write/-w:

  tito --read csv --write tsv
  tito -r csv -w tsv

If you wish to specify format options, you must use the dot notation:

  tito --read.format csv --read.delim=, data.csv
  tito -r.format json -r.path='results.*' data.json
  tito data.ndjson | tito -w.format html -w.indent='  '

"csv": Read and write comma-separated (or otherwise-delimted) text
  Options:
  - "delimiter", "delim", "d": The field delimiter
  - "newline", "line", "n": The row delimiter
  - "quote", "q": The quote character

"tsv": Read and write tab-separated values
  Options:
  - "headers": 
  - "newline", "line", "n": The line separator character sequence

"ndjson": Read and write newline-delimted JSON
  Options:

"json": Read and write arrays from streaming JSON
  Options:
  - "path", "p": The JSONPath selector containing the data (read-only)
  - "open", "o": Output this string before streaming items (write-only)
  - "separator", "sep", "s": Output this string between items (write-only)
  - "close", "c": Output this string after writing all items (write-only)

"html": Read and write data from HTML tables
  Options:
  - "selector", "s": the CSS selector of the table to target (read-only)
  - "indent", "i": indent HTML with this string (write-only)
0.6.1

9 years ago

0.6.0

9 years ago

0.5.6

9 years ago

0.5.5

9 years ago

0.5.4

9 years ago

0.5.3

9 years ago

0.5.2

9 years ago

0.5.1

9 years ago

0.5.0

9 years ago

0.4.0

10 years ago

0.3.0

10 years ago

0.2.1

10 years ago

0.2.0

10 years ago

0.1.4

10 years ago

0.1.3

10 years ago

0.1.2

10 years ago

0.1.1

10 years ago