tito v0.6.1
tito
tito is a Node.js module and command-line utility for translating between tabular text streams in formats such as CSV, TSV, JSON and HTML tables. It stands for Tables In, Tables Out.
Formats
- JSON: structured with JSONPath queries or newline-delimited (the default for input and output).
- Comma-, tab-, and otherwise-delimited text, with support for custom column and row delimiters.
- HTML tables, with support for targeted parsing with CSS selectors and formatted output.
Installation
Install it with npm:
npm install -g tito
Examples
Here are some examples of what tito can do:
Convert CSV to TSV
Use the --read
and --write
options to set the read and write
formats:
tito --read csv data.csv --write tsv data.tsv
Or pipe data into and out of tito via stdio:
cat data.csv | tito --read csv --write tsv > data.tsv
Turn HTML tables into CSV
tito's html
reader uses a streaming HTML parser and can target
tables with CSS selectors:
curl -s "http://www.federalreserve.gov/releases/h15/current/" \
| tito --read.format html --read.selector 'table.statistics' --write csv \
> interest-rates.csv
Import structured JSON data from a URL into dat
tito can take structured JSON like this:
{
"results": [
{ /* ... */ },
// etc.
]
}
and turn it into newline-delimited JSON. Just set --read.format
to json
and --read.path
to the JSONPath expression of your data
elements. For the structure above, which is common to many REST APIs,
you would use results.*
. You could then use the following to import
data from one such API into dat:
curl -s http://api.data.gov/some-data \
| tito --read.format json --read.path 'results.*' \
| dat import
Map and filter your data
The tito --map
and --filter
options allow you to perform streaming
transformations on your data. Both options can either be specified as
fof-compatible expressions or filenames.
tito --filter 'd => d.Year > 2000' \
--map 'd => {{year: d.Year, region: d.Region, revenue: +d.Revenue}}' \
--read csv data.csv
If you specify an existing filename for either --map
or --filter
, it will
be require()
d and its value passed to fof()
. This means that you can
specify map and filter transformations in JSON or JavaScript, e.g.:
{
year: 'd => +d.Year',
region: 'Region',
revenue: 'd => +d.Revenue'
}
then, you could use this transformation with:
tito --map ./transform.json \
--read csv --write json input.csv > output.json
Usage
This is the output of tito --help formats
:
tito [options] [input] [output]
Options:
--read, -r the input format (see below) [default: "ndjson"]
--write, -w the output format (see below) [default: "ndjson"]
--in, -i the input filename
--out, -o the output filename
--filter, -f filter input by this data expression [string]
--map, -m map input to this data expression [string]
--help, -h Show this help message.
--version, -v Print the version and exit
Formats:
The following values may be used for the input and output format
options, --read/-r or --write/-w:
tito --read csv --write tsv
tito -r csv -w tsv
If you wish to specify format options, you must use the dot notation:
tito --read.format csv --read.delim=, data.csv
tito -r.format json -r.path='results.*' data.json
tito data.ndjson | tito -w.format html -w.indent=' '
"csv": Read and write comma-separated (or otherwise-delimted) text
Options:
- "delimiter", "delim", "d": The field delimiter
- "newline", "line", "n": The row delimiter
- "quote", "q": The quote character
"tsv": Read and write tab-separated values
Options:
- "headers":
- "newline", "line", "n": The line separator character sequence
"ndjson": Read and write newline-delimted JSON
Options:
"json": Read and write arrays from streaming JSON
Options:
- "path", "p": The JSONPath selector containing the data (read-only)
- "open", "o": Output this string before streaming items (write-only)
- "separator", "sep", "s": Output this string between items (write-only)
- "close", "c": Output this string after writing all items (write-only)
"html": Read and write data from HTML tables
Options:
- "selector", "s": the CSS selector of the table to target (read-only)
- "indent", "i": indent HTML with this string (write-only)