0.1.1 • Published 3 months ago

@entryscape/csvw-js v0.1.1

Weekly downloads
-
License
-
Repository
bitbucket
Last release
3 months ago

csvw model

This library provides tools for RDF-generation and validation of csv-files against schemas according to W3Cs specifications.

Installing

npm install @entryscape/csvm-js

Usage

The following two validation-commands can be executed from the command line:

validatecsv csv schema rows

Or if the library has not been installed globally:

node bin/validateCSV.js csv schema rows

This command validates given csv-data against a given schema. The parameter "csv" accepts a filepath to csv-data and the parameter "schema" accepts a filepath to a schema-file. The parameter "rows" is optional and accepts a number that specifies the maximum amount of csv rows to validate. It also accepts "undefined" meaning unlimited, this is the default value. The command returns a report table with potential invalidations and/or warnings.

An example of how the report table looks is found below.

generateRDF csv schema options

This commands generates RDF from given csv-data and a given schema. The parameter "csv" accepts a filepath to csv-data and the parameter "schema" accepts a filepath toa schema-file. the parameter "options" is optional and accepts an object with "dontIgnoreInvalidRows" and "dontIgnoreWarningRows". GenerateRDF will return with only a report table if any row was invalid and "dontIgnoreInvalidRows" is true or any row has warnings and "dontIgnoreWarningRows" is true. The default value for both is false.

The command return an array with three elements: rdfxml, graph and report. Rdfxml is the generated RDF in the form of RDF/XML, graph is the generated RDF as a graph. The command also validates the data the same way as the command "validate does. The third element is therefore a report table identical to the one described earlier.

In order to view all possible parameters for a command, type:

validate -h

or

generateRDF -h

Examples

A report table may look like this:

=== Invalids ===
┌─────────┬────────────┬─────────────────────────────────┬───────────┬────────┐
│ (index) │   Source   │             Message             │    Row    │ Column │
├─────────┼────────────┼─────────────────────────────────┼───────────┼────────┤
│    0    │ 'Datatype' │ 'length not equal to 5, got 10' │     1     │ 'date' │
└─────────┴────────────┴─────────────────────────────────┴───────────┴────────┘


=== Warnings ===
┌─────────┐
│ (index) │
├─────────┤
└─────────┘

A validation command may look like this:

validatecsv ./test/skoldata-test.csv ./test/skoldata-test.json

An RDF-generation-command may look like this:

generateRDF ./example.csv ./example.json --dontIgnoreInvalidRows

Future improvements

Here are some potential improvements to the library:

  • clear more tests from W3C
  • implement ability to generate rdf from multiple different csv-files
  • improve flexibility within RDF-generation, example: -option to switch case-sensitivity on and off for specific columns
  • support RDF-generation for relative URIs
  • allow different character encoding standards besides utf-8
  • implement browser support

Testing the library

Tests are downloaded from the W3C CSVW repository on GitHub via the following command:

yarn synctests

After the tests are downloaded they can be run by executing the following command:

yarn test

Note that at the time of writing 87 tests fail while 301 test pass.