0.37.0 • Published 5 years ago

stencila-convert v0.37.0

Weekly downloads
1
License
Apache-2.0
Repository
github
Last release
5 years ago

Convert: a format converter for reproducible documents

Build status Build status Code coverage NPM Contributors Docs Chat

Stencila Converters allow you to convert between a range of formats commonly used for "executable documents" (those containing some type of source code or calculation).

Status

The following tables list the status of converters that have been developed, are in development, or are being considered for development. We'll be developing converters based on demand from users. So if you'd like to see a converter for your favorite format, look at the listed issues and comment under the relevant one. If there is no issue regarding the converter you need, create one.

When the converters have been better tested, the plan is to integrate them into Stencila Desktop as a menu item e.g. Save as... > Jupyter Notebook

You can also provide your feedback on the friendly Stencila Community Forum and Stencila Gitter channel.

Documents, markup, and notebook formats

FormatStatus
Markdownalpha
RMarkdownalpha
Latexalpha
HTMLalpha
PDF-
Google Docalpha

Tabular data and spreadsheet formats

FormatStatus
CSValpha
Yaml front matter for CSV CSVY#25
Excel (.xlsx)alpha
OpenDocument Spreadsheetalpha
Tabular Data Packagealpha

Other formats

FormatStatus
Reproducible PNG (rPNG)alpha

Demo

:sparkles: Coming soon!

Install

Convert is available as a pre-compiled, standalone command line tool (CLI), or as a Node.js package.

CLI

Windows

To install the latest release of the convert command line tool, download convert-win-x64.zip for the latest release and place it somewhere on your PATH.

MacOS

To install the latest release of the convert command line tool to /usr/local/bin just use,

curl -L https://raw.githubusercontent.com/stencila/convert/master/install.sh | bash

To install a specific version, append -s vX.X.X e.g.

curl -L https://raw.githubusercontent.com/stencila/convert/master/install.sh | bash -s v0.33.0

Or, if you'd prefer to do things manually, download convert-macos-x64.tar.gz for the latest release and then,

tar xvf convert-macos-x64.tar.gz
sudo mv -f stencila-convert /usr/local/bin # or wherever you like

Linux

To install the latest release of the convert command line tool to ~/.local/bin/ just use,

curl -L https://raw.githubusercontent.com/stencila/convert/master/install.sh | bash

To install a specific version, append -s vX.X.X e.g.

curl -L https://raw.githubusercontent.com/stencila/convert/master/install.sh | bash -s v0.33.0

Or, if you'd prefer to do things manually, or place Convert elsewhere, download convert-linux-x64.tar.gz for the latest release and then,

tar xvf convert-linux-x64.tar.gz
mv -f stencila-convert ~/.local/bin/ # or wherever you like

Package

If you want to integrate Convert into another application or package, it is also available as a Node.js package :

npm install stencila-convert

Use

Example

stencila-convert document.md document.jats.xml

You can use the --from and --to flag options to explicitly specify formats. For example,

OptionDescription
--to yamlConvert into YAML format of Stencila Schema JSON.
--to tdpConvert into Tabular Data Package JSON.

Help

To get an overview of the commands available use the --help option i.e.

stencila-convert --help

API documentation is available at https://stencila.github.io/convert.

Develop

Check how to contribute back to the project. All PRs are most welcome! Thank you!

Clone the repository and install a development environment:

git clone https://github.com/stencila/convert.git
cd convert
npm install

Run the test suite:

npm test

Or, run a single test file:

npx jest tests/xlsx.test.ts

To get coverage statistics:

npm run cover

Or, manually test conversion using the ts-node and the cli.ts script:

npx ts-node --files src/cli tests/fixtures/datatable/simple/simple.csv --to yaml

If that is a bit slow, compile the Typescript to Javascript first and use node directly:

npm run build:ts
node dist/cli tests/fixtures/datatable/simple/simple.csv --to yaml

There's also a Makefile if you prefer to run tasks that way e.g.

make lint cover check

You can also test using the Docker image for a self-contained, host-independent test environment:

docker build --tag stencila/convert .
docker run stencila/convert

Roadmap

:sparkles: Coming soon!

Contribute

We 💕 contributions! All contributions: ideas 🤔, examples 💡, bug reports 🐛, documentation 📖, code 💻, questions 💬. See CONTRIBUTING.md for more on where to start.

We recognize all contributors - including those that don't push code! ✨

See also

:sparkles: Coming soon!

FAQ

:sparkles: Coming soon!

Acknowledgments

Convert relies on many awesome opens source tools (see package.json for the complete list). We are grateful ❤ to their developers and contributors for all their time and energy. In particular, these tools do a lot of the heavy lifting 💪 under the hood.

AjvAjv is "the fastest JSON Schema validator for Node.js and browser". Ajv is not only fast, it also has an impressive breadth of functionality. We use Ajv for the validate() and coerce() functions to ensure that ingested data is valid against the Stencila schema.
Frictionless Datadatapackage-js from the team at Frictionless Data is a Javascript library for working with Data Packages. It does a lot of the work in converting between Tabular Data Packages and Stencila Datatables.
PandocPandoc is a "universal document converter". It's able to convert between an impressive number of formats for textual documents. Our Typescript definitions for Pandoc's AST allow us to leverage this functionality from within Node.js while maintaining type safety. Pandoc powers our converters for Word, JATS and Latex. We have contributed to Pandoc, including developing it JATS reader.
PuppeteerPuppeteer is a Node library which provides a high-level API to control Chrome. We use it to take screenshots of HTML snippets as part of generating rPNGs and we plan to use it for generating PDFs.
RemarkRemark is an ecosystem of plugins for processing Markdown. It's part of the unified framework for processing text with syntax trees - a similar approach to Pandoc but in Javascript. We use Remark as our Markdown parser because of it's extensibility.
SheetJsSheetJs is a Javascript library for parsing and writing various spreadhseet formats. We use their community edition to power converters for CSV, Excel, and Open Document Spreadsheet formats. They also have a pro version if you need extra support and functionality.

Many thanks ❤ to the Alfred P. Sloan Foundation and eLife for funding development of this tool.