1.0.7 • Published 6 years ago

ocn_convert v1.0.7

Weekly downloads
1
License
ISC
Repository
-
Last release
6 years ago

Text to Markdown Converter

script to convert text files to Ocean MD

Requirements

Node.js must run on your machine.

Usage

There are several ways to use this script. Commands are listed in linux format but should work with windows equiavlents unless otherwise noted.

  • run node src/oceanconvert.js
  • chmod the src/oceanconvert.js file to be executable, then ./src/oceanconvert.js
  • run node src/oceanconvert.js -a, then run oceanconvert (only on systems with /usr/local/bin in the $PATH)

To convert a file, run oceanconvert [options] [file]. To convert all files in a folder, you can run oceanconvert [options] *, or for only text files, oceanconvert [options] *.txt. If you are doing this, you'll probably want the -o or -p "path" options.

Some other possible commands:

  • find . -type f ! -name '.*' -exec oceanconvert -oer {} \;

Ocean Markdown Metadata

Ocean Markdown files use basic YAML Front Matter (YFM) for holding metadata about each file. In order to be valid YFM, for Ocean Markdown, the following conditions must hold:

  • YFM is preceeded and followed by lines containing only ---
  • YFM occurs only at the very beginning of the file, i.e. --- is the very first line
  • YFM does not contain blank lines
  • Single values consist of a field name and value separated by a colon and space, e.g.: field: value
  • Multiple values consist of a field name, a colon, and then an indented list of values

Here is an example:

---
singleItem: This is a string.
multipleItem:
  - first
  - second
number: 0
boolean: true
---

The following fields are recognized in all Ocean Markdown files:

FieldTypeReqDescription
idstring*a unique string that identifies the document (auto-generated)
accessenumyeseither "research", or "encumbered" for documents that cannot be scrolled.
authorstr/arrayyesthe author of the document
languagestringyesthe two-character language code of the document
priorityint, 5-10yeshow important it is (1 = most, 10 = least)
titlestringyesthe title of the document, in the language of the document
titleShortstring*a short title that won't break mobile design (required for long titles - TODO: define)
ocnmd_versionnumberyesthe version number for the ocean markdown spec used in the file, currently 1
sourceUrlstring*a link to the content, for display in search results (required for scraped content)
wordsCountint*word count of the document (auto-generated)

| Extended info: category | enum | | the religion to which the content relates (@TODO: get category names) coverUrl | string | | url linking to the representative image documentType | enum | | a document type (@TODO: define document types) editor | str/array | | who edited the document needsEditing | boolean | | if the text quality is bad, e.g. from OCR, mark this as true publicationName | string | | the name of the publication in which this document appeared publicationEdition | string | | the edition of a book year | int | | the year that the document was written | Primary texts and authors: authorAbrv | string | | abbreviated author name, only for central figures titleAbrv | string | | title abbreviation, e.g. GWB for Gleanings from the Writings of Baha'u'llah | Collection information: collectionTitle | string | | the title for the collection (required for items in a collection) collectionId | string | | a unique id for the collection, comprising the collectionTitle lowercased and dashed collectionCoverUrl | string | | url linking to the image for the collection | Language info: titleEn | string | | the title of the work, in English (required for books in other languages) originalLang | string | | the original language from which a translation was made searchLang | array | | an array of language codes to search for the document translationRef | string | | a string that is consistent across translations of a single document translator | str/array | | who translated the document | Audio: audio | boolean | | whether the item has audio audioUrl | str/array | | url(s) linking to the audio file(s) narrator | str/array | | the narrator for the audio file | Conversion info: _conversionOpts | object | | the settings used when converting the document (see oceanconvert.js) _convertedFrom | string | * | the file path or url from which the document was converted (see oceanconvert.js)

Below is a basic YFM template for new files that are being created by hand. It must go at the very beginning of the file.

---
author: 
title: 
language: en
sourceUrl: 
publicationName: 
year: 
translator: 
---

The full list of fields is as follows:

---
author: 
title: 
titleShort: 
access: 
language: en
priority: 9
ocnmd_version: 1
sourceUrl: 
category: 
coverUrl: 
documentType: 
editor: 
publicationName: 
publicationEdition: 
year: 
authorAbrv: 
titleAbrv: 
collectionTitle: 
collectionId: 
collectionCoverUrl: 
titleEn: 
originalLang: 
searchLang: 
translationRef: 
translator: 
audio: 
audioUrl: 
narrator: 
---

Ocean Markdown Cheat Sheet

Basic elementsDisplay

| Headers | # Header 1 | Header 1 | | ## Header 2 | Header 2 | | ### Header 3 | Header 3 | | #### Header 4 | Header 4 | | ##### Header 5 | Header 5 | | ###### Header 6 | Header 6 | | Emphasis | _italic text_ | italic | | **bold text** | bold text | | ~~strikethough~~ | strikethrough | | Links | [Link text](https://example.com) | Link text | | Blockquotes | > Blockquote text | Blockquote text | | >> Nested blockquote text | nested blockquote | | Text with block attribute.{.blockquote} | Text with block attribute. | | Horizontal Rules | *** or --- or ___ | | | -------- or ======== | | | Lists | * Bulleted item- Bulleted item+ Bulleted item | Bulleted itemBulleted itemBulleted item | | 1. Numbered item2. Numbered item | Numbered itemNumbered item | | 1. Numbered item··* nested item··* nested item2. Numbered item | Numbered itemnested itemnested itemNumbered item | Fixed Width Text | ```Fixed width``` | Fixed width | | ```Fixed width``` | Fixed width | | Tables | | Col 1   | Col 2        | Col 3   || ------- | :----------: | ------: || left    |   centered   |   right | | (a table) | | Footnotes | Footnote references[^1] in sentence.[^2] | | [^1]: Footnote text | | Page numbers | [pg 1] | pg 1 | | Block attributes | Block attributes follow the markdown-it-attrs rules. |{.class #id id="" or attr=""} | | This paragraph willhave a dropcap.{.dropcap} | This paragraph will have a dropcap. | | This paragraph willbe centered.{.center} | This paragraph will be centered. | | This paragraph willbe right aligned.{.right} | This paragraph will be right aligned. | | This is a verse of poetry{.verse} | This is a verse of poetry. | | This isa listwith linebreaks.{.list} | This is somea listwith linebreaks. | | This isa numbered block with dropcap. {#1.5} | 1.5 This is a numbered paragraph. | | This isa numbered block as an atrribute with dropcap. {id="1.5" .dropcap} | 1.5 This is a numbered paragraph. |

Other classes:

  • .ed: editor
  • .sig: signature line, e.g. on letters
  • .sit: exhortation or sitilcent
  • .noid: no paragraph number