0.42.1 • Published 3 years ago

@sugarcube/plugin-tika v0.42.1

Weekly downloads
1
License
GPL-3.0
Repository
github
Last release
3 years ago

@sugarcube/plugin-tika

Use the Apache Tika toolkit to detect and extract metadata and text from over a thousand different file types.

Installation

npm install --save @sugarcube/plugin-tika

To use this plugin you need as well Java installed.

Plugins

tika_parse

Parse a list of file specified by the query type glob_pattern.

sugarcube -Q glob_pattern:files/**/*.pdf -p tika_parse

tika_links

This plugin iterates over all links in _sc_media and fetches the text and meta data for this link. This plugin ignores any errors that the fetch might throw.

tika_location

This plugin parses any location specified using the tika_location_field query type. This fetches the text and meta data of e.g. a url inside the unit.

sugarcube -Q google_search:Keith\ Johnstone \
          -Q tika_location_field:href \
          -p google_search,tika_location

The text and meta data are added into the _sc_media collection and placed directly on the unit as well, e.g. if the location field is href, the href_text and href_meta fields are added to the unit.

tika_export

Export the text and meta data that tika_location parses to a file.

sugarcube -Q google_search:Keith\ Johnstone \
          -p google_search,tika_location,tika_export \
          --tika.location_field href

Configuration Options:

  • tika.data_dir: Specify the target directory where to store all files. Defaults to ./data/tika_location.

License

GPL3 @ Christo

0.42.1

3 years ago

0.42.0

4 years ago

0.41.0

4 years ago

0.40.0

4 years ago

0.39.0

4 years ago

0.38.0

4 years ago

0.37.0

4 years ago

0.36.0

4 years ago

0.35.0

4 years ago

0.34.0

4 years ago

0.33.0

4 years ago

0.32.1

5 years ago

0.32.0

5 years ago

0.31.0

5 years ago

0.30.2

5 years ago

0.30.1

5 years ago

0.30.0

5 years ago

0.29.0

5 years ago

0.28.0

5 years ago

0.27.2

5 years ago

0.27.0

5 years ago

0.26.0

5 years ago

0.25.1

5 years ago

0.25.0

5 years ago

0.24.0

5 years ago

0.23.0

5 years ago

0.22.0

5 years ago

0.21.0

5 years ago

0.20.1

5 years ago

0.20.0

5 years ago

0.19.0

5 years ago

0.18.0

5 years ago

0.17.0

5 years ago

0.16.0

5 years ago

0.15.0

5 years ago

0.14.0

5 years ago

0.13.1

5 years ago

0.13.0

6 years ago

0.12.0

6 years ago

0.11.0

6 years ago

0.10.0

6 years ago

0.9.0

6 years ago

0.8.0

6 years ago

0.6.0

6 years ago

0.5.1

6 years ago

0.5.0

6 years ago

0.4.0

6 years ago

0.3.0

6 years ago

0.2.1

7 years ago

0.2.0

7 years ago

0.1.0

7 years ago