6.0.0 • Published 4 years ago

@datafire/geneea v6.0.0

Weekly downloads
1
License
MIT
Repository
github
Last release
4 years ago

@datafire/geneea

Client library for Geneea Natural Language Processing

Installation and Usage

npm install --save @datafire/geneea
let geneea = require('@datafire/geneea').create({
  user_key: ""
});

.then(data => {
  console.log(data);
});

Description

<h2>API operations</h2>
<p>
    All API operations can perform analysis on supplied raw text or on text extracted from a given URL.
    Optionally, one can supply additional information which can make the result more precise. An example
    of such information would be the language of text or a particular text extractor for URL resources.
</p>
<p>The supported types of analyses are:</p>
<ul>
    <li><strong>lemmatization</strong> &longrightarrow;
        Finds out lemmata (basic forms) of all the words in the document.
    </li>
    <li><strong>correction</strong> &longrightarrow;
        Performs correction (diacritization) on all the words in the document.
    </li>
    <li><strong>topic detection</strong> &longrightarrow;
        Determines a topic of the document, e.g. finance or sports.
    </li>
    <li><strong>sentiment analysis</strong> &longrightarrow;
        Determines a sentiment of the document, i.e. how positive or negative the document is.
    </li>
    <li><strong>named entity recognition</strong> &longrightarrow;
        Finds named entities (like person, location, date etc.) mentioned the the document.
    </li>
</ul>

<h2>Encoding</h2>
<p>The supplied text is expected to be in UTF-8 encoding, this is especially important for non-english texts.</p>

<h2>Returned values</h2>
<p>The API calls always return objects in serialized JSON format in UTF-8 encoding.</p>
<p>
    If any error occurs, the HTTP response code will be in the range <code>4xx</code> (client-side error) or
    <code>5xx</code> (server-side error). In this situation, the body of the response will contain information
    about the error in JSON format, with <code>exception</code> and <code>message</code> values.
</p>

<h2>URL limitations</h2>
<p>
    All the requests are semantically <code>GET</code>. However, for longer texts, you may run into issues
    with URL length limit. Therefore, it's possible to always issue a <code>POST</code> request with all
    the parameters encoded as a JSON in the request body.
</p>
<p>Example:</p>
<pre><code>
    POST /s1/sentiment
    Content-Type: application/json

    {"text":"There is no harm in being sometimes wrong - especially if one is promptly found out."}
</code></pre>
<p>This is equivalent to <code>GET /s1/sentiment?text=There%20is%20no%20harm...</code></p>

<h2>Request limitations</h2>
<p>
    The API has other limitations concerning the size of the HTTP requests. The maximum allowed size of any
    POST request body is <em>512 KiB</em>. For request with a URL resource, the maximum allowed number of
    extracted characters from each such resource is <em>100,000</em>.
</p>

<h2>Terms of Service</h2>
<p>
    By using the API, you agree to our
    <a href="https://www.geneea.com/terms.html" target="_blank">Terms of Service Agreement</a>.
</p>

<h2>More information</h2>
<p>
    <a href="https://help.geneea.com/index.html" target="_blank">
    The Interpretor Public Documentation
    </a>
</p>

Actions

getInfo

getInfo

geneea.getInfo(null, context)

Input

This action has no parameters

Output

correctionGet

Possible options:An optional parameter diacritize with values yes, no or auto indicate whether the text diacritization will be performed. The default value is auto.

geneea.correctionGet({}, context)

Input

  • input object
    • id string: document ID
    • text string: raw document text
    • url string: document URL
    • extractor string (values: default, article, keep-everything): document extractor
    • language string: document language
    • returnTextInfo boolean

Output

correctionPost

Notes:Valid JSON cannot contain newline characters. These have to be escaped. (See also Interpretor documentation)Fields text and url are mutually exclusive.Examples:{"text": "Hello world!"}{"url": "https://en.wikipedia.org/wiki/Pyrrhuloxia"}Possible options:An optional parameter diacritize with values yes, no or auto indicate whether the text diacritization will be performed. The default value is auto.

geneea.correctionPost({}, context)

Input

Output

entitiesGet

entitiesGet

geneea.entitiesGet({}, context)

Input

  • input object
    • id string: document ID
    • text string: raw document text
    • url string: document URL
    • extractor string (values: default, article, keep-everything): document extractor
    • language string: document language
    • returnTextInfo boolean

Output

entitiesPost

Notes:Valid JSON cannot contain newline characters. These have to be escaped. (See also Interpretor documentation)Fields text and url are mutually exclusive.Examples:{"text": "Hello world!"}{"url": "https://en.wikipedia.org/wiki/Pyrrhuloxia"}

geneea.entitiesPost({}, context)

Input

Output

lemmatizeGet

lemmatizeGet

geneea.lemmatizeGet({}, context)

Input

  • input object
    • id string: document ID
    • text string: raw document text
    • url string: document URL
    • extractor string (values: default, article, keep-everything): document extractor
    • language string: document language
    • returnTextInfo boolean

Output

lemmatizePost

Notes:Valid JSON cannot contain newline characters. These have to be escaped. (See also Interpretor documentation)Fields text and url are mutually exclusive.Examples:{"text": "Hello world!"}{"url": "https://en.wikipedia.org/wiki/Pyrrhuloxia"}

geneea.lemmatizePost({}, context)

Input

Output

sentimentGet

sentimentGet

geneea.sentimentGet({}, context)

Input

  • input object
    • id string: document ID
    • text string: raw document text
    • url string: document URL
    • extractor string (values: default, article, keep-everything): document extractor
    • language string: document language
    • returnTextInfo boolean

Output

sentimentPost

Notes:Valid JSON cannot contain newline characters. These have to be escaped. (See also Interpretor documentation)Fields text and url are mutually exclusive.Examples:{"text": "Hello world!"}{"url": "https://en.wikipedia.org/wiki/Pyrrhuloxia"}

geneea.sentimentPost({}, context)

Input

Output

topicGet

topicGet

geneea.topicGet({}, context)

Input

  • input object
    • id string: document ID
    • text string: raw document text
    • url string: document URL
    • extractor string (values: default, article, keep-everything): document extractor
    • language string: document language
    • returnTextInfo boolean

Output

topicPost

Notes:Valid JSON cannot contain newline characters. These have to be escaped. (See also Interpretor documentation)Fields text and url are mutually exclusive.Examples:{"text": "Hello world!"}{"url": "https://en.wikipedia.org/wiki/Pyrrhuloxia"}

geneea.topicPost({}, context)

Input

Output

status

status

geneea.status(null, context)

Input

This action has no parameters

Output

  • output string

Definitions

EntitiesResponse

  • EntitiesResponse object: Response for the named-entity recognition
    • entities required array: Found named entities in the document
    • id string: Unique identifier of the document
    • language required string: The used language of the document
    • text string: The raw text of the document which has been analysed

Entity

  • Entity object: The named entity
    • entity required string: Disambiguated and standardized form of the entity
    • links required object: Disambiguation links for the entity, e.g. its DBpedia page
    • sentiment number: Detected sentiment of the entity (value from -1.0 to 1.0)
    • textOffset required integer: Character offset in the text (starting from 0)
    • type required string: Detected type of the entity

Entry«string,long»

  • Entry«string,long» object
    • key integer

Information about a user account.

Information_about_a_user_account.

  • Information_about_a_user_account. object
    • remainingQuotas array: Remaining quotas for the user account.
    • type string: Type (plan) of the user account.

Label

  • Label object: The topic label
    • confidence required number: Confidence (probability) of this label
    • label required string: The value of this label

LemmatizeResponse

  • LemmatizeResponse object: Response for the lemmatization
    • id string: Unique identifier of the document
    • language required string: The used language of the document
    • lemmatizedText required string: Lemmatized text of the document, individual tokens are separated by a space and sentences are separated by a new-line character
    • text string: The raw text of the document which has been analysed

Request

  • Request object: Request encapsulation for simple API version 1
    • extractor string (values: default, article, keep-everything): optional Text extractor to be used when analyzing HTML document
    • id string: Unique identifier of the document, it's optional
    • language string: optional The language of the document, auto-detection will be used if omitted
    • options object: optional Additional options for the internal modules (key-value pairs)
    • returnTextInfo boolean: optional Indicates whether to return the source text within the response object
    • text string: The raw text to be analyzed, mutually exclusive with the 'url' parameter
    • url string: URL of a document to be analysed, mutually exclusive with the 'text' parameter

Response for the text correction

Response_for_the_text_correction

  • Response_for_the_text_correction object
    • corrected boolean
    • correctedText required string: Corrected text of the document
    • diacritized boolean
    • id string: Unique identifier of the document
    • language required string: The used language of the document
    • text string: The raw text of the document which has been analysed

SentimentResponse

  • SentimentResponse object: Response for the sentiment analysis
    • id string: Unique identifier of the document
    • language required string: The used language of the document
    • sentiment required number: Detected sentiment of the document (value from -1.0 to 1.0)
    • text string: The raw text of the document which has been analysed

TopicResponse

  • TopicResponse object: Response for the topic detection
    • confidence required number: Confidence for the detected topic
    • id string: Unique identifier of the document
    • labels required array: Probabilistic distribution over possible topic labels
    • language required string: The used language of the document
    • text string: The raw text of the document which has been analysed
    • topic required string: Detected topic of the document
6.0.0

4 years ago

5.0.0

7 years ago

4.0.0

7 years ago

3.0.0

8 years ago

2.0.3

8 years ago

2.0.2

8 years ago

2.0.1

8 years ago

2.0.0

8 years ago

0.0.3

9 years ago

0.0.1

9 years ago