8.0.15007673299 • Published 5 months ago
@kodexa/kodexa-document v8.0.15007673299
Kodexa Document TypeScript SDK
A TypeScript implementation of the Kodexa Document model for working with structured documents.
Installation
npm install @kodexa/kodexa-document
Overview
The Kodexa Document TypeScript SDK provides a comprehensive framework for working with structured documents. It enables developers to create, load, manipulate, and query documents with a hierarchical node structure. The SDK offers a powerful selector language (similar to XPath) for extracting specific content from documents based on complex criteria.
Key Features
- Create and manipulate hierarchical document structures
- Add, update, and remove content nodes and features
- Query documents using a powerful selector language
- Tag content for classification and extraction
- Track document processing steps
- Store and retrieve external data
Usage Examples
Creating a Document
import { Document, DocumentMetadata } from '@kodexa/kodexa-document';
// Create a new document
const document = new Document(new DocumentMetadata());
// Create a root node
const rootNode = document.createNode('root', 'Root content');
document.contentNode = rootNode;
// Add child nodes
rootNode.addChild(document.createNode('paragraph', 'This is a paragraph'));
rootNode.addChild(document.createNode('paragraph', 'This is another paragraph'));
Creating a Document from Text
import { Document } from '@kodexa/kodexa-document';
// Create a document from text
const document = Document.fromText('Hello World');
Querying Documents
import { Document } from '@kodexa/kodexa-document';
// Create a document with some content
const document = Document.fromText('Hello World');
// Select nodes using selectors
const nodes = document.select('//text');
// Select the first matching node
const firstNode = document.selectFirst('//text');
Adding Features to Nodes
import { Document } from '@kodexa/kodexa-document';
// Create a document with some content
const document = Document.fromText('Hello World');
// Add a feature to the root node
document.contentNode?.addFeature('metadata', 'language', 'en');
// Get features
const features = document.contentNode?.getFeatures();
Tagging Content
import { Document } from '@kodexa/kodexa-document';
// Create a document with some content
const document = Document.fromText('Hello World');
// Tag the content
document.contentNode?.tag('important', { confidence: 0.95 });
// Get tags
const tags = document.contentNode?.getTags();
API Reference
Document
The main class for working with documents.
constructor(metadata?: DocumentMetadata, source?: SourceMetadata, ref?: string)
: Create a new documentstatic fromText(text: string)
: Create a document from textcreateNode(nodeType: string, content?: string, virtual?: boolean)
: Create a new content nodeselect(selector: string, params?: Record<string, any>)
: Select nodes using a selectorselectFirst(selector: string, params?: Record<string, any>)
: Select the first matching nodegetRoot()
: Get the root node of the documentgetSteps()
: Get the processing stepssetSteps(steps: Array<ProcessingStep>)
: Set the processing stepsgetExternalData()
: Get external datasetExternalData(externalData: Record<string, any>)
: Set external data
ContentNode
Represents a node in the document hierarchy.
constructor(document: Document, nodeType: string, id?: number, content?: string)
: Create a new content nodegetParent()
: Get the parent nodegetChildren()
: Get child nodesaddChild(child: ContentNode, index?: number)
: Add a child noderemoveChild(contentNode: ContentNode)
: Remove a child nodeaddFeature(featureType: string, name: string, value: any)
: Add a feature to the nodegetFeatures()
: Get all featuresgetFeature(featureType: string, name: string)
: Get a specific featuretag(name: string, options?: any)
: Add a tag to the nodegetTags()
: Get all tagsgetTag(name: string)
: Get tags by nameremoveTag(name: string)
: Remove a tagselect(selector: string, params?: Record<string, any>)
: Select nodes using a selector
ContentFeatureClass
Represents a feature associated with a content node.
constructor(featureType: string, name: string, value: any)
: Create a new featuregetValue()
: Get the feature valuetoString()
: Get a string representation of the featuretoDict()
: Convert the feature to a dictionary
Tag
Represents a tag applied to a content node.
constructor(start?: number, end?: number, value?: string, uuid?: string, data?: any)
: Create a new tagtoDict()
: Convert the tag to a dictionary
Running Tests
To run the tests:
# From the lib/typescript directory
npm install
npm test
Building the Package
To build the package:
# From the lib/typescript directory
npm run build
License
ISC
8.0.15007673299
5 months ago
8.0.2
5 months ago