unified-doc-util-annotate v1.0.5
unified-doc-util-annotate
unified-doc hast utility to annotate text nodes.
Install
npm install unified-doc-util-annotateUse
Given a hast tree parsed from some HTML content:
import annotate from 'unified-doc-util-annotate';
const html = '<blockquote><strong>some</strong>\ncontent</blockquote>';
const hast = {
type: 'root',
children: [
{
type: 'element',
tagName: 'blockquote',
children: [
{
type: 'element',
tagName: 'strong',
children: [
{
type: 'text',
value: 'some',
},
],
},
{
type: 'text',
value: '\ncontent',
},
],
},
],
};
const annotations = [
{ id: 'a', start: 1, end: 4, classNames: ['a', 'b'] },
];
console.log(annotate(hast, { annotations }));Yields:
const hast = {
type: 'root',
children: [
{
type: 'element',
tagName: 'blockquote',
children: [
{
type: 'element',
tagName: 'strong',
children: [
{
type: 'text',
value: 's',
},
{
type: 'element',
tagName: 'mark',
properties: {
dataAnnotationId: 'a',
id: 'a',
classNames: ['a', 'b'],
},
children: [
{
type: 'text',
value: 'ome',
},
],
},
],
},
{
type: 'text',
value: '\ncontent',
},
],
},
],
};unified-doc-util-annotate also supports annotating over multiple text nodes and overlapping annotations, and applying custom properties to the annotated mark nodes (classNames, style, dataset attributes).
const annotations = [
{ id: 'a', start: 3, end: 8 , classNames: ['a', 'b'], dataset: { category: 'A' } },
{ id: 'b', start: 6, end: 10, style: { background: 'red' } },
];
console.log(annotate(hast, { annotations }));Yields:
const hast = {
type: 'root',
children: [
{
type: 'element',
tagName: 'blockquote',
children: [
{
type: 'element',
tagName: 'strong',
children: [
{
type: 'text',
value: 'som',
},
{
type: 'element',
tagName: 'mark',
properties: {
id: 'a',
classNames: ['a', 'b'],
dataAnnotationId: 'a',
dataCategory: 'A',
},
children: {
type: 'text',
value: 'e',
},
},
],
},
{
type: 'element',
tagName: 'mark',
properties: {
classNames: ['a', 'b'],
dataAnnotationId: 'a',
dataCategory: 'A',
},
children: [
{
type: 'text',
value: '\nc',
},
{
type: 'element',
tagName: 'mark',
properties: {
id: 'b',
dataAnnotationId: 'b',
style: 'background: red',
},
children: [
{
type: 'text',
value: 'on',
},
],
},
],
type: 'text',
value: '\ncontent',
},
{
type: 'element',
tagName: 'mark',
properties: {
dataAnnotationId: 'b',
style: 'background: red; color white',
},
children: [
{
type: 'text',
value: 'te',
},
],
{
type: 'text',
value: 'nt',
},
},
],
},
],
};API
function annotate(hast: Hast, annotations: Annotation[]): HastAccepts a valid hast tree and applies annotations. Returns a new tree.
Interfaces
interface Annotation {
id: string;
start: number;
end: number;
classNames?: string[];
dataset?: Record<string, any>;
data?: Record<string, any>;
style?: Record<string, any>;
}An Annotation is an object requiring the id, start and end properties. Annotated mark nodes can be customized by specifying the optional classNames, style and dataset properties. Additional annotation not used for rendering can be organized under the data attribute.
The start and end properties are offset values relative to the textContent of the provided hast tree. The annotation algorithm uses these offsets to determine how to insert annotated mark nodes while preserving the semantic structure of the rest of the content.
The following pseudocode should aid this understanding:
const html = '<blockquote><strong>some</strong>\ncontent</blockquote>';
const textContent = 'some\ncontent';
const textNodes = ['some', '\ncontent'];
const textNodeOffsets = [
{ start: 0, end: 4 }, // from "[some]\ncontent"
{ start: 4, end: 12 }, // from "some[\ncontent]"
];
const hast = { ... };
const annotations = [
{ id: 'a', classNames: ['a', 'b'], start: 3, end: 8 },
{ id: 'b', style: { background: 'red' }, start: 6, end: 10 },
];
const annotatedTextSegments = [
{
value: 'som', // from "[som]e" text node
textOffset: { start: 0, end: 3},
annotationIds: [], // not annotated
},
{
value: 'e', // from "som[e]" text node
textOffset: { start: 3, end: 4},
annotationIds: ['a'], // annotated by 'a'
},
{
value: '\nc', // from "[\nc]ontent" text node
textOffset: { start: 4, end: 6},
annotationIds: ['a'], // annotated by 'a'
},
{
value: 'on', // from "\nc[on]tent" text node
textOffset: { start: 6, end: 8},
annotationIds: ['a', 'b'], // annotated by 'a' + 'b'
},
{
value: 'te', // from "\ncon[te]nt" text node
textOffset: { start: 8, end: 10},
annotationIds: ['b'], // annotated by 'b'
},
{
value: 'nt', // from "\nconte[nt]" text node
textOffset: { start: 10, end: 12},
annotationIds: [], // not annotated
},
];
const annotated = annotate(hast, { annotations });5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago