0.1.0-beta.1 • Published 4 years ago

ccgjs v0.1.0-beta.1

Weekly downloads
4
License
MIT
Repository
github
Last release
4 years ago

CCGjs

A combinatory categorial grammar (CCG) library for the web.

NOTE: Work-in-progress could be found on the development branch.

Requirements

  • Node.js ^= 12.18.2
  • NPM ^= 6.14.7

Installation

Development

Run npm install and we are all set.

Web Browser

Include this library on your HTML file.

<script type="text/javascript" src="/path/to/ccgjs"></script>

Replace /path/to/ccgjs with the CCGjs library URL. Then, use it:

<script type="text/javascript">
  const { CCG } = ccgjs;

  // do something
</script>

See examples/index.html as reference.

Available APIs

CCG.Reader

Read and then parse machine-readable CCG derivation into a JavaScript object.

Usage:

const str = '(<T S 0 2> (<L S/NP PSP PSP Hi S/NP>) (<L NP NNP NNP Wisnu NP>))';
const reader = new CCG.Reader(str);
if (reader.read()) {
  console.log(reader.result);
}

The returned object looks like this:

{
  node: {
    type: 'T',
    ccgCat: 'S',
    head: 0,
    dtrs: 2,
  },
  left: {
    node: {
      type: 'L',
      ccgCat: 'S/NP',
      modPOSTag: 'PSP',
      origPOSTag: 'PSP',
      word: 'Hi',
      predArgCat: 'S/NP',
    },
  },
  right: {
    node: {
      type: 'L',
      ccgCat: 'NP',
      modPOSTag: 'NNP',
      origPOSTag: 'NNP',
      word: 'Wisnu',
      predArgCat: 'NP',
    },
  },
}

We uses PEG.js to build the parser. The parsing expression grammar could be found on the src/ccg.pegjs file. As for the generated parser, it could be found on the src/generated.pegjs.ts file.

Run npm run pegjs to generate the .pegjs file into .ts file.

CCG.Tree

Construct a JavaScript tree object based on the parsed machine-readable CCG derivation via CCG.Reader. It will also building useful metadata for later use.

Usage:

const str = '(<T S 0 2> (<L S/NP PSP PSP Hi S/NP>) (<L NP NNP NNP Wisnu NP>))';
const tree = new CCG.Tree(str);
console.log(tree);

The returned object looks like this:

Tree {
  metadata: {
    isParsed: true,
    sentence: 'Hi Wisnu',
    words: [ 'Hi', 'Wisnu' ],
    ccgCats: [ 'S/NP', 'NP' ],
    height: 2,
    nodes: [ [Object], [Object], [Object] ]
  },
  mappedIndexedWords: { '0': { value: [Object] }, '1': { value: [Object] } },
  root: {
    value: { type: 'T', ccgCat: 'S', head: 0, dtrs: 2 },
    left: { value: [Object] },
    right: { value: [Object] }
  }
}

For more information about the omitted [Object], see CCG.TreeTypes.Metadata, CCG.TreeTypes.IndexedWordMapper, and CCG.TreeTypes.Node.

toString

We can also turn the tree back into machine-readable CCG derivation by doing tree.toString(). The returned string will be:

(<T S 0 2> (<L S/NP PSP PSP Hi S/NP>) (<L NP NNP NNP Wisnu NP>))

buildDerivations

It is possible to get the structured CCG derivation based on the CCG.TreeTypes.Node simply by doing tree.buildDerivations(). The returned Array<Array<CCG.TreeTypes.Derivation>> will be:

[
  [
    { from: 0, to: 0, ccgCat: 'S/NP' },
    { from: 1, to: 1, ccgCat: 'NP' }
  ],
  [ { from: 0, to: 1, ccgCat: 'S', opr: '>' } ]
]

How to read?

In the CCG.TreeTypes.Metadata, we may find words key. In this example, it will be ['Hi', 'Wisnu']. Meaning that word Hi is at 0 index and word Wisnu is at 1 index. We may read it as:

  Hi     Wisnu
------ ---------
 S/NP     NP
--------------->
       S

CCG.DOM

Render and manipulate CCG.Tree as a DOM (document object model) directly on the browser. Currently, there is only one method available.

Usage:

const str = [
  '(<T Sf 1 2>',
  '(<T NP 0 2>',
  '(<L NP NNP NNP raam NP>)',
  '(<L NP\\NP PSP PSP ne NP\\NP>))',
  '(<T Sf\\NP 1 2>',
  '(<T NP 0 2>',
  '(<L NP NNP NNP mohan NP>)',
  '(<L NP\\NP PSP PSP ko NP\\NP>))',
  '(<T (Sf\\NP)\\NP 1 2>',
  '(<T NP 1 2>',
  '(<L NP/NP JJ JJ niilii NP/NP>)',
  '(<L NP NN NN kitaab NP>))',
  '(<L ((Sf\\NP)\\NP)\\NP VM VM dii ((Sf\\NP)\\NP)\\NP>))))',
].join(' ');
const dom = new CCG.DOM(str);
const table = dom.createTable();

// apply it directly
document.body.appendChild(table);

// or take the HTML string
console.log(table.outerHTML);

How it looks like? Take a look on this JS Bin!

TODO

There are a lot of things to do. The goal of this project is to enable interactive CCG derivation manipulation directly on the web browser. By manipulation we meant, as an example, the ability to (interactively) create, edit, and delete a CCG node from its tree. The direction of this project should be clear by now.

Contributing

Please refrain to contribute for the time being until this project officially released. We will add CONTRIBUTING.md after we are ready.

Both issues and pull requests will be ignored.

Acknowledgements

This JavaScript library is part of my undergraduate thesis. Hence, I would like to thank my supervisors (@aromadhony and @saidalfaraby) for their advice and guidance.

References

Hockenmaier, J., & Steedman, M. (2007). CCGbank: A corpus of CCG derivations and dependency structures extracted from the Penn Treebank. Computational Linguistics, 33(3), 355–396.

Ambati, B.R., Deoskar, T. & Steedman, M. Hindi CCGbank: A CCG treebank from the Hindi dependency treebank. Lang Resources & Evaluation 52, 67–100 (2018).

License

Licensed under the MIT License.