Aida-nlp NPM | npm.io

Check the demo

It's a chatbott running from the browser using Tensorflow.js and using the Web Speech API for speach to text and text to speach.

Train online

You can train from the browser using Javascript and Tensorflow.js (using your local GPU resources) or from the browser using Python and Tensorflow with Keras thanks to Google Colaboratory's free TPU's. There is no need to setup a local environment, the trained models can be saved for later use.

Local NPM package setup

Install the npm package:

yarn add aida-nlp

Create your chatito definition files, here you define your intents and your possible sentence models in mutiple .chatito files, and save them to a directory. e.g.: ´./chatito´
Create a config file like aida_config.json where you define the path to your chatito definition files, the chatito dataset output path and the output path for the trained NLP models:

{
  "chatito": {
    "inputPath": "./chatito",
    "outputPath": "./dataset"
  },
  "aida": {
    "outputPath": "./model",
    "language": "en"
  }
}

Generate and encode the dataset for training: npx aida-nlp aida_config.json --action dataset. The dataset will be available at the configured output path.
Start training: npx aida-nlp aida_config.json --action train. The models will be saved at the configured output path.
Run npx aida-nlp aida_config.json --action test for trying the generated testing dataset.

Local setup cloning the project

Alternatively to training online and using npm package, you can setup the project locally. Clone the GH proejct and install dependencies for node and python (given NodeJS with yarn and Python3 are installed):

Run yarn install from the ./typescript directory
Run pip3 install -r requirements.txt from the ./python directory

Create a dataset

Edit or create the chatito files inside ./typescript/examples/en/intents to customize the dataset generation as you need. You can read more about Chatito.

Then, from ./typescript directory, run npm run dataset:en:process. This will generate many files at the ./typescript/public/models directory. The dataset, the dataset parameters, the testing parameters and the embeddings dictionary. (Note: Aida also supports spanish language, if you need other language you can add if you first download the fastText embeddings for that language).

Training

Ttrain from 3 local environments: - For python: open ./python/main.ipynb with jupyter notebook or jupyter lab. Python will load your custom settings generated at step 3. And save the models in a TensorflowJS compatible format at the output directory.

  - For web browsers: from `./typescript` run `npm run web:start`. Then navigate to `http://localhost:8000/train` for the training web UI. After training, downloading the model to the `./typescript/public/pretrained/web` directory (NOTE: this will also generate and download a new dataset).

  - For Node.js: from `./typescript` run `npm run node:start`. This will load the previously dataset generated files from `./typescript/public/models`.

Technical Overview

Read the technical overview documentation.

Future ideas

Add tests
Add example that predicts from AWS Lambda
Experiment with multi layer language models based on character features like bigrams or trigrams for transfer learning, probably using a custom BiLSTM or LSTM architecture similar but simplier to Universal Language Model Fine-tuning for Text Classification (blog post).

Author

Rodrigo Pimentel

chatbot chatbots conversational ux conversational user experience virtual assistant natural language processing natural language understanding nlp

@tensorflow/tfjs @tensorflow/tfjs-node chatito lodash ts-node

@everything-registry/sub-chunk-1108 @zalastax/nolb-aid

0.1.8

7 years ago

0.1.7

7 years ago