Manee NPM | npm.io

Manee : Thai / English General-purpose text classification tool

An easy-to-use and simple text classification in Node.js based on TNThai and ml. The analyzer support Thai and English text. From text to vector, one-hot encoding technique is used. See Basic-Usage for more details.

Briefly, one-hot encoding represents a word in the text by a vector of the size of the vocabulary, where only the entry corresponding to the word is a one and all the other entries are zero.

Feature

Training using Freetext string support Thai / English
a Label must be string representing a category.
Only support Multinomial Naive Bayes

Installation

npm install manee

npm install manee --save

Basic usage

const manee = require('manee');

var textClassifier = new manee()

Texts = ["FedEx Parcel Support: Delivery Problem, 1st Attempt Hello  We've tried"
        ,"Package Delivery Notification Dear Customer,  Please review your parcel delivery label"
        ,"Improve trustmail SERP Position Very powerful SERP Booster Plan"
        ,"re: G Analytics traffic for trustmail hi Che%ap Social and Search traffic i%n Google Analyt*ics"
        ,"Delivery problem, parcel USPS Your item has arrived at the Post Office at  Mon, 03 Apr 2017 12:36:51 -0700"
        ,"สมัครงานตำแหน่ง IT Support Web เรียน ฝ่ายบุคคล กระผมมีความสนใจที่จะสมัครงานในตำแหน่ง สมัครงานในตำแหน่ง"
        ,"สมัครงานตำแหน่ง Production Supervisor (พระนครศรีอยุธยา) เรียนผู้จัดการฝ่ายบุคคล บริษัท "
        ,"Application (Planning) T. Maenumkhu A.Pluakdaeng Rayong 21140 April 12  2017 Personal Manager"
        ,"สมัครงานตำแหน่ง  ผู้จัดการแผนกบุคคล เรียน ผู้จัดการฝ่ายทรัพยากรบุคคล เนื่องจากดิฉันนางสาว มีความสนใจร่วม"
        ,"ส่งเอกสารสมัครงาน เรียน ฝ่ายบุคคล กระผมมีความประสงค์ที่จะสมัครงานในตำแหน่ง \" เจ้าหน้าที่ RD; \""]

Labels = ["Spam", "Spam", "Spam", "Spam", "Spam", "Good", "Good", "Good", "Good", "Unknown"]

textClassifier.train(Texts, Labels)

textClassifier.classify(Texts)
//["Spam", "Spam", "Spam", "Spam", "Spam", "Good", "Good", "Good", "Good", "Unknown"]

textClassifier.evaluate()
/* Training Set has : 10 Samples
Correct Label :    Spam,Spam,Spam,Spam,Spam,Good,Good,Good,Good,Unknown
Classified Label : Spam,Spam,Spam,Spam,Spam,Good,Good,Good,Good,Unknown
Whole set evaluation : 100% */

textClassifier.save('test.model')

newTextClassifier = new manee()

newTextClassifier.load('test.model')

newTextClassifier.classify(Texts)
//["Spam", "Spam", "Spam", "Spam", "Spam", "Good", "Good", "Good", "Good", "Unknown"]