0.0.1 • Published 1 month ago

ldawithmorelanguages v0.0.1

Weekly downloads
-
License
-
Repository
github
Last release
1 month ago

LDA

Latent Dirichlet allocation (LDA) topic modeling in javascript for node.js. LDA is a machine learning algorithm that extracts topics and their related keywords from a collection of documents.

In LDA, a document may contain several different topics, each with their own related terms. The algorithm uses a probabilistic model for detecting the number of topics specified and extracting their related keywords. For example, a document may contain topics that could be classified as beach-related and weather-related. The beach topic may contain related words, such as sand, ocean, and water. Similarly, the weather topic may contain related words, such as sun, temperature, and clouds.

See http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation

Author

Kory Becker http://www.primaryobjects.com

Based on original javascript implementation https://github.com/awaisathar/lda.js

Added stop-words for a lot of languages https://github.com/stopwords-iso/stopwords-iso