0.1.3 • Published 9 years ago

rdn-naive-bayes v0.1.3

Weekly downloads
-
License
MIT
Repository
-
Last release
9 years ago

CS 5860 - Naive Bayes Classifier

Ross Nordstrom
University of Colorado - Colorado Springs
CS 5860 - Machine Learning

Assignment

Write a program in a language of your choice that classifies datasets into two classes. The two classes here are Charles Dickens and Thomas Hardy.

Assignment Details

Dataset

In addition to the required Dickens and Hardy books, some additional datasets were taken from UCI - Machine Learning Repository. The datasets used are described below.

Datasets used, and their location in this project:

DatasetSourcePathType *
SMSUCI - SMS Spam Collection./data/smsinline
BadgesUCI - Badges./data/badgesinline
MainGutenberg - Dickens, Hardy./data/maingutenberg

Dataset Types: *

TypeDescription
inlineDataset is stored as a single file in which each line represents a training point. The first word in each line is the class/category, while the rest of the line is a list of words used as the training "text blob."
gutenbergDataset is stored as a list of directories representing classes/categories (e.g. "dickens", "hardy"). Each file within the class directories represent a training point. These files are actually books, but are abstractly considered to be "text blobs," just like the inline dataset type.

Usage

This project is intended to be used via the CLI, and is exposed as an NPM package.

Installation

From NPM:

npm install -g rdn-naive-bayes

From local:

git clone git@github.com:ross-nordstrom/cs5860-naive_bayes.git
cd cs5860-naive-bayes
npm install
npm link

Running

View Usage: Rather than document the usage here, please see the tool's help documentation. In general, the tool expects to be given a dataset which it will divide into Training/Testing data.

rdn-naive-bayes -h

Testing

npm install
npm test
0.1.3

9 years ago

0.1.2

9 years ago

0.1.1

9 years ago

0.1.0

9 years ago

0.0.2

9 years ago

0.0.1

9 years ago