1.2.4 • Published 7 years ago

Barfer v1.2.4

Weekly downloads
27
License
See LICENSE
Repository
-
Last release
7 years ago

Barfer

This module provides a set of NLP tools, using other modules, to find various things:

  • Sentiment (English and Spanish) using trigrams and bigrams
  • Emoji sentiment
  • Intention
  • Topics
  • Context
  • Discover language (defaults to a white-list of eng, spa, por, fra and ger)

Changes

* 1.2.4 - Improved stopwords and sentiment, general system speed improvement
* 1.2.0 - Added more default taggers
* 1.1.1 - Fixing some bugs
* 1.1.0 - Overhaul to the core, removed a couple of modules, simplified logic, and, optimized parsing

How to use

Start Barfer

	const barfer = new Barfer( {
		lang:
		{
			whitelist: [ 'spa' ] // works best when focused in one single language for now...
			// whitelist: [ 'eng', 'spa' ]
		},
	} );

Additional configuration options

	const conf = {

		// process data as Twitter data
		twitter: true,

		// enable morphing into ascii only characters.
		latinize: true,

		// define a very important target/topic
		target: 'some name',

		// add interesting topics for the tagger.
		interesting: [ 'some', 'other' ],

		// set this to an empty array if you want to surpass the white-list
		lang:
		{
			whitelist: [ 'eng', 'spa', 'por', 'fra', 'ger' ]
		},

	};

Implement a term

	// this term is a default in Barfer

	barfer.addParameter( ( tokens ) => {

		const match = /^(car|bus|metro|train|plane|boat|taxi|bike|bicicle)\b/igm.exec(tokens.rest());

		if ( match !== null )
		{

			return {
				tag: 'vehicles',
				length: match[ 0 ].length,
				data: match[ 0 ].toLowerCase()
			};

		}

	} );

Study

	const data = barfer.study( twit, ( err, data ) => {
		// data is full of rich data!
	} );

Output

	{
		str: 'rt @jonbershad: @realdonaldtrump fun. so you won\'t be giving us a date when you\'ll be discussing your massive conflicts of interest?',
		lang: 'eng',
		topics:
		 [ { count: 135,
				 length: 10,
				 stem: 'discuss',
				 text: 'discussing',
				 weight: 9.64,
				 action: true,
				 topic: true },
			 { count: 134,
				 length: 6,
				 stem: 'give',
				 text: 'giving',
				 weight: 9.57,
				 stopword: true,
				 action: true,
				 topic: true },
			 { count: 59,
				 length: 9,
				 stem: 'conflict',
				 text: 'conflicts',
				 weight: 4.21,
				 topic: true,
				 sentiment: -2,
				 negative: true },
			 { count: 35,
				 length: 7,
				 stem: 'massiv',
				 text: 'massive',
				 weight: 2.5,
				 topic: true },
			 { count: 34,
				 length: 6,
				 stem: 'youll',
				 text: 'youll',
				 weight: 2.42,
				 stopword: true,
				 topic: true },
			 { count: 29,
				 length: 9,
				 stem: 'interest',
				 text: 'interest',
				 weight: 2.07,
				 stopword: true,
				 topic: true,
				 sentiment: 1,
				 positive: true },
			 [length]: 6 ],
		tagger:
		{
			actions:
			{
				tag: 'actions',
				words:
				{
					giving: { text: 'giving', count: 1, action: true },
					discussing: { text: 'discussing', count: 1, action: true }
				}
			},
			topics:
			{
				tag: 'topics',
				words:
				{
					giving: { text: 'giving', data: { index: 0 } },
					youll: { text: 'youll', data: { index: 0 } },
					discussing: { text: 'discussing', data: { index: 0 } },
					massive: { text: 'massive', data: { index: 0 } },
					conflicts: { text: 'conflicts', data: { index: 1 } },
					interest: { text: 'interest', data: { index: 2 } }
				}
			},
			positive:
			{
				tag: 'positive',
				words:
				{
					interest:
					{
						text: 'interest',
						data: [ 'massive', 'conflicts', [length]: 2 ] }
					}
				},
			negative:
				{ tag: 'negative',
					words:
					{
						conflicts:
						{
							text: 'conflicts',
							data: [ 'massive', 'interest', [length]: 2 ] } }
						}
					},
		rest: [ 'discussing', 'massive', [length]: 2 ],
		sentiment:
		{
			polarity: -1,
			positive: { score: 1, words: [ 'interest', [length]: 1 ] },
			negative: { score: -2, words: [ 'conflicts', [length]: 1 ] } },
		emojiSentiment:
		{
			polarity: 0,
			positive: { score: 0, emoji: [ [length]: 0 ] },
			negative: { score: 0, emoji: [ [length]: 0 ] }
		},
		twitter:
		{
			parsedAt: 1481743750671,
			mentions: [ 'jonbershad', 'realdonaldtrump', [length]: 2 ],
			hashtags: [ [length]: 0 ],
			cashtags: [ [length]: 0 ],
			replies: [ [length]: 0 ],
			urls: [ [length]: 0 ]
		},
		wordMap:
		{ '@jonbershad':
				{ count: 5,
					length: 12,
					stem: '@jonbershad',
					text: '@jonbershad',
					weight: 0.35,
					rest: true,
					mention: true },
			 '@realdonaldtrump':
				{ count: 5,
					length: 16,
					stem: '@realdonaldtrump',
					text: '@realdonaldtrump',
					weight: 0.35,
					rest: true,
					mention: true },
			 fun: { count: 1, length: 4, stem: 'fun', text: 'fun', weight: 0.07 },
			 you: { count: 1, length: 3, stem: 'you', text: 'you', weight: 0.07 },
			 wont:
				{ count: 1,
					length: 5,
					stem: 'wont',
					text: 'wont',
					weight: 0.07,
					stopword: true },
			 giving:
				{ count: 134,
					length: 6,
					stem: 'give',
					text: 'giving',
					weight: 9.57,
					stopword: true,
					action: true,
					topic: true },
			 date:
				{ count: 1,
					length: 4,
					stem: 'date',
					text: 'date',
					weight: 0.07,
					stopword: true },
			 when:
				{ count: 1,
					length: 4,
					stem: 'when',
					text: 'when',
					weight: 0.07,
					stopword: true },
			 youll:
				{ count: 34,
					length: 6,
					stem: 'youll',
					text: 'youll',
					weight: 2.42,
					stopword: true,
					topic: true },
			 discussing:
				{ count: 135,
					length: 10,
					stem: 'discuss',
					text: 'discussing',
					weight: 9.64,
					action: true,
					topic: true },
			 your:
				{ count: 1,
					length: 4,
					stem: 'your',
					text: 'your',
					weight: 0.07,
					stopword: true },
			 massive:
				{ count: 35,
					length: 7,
					stem: 'massiv',
					text: 'massive',
					weight: 2.5,
					topic: true },
			 conflicts:
				{ count: 59,
					length: 9,
					stem: 'conflict',
					text: 'conflicts',
					weight: 4.21,
					topic: true,
					sentiment: -2,
					negative: true },
			 interest:
				{ count: 29,
					length: 9,
					stem: 'interest',
					text: 'interest',
					weight: 2.07,
					stopword: true,
					topic: true,
					sentiment: 1,
					positive: true }
		}
	}

Note

This is proof of concept still, but I'm working regularly in improving it.

See test/index.js for more example in how to use Barfer.

Tests

Run VERBOSE=true npm test to run tests and see all data output And npm test just to run the tests.

License

See LICENSE for license info