1.1.1 • Published 2 years ago

predictive-sentence-generator v1.1.1

Weekly downloads
-
License
MIT
Repository
-
Last release
2 years ago

Overview

predictive-sentence-generator is a package that generates random sentences based on source text. The grammar isn't great, and varies widely in accuracy based on the length/writing style of the source text. Despite this, many of the sentences sound like they could have been part of the source (though sometimes they actually are; see Common Issues). This sentence generator should be compatible with any (Latin alphabet) language, though I can't confirm as I only know English.

Simple Use

If you're impatient or don't really care about settings that much, here's the minimum you'll need to get started:

const psg = require('predictive-sentence-generator');
psg.load({
	type: 'preset',
	content: 'random'
});
console.log(psg.generate());

Overview

First, require the sentence generator.

Next, run the load() method, which in this case loads a random preset (see the Presets and Load Settings sections below). This is usually undesirable, since the presets can range from Shakespearean language (Romeo and Juliet) to the scripts of Shrek 2 and the Bee Movie.

Finally, run the generate() method. Without any parameters, this generates a single, formatted sentence of any length. Depending on the source this can often be a single word — to prevent this, see the minLength property in Generate Settings.

Load Settings

Before generating any sentences, you must call the load() method. It tells the generator what to use as the source, and will throw an error if you try to do anything without calling it first. As seen above, this method has two required parameters: type and content. The method must be called with both, and will throw an error if one or both are missing.

type

Specifies the format of the content property.

'preset' declares the use of a preset; see the Presets section for more information.

'save file' loads a save from a file using fs, see Saving and Loading for more information.

'save url' loads a save from an external URL using sync-fetch.

'save internal' loads a save directly from the content property of the method.

'text file' loads raw text from a file and internally converts it to an object (see Saving and Loading) so the generator can use it.

'text url' loads text from a URL.

'text internal' loads text directly from the content property.

Most of these values can have different formatting and still work. For example, Save File means the same thing as file-save and saveFromFile.

content

When type is set to 'preset', content is the name of the preset (i.e, 'Bee Movie').

When type is set to '[] file', content specifies the path to the file.

When type is set to '[] url', content specifies the full URL (i.e, https://example.com/).

When type is set to '[] internal', content is the entirety of the data (text, JSON, Object, etc.).

Content in 'save []' types can be either a JavaScript Object or a JSON string. 'text []' types should only be strings.

Example

An example use of the load() method with settings.

psg.load({
	type: 'preset',
	content: 'Bee Movie'
});

Setting Ignored Characters

Sometimes, your source text might have characters you want to ignore (double quotes, for example). The load() method has an optional property to do this: ignoredCharacters.

ignoredCharacters should be a RegExp, in the format of what characters to replace. For example, use /[^A-Z'.,;:?!]/gi (^ meaning 'match everything but the following') to remove anything that isn't a letter, apostrophe or punctuation, and /[\"]/g to remove all double quotes.

This property is only used if the source is text, not a save or a preset.

Note: all ignored characters are interpreted as whitespace, so replacing the 1 in 'Ch1cken' will turn it into two words, Ch and cken.

Generate Settings

The generate() method also has optional settings. Any combination of these can be included; none, all, or anything in between.

sentenceCount specifies the number of sentences to generate. A similar effect can be achieved by running the generate() method multiple times. Defaults to 1.

sentenceSeparator specifies a character or string to separate sentences (default: space). Ignored if formatted is set to false.

wordSeparator is essentially the same as sentenceSeparator, but for each word.

formatted specifies whether to format the sentence as a string. if set to false, generate() will return a 2D array — the first level containing each sentence and the next containing each word. Defaults to true.

Formatted output:

Pretty pony wants to live her honeymoon. Boss!

Unformatted output:

[
  [ 'Pretty', 'pony', 'wants', 'to', 'live', 'her', 'honeymoon.' ],
  [ 'Boss!' ]
]

minLength sets a minimum word count for each sentence. Slow at high values, since the generator has to repeatedly run until it finds a sentence of the correct length. Default: 1.

maxLength sets the maximum word count. Slow at low values, though usually not quite as much as minLength, for the same reason as mentioned above. Default: Infinity.

seed defines a starting word or words for the generator. If set to '(random)' (the default value), it will choose any word starting with a capital letter. Note: if you omit the parentheses, it will look for the word 'random' as a seed and probably error. This value can also be an array (instead of a string), in which case a seed will be selected at random for each sentence.

seedCaseSensitive specifies whether the seed word should be searched for case-sensitively. Default: true.

Example

An example use of the generate() method with settings.

psg.generate({
	minLength: 5,
	sentenceCount: 3,
	formatted: false
});

Note: the generate() method returns the sentence(s), so if you want to print it to the console you'll have to wrap the method in console.log().

Presets

The package comes with multiple source texts for you to try out. I update these sometimes, but this page might not always be up to date, so go to the source to know for sure.

The current list includes:

  • The Bee Movie (Script)
  • Jane Eyre
  • Romeo and Juliet
  • Shrek 2 (Script)
  • The Bible

Saving and Loading

Realistically, you probably won't need this feature. Usually you can just load your source as text, but if your source text is very long, you might want to create a save to cut down on loading time.

Saving works by loading an Object with word relationships, rather than a string of text (which is converted to word relationships internally). To save, first run the getSave() method, which returns a JSON string or JavaScript Object with word relationships. Store this data somewhere, then load it using the load() method with the type set to save [location type] (see Load Settings).

getSave() has an optional parameter, asString, which specifies whether the data should be returned as a JavaScript Object or JSON string.

Example Usage

Saving:

const psg = require('predictive-sentence-generator');
psg.load({
	type: 'text internal',
	content: "This is an example. I don't actually know what to write here, so I'll just write some random example text."
});

const savedData = psg.getSave(false);

console.log(savedData);

savedData is now set to:

{
  This: [ 'is' ],
  is: [ 'an' ],
  an: [ 'example.' ],
  'example.': [ 'I' ],
  I: [ "don't" ],
  "don't": [ 'actually' ],
  actually: [ 'know' ],
  know: [ 'what' ],
  what: [ 'to' ],
  to: [ 'write' ],
  write: [ 'here,', 'some' ],
  'here,': [ 'so' ],
  so: [ "I'll" ],
  "I'll": [ 'just' ],
  just: [ 'write' ],
  some: [ 'random' ],
  random: [ 'example' ],
  example: [ 'text.' ]
}

Now, loading this object in as a save:

const psg = require('predictive-sentence-generator');
psg.load({
	type: 'save internal',
	content:
	{
	  This: [ 'is' ],
	  is: [ 'an' ],
	  an: [ 'example.' ],
	  'example.': [ 'I' ],
	  I: [ "don't" ],
	  "don't": [ 'actually' ],
	  actually: [ 'know' ],
	  know: [ 'what' ],
	  what: [ 'to' ],
	  to: [ 'write' ],
	  write: [ 'here,', 'some' ],
	  'here,': [ 'so' ],
	  so: [ "I'll" ],
	  "I'll": [ 'just' ],
	  just: [ 'write' ],
	  some: [ 'random' ],
	  random: [ 'example' ],
	  example: [ 'text.' ]
	}
});

console.log(psg.generate());

...produces this result:

I don't actually know what to write some random example text.

It's not very random, but that's related to how short the source is, not saving (see Common Issues below).

Misc. Methods

getSourceText([formatted = true]): returns the text used as the source, if it was loaded as text; errors otherwise. formatted = false splits the words into an array, otherwise the method returns a space-separated string.

Common Issues

Issue/ErrorPossible CauseSolution
Unoriginal sentences (generated sentences are in the source)The source text is too shortGet a longer source text, or write a script to check for unoriginal sentences
Sentences are too shortIf the source has a lot of 1-2 word sentences, these can get picked frequentlySee the minLength property in Generate Options
Error: Could not find a starting pointThe source text has no capitalized words, which are interpreted as starting pointsUse proper grammar (capitalize words), or get a different source
Error: Could not find an ending pointThe source text has no end-of-sentence punctuation (. ? or !)Again, use proper punctuation or get a different source
Error: Source not loadedA method has been called before load()see Load Settings
Error: Could not find source presetThe content property of load() has an invalid preset namesee Presets
Error: Incomplete sentence at (word)The source text ended before the sentence was closed with punctuationCheck the end of the source text, add punctuation if needed
1.1.1

2 years ago

1.1.0

2 years ago

1.0.8

2 years ago

1.0.7

2 years ago

1.0.6

2 years ago

1.0.5

2 years ago

1.0.4

2 years ago

1.0.3

2 years ago

1.0.2

2 years ago

1.0.1

2 years ago

1.0.0

2 years ago