0.0.1 • Published 10 years ago

simple_english v0.0.1

Weekly downloads
8
License
-
Repository
github
Last release
10 years ago

simple_english reduce the complexity of written english

demo

#Justification A working NLP library can be satisfactory with a breathtaking lightness. By Zipfs law:

The top 10 words account for 25% of our language.

The top 100 words account for 50% of our language.

The top 50,000 words account for 95% of our language.

The trade-offs for processing english are way more profound than the 80/20 rule. On the Penn treebank, for example, the following is possible:

  • choosing all nouns: 33% correct
  • using a 1 thousand word lexicon: 45% correct
  • using a 1 thousand word lexicon, and falling back to nouns: 70% correct
  • using a 1 thousand word lexicon, common suffix regexes, and falling back to nouns: 74% correct

The process is to get curated data, find the patterns, list the exceptions. bada-bing, bada-BOOM.

#Usage

Server-side

npm install simple_english

simple("well as a matter of fact, the went at full blast.")
//"well actually, they went at top speed"

Client-side

<script src"https://s3.amazonaws.com/spencermounta.in/simple_english/client_side/simple.min.js"</script>
<script>
  simple("well as a matter of fact, the went at full blast.")
  //"well actually, they went at top speed"
</script>

Licence

go-fer-it.