3.1.2 • Published 1 month ago

jschardet v3.1.2

Weekly downloads
118,169
License
LGPL-2.1+
Repository
github
Last release
1 month ago

NPM

JsChardet

Port of python's chardet (https://github.com/chardet/chardet).

License

LGPL

How To Use It

Node

npm install jschardet
var jschardet = require("jschardet")

// "àíàçã" in UTF-8
jschardet.detect("\xc3\xa0\xc3\xad\xc3\xa0\xc3\xa7\xc3\xa3")
// { encoding: "UTF-8", confidence: 0.9690625 }

// "次常用國字標準字體表" in Big5
jschardet.detect("\xa6\xb8\xb1\x60\xa5\xce\xb0\xea\xa6\x72\xbc\xd0\xb7\xc7\xa6\x72\xc5\xe9\xaa\xed")
// { encoding: "Big5", confidence: 0.99 }

// Martin Kühl
// jschardet.detectAll("\x3c\x73\x74\x72\x69\x6e\x67\x3e\x4d\x61\x72\x74\x69\x6e\x20\x4b\xfc\x68\x6c\x3c\x2f\x73\x74\x72\x69\x6e\x67\x3e")
// [
//   {encoding: "windows-1252", confidence: 0.95},
//   {encoding: "ISO-8859-2", confidence: 0.8796300205763055},
//   {encoding: "SHIFT_JIS", confidence: 0.01}
// ]

Browser

Copy and include jschardet.min.js in your web page.

This library is also available in cdnjs at https://cdnjs.cloudflare.com/ajax/libs/jschardet/1.4.1/jschardet.min.js

Options

// See all information related to the confidence levels of each encoding.
// This is useful to see why you're not getting the expected encoding.
jschardet.enableDebug();

// Default minimum accepted confidence level is 0.20 but sometimes this is not
// enough, specially when dealing with files mostly with numbers.
// To change this to 0 to always get something or any other value that can
// work for you.
jschardet.detect(str, { minimumThreshold: 0 });

// Lock down which encodings to detect, can be useful in situations jschardet
// is giving a higher probability to encodings that you never use.
jschardet.detect(str, { detectEncodings: ["UTF-8", "windows-1252"] });

Supported Charsets

  • Big5, GB2312/GB18030, EUC-TW, HZ-GB-2312, and ISO-2022-CN (Traditional and Simplified Chinese)
  • EUC-JP, SHIFT_JIS, and ISO-2022-JP (Japanese)
  • EUC-KR and ISO-2022-KR (Korean)
  • KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, and windows-1251 (Russian)
  • ISO-8859-2 and windows-1250 (Hungarian)
  • ISO-8859-5 and windows-1251 (Bulgarian)
  • windows-1252
  • ISO-8859-7 and windows-1253 (Greek)
  • ISO-8859-8 and windows-1255 (Visual and Logical Hebrew)
  • TIS-620 (Thai)
  • UTF-32 BE, LE, 3412-ordered, or 2143-ordered (with a BOM)
  • UTF-16 BE or LE (with a BOM)
  • UTF-8 (with or without a BOM)
  • ASCII

Technical Information

I haven't been able to create tests to correctly detect:

  • ISO-2022-CN
  • windows-1250 in Hungarian
  • windows-1251 in Bulgarian
  • windows-1253 in Greek
  • EUC-CN

Development

Use npm run dist to update the distribution files. They're available at https://github.com/aadsm/jschardet/tree/master/dist.

Authors

Ported from python to JavaScript by António Afonso (https://github.com/aadsm/jschardet)

Transformed into an npm package by Markus Ast (https://github.com/brainafk)

@ndcb/fs-text@theia/corechinese-convert-clinectis-connector-sampleweb-debug-cli-cli@shealtiel/socket-bridge@vscode/monaco-editor-coredorajs-matrixz-cili@bgcbrasil/utils@everything-registry/sub-chunk-1979easyvadmineasyspidermip-validatorencoding-checkerdabenwang-codeblitzjs-ide-sumi-corezombie-wprotonircggggulp-kcodegulp-gbk-convertgulp-pf-replacehavenmoney-inlinergulp-utf8-convertgulp-tpl2jsgulp-seedgengen-lib-web-designgengen-lib-web-design-gqgengen-lib-web-design-omsgengen-lib-web-platformgrunt-chardet-encodinggrunt-sohu-tasksgugugrunt-tapgrunt-update-readmeeml-parser-bufferextract-main-textfoolproxyfixdotfishx-gui-serverhtml-to-utf8html-encode@trusted-solutions/sdit.datahubot-url-titlehubot-ya-url-titlefdlintjsfdm-helperfeproxyfb-compilerfd-gulp-chinese2unicodefd-gulp-convert-encodingfd-gulp-encodingfilterfd-init-easymobfd-init-mloftyfd-init-templatefd-gulp-removebomfile2linesfind-rssfilericafile-encodingfile-cleanupfdserverypc-js-sdkynap-parsersr-checkgithub1s-vscode-webqif2jsonzip-unzip-promiseptapgithubpresspoi-plugin-akashic-recordsfunsocietyirc-clientpptxto.txt@akumzy/ogp-parser@alexcdot/textract@webvuer/file-stat@webvuer/server-api@webvuer/stat@webpart/server-api@alphatr/spider@bmssearch/bms-pattern-overviewkwest-textappnet.io-stun-proxy@codeblitzjs/ide-sumi-coreaminer-toolsanalyze-sgfsimple-node-crawleranyproxy-package-inspectanyproxy-ruleskrawler@chensi-thunder/fe-generate-jarsimple-webloadersocket.io-transitsmart-encoding-convertsmi2vttlaem-simulatereplace-with-dictionaryiconv-jschardetread-text-file@tauri-apps/tauri-inliner@workbench-stack/module-server
3.1.2

1 month ago

3.1.1

1 month ago

3.1.0

1 month ago

3.0.0

3 years ago

2.3.0

3 years ago

2.2.1

4 years ago

2.2.0

4 years ago

2.1.1

4 years ago

2.1.0

5 years ago

1.6.0

6 years ago

1.5.1

7 years ago

1.5.0

7 years ago

1.4.2

7 years ago

1.4.1

8 years ago

1.4.0

8 years ago

1.3.0

9 years ago

1.2.0

9 years ago

1.1.1

9 years ago

1.1.0

11 years ago

1.0.2

12 years ago

1.0.1

12 years ago

1.0.0

12 years ago

0.0.2

12 years ago

0.0.1

12 years ago