2.0.0 • Published 1 year ago

chardet v2.0.0

Weekly downloads
14,523,607
License
MIT
Repository
github
Last release
1 year ago

chardet

Chardet is a character detection module written in pure JavaScript (TypeScript). Module uses occurrence analysis to determine the most probable encoding.

  • Packed size is only 22 KB
  • Works in all environments: Node / Browser / Native
  • Works on all platforms: Linux / Mac / Windows
  • No dependencies
  • No native code / bindings
  • 100% written in TypeScript
  • Extensive code coverage

Installation

npm i chardet

Usage

To return the encoding with the highest confidence:

import chardet from 'chardet';

const encoding = chardet.detect(Buffer.from('hello there!'));
// or
const encoding = await chardet.detectFile('/path/to/file');
// or
const encoding = chardet.detectFileSync('/path/to/file');

To return the full list of possible encodings use analyse method.

import chardet from 'chardet';
chardet.analyse(Buffer.from('hello there!'));

Returned value is an array of objects sorted by confidence value in descending order

[
  { confidence: 90, name: 'UTF-8' },
  { confidence: 20, name: 'windows-1252', lang: 'fr' },
];

In browser, you can use Uint8Array instead of the Buffer:

import chardet from 'chardet';
chardet.analyse(new Uint8Array([0x68, 0x65, 0x6c, 0x6c, 0x6f]));

Working with large data sets

Sometimes, when data set is huge and you want to optimize performance (with a tradeoff of less accuracy), you can sample only the first N bytes of the buffer:

chardet
  .detectFile('/path/to/file', { sampleSize: 32 })
  .then((encoding) => console.log(encoding));

You can also specify where to begin reading from in the buffer:

chardet
  .detectFile('/path/to/file', { sampleSize: 32, offset: 128 })
  .then((encoding) => console.log(encoding));

Supported Encodings:

  • UTF-8
  • UTF-16 LE
  • UTF-16 BE
  • UTF-32 LE
  • UTF-32 BE
  • ISO-2022-JP
  • ISO-2022-KR
  • ISO-2022-CN
  • Shift_JIS
  • Big5
  • EUC-JP
  • EUC-KR
  • GB18030
  • ISO-8859-1
  • ISO-8859-2
  • ISO-8859-5
  • ISO-8859-6
  • ISO-8859-7
  • ISO-8859-8
  • ISO-8859-9
  • windows-1250
  • windows-1251
  • windows-1252
  • windows-1253
  • windows-1254
  • windows-1255
  • windows-1256
  • KOI8-R

Currently only these encodings are supported.

TypeScript?

Yes. Type definitions are included.

References

external-editor@huyhpham/rn-linearchetype-library@datapos/datapos-engine@layer0/coreeasy-select-rnreact-native-bluetooth2killi8n-react-native-fast-imagepipihomern-send-sms@arisageha/react-lazyload@arisageha/react-lazyload-fix@oneplanetcrowd/developersrdclr-boilerplate@almeidaa/mskou-ja-horoscopes-fetcherja-horoscopes-fetcherreact-native-template-rfbasexiaohu-ebook-generatorairscanairscan-examplebb-chatreact-native-esc-pos-sahaab@borisovart/atol-kkt-module@frxf/frxfdeneme323112@steven-torres/jsxr@ntt_app/react-native-custom-notificationreact-native-covid-sdkawesome-validatorgql_din_mod@nodesh/nodeshreact-native-thanh-toast-librarymutasi-bca@thanhnguyen14797/react-native-thanh-toast-libraryauto-extractcthpb-plugin-socialmysql-formatreact-native-printer-brothersrn-pdf-reader-offlinexueyan-typescript-clireact-native-shekhar-bridge-testcogoportutilsukor-remasterdyx-reactirc-100talent-to-vite-cli@oiti/documentoscopy-react-native@respondea/cordova-plugin-v-inappbrowserquoc-testreact-native-slider-kf@prodam/prodam-typesshamsi-date-nowexpand-react-bridgeluminos-ui-core@everything-registry/sub-chunk-1313iqra-calculatorjawwy-sdkjawwy_gamification_releasereact-native-sphereuisphereuijawwy_libraryreact-native-credit-card-pkgp149-tablereact-native-jawwy_samplecode-msi-simple_calculator123zs89cli-number-guessingkny63-atm-machinekshahmeer10_atmatm_process@uf3sindh/simple_cli_calculatorhmmmmd12-cli-number-guessinghmd12-cli-number-guessingcripto-htja--simple--calculatorlapture-ui-complaptureuicode-with-zafar-number-gusseing-gamecode-with-zafar-student-managment-systemgriffin-ui-librarynadia-simple-calculatoratm-code-from-inquirer-code-with-hasnainatm-code-with-hasnainatm-code-with-hasnain-using-inquirercode-with-abbasi-riffattalha12-todo-listtdl1-to-do-listsyz432cli-number-guessingtaha-atm-machinetest-carosello-campustest-library-123test-haptik-libwiz-editorwiz-editor-prodwinx-form-winxword-counter-h-a-aword-counter-husainaword-counter-in-typescriptword-my-counterword_counter_hk123
1.6.1

1 year ago

2.0.0

1 year ago

1.6.0

2 years ago

1.5.1

2 years ago

1.5.0

2 years ago

1.4.0

3 years ago

1.3.0

4 years ago

1.2.2

4 years ago

1.2.1

4 years ago

1.2.0

5 years ago

1.1.0

5 years ago

1.0.0

5 years ago

0.8.0

5 years ago

0.7.0

6 years ago

0.6.0

7 years ago

0.5.0

7 years ago

0.4.2

7 years ago

0.4.1

7 years ago

0.4.0

7 years ago

0.3.0

7 years ago

0.1.0

9 years ago

0.0.8

11 years ago

0.0.7

11 years ago

0.0.6

12 years ago

0.0.5

12 years ago

0.0.4

12 years ago

0.0.3

12 years ago