drunicode v1.1.0
DrUnicode
DrUnicode is a heuristic utility for detecting and diagnosing common string corruption, encoding issues, and alterations. It helps developers identify and validate string integrity, ensuring that data is correctly encoded and displayed without unintended alterations. By using a set of integrity checkers, it allows for the detection of common encoding errors and anomalies across various languages.
It is envisioned for situations where user engagement is declining due to string-related issues, but even pinpointing the cause is challenging. It is intended to be used in production environments to provide real-time diagnostics when a problem is suspected, enabling appropriate logging or responsive actions.
Features
- Detects double UTF-8 encoding anomalies across multiple languages including Spanish, French, Russian, Hebrew, Arabic, Japanese, Korean, and Chinese.
- Detection of unexpected invisible characters that should not be present in the text.
Future Features
- Detection of invalid bidirectional characters used (Bidi).
- Identification of common confusables used.
Installation
You can install DrUnicode via npm or yarn.
npm install drunicode
or
yarn add drunicode
Usage
Once installed, you can use DrUnicode to analyze strings for encoding issues.
Basic Example
import { DrUnicode } from 'drunicode';
const drUnicode = new DrUnicode();
const result = drUnicode.analyze("Let's go now!");
console.log(result); // Outputs: 'valid'
Detect Double UTF-8 Anomalies Example
const drUnicode = new DrUnicode();
const result = drUnicode.analyze("¡Vámonos ahora mismo!");
console.log(result); // Outputs: 'invalid'
drUnicode.analyze("Ðавай пÑÑмо ÑейÑаÑ!", (invalidString, message) => {
console.log('Message:', message); // Outputs: 'Double UTF-8 encoding corruption detected of Russian'
});
Analyzing DOM for Invalid Strings
DrUnicode can also analyze the full content of a webpage by checking for invalid strings within the DOM.
const drUnicode = new DrUnicode();
drUnicode.analyzeDom((invalidString, nodeLocation, message) => {
console.log('Invalid String:', invalidString);
console.log('Node Location:', nodeLocation);
console.log('Message:', message);
});
Tests
The project includes a suite of tests to ensure correctness. You can run the tests with:
npm run test