1.1.0 • Published 5 months ago

bitaboom v1.1.0

Weekly downloads
-
License
MIT
Repository
github
Last release
5 months ago

Table of Contents

wakatime GitHub npm npm GitHub issues GitHub stars GitHub Release codecov Size typescript

Bitaboom - A String Utilities Library

Bitaboom is a NodeJS string utility library written in TypeScript, designed to provide a collection of helpful string manipulation functions. It supports the latest ESNext features and is tested using Vitest.

Demo

Demo

Installation

To install Bitaboom, use npm or yarn:

npm install bitaboom
# or
yarn add bitaboom
# or
pnpm i bitaboom

Usage

Import the library into your project:

import { functionName } from 'bitaboom';

Use any function from the API in your code:

const result = functionName('inputString');
console.log(result);

Available Functions

addSpaceBeforeAndAfterPunctuation

Adds spaces before and after punctuation marks except in specific cases like quoted text.

Example:

addSpaceBeforeAndAfterPunctuation('Text,word');
// Output: 'Text, word'

addSpaceBetweenArabicTextAndNumbers

Inserts spaces between Arabic text and numbers.

Example:

addSpaceBetweenArabicTextAndNumbers('الآية37');
// Output: 'الآية 37'

applySmartQuotes

Turns regular double quotes into smart quotes and fixes any incorrect starting quotes.

Example:

applySmartQuotes('The "quick brown" fox');
// Output: 'The “quick brown” fox'

cleanExtremeArabicUnderscores

Removes extreme Arabic underscores (ـ) from the beginning or end of lines. It does not affect Hijri dates or certain Arabic terms.

Example:

cleanExtremeArabicUnderscores('ـThis is a textـ');
// Output: "This is a text"

cleanJunkFromText

Cleans unnecessary spaces and punctuation from text.

Example:

cleanJunkFromText('Some text !@#\nAnother line.');
// Output: 'Some text\nAnother line.'

cleanLiteralNewLines

Replaces literal new line characters (\n) with actual line breaks.

Example:

cleanLiteralNewLines('A\nB');
// Output: 'A\nB'

cleanMultilines

Removes trailing spaces from each line in a multiline string.

Example:

cleanMultilines(' This is a line   \nAnother line   ');
// Output: 'This is a line\nAnother line'

cleanSymbolsAndPartReferences

Removes various symbols, part references, and numerical markers from the text.

Example:

cleanSymbolsAndPartReferences('(1) (2/3)');
// Output: ''

cleanTrailingPageNumbers

Removes trailing page numbers formatted as -[46]- from the text.

Example:

cleanTrailingPageNumbers('This is some -[46]- text');
// Output: 'This is some text'

convertUrduSymbolsToArabic

Converts Urdu symbols like 'ھ' and 'ی' to their Arabic equivalents 'ه' and 'ي'.

Example:

convertUrduSymbolsToArabic('ھذا');
// Output: 'هذا'

extractInitials

Extracts initials from the input string, typically for names or titles.

Example:

extractInitials('Nayl al-Awtar');
// Output: 'NA'

fixTrailingWow

Corrects unnecessary trailing "و" in greetings or phrases.

Example:

fixTrailingWow('السلام عليكم و رحمة');
// Output: 'السلام عليكم ورحمة'

hasWordInSingleLine

Checks if a line has any word by itself.

Example:

hasWordInSingleLine('Abc efg\nhij\nklmn opq');
// Output: true (since "hij" is by itself)

insertLineBreaksAfterPunctuation

Adds line breaks after punctuation marks such as periods, exclamation points, and question marks.

Example:

insertLineBreaksAfterPunctuation('Text.');
// Output: 'Text.
'

isJsonStructureValid

Checks if a given string resembles a JSON object with numeric or quoted keys and values that are single or double quoted. Useful for detecting malformed JSON-like structures that can be fixed.

Example:

isJsonStructureValid("{10: 'abc', 'key': 'value'}");
// Output: true

isOnlyPunctuation

Checks if the input string consists only of punctuation characters.

Example:

isOnlyPunctuation('!?');
// Output: true

normalizeAlifVariants

Simplifies all forms of 'alif' (أ, إ, and آ) to the basic 'ا'.

Example:

normalizeAlifVariants('أنا إلى الآفاق');
// Output: 'انا الى الافاق'

normalizeApostrophes

Replaces various apostrophe characters like ‛, ’, and ‘ with the standard apostrophe (').

Example:

normalizeApostrophes('‛ulama’ al-su‘');
// Output: "'ulama' al-su'"

normalizeArabicPrefixesToAl

Replaces common Arabic prefixes like 'Al-', 'Ar-', 'Ash-', etc., with 'al-' in the text. It handles different variations of prefixes but does not modify cases where the second word does not start with 'S'.

Example:

normalizeArabicPrefixesToAl('Ash-Shafiee');
// Output: 'al-Shafiee'

normalizeDoubleApostrophes

Removes double occurrences of Arabic apostrophes such as ʿʿ or ʾʾ.

Example:

normalizeDoubleApostrophes('ʿulamāʾʾ');
// Output: 'ʿulamāʾ'

normalizeJsonSyntax

Converts a string that resembles JSON but has numeric keys and single-quoted values into valid JSON format. The function replaces numeric keys with quoted numeric keys and ensures all values are double-quoted, as required by JSON.

Example:

normalizeJsonSyntax("{10: 'abc', 20: 'def'}");
// Output: '{"10": "abc", "20": "def"}'

normalizeTransliteratedEnglish

Simplifies English transliterations by removing diacritics, apostrophes, and common prefixes.

Example:

normalizeTransliteratedEnglish('Al-Jadwāl');
// Output: 'Jadwal'

normalize

Normalizes the text by removing diacritics, apostrophes, and dashes.

Example:

normalize('Al-Jadwāl');
// Output: 'AlJadwal'

removeArabicPrefixes

Strips common Arabic prefixes like 'al-', 'bi-', 'fī', 'wa-', etc., from the beginning of words.

Example:

removeArabicPrefixes('al-Bukhari');
// Output: 'Bukhari'

removeDeathYear

Removes death year references like "(d. 390H)" and "d. 100h" from the text.

Example:

removeDeathYear('Sufyān ibn ‘Uyaynah (d. 198h)');
// Output: 'Sufyān ibn ‘Uyaynah'

removeNonIndexSignatures

Removes single-digit numbers and dashes from Arabic text but preserves numbers used as indexes.

Example:

removeNonIndexSignatures('الورقه 3 المصدر');
// Output: 'الورقه المصدر'

removeNumbersAndDashes

Removes numeric digits and dashes from the text.

Example:

removeNumbersAndDashes('ABC 123-Xyz');
// Output: 'ABC Xyz'

removeSingleDigitReferences

Removes single digit references like (1), «2», 3 from the text.

Example:

removeSingleDigitReferences('Ref (1), Ref «2», Ref [3]');
// Output: 'Ref , Ref , Ref '

removeSingularCodes

Removes Arabic letters or Arabic-Indic numerals enclosed in square brackets or parentheses.

Example:

removeSingularCodes('[س]');
// Output: ''

removeSolitaryArabicLetters

Removes solitary Arabic letters unless they are 'ha' used in Hijri years.

Example:

removeSolitaryArabicLetters('ب ا الكلمات ت');
// Output: 'ا الكلمات'

removeTatwil

Removes tatweel characters from Arabic text while preserving the Hijri years.

Example:

removeTatwil('أبـــتِـــكَةُ');
// Output: 'أبتِكَةُ'

removeUrls

Removes URLs from the text.

Example:

removeUrls('Visit https://example.com');
// Output: 'Visit '

replaceAlifMaqsurah

Replaces 'alif maqsurah' (ى) with 'ya' (ي).

Example:

replaceAlifMaqsurah('رؤيى');
// Output: 'رؤيي'

replaceEnglishPunctuationWithArabic

Replaces English punctuation marks (e.g., ? and ;) with their Arabic equivalents.

Example:

replaceEnglishPunctuationWithArabic('This; and, that?');
// Output: 'This؛and، that؟'

replaceLineBreaksWithSpaces

Replaces consecutive line breaks and whitespace characters with a single space.

Example:

replaceLineBreaksWithSpaces('a\nb');
// Output: 'a b'

replaceSalutationsWithSymbol

Replaces common salutations like "sallahu alayhi wasallam" with "ﷺ". Handles variations like 'peace and blessings be upon him'.

Example:

replaceSalutationsWithSymbol('Then Muḥammad (sallahu alayhi wasallam)');
// Output: 'Then Muḥammad ﷺ'

replaceTaMarbutahWithHa

Replaces 'ta marbutah' (ة) with 'ha' (ه).

Example:

replaceTaMarbutahWithHa('مدرسة');
// Output: 'مدرسه'

splitByQuotes

Splits a string by spaces but keeps quoted substrings intact. Substrings enclosed in double quotes are treated as a single part.

Example:

splitByQuotes('"This is" "a part of the" "string and"');
// Output: ["This is", "a part of the", "string and"]

stripAllDigits

Removes all numeric digits from the text.

Example:

stripAllDigits('abc123');
// Output: 'abc'

stripDiacritics

Removes Arabic diacritics (tashkeel) and the elongation character (ـ).

Example:

stripDiacritics('مُحَمَّدٌ');
// Output: 'محمد'

stripEnglishCharactersAndSymbols

Removes English letters and symbols from the text.

Example:

stripEnglishCharactersAndSymbols('أحب & لنفسي');
// Output: 'أحب   لنفسي'

stripZeroWidthCharacters

Removes zero-width characters like ZWJ and other invisible characters.

Example:

stripZeroWidthCharacters('يَخْلُوَ ‏.');
// Output: 'يَخْلُوَ .'

1.1.0

5 months ago

1.0.0

10 months ago