1.1.5 • Published 7 months ago

custom-search-lib v1.1.5

Weekly downloads
-
License
MIT
Repository
github
Last release
7 months ago

Custom Search Library

Custom Search Library is a versatile JavaScript/TypeScript library for implementing advanced search functionalities. It supports fuzzy search, ranked results, prefix/suffix matching, advanced filtering, sorting, faceted searching, wildcard queries, and now supports normalization of special characters across multiple languages, all with configurable options for diverse use cases.

Features

  • Fuzzy Search: Finds results based on Levenshtein distance with configurable thresholds.
  • Ranked Fuzzy Search: Sorts results by relevance using a scoring mechanism.
  • Wildcard Search: Allows pattern matching using wildcards (*).
  • Prefix and Suffix Search: Matches strings that start or end with the query.
  • Advanced Filtering: Apply exact, range, or multi-value filters with support for AND/OR logic.
  • Sorting: Multi-level sorting with locale-aware string comparison.
  • Faceted Search: Generates facets (categories) with counts for better insights into datasets.
  • Highly Configurable: Supports case sensitivity, custom thresholds, and multi-field operations.
  • Language-Specific Normalization: Handles special characters for languages like Swedish, Danish, Norwegian, Turkish, French, Spanish, Polish, Czech, Slovak, Hungarian, Greek, and more.
  • Fuzzy Search for Persian: A dedicated fuzzy search function tailored for Persian text.

Installation

Install the library via npm:

npm install custom-search-lib

New Language Support Feature

The library now includes support for special characters from multiple languages, including:

  • German: Handles characters like ß, ä, ö, and ü.
  • Swedish, Danish, Norwegian: Normalizes ä, å, ö, ø, and æ.
  • Turkish: Converts ç, ğ, ı, ş, and ü.
  • French: Handles œ, é, è, ê, ë, à, â, ù, û, î, ï, and ç.
  • Spanish: Processes ñ, á, í, ó, and ú.
  • Polish: Normalizes ą, ć, ę, ł, ń, ó, ś, ź, and ż.
  • Czech and Slovak: Handles č, ď, ě, ň, ř, š, ť, ů, and ž.
  • Hungarian: Converts á, é, í, ó, ö, ő, ú, ü, and ű.
  • Greek: Provides transliterations for Greek characters, including α to ω.
  • Arabic and Persian:
    • Arabic: Normalizes أ, إ, آ, ؤ, ئ, ة, and ى to their standard forms.
    • Persian: Converts ي, ك, ۀ, پ, چ, ژ, and گ to their standardized forms.
  • Others: Includes mappings for characters like ý, đ, and ħ.

Example Usage of Language Normalization:

import { normalizeText } from 'custom-search-lib';

const normalized = normalizeText('Göteborg, Ærø, Crème brûlée, Ελληνικά');
console.log(normalized); // Output: 'goteborg, aero, creme brulee, ellinika'

**React TypeScript Example Implementation

  1. SearchDemo Component

The main component demonstrates the use of various search functionalities, including filtering, sorting, and faceted search.

import { useState } from 'react';
import { 
  fuzzySearch, 
  rankedFuzzySearch, 
  fuzzySearchPersian, 
  rankedFuzzySearchPersian, 
  prefixSearch, 
  suffixSearch, 
  wildcardSearch,
  applyFilters,
  sortData,
  generateFacets,
} from 'custom-search-lib'; 
import { SearchResults } from './types/SearchResults'; 
import { mockData } from './mockData/mockData'; 

const SearchDemo = () => {
  const [query, setQuery] = useState(''); 
  const [filters] = useState({ category: 'Books' }); // Example filter
  const [sortConfig] = useState<{ field: string; order: 'asc' | 'desc' }[]>([{ field: 'price', order: 'asc' }]);
  const [results, setResults] = useState<SearchResults>({
    fuzzy: [],
    ranked: [],
    persianFuzzy: [],
    persianRanked: [],
    prefix: [],
    suffix: [],
    wildcard: [],
    filtered: [],
    sorted: [],
    facets: {},
  });

  const handleSearch = () => {
    const fuzzyResults = fuzzySearch(query, mockData.map(item => item.name));
    const rankedResults = rankedFuzzySearch(query, mockData.map(item => item.name));
    const persianFuzzyResults = fuzzySearchPersian(query, mockData.map(item => item.name), { threshold: 2 });
    const persianRankedResults = rankedFuzzySearchPersian(query, mockData.map(item => item.name), { threshold: 2 });
    const prefixResults = prefixSearch(query, mockData.map(item => item.name));
    const suffixResults = suffixSearch(query, mockData.map(item => item.name));
    const wildcardResults = wildcardSearch(query, mockData.map(item => item.name));
    const filteredResults = applyFilters(mockData, filters);
    const sortedResults = sortData(mockData, sortConfig);
    const facets = generateFacets(mockData, ['category']);

    setResults({
      fuzzy: fuzzyResults,
      ranked: rankedResults,
      persianFuzzy: persianFuzzyResults,
      persianRanked: persianRankedResults,
      prefix: prefixResults,
      suffix: suffixResults,
      wildcard: wildcardResults,
      filtered: filteredResults,
      sorted: sortedResults,
      facets: facets,
    });
  };

  return (
    <div style={{ padding: '20px' }}>
      <h1>Search Demo</h1>
      <input
        type="text"
        value={query}
        onChange={(e) => setQuery(e.target.value)}
        placeholder="Enter search query"
      />
      <button onClick={handleSearch}>Search</button>

      {/* Filtering Example */}
      <div>
        <h3>Filtered Results:</h3>
        <ul>
          {results.filtered.map((item, index) => (
            <li key={index}>{item.name} - ${item.price}</li>
          ))}
        </ul>
      </div>

      {/* Sorting Example */}
      <div>
        <h3>Sorted Results:</h3>
        <ul>
          {results.sorted.map((item, index) => (
            <li key={index}>{item.name} - ${item.price}</li>
          ))}
        </ul>
      </div>

      {/* Faceted Search Example */}
      <div>
        <h3>Facets:</h3>
        <ul>
          {Object.entries(results.facets.category || {}).map(([key, count]) => (
            <li key={key}>{key}: {count}</li>
          ))}
        </ul>
      </div>

      {/* Other Search Results */}
      <div>
        <h3>Fuzzy Search:</h3>
        <ul>
          {results.fuzzy.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Ranked Fuzzy Search:</h3>
        <ul>
          {results.ranked.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Persian Fuzzy Search:</h3>
        <ul>
          {results.persianFuzzy.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Persian Ranked Fuzzy Search:</h3>
        <ul>
          {results.persianRanked.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Prefix Search:</h3>
        <ul>
          {results.prefix.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Suffix Search:</h3>
        <ul>
          {results.suffix.map((item, index) => <li key={index}>{item}</li>)}
        </ul>

        <h3>Wildcard Search:</h3>
        <ul>
          {results.wildcard.map((item, index) => <li key={index}>{item}</li>)}
        </ul>
      </div>
    </div>
  );
};

export default SearchDemo;
  1. SearchResults Interface

Defines the structure for managing search results.

export interface SearchResults {
  fuzzy: string[];
  ranked: string[];
  persianFuzzy: string[];
  persianRanked: string[];
  prefix: string[];
  suffix: string[];
  wildcard: string[];
  filtered: any[];
  sorted: any[];
  facets: { [key: string]: { [value: string]: number } };
}
  1. Mock Data

Provides a large dataset for testing the search functionalities.

const generateRandomString = (length: number): string => {
  const characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
  let result = '';
  for (let i = 0; i < length; i++) {
    result += characters.charAt(Math.floor(Math.random() * characters.length));
  }
  return result;
};

const generateLargeDataset = (size: number): Array<{ name: string; category: string; price: number }> => {
  const categories = ['Books', 'Electronics', 'Clothing', 'Home Appliances', 'Toys', 'Miscellaneous'];
  const dataset: Array<{ name: string; category: string; price: number }> = [];
  for (let i = 0; i < size; i++) {
    dataset.push({
      name: generateRandomString(10),
      category: categories[Math.floor(Math.random() * categories.length)],
      price: Math.floor(Math.random() * 500),
    });
  }
  return dataset;
};

const predefinedDataset = [
  { name: 'Bicycle', category: 'Sports', price: 150 },
  { name: 'Bike', category: 'Sports', price: 200 },
  { name: 'Bicycles', category: 'Sports', price: 180 },
  { name: 'Tricycle', category: 'Sports', price: 120 },
  { name: 'Motorcycle', category: 'Vehicles', price: 1500 },
  { name: 'Hello@World', category: 'Miscellaneous', price: 50 },
  { name: 'Laptop', category: 'Electronics', price: 1000 },
  { name: 'Smartphone', category: 'Electronics', price: 800 },
  { name: 'Headphones', category: 'Electronics', price: 150 },
];

export const mockData = [...predefinedDataset, ...generateLargeDataset(5000)];
  1. Mock Data in Persian to use with fuzzySearchPersian and rankedFuzzySearchPersian in a demo app.
const generateRandomPersianString = (length: number): string => {
  const characters = 'ابپتثجچحخدذرزژسشصضطظعغفقکگلمنوهی';
  let result = '';
  for (let i = 0; i < length; i++) {
    result += characters.charAt(Math.floor(Math.random() * characters.length));
  }
  return result;
};

const generateLargePersianDataset = (size: number): Array<{ name: string; category: string; price: number }> => {
  const categories = ['کتاب‌ها', 'الکترونیک', 'پوشاک', 'لوازم خانگی', 'اسباب‌بازی', 'متفرقه'];
  const dataset: Array<{ name: string; category: string; price: number }> = [];
  for (let i = 0; i < size; i++) {
    dataset.push({
      name: generateRandomPersianString(10), // Generate a random Persian name
      category: categories[Math.floor(Math.random() * categories.length)],
      price: Math.floor(Math.random() * 500),
    });
  }
  return dataset;
};

const predefinedPersianDataset = [
  { name: 'دوچرخه', category: 'ورزش', price: 150 },
  { name: 'موتورسیکلت', category: 'وسایل نقلیه', price: 1500 },
  { name: 'کتاب ریاضی', category: 'کتاب‌ها', price: 200 },
  { name: 'لپ‌تاپ', category: 'الکترونیک', price: 1000 },
  { name: 'هدفون', category: 'الکترونیک', price: 150 },
  { name: 'اسباب‌بازی چوبی', category: 'اسباب‌بازی', price: 300 },
  { name: 'یخچال', category: 'لوازم خانگی', price: 500 },
];

export const persianMockData = [...predefinedPersianDataset, ...generateLargePersianDataset(5000)];

Performance

The library is optimized for performance but can handle large datasets efficiently:

  • Uses an optimized Levenshtein algorithm.
  • Benchmarked for datasets of up to 100,000 entries.

License

This project is licensed under the MIT License.


1.1.5

7 months ago

1.1.1

7 months ago

1.1.0

7 months ago

1.0.9

7 months ago

1.0.8

8 months ago

1.0.7

8 months ago

1.0.6

8 months ago

1.1.4

7 months ago

1.1.3

7 months ago

1.1.2

7 months ago

1.0.5

8 months ago

1.0.4

8 months ago

1.0.2

8 months ago

1.0.1

8 months ago

1.0.0

8 months ago