0.1.1 • Published 2 years ago
duplicate-document-image-finder v0.1.1
duplicate-documet-image-finder
A JavaScript library to find duplicate document images.
How does it work?
It extracts the text of images using OCR and uses levenshtein distance to calculate the similarity between two texts.
Methods and Interfaces
find. Find the duplicated images. You can pass your own OCR results.async find(images:HTMLImageElement[],textLinesOfImages?:TextLine[][],progressCallback?:any):Promise<HTMLImageElement[]>TextLineexport interface TextLine{ x:number; y:number; width:number; height:number; text:string; }
Install
Via NPM:
npm install duplicate-document-image-finderVia CDN:
<script type="module">
import { DuplicateDocumentImageFinder } from 'https://cdn.jsdelivr.net/npm/duplicate-document-image-finder/dist/duplicate-document-image-finder.js';
</script>License
MIT