0.0.6 • Published 4 months ago

@sumsub/capture-sdk v0.0.6

Weekly downloads
-
License
CC-BY-NC-SA-4.0
Repository
github
Last release
4 months ago

Sumsub Capture SDK · Version Lint

This library helps to evaluate the quality of document images before transferring them to the server.

Install

npm install @sumsub/capture-sdk

Usage

import initCaptureSdk from '@sumsub/capture-sdk'

const captureSdk = await initCaptureSdk()

// returns numeric score between 0 and 1
// higher score means more unsatisfactory image
const score = await captureSdk.predictImageDataScore(imageData)

// returns true when image score less then maxAllowedScore (default value is 0.83), false if else
const result = await captureSdk.predictImageDataResult(imageData, maxAllowedScore)

This library helps evaluate the quality of document images before transferring them to the server.

Model

The trained model is a lightweight version of SqueezeNet, weighing only 1 MB.

The classes are defined as follows:

  • Class 1 consists of low-quality document photos and photos that are not from the document domain
  • Class 0 comprises high-quality document photos

Train dataset:

  • Class 1 contains 500k data collected by Sumsub, representing poor-quality document photos, and an additional 200k data from ImageNet that consists of photos not from the document domain
  • Class 0 includes 500k data collected and generated by Sumsub, representing high-quality document photos

Test dataset:

  • Class 1 has 100k data collected by Sumsub, which are poor-quality document photos that were rejected during the fastfail stage.
  • Class 0 also has 100k data from Sumsub, representing high-quality document photos.

Metrics

  1. roc_auc_score = 0.85
  2. frtt_score(quantile=0.985) = 0.30 (threshold of 0.89)
  3. frtt_score(quantile=0.97) = 0.40 (threshold of 0.83)

frtt_score

To clarify, in the frtt_score metric, the quantile parameter determines the acceptable fraction of false positives that we set.

For example, when the quantile is 0.985, we expect our model to accurately classify 98.5% of Class 0 objects. There is a possibility of misclassifying (resulting in false positives) 1.5% of Class 0 objects.

Metrics such as Recall are then measured to determine the ratio of poor-quality photos (Class 1) captured at the selected threshold for the classifier.

0.0.6

4 months ago

0.0.5

4 months ago

0.0.3

8 months ago

0.0.4

6 months ago

0.0.2

1 year ago

0.0.1

1 year ago