@languageconfidence/lang-conf-sdk-js NPM

Language Confidence

A JavaScript SDK to simplify connecting with the Language Confidence online service via the REST API.

Project page: https://bitbucket.org/languageconfidence/client-sdk-js

How to start

Install with NPM:

npm install @languageconfidence/lang-conf-sdk-js --save

The main bundle file is available at dist/bundle.js within the distribution package.

You can then import the library in your JS:

import { LanguageConfidence } from '@languageconfidence/lang-conf-sdk-js'

Usage

Getting an API Key

In order to access the Language Confidence API you must first register for a Language Confidence account, and then request an API key.

Access Tokens

If you are using the Trial Plan then this step is not necessary.

If you have a Custom Plan for your production application then the API will require Oauth2 access tokens for authorization on top of your Api Key.

In this case you will be given a set of client credentials. You should store the credentials securely on you application backend and use them to request new access tokens from our auth server.

Your client credentials should never be checked into source control or visible on the client side

Here is an example of the request your backend should make to get access tokens:

curl -X POST \
  https://languageconfidence.cloud.tyk.io/pronunciation/oauth/token \
  -H 'Authorization: Basic <Your base64 encoded client credentials>' \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  -d 'grant_type=client_credentials&undefined='

You will need to base64 encode your client credentials in the following format clientID:clientSecret and use the output as the value for the Authorization header

Initialising the LanguageConfidence SDK

Create a new instance of the LanguageConfidence class, passing in your API key. You can then use this instance to make calls to the server.

let langConf = new LanguageConfidence({
  apiKey: "<your-api-key>"
});

You can also set other default values when initialising your LangugaeConfidence object. For a full list see the API documentation.

The default server url is for the trial plan "https://api.languageconfidence.com/pronunciation-trial", if you are on a custom plan you will need to change it to "https://api.languageconfidence.com/pronunciation"

Gathering your request data

Each request to the server includes a text phrase and the matching audio to be checked for pronunciation quality.

Audio data should be formatted in base64 and included as part of the JSON request to the server. How you capture this audio will depend on your environment. For an example of capturing audio in HTML5 see this example.

let request = {
  format: "mp3",
  content: "After the show we went to a restaurant.",
  audioBase64: "<base-64-encoded-audio-data>"
};

At a minimum your request should include the format ('mp3' or 'wav'), the content (i.e. the phrase being read), and the base 64 audio data. You can also include options to change how the audio is processed. For a full list see the API documentation.

Sending data to the server

// send the request data to the server and get back score values
langConf
  .getPronunciationScore(request)
  .then(result => {
    // success! you can now use the response
    console.log("Response from server", result);
  })
  .catch(error => {
    // something went wrong, handle the error
    console.log("Error getting scores from server", error);
  });

Getting the score for the whole sentence

Once you have your processed data you can extract scores for different parts of the phrase. The 'sentence score' is the overall score for the whole phrase.

Scores are available as a raw percentage value, or as a 'graded' score, in which case the percent is mapped to a grade of 'poor', 'average' or 'good', based on the difficulty level set when the request was made. You can also set custom dificulty thresholds.

langConf.getPronunciationScore(request).then(result => {
  let sentenceScore = result.getSentenceScore();

  // get the overall score as a raw percent (as a decimal fraction)
  console.log("Overall score: " + sentenceScore.rawScore);

  // get the overall score graded ('poor', 'average' or 'good'). This will depend
  // on the grading difficulty used.
  console.log("Overall score (graded): " + sentenceScore.gradedScore);
});

Getting the score for specific words

You can also get the score for individual words. As the same word may appear multiple times in a phrase, word scores are returned as an array. Each entry in the array is the score for the word, in order that it occurred in the phrase.

langConf.getPronunciationScore(request).then(result => {
  // we get the first element as we know the word only occurred once
  let wordScore = result.getWordScores("restaurant")[0];

  // get the score for that word as a raw percent (as a decimal fraction)
  console.log('Score for "restaurant": ' + wordScore.rawScore);

  // get the score graded ('poor', 'average' or 'good') for that word
  console.log('Score for "restaurant" (graded): ' + wordScore.gradedScore);
});

Getting the score for specific phonemes within a word

You can also extract the scores for specific phonemes within each word.

Phonemes also include extra information in the 'soundedLike' field. This is a two dimensional array, where each entry in the array gives details on an alternate phoneme that the audio sounded like. The details is then another array, with just two values, the first value is the phoneme, and the second is a value for how strongly the audio matched it.

langConf.getPronunciationScore(request).then(result => {
  // we get the first element as we know the word only occurred once
  let phonemes = result.getPhonemeScoresList("restaurant")[0];

  // the result is a list of phonemes within that word
  console.log("Found " + phonemes.length + ' phonemes in "restaurant"');
  phonemes.forEach(score => {
    console.log('Score for phoneme "' + score.label + '" is: ');
    console.log(" - raw score: " + score.rawScore);
    console.log(" - graded score: " + score.gradedScore);

    score.soundedLike.forEach(soundLike => {
      console.log(
        " - sounded like: " +
          soundLike[0] +
          ", (confidence " +
          soundLike[1] +
          ")"
      );
    });
  });
});

Changing the grading level

The grade ('poor', 'average', 'good') a particular score gets depends on the difficulty ('beginner', 'intermediate', 'advanced') that was set at the time of requesting the score. You can however regrade a response for a new difficulty by using the adjustPronunciationScoreGrades method. This creates a new graded score, leaving the original score intact.

langConf.getPronunciationScore(request).then(result => {
  langConf
    .adjustPronunciationScoreGrades(result, "beginner")
    .then(beginnerScore => {
      console.log(
        "Grade when a beginner: " + beginnerScore.getSentenceScore().gradedScore
      );
    });

  langConf
    .adjustPronunciationScoreGrades(result, "advanced")
    .then(advancedScore => {
      console.log(
        "Grade when advanced: " + advancedScore.getSentenceScore().gradedScore
      );
    });
});

You can also change the difficulty level used by default by setting the value on the LanguageConfidence instance:

// all requests made will use the beginner grading system unless
// specifically overridden in the request
langConf.difficulty = "intermediate";

Or on a per request level by passing the difficulty in as an extra request option

  let request = {
    "format": "mp3",
    "content": "After the show we went to a restaurant.",
    "audioBase64": "<base-64-encoded-audio-data>",
    "difficulty": "intermediate"
  };

  langConf.getPronunciationScore(request).then(...);

Changing the grading level with custom difficulty thresholds

If you want more fine grained control over the grading level you can set custom thresholds for the ('poor', 'average', 'good') grades. You will need to specifiy two float values in an array. Each value represents a percentage e.g: 0.3 = 30%.

So if you set the thresholds as : 0.1 , 0.2.

The grades will be scored the following way:

poor < 0.1 < average < 0.2 < good

Just like the default difficulty levels you can adjust the custom difficulty:

Using the adjustPronunciationScoreGrades method

langConf.getPronunciationScore(request).then(result => {
  langConf
    .adjustPronunciationScoreGrades(result, null, [0.1, 0.2])
    .then(beginnerScore => {
      console.log(
        "Grade with custom difficulty: " +
          beginnerScore.getSentenceScore().gradedScore
      );
    });
});

Setting the value on the LanguageConfidence instance:

langConf.customDifficulty = [0.1, 0.2];

Or on a per request level by passing the difficulty in as an extra request option

  let request = {
    "format": "mp3",
    "content": "After the show we went to a restaurant.",
    "audioBase64": "<base-64-encoded-audio-data>",
    "customDifficulty": [0.1, 0.2]
  };

  langConf.getPronunciationScore(request).then(...);

Handling expired Access Tokens

Your API Access Token will eventually expire for security reasons and you will need to request a new one. Typically tokens are valid for 10 hours from time of issue, but you should check the expires_in field when you get your Access Token to be sure.

To request a new token, simply follow the same steps you used to get a token in the first place. Normally this will involve calling your server, which then uses the client ID and client secret to request a new access token and return it to your client to use. For testing purposes, you can use CURL as per above to request a new token.

You can choose between two strategies for handling access token expiry. The first is to track the expiry time and then request a new token before your current one expires. The second is to just simply keep trying to use your token and wait to receive a 401 Unauthorised error response, and then request a new token then.

In the first case, your client code is a little more complex, as you have to keep the expiry time when you first get the token and then you need to monitor this and grab a new token before the old one expires.

In the second case, your code is simpler, but you may have users make requests to the server that take a while (especially when uploading media data) and then that request be rejected and having to resubmit it once a new access token is obtained.

The code for handling auth errors is:

langConf.getPronunciationScore(request)
    .then( result => {
        ...
    })
    .catch( error => {
        if (error.message === 'keymanagement.service.access_token_expired') {

          // call your sever to get the new Access Token
          langConf.acccessToken = ... your method to get access token ...;

          // you can now retry your failed request
        }
    });

Report a bug / Ask for a feature

We welcome your feedback. If you find a bug or have a feature request, please use log an issue in the Issues list for the BitBucket project repository.

Contributing

Contributions are welcome, just create a fork, make your change and issue a pull request.

The local setup is pretty simple. Just clone the repo locally and run npm install. Then run npm watch to build the library as you make changes. Code is built into dist/bundle.js and the simple example (under examples/simple/simple.html) is an effective way to test your changes.

Additionally all code should have unit tests created (in test/index.test.js) and you can use npm test to run all unit tests before issuing a pull request.

Available commands

test code - `npm test`

Runs all the tests in /test (tests are written using jest)

generate documentation - `npm run docs`

Generates the API documentation (using jsdoc) into the /docs directory.

build the source code - `npm run build`

Builds a bundled, minified and uglified release of the code (using Rollup.js) into dist/bundle.js

build and watch the source code (for local dev) - `npm run watch`

Builds the source code into dist/bundle.js and then continues watching for changes, and auto-builds when any are detected. Use this for local development only.

Versioning

We use SemVer as a guideline for our versioning here.

What does that mean?

Releases will be numbered with the following format:

<major>.<minor>.<patch>

And constructed with the following guidelines:

Breaking backward compatibility bumps the <major> (and resets the <minor> and <patch>)
New additions without breaking backward compatibility bump the <minor> (and reset the <patch>)
Bug fixes and misc. changes bump the <patch>