1.0.0 • Published 5 years ago
@pietrop/serialize-stt-words v1.0.0
serialize-stt-words
A module to serialize and deserialize words from STT in dpe format into arrays of each attribute.
eg with euristics if mock8hours.json is 8 hours and 9.6MB
This is the breakdown of file size for each attribute saved seperately.
58K paragraphEndTimes.json
59K paragraphStartTimes.json
93K speakersLit.json
637K textList.json
637K wordEndTimes.json
653K wordStartTimes.jsonWell within the 1MB firebase document limit.
Setup
git clone git@github.com:pietrop/serialize-stt-words.gitcd serialize-stt-wordsnpm installUsage
{
"words": [
{
"text": "Hello",
"start": 0,
"end": 0.88
},
....
],
"paragraphs": [
{
"speaker": "SPEAKER_B",
"start": 0,
"end": 1.24
},
...
]
}Returns arrays of
npm install @pietrop/serialize-stt-wordsconst { serializeTranscript } = require('@pietrop/serialize-stt-words');
const { wordStartTimes, wordEndTimes, textList, paragraphStartTimes, paragraphEndTimes, speakersLit } = serializeTranscript(transcript);{
"wordStartTimes": [
0,
0.9,
1.13,
...
],
"wordEndTimes": [
0.88,
1.12,
...
],
"textList": [
"Media",
"will",
...
],
"paragraphStartTimes": [
0,
1.25,
...
],
"paragraphEndTimes": [
1.24,
4,
...
],
"speakersLit": [
"SPEAKER_B",
"SPEAKER_A",
...
]
}The idea being that you could save each separate in a db and recombine later.
const { deserializeTranscript } = require('@pietrop/serialize-stt-words');
const desRes = deserializeTranscript({ wordStartTimes, wordEndTimes, textList, paragraphStartTimes, paragraphEndTimes, speakersLit });Documentation
There's a docs folder in this repository.
docs/notes contains dev draft notes on various aspects of the project. This would generally be converted either into ADRs or guides when ready.
Development env
- npm >
6.1.0 - Node 12
Node version is set in node version manager .nvmrc
nvm useTests
npm testDeployment
npm run publish:public1.0.0
5 years ago