npm.io
2.1.1 • Published 2 months agoCLI

@ltftf/srt-parser-2

Licence
MIT
Version
2.1.1
Deps
0
Size
10 kB
Vulns
0
Weekly
0

srt-parser-2

An SRT parser for Javascript.

It reads an .srt file into an array.

This is a fork of srt-parser-2.

What's changed

  • Removed the unnecessary class instantiation
  • Text is parsed as an array of strings (by line) instead of a single string with \n characters
  • Handles the obsolete \r line breaks

Install

npm

npm install @ltftf/srt-parser-2

or yarn

yarn add @ltftf/srt-parser-2

Example

This is an SRT format file:

1
00:00:11,544 --> 00:00:12,682
Hello
World

it would become:

[{
    id: "1",
    startTime: "00:00:11,544",
    startSeconds: 11.544,
    endTime: "00:00:12,682",
    endSeconds: 12.682,
    text: [ "Hello", "World" ]
}]
Environment support

Since it only process text,
it should work in both Browser and Node.js environment

Usage

let srt = `
1
00:00:11,544 --> 00:00:12,682
Hello
`;

import { fromSrt, toSrt } from "@ltftf/srt-parser-2";
var srtArray = fromSrt(srt);
console.log(srtArray);

// turn array back to SRT string.
var srtString = toSrt(srtArray);
console.log(srtString);

You can run this example using node example/1.Comma.js

CLI

npx srt-parser-2 -i input.srt -o output.json --minify

Options:

Option Required Default
--input or -i Yes
--output or -o No output.json
--minify No false

License

MIT

Why?

Why this one special? There are plenty of SRT parsers on npm:

What's wrong with them?

Nothing wrong.
All of them can handle this format:

1
00:00:11,544 --> 00:00:12,682
Hello

But I want to handle format like these:

00:00:11.544

This is wrong format, it use period as separator

Or this:

00:00:11,5440

This is also wrong format, millisecond has 4 digit (should be 3)

Or this:

1:00:11,5

Similar, hour & millisecond is only 1 digit (wrong)

Or this

00:00:00.05

etc

Format Support

Format Other parser srt-parser-2 srt-parser-2 would turn this into
0001,544 Yes Yes 0001,544
0001.544 Yes for some of them Yes 0001,544
0001.54 Yes for some of them Yes 0001,544
0000.3333 No Yes 0000,333
0000.3 No Yes 0000,300
13.4 No Yes 0103,400

Basic principle:

  1. If hour,minute,second is shorter than 2 digit, pad start with "0", if longer than 2 digit, only save first 2 digit.
  2. Millisecond is the same, but it's 3 digit.
  3. Separator can be .(periods) or ,(comma), periods(incorrect) will be replace with comma(correct)

Conclusion

  1. Support more time format (even wrong format)
  2. Have extensive test