Subtitle-parser NPM

Subtitle Parser

Description

A parser that can parse subtitle texts

Read in other languages

简体中文

Features

You can get the seconds and milliseconds
It supports two types of text: LRC, SRT (It will support more types of text)

TODO

Support more types of text
Process zero in SRT
The results are processed for convenience to use

Usage

Download this library

npm install subtitle-parser

or use yarn

yarn install subtitle-parser

Example

const subtitleParser = require("subtitle-parser");
const contentSRT = 
`
1
00:00:01,410 --> 00:00:04,220
我们已经走得太远，以至于忘记了为什么出发。
We already walked too far, down to we had forgotten why embarked.
2
00:00:04,251 --> 00:00:06,234
天下没有不散的筵席
There is no never-ending feast
`;
// You don't have to write the first five
// This is not the time of the song, just a test
/**
 * ar: Artist name
 * ti: title
 * al: album
 * by: editor(People who make LRC lyrics)
 * offset: Time compensation value(millisecond)
 */
const contentLRC = 
`
[ar:A lot]
[ti:Something just like this]
[al:Memories... Do Not Open]
[by:RotCool]
[offset:0]
[00:05:50]I've been reading books of old
[00:10:50][00:30:51]I want to something just like this
`;
// The second parameter is unnecessary
console.log(subtitleParser.parse(contentSRT, "SRT"));
console.log(JSON.stringify(subtitleParser.parse(contentLRC, "LRC")));

It will output:

[
  {
    id: '1',
    startTime: { baseForm: '00:00:01,410', htmlCurrentTime: 1.41 },
    endTime: { baseForm: '00:00:04,220', htmlCurrentTime: 4.22 },
    content: '我们已经走得太远，以至于忘记了为什么出发。\n' +
      'We already walked too far, down to we had forgotten why embarked.'
  },
  {
    id: '2',
    startTime: { baseForm: '00:00:04,251', htmlCurrentTime: 4.251 },
    endTime: { baseForm: '00:00:06,234', htmlCurrentTime: 6.234 },
    content: '天下没有不散的筵席\nThere is no never-ending feast'
  }
]
{
	"content": [{
		"time": {
			"baseForm": "00:05:50",
			"minute": 0,
			"second": 5,
			"millisecond": 50,
			"htmlCurrentTime": 5.05
		},
		"content": "I've been reading books of old"
	}, {
		"time": {
			"baseForm": "00:10:50",
			"minute": 0,
			"second": 10,
			"millisecond": 50,
			"htmlCurrentTime": 10.05
		},
		"content": "I want to something just like this"
	}, {
		"time": {
			"baseForm": "00:30:51",
			"minute": 0,
			"second": 30,
			"millisecond": 51,
			"htmlCurrentTime": 30.051
		},
		"content": "I want to something just like this"
	}],
	"ar": "A lot",
	"ti": "Something just like this",
	"al": "Memories... Do Not Open",
	"by": "RotCool",
	"offset": "0"
}

Format

LRC

You can use minute:second:millisecond or minute:second.millisecond or hour:minute:second:millisecond or hour:minute:second.millisecond
You can also add your own tag here, for example: copyright:RotCool It will be parsed as { ... copyright: RotCool ...}
Every line should have "xxx". Otherwise, it will be ignored
The content can't be contained "" or ""

SRT

You must use "-->" to split the start time and the end time
It doesn't matter how many spaces there are before and after "-->"
The start time and the end time only support "hour:minute:second,millisecond" "hour:minute:second:millisecond" or "hour:minute:second.millisecond" can't be parsed
How many zeros of time will not be processed, for example 0:0:1,4 It will be parsed as { baseForm: "0:0:1,4", htmlCurrentTime: 1.004 }

4 years ago

4 years ago

4 years ago

4 years ago

4 years ago

4 years ago

4 years ago

4 years ago

4 years ago