1.1.1 • Published 2 years ago

@vtfk/pdf-splitter v1.1.1

Weekly downloads
-
License
MIT
Repository
github
Last release
2 years ago

pdf-splitter

NodeJS package for splitting pdfs, based on given ranges or keywords. Uses PDFtk and node-pdftk for splitting, and PDF.js for pdf-text-reading

Requirements

Make sure you have PDFtk installed. Save the path to the executable as an environment variable "PDFTK_EXT".

For example in .env

PDFTK_EXT="<installationPath>/PDFtk/bin/pdftk"

Installing

$ npm install @vtfk/pdf-splitter

Usage

With array of page-ranges

Specify which pages you want to split into new documents

DescriptionValue
Page one and three as separate documents'1', '3'
Page one to four (inclusive) as doc and page three, six, and eight to ten (inclusive) as doc'1-4', '3 6 8-10'
const splitPdf = require('@vtfk/pdf-splitter')

const pdfToSplit = {
    pdfPath: 'a pdf.pdf',
    ranges: ['1-4', '3 6 8-10', '4 2'],
    outputDir: 'path/to/outputDirectory', // Optional, defaults to directory of the input pdf
    outputName: 'nameForResultingPdfs' // Optional, defaults to the <NameOfPdf>-<index>.pdf
}

const result = await splitPdf(pdfToSplit)
console.log(result)

With array of keywords/sentences

Specify on which keywords/sentences you want to split the document on (EVERY word/sentence must be present for it to split on that page - see option "orKeywords" for the SOME instead of EVERY)

NOTE: At least one keyword or sentence must be unique for the document

const splitPdf = require('@vtfk/pdf-splitter')

const pdfToSplit = {
    pdfPath: 'a pdf.pdf',
    keywords: ['a unique sentence for the page you want to split on', 'word', 'another'],
    outputDir: 'path/to/outputDirectory', // Optional, defaults to directory of the input pdf
    outputName: 'nameForResultingPdfs' // Optional, defaults to the <NameOfPdf>-<index>.pdf
}

const result = await splitPdf(pdfToSplit)
console.log(result)

Options

options.onlyPagesWithKeywords

Only return the pages where the keywords are present as separate documents

const splitPdf = require('@vtfk/pdf-splitter')

const pdfToSplit = {
    pdfPath: 'a pdf.pdf',
    keywords: ['a unique sentence for the page you want to split on', 'word', 'another'],
    outputDir: 'path/to/outputDirectory', // Optional, defaults to directory of the input pdf
    outputName: 'nameForResultingPdfs', // Optional, defaults to the <NameOfPdf>-<index>.pdf
    onlyPagesWithKeywords: true
}

const result = await splitPdf(pdfToSplit)
console.log(result)

options.orKeywords Only require ONE of the keywords to be present on the page, for it to split on that page

const splitPdf = require('@vtfk/pdf-splitter')

const pdfToSplit = {
    pdfPath: 'a pdf.pdf',
    keywords: ['a unique sentence for the page you want to split on', 'word', 'another'], // will split if one of these are present on the page
    outputDir: 'path/to/outputDirectory', // Optional, defaults to directory of the input pdf
    outputName: 'nameForResultingPdfs', // Optional, defaults to the <NameOfPdf>-<index>.pdf
    orKeywords: true // Optional, defaults to false
}

const result = await splitPdf(pdfToSplit)
console.log(result)
1.1.1

2 years ago

1.1.0

2 years ago

1.0.4

2 years ago

1.0.3

2 years ago

1.0.2

2 years ago

1.0.0

2 years ago