1.1.2 • Published 12 months ago

text-from-pdf v1.1.2

Weekly downloads
-
License
Apache-2.0
Repository
github
Last release
12 months ago

PDF-TO-TEXT

A pdf to text wrapper to extract text from a pdf. It works with searchable and non-searchable(images) PDFs

PDF CI

Installation

npm install text-from-pdf

Mac Users

brew install poppler

Linux Users

sudo apt-get update && sudo apt-get install poppler-utils

Windows Users

No installation required

Usage

1) Standard Input PDF with horizontally aligned text:

  ```js
   const text = await pdfToText('<PATH_TO_PDF_FILE/fileName.pdf>');
   console.log(text)
 ```

2) Input PDF's with vertically aligned text:

   ```js
    const options = {
      rotationDegree: -90,
    };
    $ const text = await pdfToText('<PATH_TO_PDF_FILE/fileName.pdf>', options);
    $ console.log(text)
   ```

3) Text from first and second page:

   ```js  
    const options = {
       firstPageToConvert: 1,
       lastPageToConvert: 2,
    };
    $ const text = await pdfToText('<PATH_TO_PDF_FILE/fileName.pdf>', options);
    $ console.log(text)
   ```

4) Text from third to fifth page:

   ```js  
    const options = {
       firstPageToConvert: 3,
       lastPageToConvert: 5,
    };
    $ const text = await pdfToText('<PATH_TO_PDF_FILE/fileName.pdf>', options);
    $ console.log(text)
   ```

4) Enable Progressbar logging:

   ```js  
    const options = {
       firstPageToConvert: 1,
       lastPageToConvert: 1,
       enableProgressBarLogging: true
    };
    $ const text = await pdfToText('<PATH_TO_PDF_FILE/fileName.pdf>', options);
    $ console.log(text)
   ```    

Features request

Fork, add your changes and create a pull request

1.1.2

12 months ago

1.1.1

2 years ago

1.1.0

2 years ago

1.0.10

2 years ago

1.0.9

2 years ago

1.0.8

2 years ago

1.0.7

2 years ago

1.0.6

2 years ago

1.0.5

2 years ago

1.0.4

2 years ago

1.0.3

2 years ago

1.0.2

2 years ago

1.0.1

2 years ago

1.0.0

2 years ago