1.1.1 • Published 1 year ago

pdf-parser-client-side v1.1.1

Weekly downloads
-
License
MIT
Repository
github
Last release
1 year ago

PDF Parser Client Side

A lightweight easy to use package to parse text from PDF files on client side without any server dependency.

How to Install ?

Use npm or yarn to install this npm package

npm i pdf-parser-client-side

or

yarn add pdf-parser-client-side

Include the package

import extractTextFromPDF from "pdf-parser-client-side";

variant Parameter

The variant parameter is used to specify the type of text extraction and replacement to be performed on the extractedText. Depending on the value of the variant parameter, different types of characters will be removed or retained.

variant ValueDescriptionRegular ExpressionRetained Characters
cleanRemoves all non-ASCII characters and any spaces that follow them./[^\x00-\x7F]+\ \*(?:[^\x00-\x7F] | )\*/gASCII characters only
alphanumericRetains only alphanumeric characters (letters and numbers)./[^a-zA-Z0-9]+/gA-Z, a-z, 0-9
alphanumericwithspaceRetains alphanumeric characters and spaces./[^a-zA-Z0-9 ]+/gA-Z, a-z, 0-9, space
alphanumericwithspaceandpunctuationRetains alphanumeric characters, spaces, and basic punctuation marks (.,!?,)./[^a-zA-Z0-9 .,!?]+/gA-Z, a-z, 0-9, space, .,!?
alphanumericwithspaceandpunctuationandnewlineRetains alphanumeric characters, spaces, basic punctuation marks (.,!?), and newlines./[^a-zA-Z0-9 .,!?]+/gA-Z, a-z, 0-9, space, .,!?

Example Usage

Javascript

import React from "react";
import extractTextFromPDF from "pdf-parser-client-side";

export default function Test() {
  const handleFileChange = async (e, variant) => {
    const file = e.target.files?.[0];
    if (file) {
      try {
        const text = await extractTextFromPDF(file, variant);
        console.log("Extracted Text:", text);
      } catch (error) {
        console.error("Error extracting text from PDF:", error);
      }
    }
  };

  return (
    <div>
      <input
        type="file"
        name=""
        id="file-selector"
        accept=".pdf"
        onChange={(e) => handleFileChange(e, "clean")}
      />
    </div>
  );
}

Typescript

import React from "react";
import extractTextFromPDF, { Variant } from "pdf-parser-client-side";

export default function Test() {
  const handleFileChange = async (
    e: React.ChangeEvent<HTMLInputElement>,
    variant: Variant
  ) => {
    const file = e.target.files?.[0];
    if (file) {
      try {
        const text = await extractTextFromPDF(file, variant);
        console.log("Extracted Text:", text);
      } catch (error) {
        console.error("Error extracting text from PDF:", error);
      }
    }
  };

  return (
    <div>
      <input
        type="file"
        name=""
        id="file-selector"
        accept=".pdf"
        onChange={(e) => handleFileChange(e, "clean")}
      />
    </div>
  );
}

Contributing

Feel free to contribute!

  1. Fork the repository
  2. Make changes
  3. Submit a pull request

</> with 💛 by Vishwa Gaurav

1.1.1

1 year ago

1.0.2

1 year ago

1.1.0

1 year ago

1.0.9

1 year ago

1.0.8

1 year ago

1.0.7

1 year ago

1.0.6

1 year ago

1.0.5

1 year ago

1.0.4

1 year ago

1.0.3

1 year ago

1.0.1

2 years ago

1.0.0

2 years ago