Paper2web NPM | npm.io

paper2web WIP

A simple Node.js package for converting PDF documents into structured HTML without losing formatting.

Features

Extracts text content from PDFs.
Preserves basic text formatting (bold, italic).
Extract unordered lists to a HTML list
Converts each page into structured HTML.

Installation

Install via npm:

npm install paper2web

Dependencies

This package relies on the following libraries:

pdf2json
nodejs >= 22.14.0

Acknowledgments

This project uses pdf2json (pdf2json) by modesty to extract data from PDF files for further processing.

Usage

import { convertPdfToHtml } from "paper2web";

const pdfPath = "path/to/input.pdf";

convertPdfToHtml(pdfPath)
  .then(() => console.log("Conversion successful!"))
  .catch((error) => console.error("Error:", error))

License

This project is licensed under the MIT License

Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request!

Future of this package

Add support for images
Add cli support

Author

Developed by Malte Harms

Let me know if you need any changes! 🚀

4 months ago

4 months ago

4 months ago

4 months ago

4 months ago

4 months ago

4 months ago

4 months ago

4 months ago

4 months ago

4 months ago