2.0.7 • Published 4 months ago
paper2web v2.0.7
paper2web WIP
A simple Node.js package for converting PDF documents into structured HTML without losing formatting.
Features
- Extracts text content from PDFs.
- Preserves basic text formatting (bold, italic).
- Extract unordered lists to a HTML list
- Converts each page into structured HTML.
Installation
Install via npm:
npm install paper2web
Dependencies
This package relies on the following libraries:
Acknowledgments
This project uses pdf2json (pdf2json
) by modesty to extract data from PDF files for further processing.
Usage
import { convertPdfToHtml } from "paper2web";
const pdfPath = "path/to/input.pdf";
convertPdfToHtml(pdfPath)
.then(() => console.log("Conversion successful!"))
.catch((error) => console.error("Error:", error))
License
This project is licensed under the MIT License
Contributing
Contributions are welcome! Feel free to open an issue or submit a pull request!
Future of this package
- Add support for images
- Add cli support
Author
Developed by Malte Harms
Let me know if you need any changes! 🚀