2.0.2 • Published 4 years ago

docx-images v2.0.2

Weekly downloads
3
License
ISC
Repository
github
Last release
4 years ago

docx-images can extract images from a docx file, and crop them as necessary.

I.e. if a docx file embeds image1.jpg that is 100x100 pixels, but the document crops this image to 50x50 pixels, then docx-images will extract both the original image and a cropped version.

Installation

npm install -g docx-images

Usage

To see the list of available commands, run docx-images:

docx-images

Usage is docx-images command ...args
  Where command is one of ["extract", "list"]
  And ...args is zero or more arguments for that command

To view help for a command, run:

docx-images COMMAND_NAME help

list

Lists the images found in a docx file:

Usage is: docx-images list [options...]
  --docx: {string} The path to the docx file.
  --documentXml: {string} The path to an already-extracted document.xml.rels file.
  --documentXmlRels: {string} The path to an already-extracted document.xml file.

Sample output:

docx-images --docx D:/sample.docx

Loading docx from D:/sample.docx

406 imageRels (image lookups - map an id to a target):
rId117 media/image61.jpeg
rId299 media/image164.jpeg
rId671 media/image358.jpeg
...

411 images (the order the images appear in the document):
embed= rId8 count= 1 target= media/image1.jpeg srcRect= { l: '', r: '', t: '', b: '' }
embed= rId9 count= 1 target= media/image2.jpeg srcRect= { l: '', r: '', t: '', b: '' } extent= { cx: '1396800', cy: '2664000' }
embed= rId16 count= 1 target= media/image3.jpeg srcRect= { l: '3006', r: '2248', t: '26301', b: '2472' }
embed= rId22 count= 1 target= media/image4.jpeg srcRect= { l: '', r: '', t: '19615', b: '5124' }
embed= rId23 count= 1 target= media/image5.jpeg srcRect= { l: '6529', r: '', t: '17028', b: '10010' }
embed= rId24 count= 1 target= media/image6.jpeg srcRect= { l: '', r: '', t: '12526', b: '8962' }
embed= rId25 count= 1 target= media/image7.jpeg srcRect= { l: '', r: '', t: '7139', b: '' } extent= { cx: '3600000', cy: '2494800' }
embed= rId26 count= 1 target= media/image8.jpeg srcRect= { l: '', r: '', t: '', b: '13243' }
...

The srcRect describes how the image is cropped. See comments in src/docx/document.xml.js and the getCropRect() function for more information.

extract

docx-images extract help
Usage is: docx-images extract [options...]
  --docx: {string} The path to the docx file.
  --output-dir: {string} Output folder for the extracted file(s). [Default "."]
  --image-prefix: {string} A prefix to use for the extracted image name(s). [Default ""]
  --opts-module: {string} A CommonJS module path providing custom options code.

Sample output:

docx-images extract --docx D:/sample.docx --output-dir D:/test-folder --image-prefix MY_PREFIX_

...
rId608 word/media/image316.jpeg D:\test-folder\word\media\MY_PREFIX_image316.jpeg image was not cropped
rId697 word/media/image384.jpeg D:\test-folder\word\media\MY_PREFIX_image384.jpeg image was not cropped
rId696 word/media/image383.jpeg D:\test-folder\word\media\MY_PREFIX_image383.jpeg image was not cropped
rId147 word/media/image79.jpeg image was cropped to:
  [0] D:\test-folder\word\media\MY_PREFIX_image79.crop.1.jpeg,
  old-size: {"left":0,"top":0,"width":997,"height":744},
  new-size: {"left":0,"top":112.17287999999999,"width":997,"height":631.82712}
...

Custom Options

Custom options, such as callbacks, can be passed to docx-images using the opts-module argument. For example:

docx-images extract --docx src/test/100x100.mixed.docx --output-dir .temp --opts-module PATH_TO/optsModule.sample.js

This module should return an object of the following shape.

module.exports = {
  ignoreFallbackImages,
  makeEntryImagePath,
  makeCroppedImagePath,
};

See src/test/optsModule.sample.js for an example.

Report

After extraction, a docx-images.report.html report will be generated that shows the extracted images and the 'crop boxes' for each cropped image.

Development

Building

npm run build or npm run build:watch

Linting

npm run lint or npm run lint:watch

Testing

npm run test or npm run test:watch

Notes