ms-single-file-cli v0.1.0
SingleFile CLI (Command Line Interface)
Introduction
SingleFile can be launched from the command line by running it into a (headless) browser. It runs through Node.js as a standalone script injected into the web page instead of being embedded into a WebExtension. To connect to the browser, it can use Puppeteer or Selenium WebDriver. Alternatively, it can also emulate a browser with JavaScript disabled by using jsdom.
Installation with Docker
Installation from Docker Hub
docker pull capsulecode/singlefiledocker tag capsulecode/singlefile singlefileManual installation
git clone --depth 1 --recursive https://github.com/gildas-lormeau/single-file-cli.gitcd single-file-clidocker build --no-cache -t singlefile .Run
docker run singlefile "https://www.wikipedia.org"Run and redirect the result into a file
docker run singlefile "https://www.wikipedia.org" > wikipedia.htmlRun and mount a volume to get the saved file in the current directory
Save one page
docker run -v %cd%:/usr/src/app/out singlefile "https://www.wikipedia.org" wikipedia.html(Windows)docker run -v $(pwd):/usr/src/app/out singlefile "https://www.wikipedia.org" wikipedia.html(Linux/UNIX)Save one or multiple pages by using the filename template (see
--filename-templateoption)docker run -v %cd%:/usr/src/app/out singlefile "https://www.wikipedia.org" --dump-content=false(Windows)docker run -v $(pwd):/usr/src/app/out singlefile "https://www.wikipedia.org" --dump-content=false(Linux/UNIX)
An alternative docker file can be found here https://github.com/screenbreak/SingleFile-dockerized. It allows you to save pages from the command line interface or through an HTTP server.
Manual installation
Make sure Chrome or Firefox is installed and the executable can be found through the
PATHenvironment variable. Otherwise you will need to set the--browser-executable-pathoption to help SingleFile locating it. As an alternative to Chrome and Firefox, you can use jsdom by setting the--back-endoption tojsdom.Install Node.js
There are 3 ways to download the code of SingleFile, choose the one you prefer (
npmis installed with Node.js):Download and install globally with
npmnpm install -g "single-file-cli"Download and unzip manually the master archive provided by Github
unzip master.zip .cd single-file-cli-masternpm installDownload with
gitgit clone --depth 1 --recursive https://github.com/gildas-lormeau/single-file-cli.gitcd single-file-clinpm install
Make
single-fileexecutable (Linux/Unix/BSD etc.) if SingleFile is not installed globally.chmod +x single-fileTo use Firefox instead of Chrome, you must download the Selenium WebDriver component (i.e.
geckodriverfor Firefox). Make sure it can be found through thePATHenvironment variable or theclifolder. Otherwise you will need to set the--web-driver-executable-pathoption to help WebDriver locating the executable.
Run
Syntax
single-file <url> [output] [options ...]Display help
single-file --helpExamples
- Dump the processed content of https://www.wikipedia.org into the console
single-file https://www.wikipedia.org --dump-content- Save https://www.wikipedia.org into
wikipedia.htmlin the current folder
single-file https://www.wikipedia.org wikipedia.html- Save https://www.wikipedia.org into
wikipedia.htmlin the current folder with Firefox instead of Chrome
single-file https://www.wikipedia.org wikipedia.html --back-end=webdriver-gecko- Save a list of URLs stored into
list-urls.txtin the current folder
single-file --urls-file=list-urls.txt- Save https://www.wikipedia.org and crawl its internal links with the query parameters removed from the URL
single-file https://www.wikipedia.org --crawl-links=true --crawl-inner-links-only=true --crawl-max-depth=1 --crawl-rewrite-rule="^(.*)\\?.*$ $1"- Save https://www.wikipedia.org and external links only
single-file https://www.wikipedia.org --crawl-links=true --crawl-inner-links-only=false --crawl-external-links-max-depth=1 --crawl-rewrite-rule="^.*wikipedia.*$"
Troubleshooting
If the error message
UnhandledPromiseRejectionWarning: Error: Browser is not downloaded. Run "npm install" or "yarn install" at ChromeLauncher.launchis displayed, it probably means thatsingle-filewas not able to find the executable of the browser. Using the option--browser-executable-pathto pass tosingle-filethe complete path of the executable fixes this issue.If saving a page takes an unusually long time, this may be due to a timeout error that was automatically recovered. Setting
--browser-wait-untilto a lower value (e.g.networkidle0orloadinstead ofnetworkidle2) fixes this issue.
License
SingleFile is licensed under AGPL. Code derived from third-party projects is licensed under MIT. Please contact me at gildas.lormeau <at> gmail.com if you are interested in licensing the SingleFile code for a commercial service or product.