Dextractor NPM

Dextractor

Dextractor is a tool to extract links from within a given URL, save them inside a text file and download the files from the links that actually end to a file.

Installation

I assume that you have NodeJS already installed on your machine.

1. Simply cd to the root of your NodeJS project and run:

npm install dextractor

2. Done!

Usage

1. Simply require the module inside your project:

const dextractor = require("dextractor");

Or use the experimental module approach if you wish:

import dextractor from "dextractor";

2. There are four methods available to use. The available methods are as below:

> saveLinks(url, path?, callback?)

This method only saves the links that appear inside of a given URL into a text file in [path that you specify or ./export if you don't specify any path]/[Given URL separated with underlines instead of forward slashes]/links/links.txt.

Parameters:

Parameter	Type	Description
url	String	The URL that you wish to perform the dextraction on.
path	String	Optional The path to save the downloaded files and/or saved links in it. If no path is specified, `./export/` will be used instead.
callback	Function	Optional The function to be executed after the dextraction was done. If no function is specified, it will simply print `Done!` on the console after the job is done.

Example:

const dextractor = require("dextractor");

dextractor.saveLinks("https://example.com", "./example", () => {
  console.log("Alright!");
});

> downloadFiles(url, path?, callback?)

This method only downloads the files in the links appearing inside of the given URL and saves them in a zip file in [path that you specify or ./export if you don't specify any path]/[Given URL separated with underlines instead of forward slashes]/files.zip.

Parameters:

Parameter	Type	Description
url	String	The URL that you wish to perform the dextraction on.
path	String	Optional The path to save the downloaded files and/or saved links in it. If no path is specified, `./export/` will be used instead.
callback	Function	Optional The function to be executed after the dextraction was done. If no function is specified, it will simply print `Done!` on the console after the job is done.

Example:

const dextractor = require("dextractor");

dextractor.downloadFiles("https://example.com", "./example", () => {
  console.log("Alright!");
});

> saveLinksAndDownloadFiles(url, path?, callback?)

This method does the whole job of dextraction. It saves the links that appear inside of the given URL into a text file in [path that you specify or ./export if you don't specify any path]/[Given URL separated with underlines instead of forward slashes]/links/links.txt and then downloads the files in the links appearing inside of the given URL and saves them in a zip file in [path that you specify or ./export if you don't specify any path]/[Given URL separated with underlines instead of forward slashes]/files.zip.

Parameters:

Parameter	Type	Description
url	String	The URL that you wish to perform the dextraction on.
path	String	Optional The path to save the downloaded files and/or saved links in it. If no path is specified, `./export/` will be used instead.
callback	Function	Optional The function to be executed after the dextraction was done. If no function is specified, it will simply print `Done!` on the console after the job is done.

Example:

const dextractor = require("dextractor");

dextractor.saveLinksAndDownloadFiles("https://example.com", "./example", () => {
  console.log("Alright!");
});

> getLinks(url, callback?)

This method only gives you an array of the extracted links from within the given URL. Just simply pass a parameter to your callback and that will be the array of extracted links.

Parameters:

Parameter	Type	Description
url	String	The URL that you wish to perform the dextraction on.
callback	Function	Optional The function to be executed after the dextraction was done. Note: You should pass a parameter to your callback in order to access the array of extracted links inside of your callback.

Example:

const dextractor = require("dextractor");

dextractor.getLinks("https://example.com", links => {
  links.map(each => {
    console.log(each);
  });
});

Note 1: If you wish to run the dextraction in a synchronous manner, simply pass anything that you wish to be executed after the dextraction inside of a function as a callback to any of the available methods.

Note 2: The downloading feature (downloadFiles and saveLinksAndDownloadFiles methods) only works on the direct and static downloading links e.g. https://example.com/image.png meaning that dynamic download links will not work.

Note 3:: The link extraction feature (saveLinks and saveLinksAndDownloadFiles methods) works properly only on static websites or static file servers. On a dynamic website you might get links to the personal files of the website.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

Dextractor Dextractor npm link link extraction link extractor download files in a URL

request-promise zip-a-folder

5 years ago

5 years ago

5 years ago

5 years ago

5 years ago

5 years ago