2.0.2 • Published 4 years ago

dextractor v2.0.2

Weekly downloads
-
License
MIT
Repository
gitlab
Last release
4 years ago

Dextractor

Dextractor is a tool to extract links from within a given URL, save them inside a text file and download the files from the links that actually end to a file.

Installation

I assume that you have NodeJS already installed on your machine.

1. Simply cd to the root of your NodeJS project and run:

npm install dextractor

2. Done!


Usage

1. Simply require the module inside your project:

const dextractor = require("dextractor");

Or use the experimental module approach if you wish:

import dextractor from "dextractor";

2. There are four methods available to use. The available methods are as below:

> saveLinks(url, path?, callback?)

This method only saves the links that appear inside of a given URL into a text file in [path that you specify or ./export if you don't specify any path]/[Given URL separated with underlines instead of forward slashes]/links/links.txt.

Parameters:

ParameterTypeDescription
urlStringThe URL that you wish to perform the dextraction on.
pathStringOptional The path to save the downloaded files and/or saved links in it. If no path is specified, ./export/ will be used instead.
callbackFunctionOptional The function to be executed after the dextraction was done. If no function is specified, it will simply print Done! on the console after the job is done.

Example:

const dextractor = require("dextractor");

dextractor.saveLinks("https://example.com", "./example", () => {
  console.log("Alright!");
});

> downloadFiles(url, path?, callback?)

This method only downloads the files in the links appearing inside of the given URL and saves them in a zip file in [path that you specify or ./export if you don't specify any path]/[Given URL separated with underlines instead of forward slashes]/files.zip.

Parameters:

ParameterTypeDescription
urlStringThe URL that you wish to perform the dextraction on.
pathStringOptional The path to save the downloaded files and/or saved links in it. If no path is specified, ./export/ will be used instead.
callbackFunctionOptional The function to be executed after the dextraction was done. If no function is specified, it will simply print Done! on the console after the job is done.

Example:

const dextractor = require("dextractor");

dextractor.downloadFiles("https://example.com", "./example", () => {
  console.log("Alright!");
});

> saveLinksAndDownloadFiles(url, path?, callback?)

This method does the whole job of dextraction. It saves the links that appear inside of the given URL into a text file in [path that you specify or ./export if you don't specify any path]/[Given URL separated with underlines instead of forward slashes]/links/links.txt and then downloads the files in the links appearing inside of the given URL and saves them in a zip file in [path that you specify or ./export if you don't specify any path]/[Given URL separated with underlines instead of forward slashes]/files.zip.

Parameters:

ParameterTypeDescription
urlStringThe URL that you wish to perform the dextraction on.
pathStringOptional The path to save the downloaded files and/or saved links in it. If no path is specified, ./export/ will be used instead.
callbackFunctionOptional The function to be executed after the dextraction was done. If no function is specified, it will simply print Done! on the console after the job is done.

Example:

const dextractor = require("dextractor");

dextractor.saveLinksAndDownloadFiles("https://example.com", "./example", () => {
  console.log("Alright!");
});

> getLinks(url, callback?)

This method only gives you an array of the extracted links from within the given URL. Just simply pass a parameter to your callback and that will be the array of extracted links.

Parameters:

ParameterTypeDescription
urlStringThe URL that you wish to perform the dextraction on.
callbackFunctionOptional The function to be executed after the dextraction was done. Note: You should pass a parameter to your callback in order to access the array of extracted links inside of your callback.

Example:

const dextractor = require("dextractor");

dextractor.getLinks("https://example.com", links => {
  links.map(each => {
    console.log(each);
  });
});

Note 1: If you wish to run the dextraction in a synchronous manner, simply pass anything that you wish to be executed after the dextraction inside of a function as a callback to any of the available methods.

Note 2: The downloading feature (downloadFiles and saveLinksAndDownloadFiles methods) only works on the direct and static downloading links e.g. https://example.com/image.png meaning that dynamic download links will not work.

Note 3:: The link extraction feature (saveLinks and saveLinksAndDownloadFiles methods) works properly only on static websites or static file servers. On a dynamic website you might get links to the personal files of the website.


Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.


License

MIT

2.0.2

4 years ago

2.0.1

4 years ago

2.0.0

4 years ago

1.1.0

4 years ago

1.0.1

4 years ago

1.0.0

4 years ago