1.0.11 • Published 2 years ago

@disane-dev/private-ci-test v1.0.11

Weekly downloads
-
License
MIT
Repository
github
Last release
2 years ago

Document scraper for getting invoices automagically as pdf (useful for taxes or DMS)

šŸ  Homepage

Prerequisites

  • npm >=9.1.2
  • node >=18.12.1

Configuration

All settings can be changed via CLI, env variable (even when using docker).

SettingDescriptionDefault value
AMAZON_USERNAMEYour Amazon usernamenull
AMAZON_PASSWORDYour amazon passwordnull
AMAZON_TLDAmazon top level domainde
AMAZON_YEAR_FILTEROnly extracts invoices from this year (i.e. 2023)2023
AMAZON_PAGE_FILTEROnly extracts invoices from this page (i.e. 2)null
AMAZON_ONLY_NEWTracks already scraped documents and starts a new run at the last scraped onetrue
FILE_DESTINATION_FOLDERDestination path for all scraped documents./documents/
FILE_FALLBACK_EXTENSIONFallback extension when no extension can be determined.pdf
DEBUGDebug flag (sets the loglevel to DEBUG)false
SUBFOLDER_FOR_PAGESCreates subfolders for every scraped page/pluginfalse
LOG_PATHSets the log path./logs/
LOG_LEVELLog level (see https://github.com/winstonjs/winston#logging-levels)info
RECURRINGFlag for executing the script periodically. Needs 'RECURRING_PATTERN' to be set. Default truewhen using docker containerfalse
RECURRING_PATTERNCron pattern to execute periodically. Needs RECURRING to true*/30 * * * *
TZTimezone used for docker enviromentsEurope/Berlin

Install

npm install

Usage

$ npm install -g @disane-dev/private-ci-test
$ docudigger COMMAND
running command...
$ docudigger (--version)
@disane-dev/private-ci-test/1.0.11 linux-x64 node-v18.16.1
$ docudigger --help [COMMAND]
USAGE
  $ docudigger COMMAND
...

docudigger scrape all

Scrapes all websites periodically (default for docker environment)

USAGE
  $ docudigger scrape all [--json] [--logLevel trace|debug|info|warn|error] [-d] [-l <value>] [-c <value> -r]

FLAGS
  -c, --recurringCron=<value>  [default: * * * * *] Cron pattern to execute periodically
  -d, --debug
  -l, --logPath=<value>        [default: ./logs/] Log path
  -r, --recurring
  --logLevel=<option>          [default: info] Specify level for logging.
                               <options: trace|debug|info|warn|error>

GLOBAL FLAGS
  --json  Format output as json.

DESCRIPTION
  Scrapes all websites periodically

EXAMPLES
  $ docudigger scrape all

docudigger scrape amazon

Used to get invoices from amazon

USAGE
  $ docudigger scrape amazon -u <value> -p <value> [--json] [--logLevel trace|debug|info|warn|error] [-d] [-l
    <value>] [-c <value> -r] [--fileDestinationFolder <value>] [--fileFallbackExentension <value>] [-t <value>]
    [--yearFilter <value>] [--pageFilter <value>] [--onlyNew]

FLAGS
  -c, --recurringCron=<value>        [default: * * * * *] Cron pattern to execute periodically
  -d, --debug
  -l, --logPath=<value>              [default: ./logs/] Log path
  -p, --password=<value>             (required) Password
  -r, --recurring
  -t, --tld=<value>                  [default: de] Amazon top level domain
  -u, --username=<value>             (required) Username
  --fileDestinationFolder=<value>    [default: ./data/] Amazon top level domain
  --fileFallbackExentension=<value>  [default: .pdf] Amazon top level domain
  --logLevel=<option>                [default: info] Specify level for logging.
                                     <options: trace|debug|info|warn|error>
  --onlyNew                          Gets only new invoices
  --pageFilter=<value>               Filters a page
  --yearFilter=<value>               Filters a year

GLOBAL FLAGS
  --json  Format output as json.

DESCRIPTION
  Used to get invoices from amazon

  Scrapes amazon invoices

EXAMPLES
  $ docudigger scrape amazon

Docker

docker run \ 
  -e AMAZON_USERNAME='[YOUR MAIL]' \ 
  -e AMAZON_PASSWORD='[YOUR PW]' \
  -e AMAZON_TLD='de' \ 
  -e AMAZON_YEAR_FILTER='2020' \
  -e AMAZON_PAGE_FILTER='1' \
  -e LOG_LEVEL='info' \
  -v "C:/temp/docudigger/:/home/node/docudigger" \
  ghcr.io/disane87/docudigger

Dev-Time 🪲

NPM

npm install
[Change created .env for your needs]
npm run start

Author

šŸ‘¤ Marco Franke

šŸ¤ Contributing

Contributions, issues and feature requests are welcome!Feel free to check issues page. You can also take a look at the contributing guide.

Show your support

Give a ā­ļø if this project helped you!


This README was generated with ā¤ļø by readme-md-generator

1.0.11

2 years ago

1.0.10

2 years ago

1.0.9

2 years ago

1.0.8

2 years ago

1.0.7

2 years ago

1.0.6

2 years ago

1.0.5

2 years ago

1.0.4

2 years ago

1.0.3

2 years ago

1.0.2

2 years ago

1.0.1

2 years ago