dap-cli v1.0.7-6
Data Access Platform CLI
CLI for the Data Access Platform service
Installing
git clone gerrit:data-access-platform-cli
cd data-access-platform-cli
npm link
Setup
Fill in your details in config.json
following the example provided by
config.example.json
. While you can specify a different path to config with the
--config
or -c
flags, by default it reads from config.json
.
Usage
To use:
dap <command> [options]
Shared Options
version
dap --version
help
dap --help
or dap -h
logging
There are different levels of logging:
dap <command> [options] # 'info' logging level by default
dap <command> [options] -v # 'verbose' logging level
dap <command> [options] -vv # 'debug' logging level
Available Commands
Snapshot
This command will get the most recent version of all the requested tables. For
example, to grab users
, accounts
, and courses
:
dap snapshot users accounts courses
The default output location is the snapshots/
directory, but that can be
overridden with the --output
option.
The number of concurrent tables fetched at a time can be controlled with the
--concurrency
flag. Example:
dap snapshot users accounts courses submissions --concurrency 2
The downloaded file has the following format:
<table name>_<current date>.<file format>
The current date is the local time when the user runs the snapshot command.
# The time now is 11:49 PM. Today is the 24th of March, 2020
dap snapshot users
# The downloaded file will be:
# snapshots/users_2020-03-24-23-49-03.csv
Started snapshot queries can be resumed later by running the command again.
A small file is created for each table in the <user-home>/.dap
directory
to keep track of started queries. Queries can be interrupted using Ctrl-C
while executing a query and also while downloading the query results and
later resumed.
A started query will run in the cloud even if we stopped the client and we
can download its result at a later time. When running the command again it
will ask if we want to resume the interrupted query, we can provide a
default answer using the --continue
option with the yes
or no
value.
To speed up download of large snapshot files multiple parts of a file can
be downloaded in parallel, by default each file is downloaded only in one
part but there are two command line parameters which can be used to
speed up the download: any file larger than the value in MBytes specified
by option --largefilesize
will be downloaded in the number of parts
specified by option --largefiledownloadparts
. Example command:
# Snapshot files larger than 1GB will download in 4 parts concurrently
dap snapshot users assignments --largefilesize 1000 --largefiledownloadparts 4
Updates
This command will get changes to the specified tables since the provided time. This time can be provided in two different ways: 1) an amount of time relative to now, and 2) and absolute time.
# This grabs changes from the last 4h
dap updates users --last 4h
# This grabs changes since a specific time
dap updates users --since '2020-02-21T09:03:00Z'
The relative time accepts many formats, including a number followed by:
m
for minutesh
for hoursd
for days
By default, it will grab updates from the last 24 hours.
The default output location is the updates/
directory, but that can be
overridden with the --output
option.
The number of concurrent diffs fetched at a time can be controlled with the
--concurrency
flag.
The downloaded file has the following format:
<table name>_<since date>_<current date>.<file format>
Both since and current dates are local times.
# The time now is 12:34 AM. Today is the 14th of May, 2020
dap updates accounts --last 20d
# The downloaded file will be:
# updates/accounts_2020-04-24-00-34-28_2020-05-14-00-34-39.csv
Schema
This command will get information about the schema.
# List the available tables
dap schema --list
# Get the schema for some tables
dap schema users courses accounts
# Get the schema for all the tables
dap schema
Developing
Adding a new command
The CLI uses yargs commandDir to make it easy
to add new commands. Add a new command by creating a new module in
lib/commands/
. This module should contain:
exports.command
: string or array of strings that contains the command
exports.describe
: string with the description of the command
exports.builder
: object containing the command options or a function
accepting and returning a yargs instance
exports.handler
: function using the parsed argv
This structure assumes all modules in the commands
directory are command
modules. Any supporting files need to be in a different directory. See
snapshot.js for an example command and
snapshot.test.js for example tests.
See the Providing a Command Module docs for more details on these exports and the .commandDir(directory, [opts]) docs for more details about using a command module directory and more advanced options.
Running tests
Using a docker container you can use:
./build.sh
or without you can run:
npm run test
npm run lint
npm run lint:md
2 years ago