0.0.14 • Published 11 years ago

sqd v0.0.14

Weekly downloads
3
License
ISC
Repository
-
Last release
11 years ago

SQD

executing unix commands with multi processes

installation

$ npm install -g sqd

usage

$ sqd -c command [--debug] [--exit] [-p nProcess] [-s separator_command] <input file> [output file]

grep with 8 processes

sqd -c "grep -e something" -p 8 input.txt

results are on STDOUT.

sed with 4 processes (default), results to output.txt

sqd -c "sed -e y/ATCG/atcg/" input.txt output.txt

with separator option, we can also handle binary files

sqd -c "samtools view -" -s bam input.bam

reducing

sqd -c "node foobar.js" sample.txt --reduce

in foobar.js

if (process.env.sqd_map) {
  process.stdin.on("data", function(data) {
    // do something, separated into multi processes
  });
}
else if (process.env.sqd_reduce) {
  process.stdin.on("data", function(data) {
    // do somothing which reduces the results
  });
}
process.stdin.resume()

options

  • -p: the number of processes
  • --debug: debug mode (showing time, temporary files)
  • --exit: exits when child processes emit an error or emit to stderr
  • -s: (see separator section)
  • --reduce: reducing the results with the same command, which is given an environmental variable named sqd_reduce with value "1"

additional environment variables in child processes

Be careful that all values are parsed as string.

  • sqd_n: process number named by sqd, differs among child processes
  • sqd_start: start position of the file passed to the child process
  • sqd_end: end position of the file passed to the child process
  • sqd_command: command string (common)
  • sqd_input: input file name (common)
  • sqd_tmpfile: path to the tmpfile used in the child process
  • sqd_debug: debug mode or not. '1' or '0' (common)
  • sqd_hStart:: start position of the header (common)
  • sqd_map:: "1", unless it is spawned for reducing. undefined, otherwise
  • sqd_reduce:: "1", if it is spawned for reducing. undefined, otherwise

separator

sqd requires a separator which separates a given input file into multiple chunks. separator offers the way how sqd separates the file by JSON format.

the JSON keys are

  • positions: start positions of each chunks in the file
"positions": [133, 271, 461, 631]
  • header: range of the header section of the file, null when there is no header section
"header": [0, 133]
  • size: file size (optional)
"size": 34503

available separators

sqdm --much more memory

$ sqdm [memory=4000MB] -c command [--debug] [--exit] [-p nProcess] [-s separator_command] <input file> [output file]

sqd with 8000MB(≒8GB) memory

sqdm 8000 -c "cat" sample.txt
0.0.14

11 years ago

0.0.12

11 years ago

0.0.11

11 years ago

0.0.10

11 years ago

0.0.8

11 years ago

0.0.7

11 years ago

0.0.5

11 years ago

0.0.3

11 years ago

0.0.2

11 years ago

0.0.1

11 years ago

0.0.0

11 years ago