0.1.6-3 • Published 5 years ago

ngsjs v0.1.6-3

Weekly downloads
4
License
MIT
Repository
github
Last release
5 years ago

output: github_document

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-"
)

ngsjs is a set of command line tools, NGS data analysis workflows [WDL, Nextflow, snakemake, and bpipe], and R shiny plugins/R markdown document for exploring next-generation sequencing data.

ngsjs

Now, there are several difficulties for next-generation sequencing (NGS) data analysis projects that needs to be solved:

  • Standardized project management, directory structured,recording and checking of raw data and analysis result, standardized logging for input, output and commands
  • Construction and redeployment of computing environment including all required tools, databases and other files.
  • Lack of integration and unify of massive data analysis workflows.
  • Lack of the unified distribution platform for various data analysis workflows (e.g. snakemake, nextflow, Galaxy, etc.).
  • Reuse of workflows language codes (e.g. commands, input and output information) on other programming platform are still complicated.
  • The readability and reusable will also be decreased when massive Python and R codes mixed with the workflows language codes.

This is an experimental project to providing a set of tools for the exploring next-generation sequencing (NGS) data. We aim to integrate and develop command line tools, NGS data analysis workflows [WDL, Nextflow, snakemake, and bpipe], and R shiny plugins/R markdown document.

We proposed that using node to distribute the bioinformatics data analysis required workflows (e.g Common workflow language (CWL) ) and user created command line scripts in data analysis process. The creation, update and upload of a node package are very simple. Well-tested and high-performance distribution tools of node packages, such as npm and yarn, are providing the service for more than 831,195 node packages.

Command line scripts supported now:

toolfunction
rdepsGetting all ngsjs required R packages
rsessionsessionInfo() and sessioninfo::session\_info()
rinstallInstall R packages and BioInstaller resources using install.packages() and R packages devtools, BiocManager and BioInstaller
rbashfulUsing the GO program bashful, yaml and toml and R scripts to stitch together commands and bash snippits and run them with a bit of style
rconfigParsing and generating json, ini, yaml, and toml format configuration files
rclrsGenerating colors for visulization using a theme key
rmvFormating the file names.
ranystrGenerating any counts and any length random strings (e.g. Ies1y7fpgMVjsAyBAtTT)
rtime_stampGenerating time stamp (e.g. 201811_15_22_43_25, 2018/11/15/, 2018/11/15/22/).
rdownloadParallel download URLs with logs
rbinCollecting R packages inst/bin files to a directory, e.g. PATH

We are collecting the CWL language created workflows and publish on the npm:

Besides, we are developing a framework to integrate various data analysis workflows and command line scripts:

  • rbashful: A ngsjs command line tool to dynamically render env.toml and cli.yaml for a unified downstream analysis environment shared between all integrated tools, workflows, scripts.
  • cli.yaml: Process controller with the bashful style.
  • env.toml: Store the fields and values of input and output parameters; the core command line commands indexed by unique keys.
  • others

Requirements

R packages

  • optparse
  • configr
  • stringi
  • futile.logger
  • glue
  • ngstk
  • BioInstaller
  • devtools
  • pacman
  • BiocManager
  • sessioninfo
  • future

Installation

You need to install the node, R and GO for running all ngsjs executable files.

# Use conda to manage the env
conda install go nodejs \
echo 'export NODE_PATH="/path/miniconda2/lib/node_modules/"\n' >> ~/.bashrc \
&& npm install -g npm \
&& npm install -g yarn \
&& echo 'export PATH=$PATH:~/.yarn/bin/\n' >> ~/.bashrc

# Other see https://nodejs.org/en/download/package-manager/
# For: Ubuntu
apt update
apt install -y npm golang

# For MacOS
brew install node go
npm install -g ngsjs
# or
yarn global add ngsjs

# If you not to globaly install ngsjs 
# You need to set the PATH
echo "export NGSJS_ROOT=/path/node_modules/nodejs" >> ~/.bashrc
echo "export PATH=$PATH:${NGSJS_ROOT}/bin" >> ~/.bashrc

# Current dir is /path
npm install ngsjs
# or
yarn add ngsjs

Usage

Before try your ngsjs command line tools, you need run the rdeps getting all the extra R packages required by ngsjs.

# install the extra R packages used in `ngsjs` scripts
rdeps

Then you can use the ngsjs to run all sub-commands.

ngsjs -h

rbashful

bashful is a GO program and used by rbashful, so you need to install it before use the rbashful.

Ubuntu/Debian

wget https://github.com/wagoodman/bashful/releases/download/v0.0.10/bashful_0.0.10_linux_amd64.deb
sudo apt install ./bashful_0.0.10_linux_amd64.deb

RHEL/Centos

wget https://github.com/wagoodman/bashful/releases/download/v0.0.10/bashful_0.0.10_linux_amd64.rpm
rpm -i bashful_0.0.10_linux_amd64.rpm

Mac

brew tap wagoodman/bashful
brew install bashful

or download a Darwin build from the releases page.

Go tools

go get github.com/wagoodman/bashful

npm.io

View a rbashful demo here.

source_dir <- "/Users/ljf/Documents/repositories/ljf/github/ngsjs/examples/rbashful/rnaseq_splicing/02_leafcutter_majiq"

# View the cli.yaml
cat(paste0(readLines(sprintf("%s/cli.yaml", source_dir)), 
           collapse = "\n"), sep = "\n")

# View the env.toml
cat(paste0(readLines(sprintf("%s/env.toml", source_dir)), 
           collapse = "\n"), sep = "\n")

# View the submit.sh
cat(paste0(readLines(sprintf("%s/submit", source_dir)), 
           collapse = "\n"), sep = "\n")
rbashful -h

rsession

# Print commandline help of rsession
rsession -h

# Print rsession R document (Just like ?sessionInfo in R client)
rsession -d

# Print R sessionInfo()
# The followed three lines are equivalent.
rsession
rsession -f 1
rsession -f sessionInfo

# The followed two lines are equivalent.
rsession -f 2 -e 'include_base=TRUE'
rsession -f sessioninfo::session_info -e 'include_base=TRUE'
rsession -f 2 -e 'include_base=TRUE'

rsession -h

rinstall

# Print commandline help of rinstall
rinstall -h

# Print rinstall R document (Just like ?sessionInfo in R client)
rinstall -d

# Install CRAN R package yaml (install.package)
rinstall yaml
rinstall -f 1 yaml

# Install R package ngstk from GitHub ngsjs/ngstk (devtools::install_github)
rinstall -f 2 ngsjs/ngstk

# Install R package ngstk from GitHub ngsjs/ngstk (install.package)
# devtools::install_github with force = TRUE, ref = 'develop'
rinstall -f 2 -e "force = TRUE, ref = 'develop'" ngsjs/ngstk

# Install Bioconductor package ggtree (BiocManager)
# BiocManager::install('ggtree')
rinstall -f 3 ggtree

# Install R packages (pacman)
# pacman::p_load(ggtree)
rinstall -f 4 ggtree

# Install and download BioInstaller resources

# Show all BioInstaller default resources name
rinstall -f BioInstaller::install.bioinfo -e "show.all.names=T"
rinstall -f 5 -e "show.all.names=T"

# Show ANNOVAR refgene and avsnp versions
rinstall -f BioInstaller::install.bioinfo -e "show.all.versions=T" db_annovar_refgene
rinstall -f 5 -e "show.all.versions=T" db_annovar_avsnp

# Show ANNOVAR hg19 refgene and avsnp
rinstall -f 5 -e "download.dir='/tmp/refgene', extra.list=list(buildver='hg19')" db_annovar_refgene
rinstall -f 5 -e "download.dir='/tmp/avsnp', extra.list=list(buildver='hg19')" db_annovar_avsnp
rinstall -h

rdownload

rdownload "https://img.shields.io/npm/dm/ngsjs.svg,https://img.shields.io/npm/v/ngsjs.svg,https://img.shields.io/npm/l/ngsjs.svg"

rdownload "https://img.shields.io/npm/dm/ngsjs.svg,https://img.shields.io/npm/v/ngsjs.svg,https://img.shields.io/npm/l/ngsjs.svg" --destfiles "/tmp/ngsjs1.svg,ngsjs2.svg,ngsjs3.svg"

rdownload --urls "https://img.shields.io/npm/dm/ngsjs.svg , https://img.shields.io/npm/v/ngsjs.svg,https://img.shields.io/npm/l/ngsjs.svg" \
          --destfiles "ngsjs1.svg,ngsjs2.svg,ngsjs3.svg" --max-cores 1
rdownload -h

rconfig

# Use configr::read.config parsing json format file
# Reture the list object output
rconfig package.json
rconfig -c package.json

# Use configr::read.config parsing json format file with the custom R function
rconfig -c test.json -r 'function(x){x[["a"]] + x[["b"]]}'
rconfig -c test.json -r 'function(x){x[["a"]]}'
rconfig -c test.json -r 'function(x){x[["b"]]}'
rconfig -c test.json -r 'x[["b"]]'

# Use configr::write.config parsing json format file
rconfig -f "configr::write.config" test.json -e "config.dat=list(a=1, b=2), write.type='json'"
rconfig -f 2 test.json -e "config.dat=list(a=1, b=2), write.type='json'"
rconfig -f "configr::fetch.config" "https://raw.githubusercontent.com/Miachol/configr/master/inst/extdata/config.global.toml"

rconfig -h

rclrs

# Show default and red/blue theme colors
rclrs default
rclrs -t default
rclrs -t red_blue

# Show default theme colors (extract the first element)
rclrs -t default -r 'x[1]'

# Show all supported theme
rclrs --show-all-themes
rclrs -h

rmv

# do.rename is used to preview the new filenames
# 
rmv "`ls`" -e "do.rename = F, prefix = 'prefix', suffix = 'suffix'"
rmv "`ls`" -e "do.rename = F, replace = list(old =c('-', '__'), new = c('_', '_'))"
rmv "`ls`" -e "do.rename = F, toupper = TRUE"
rmv "`ls`" -e "do.rename = F, tolower = TRUE"

rmv "`ls`" -e "do.rename=T, replace=list(old='new', new='old')"
rmv -h

rtime_stamp

rtime_stamp

rtime_stamp -r 'x[[1]]'

rtime_stamp -r 'x[[1]][1]'

rtime_stamp -t '%Y_%d'

rtime_stamp -e "extra_flag=c('*')"

rtime_stamp -h

ranystr

./bin/ranystr

./bin/ranystr -l 30

./bin/ranystr -l 20 -n 3
ranystr -h
# Collect system.files("extdata", "bin", package = "ngstk")
# multiple packages (i.e. ngstk,configr) 
rbin ngstk

rbin --destdir /tmp/path ngstk

rbin -h

How to contribute?

Please fork the GitHub ngsjs repository, modify it, and submit a pull request to us.

Maintainer

Jianfeng Li

License

MIT

0.1.6-3

5 years ago

0.1.6-2

6 years ago

0.1.6-1

6 years ago

0.1.6

6 years ago

0.1.5

6 years ago

0.1.4

6 years ago

0.1.3

6 years ago

0.1.2

6 years ago

0.1.1

6 years ago

0.0.1

6 years ago