1.1.0 • Published 9 years ago

proteins v1.1.0

Weekly downloads
2
License
GPLv3
Repository
github
Last release
9 years ago

Proteins

A project to aggressively refactor protein identification and data storage

The current state of protein identification codes is that of many disparate systems. The most well-known system as of this writing is the GenBank Accession Code. However, GenBank codes correspond to a particular observation of a protein, meaning that for a single protein in a single organism, there are usually >10,000 accession codes with the same sequence. This is terrible for computer programs operating on an idealized protein, since a "standard" GenBank accession will have to be cherry-picked for each needed protein.

This project sets out to refactor the situation of identification codes and data storage formats. We propose a JSON-based system storing protein metadata, and a dual identifier system containing two sets of identifiers:

  • a global, long hexadecimal identifier
  • and a short, human-readable identifier

We've implemented this system in this repository. After installing with npm install -g proteins: To generate an ID from a JSON protein description file:

mkproteinid < ./test.json
1.1.0

9 years ago

1.0.2

9 years ago

1.0.1

9 years ago

1.0.0

9 years ago