data-loss-signatures v1.0.5
data-loss-signatures

Identify confidential and sensitive info in source code repositories by data-loss "signatures".
data-loss-signatures is a Node.js
module
for storing and accessing to data-leakage detection definitions.
We call the data structure that represents a data-leakage detection
defintion a "signature." We store a community-tested list of signatures in a file called
signatures.json.
Table of Contents
- 1. Security
- 2. Install
- 3. Usage
- 4. API
- 5. Accessing signatures with other tools and programming languages
- 6. Maintainers
- 7. Contributions
- 8. License
- 9. References and Attributions
1. Security
Data leakage is the unauthorized transmission of data from within an organization to an external destination or recipient.^1
One of the most common forms of data-loss (aka, "data leakage") happens when developers (inadvertently) commit and push passwords, access-tokens, and sensitive data to a source-control management system (like Git). Consequently, confidential information "leaks" into search results and commit history.
The signatures.json contains a growing list of definitions to help you detect secrets in your source code repositories.
| Secret | Detected in | |
|---|---|---|
| 1 | .pem file extensionPotential cryptographic private key | extension |
| 2 | Log fileLog files can contain secret HTTP endpoints, session IDs, API keys and other goodies | extension |
| 3 | .pkcs12 file extensionPotential cryptographic key bundle | extension |
| 4 | .p12 file extensionPotential cryptographic key bundle | extension |
| 5 | .pfx file extensionPotential cryptographic key bundle | extension |
| 6 | .asc file extensionPotential cryptographic key bundle | extension |
| 7 | Pidgin OTR private key | filename |
| 8 | OpenVPN client configuration file | extension |
| 9 | Azure service configuration schema file | extension |
| 10 | Remote Desktop connection file | extension |
| 11 | Microsoft SQL database file | extension |
| 12 | Microsoft SQL server compact database file | extension |
| 13 | SQLite database file | extension |
| 14 | Microsoft BitLocker recovery key file | extension |
| 15 | Microsoft BitLocker Trusted Platform Module password file | extension |
| 16 | Windows BitLocker full volume encrypted data file | extension |
| 17 | Java keystore file | extension |
| 18 | Password Safe database file | extension |
| 19 | Ruby On Rails secret token configuration fileIf the Rails secret token is known, it can allow for remote code execution (http://www.exploit-db.com/exploits/27527/) | filename |
| 20 | Carrierwave configuration fileCan contain credentials for cloud storage systems such as Amazon S3 and Google Storage | filename |
| 21 | Potential Ruby On Rails database configuration fileCan contain database credentials | filename |
| 22 | OmniAuth configuration fileThe OmniAuth configuration file can contain client application secrets | filename |
| 23 | Django configuration fileCan contain database credentials, cloud storage system credentials, and other secrets | filename |
| 24 | 1Password password manager database fileFeed it to Hashcat and see if you're lucky | extension |
| 25 | Apple Keychain database file | extension |
| 26 | Network traffic capture file | extension |
| 27 | GnuCash database file | extension |
| 28 | Jenkins publish over SSH plugin file | filename |
| 29 | Potential Jenkins credentials file | filename |
| 30 | KDE Wallet Manager database file | extension |
| 31 | Potential MediaWiki configuration file | filename |
| 32 | Tunnelblick VPN configuration file | extension |
| 33 | Sequel Pro MySQL database manager bookmark file | filename |
| 34 | Little Snitch firewall configuration fileContains traffic rules for applications | filename |
| 35 | Day One journal fileNow it's getting creepy... | extension |
| 36 | Potential jrnl journal fileNow it's getting creepy... | filename |
| 37 | Chef Knife configuration fileCan contain references to Chef servers | filename |
| 38 | cPanel backup ProFTPd credentials fileContains usernames and password hashes for FTP accounts | filename |
| 39 | Robomongo MongoDB manager configuration fileCan contain credentials for MongoDB databases | filename |
| 40 | FileZilla FTP configuration fileCan contain credentials for FTP servers | filename |
| 41 | FileZilla FTP recent servers fileCan contain credentials for FTP servers | filename |
| 42 | Ventrilo server configuration fileCan contain passwords | filename |
| 43 | Terraform variable config fileCan contain credentials for terraform providers | filename |
| 44 | Shell configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
| 45 | Shell configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
| 46 | Shell configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
| 47 | Private SSH key | filename |
| 48 | Private SSH key | filename |
| 49 | Private SSH key | filename |
| 50 | Private SSH key | filename |
| 51 | SSH configuration file | path |
| 52 | Potential cryptographic private key | extension |
| 53 | Shell command history file | filename |
| 54 | MySQL client command history file | filename |
| 55 | PostgreSQL client command history file | filename |
| 56 | PostgreSQL password file | filename |
| 57 | Ruby IRB console history file | filename |
| 58 | Pidgin chat client account configuration file | path |
| 59 | Hexchat/XChat IRC client server list configuration file | path |
| 60 | Irssi IRC client configuration file | path |
| 61 | Recon-ng web reconnaissance framework API key database | path |
| 62 | DBeaver SQL database manager configuration file | filename |
| 63 | Mutt e-mail client configuration file | filename |
| 64 | S3cmd configuration file | filename |
| 65 | AWS CLI credentials file | path |
| 66 | SFTP connection configuration file | filename |
| 67 | T command-line Twitter client configuration file | filename |
| 68 | gitrob configuration file | filename |
| 69 | Shell configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
| 70 | Shell profile configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
| 71 | Shell command alias configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
| 72 | PHP configuration file | filename |
| 73 | GNOME Keyring database file | extension |
| 74 | KeePass password manager database fileFeed it to Hashcat and see if you're lucky | extension |
| 75 | SQL dump file | extension |
| 76 | Apache htpasswd file | filename |
| 77 | Configuration file for auto-login processCan contain username and password | filename |
| 78 | Rubygems credentials fileCan contain API key for a rubygems.org account | path |
| 79 | Tugboat DigitalOcean management tool configuration | filename |
| 80 | DigitalOcean doctl command-line client configuration fileContains DigitalOcean API key and other information | path |
| 81 | git-credential-store helper credentials file | filename |
| 82 | GitHub Hub command-line client configuration fileCan contain GitHub API access token | path |
| 83 | Git configuration file | filename |
| 84 | Chef private keyCan be used to authenticate against Chef servers | path |
| 85 | Potential Linux shadow fileContains hashed passwords for system users | path |
| 86 | Potential Linux passwd fileContains system user information | path |
| 87 | Docker configuration fileCan contain credentials for public or private Docker registries | filename |
| 88 | NPM configuration fileCan contain credentials for NPM registries | filename |
| 89 | Environment configuration file | filename |
| 90 | Contains word: credential | path |
| 91 | Contains word: password | path |
2. Install
Before you begin, you'll need to have these
Programming languages:
Skills:
You'll need to know how to access the command line (aka, "Terminal")
on your machine.
Open a Terminal and enter the following command:
# As a dependency in your Node.js app
npm i data-loss-signatures --save-prod3. Usage
Use data-loss-signatures.signatures to find file extensions, names, and paths
that commonly leak secrets.
const { signatures } = require('data-loss-signatures')
// ⚠️ Note: the 'recursive-readdir' module is not bundled with
// data-loss-signatures. 'recursive-readdir' is referenced
// only as an example.
const recursiveReaddir = require('recursive-readdir')
const potentialLeaks = recursiveReaddir('/path/to/local/repo')
.then(files => files
.map(file => signatures
.map(signature => signature.match(file)))
)
.catch(err => err)4. API
The data-loss-signatures module provides a
Signatures class, which validates data-loss-signatures and
converts regular expression strings to RE2 (whenever possible).
The data-loss-signatures module's public API provides:
factorymethod: a convenience function that creates a signature object.nullSignature: implements a default object literal with all signatures properties set tonull.Signature: a class that constructs a signature object.signatures: an array ofSignatureinstances.toArray(data: {String|Array.<Object>}): generates anArray.<Signature>from a JSON string or object literal array.validParts: a constants enum of validSignature.prototype.partvalues.validTypes: a constants enum of validSignature.prototype.typevalues.
4.1. data-loss-signatures.Signature
A class that constructs Signature objects.
const { Signature, validParts, validTypes } = require('data-loss-signatures')
const signature = new Signature({
caption: 'Potential cryptographic private key',
description: '',
part: validParts.EXTENSION,
pattern: '.pem',
type: validTypes.MATCH
})4.2. data-loss-signatures.Signature.prototype.match
Discover possible data leaks by matching a Signature pattern
against file extensions, names, and paths.
const rsaTokenSignature = new Signature({
'caption': 'Private SSH key',
'description': '',
'part': 'filename',
'pattern': '^.*_rsa$',
'type': 'regex'
})
const suspiciousFilePath = '/hmm/what/might/this/be/id_rsa'
rsaTokenSignature.match(suspiciousFilePath)
// => ['/hmm/what/might/this/be/id_rsa']
const fileThatIsJustBeingCoolBruh = 'file/that/is/just/being/cool/bruh'
rsaTokenSignature.match(suspiciousFilePath)
// => null Review the source code for
signature.
5. Accessing signatures with other tools and programming languages
You can access signatures.json without the data-loss-signatures
Node module. Select a tool or programming language below to view examples.
You can access data-loss rules using HTTPS. You can GET all signatures directly from Gitlab with cURL.
curl -X GET \
'https://gitlab.com/gregswindle/data-loss-signatures/raw/master/signatures.json'package main
import (
"fmt"
"net/http"
"io/ioutil"
)
func main() {
url := "https://gitlab.com/gregswindle/data-loss-signatures/raw/master/signatures.json"
req, _ := http.NewRequest("GET", url, nil)
req.Header.Add("Private-Token", "<your-personal-token>")
req.Header.Add("cache-control", "no-cache")
res, _ := http.DefaultClient.Do(req)
defer res.Body.Close()
body, _ := ioutil.ReadAll(res.Body)
fmt.Println(res)
fmt.Println(string(body))
}const http = require('https')
const options = {
method: 'GET',
hostname: ['gitlab', 'com'],
path: ['api', 'v4', 'projects'],
headers: {
'Private-Token': '<your-access-token>',
'cache-control': 'no-cache'
}
}
const req = http.request(options, res => {
const chunks = []
res.on('data', chunk => {
chunks.push(chunk)
})
res.on('end', () => {
var body = Buffer.concat(chunks)
console.log(body.toString())
})
})
req.end()Python3
import http.client
conn = http.client.HTTPConnection("gitlab,com")
payload = ""
headers = {
'Accept': "application/json",
'cache-control': "no-cache"
}
conn.request("GET", "gregswindle,data-loss-signatures,raw,master,signatures.json", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))Python2
import requests
url = "https://gitlab.com/gregswindle/data-loss-signatures/raw/master/signatures.json"
payload = ""
headers = {
'Accept': "application/json",
'cache-control': "no-cache"
}
response = requests.request("GET", url, data=payload, headers=headers)
print(response.text)require 'uri'
require 'net/http'
url = URI("'https://gitlab.com/gregswindle/data-loss-signatures/raw/master/signatures.json")
http = Net::HTTP.new(url.host, url.port)
request = Net::HTTP::Get.new(url)
request["Private-Token"] = '<your-personal-token>'
request["cache-control"] = 'no-cache'
response = http.request(request)
puts response.read_body6. Maintainers
The Maintainer Guide has useful information for Maintainers and Trusted Committers.
7. Contributions
We gratefully accept Merge Requests! Here's what you need to know to get started.
Before submitting a Merge Request, please read our:
Thanks goes to our awesome contributors (emoji key):
This project follows the all-contributors specification. Contributions of any kind welcome!
7.1. Adding a Signature
Before adding a new Signature, please review all current definitions: the Signature might already exist.
If the Signature does not exist, please be sure to add your Signature with the following properties:
caption: A succinct summary for the Signature. Think of caption as a well-written email subject.description: Provide more details about the Signature if necessary. description is especially useful for differentiating similar Signatures.part: An enumeration that defines what the Signature is evaluating. Valid values are:contents: The string(s) within a file.extension: A file extension (which defines the Content-Type or mime-type).filename: The unique name of the file.path: The directory path relative to the repo and without the filename.
pattern: The string or regular expression to look for.type: An enumeration that defines how to evaluate for secrets. Valid values are:match: A strict string equivalency evaluation.regex: A regular expression "search" or "test".
7.2. Editing a Signature
Edits are welcome! Just be sure to unit test.
7.3. Removing a Signature
Please provide a testable justification for any Signature removal.
8. License
Apache-2.0 © 2019 Greg Swindle
9. References and Attributions
^1: What is Data Leakage? Defined, Explained, and Explored | Forcepoint. (2019) Retrieved January 27, 2019, from https://www.forcepoint.com/cyber-edu/data-leakage