data-loss-signatures v1.0.5
data-loss-signatures
Identify confidential and sensitive info in source code repositories by data-loss "signatures".
data-loss-signatures is a Node.js
module
for storing and accessing to data-leakage detection definitions.
We call the data structure that represents a data-leakage detection
defintion a "signature." We store a community-tested list of signatures in a file called signatures.json
.
Table of Contents
- 1. Security
- 2. Install
- 3. Usage
- 4. API
- 5. Accessing signatures with other tools and programming languages
- 6. Maintainers
- 7. Contributions
- 8. License
- 9. References and Attributions
1. Security
Data leakage is the unauthorized transmission of data from within an organization to an external destination or recipient.^1
One of the most common forms of data-loss (aka, "data leakage") happens when developers (inadvertently) commit and push passwords, access-tokens, and sensitive data to a source-control management system (like Git). Consequently, confidential information "leaks" into search results and commit history.
The signatures.json contains a growing list of definitions to help you detect secrets in your source code repositories.
Secret | Detected in | |
---|---|---|
1 | .pem file extensionPotential cryptographic private key | extension |
2 | Log fileLog files can contain secret HTTP endpoints, session IDs, API keys and other goodies | extension |
3 | .pkcs12 file extensionPotential cryptographic key bundle | extension |
4 | .p12 file extensionPotential cryptographic key bundle | extension |
5 | .pfx file extensionPotential cryptographic key bundle | extension |
6 | .asc file extensionPotential cryptographic key bundle | extension |
7 | Pidgin OTR private key | filename |
8 | OpenVPN client configuration file | extension |
9 | Azure service configuration schema file | extension |
10 | Remote Desktop connection file | extension |
11 | Microsoft SQL database file | extension |
12 | Microsoft SQL server compact database file | extension |
13 | SQLite database file | extension |
14 | Microsoft BitLocker recovery key file | extension |
15 | Microsoft BitLocker Trusted Platform Module password file | extension |
16 | Windows BitLocker full volume encrypted data file | extension |
17 | Java keystore file | extension |
18 | Password Safe database file | extension |
19 | Ruby On Rails secret token configuration fileIf the Rails secret token is known, it can allow for remote code execution (http://www.exploit-db.com/exploits/27527/) | filename |
20 | Carrierwave configuration fileCan contain credentials for cloud storage systems such as Amazon S3 and Google Storage | filename |
21 | Potential Ruby On Rails database configuration fileCan contain database credentials | filename |
22 | OmniAuth configuration fileThe OmniAuth configuration file can contain client application secrets | filename |
23 | Django configuration fileCan contain database credentials, cloud storage system credentials, and other secrets | filename |
24 | 1Password password manager database fileFeed it to Hashcat and see if you're lucky | extension |
25 | Apple Keychain database file | extension |
26 | Network traffic capture file | extension |
27 | GnuCash database file | extension |
28 | Jenkins publish over SSH plugin file | filename |
29 | Potential Jenkins credentials file | filename |
30 | KDE Wallet Manager database file | extension |
31 | Potential MediaWiki configuration file | filename |
32 | Tunnelblick VPN configuration file | extension |
33 | Sequel Pro MySQL database manager bookmark file | filename |
34 | Little Snitch firewall configuration fileContains traffic rules for applications | filename |
35 | Day One journal fileNow it's getting creepy... | extension |
36 | Potential jrnl journal fileNow it's getting creepy... | filename |
37 | Chef Knife configuration fileCan contain references to Chef servers | filename |
38 | cPanel backup ProFTPd credentials fileContains usernames and password hashes for FTP accounts | filename |
39 | Robomongo MongoDB manager configuration fileCan contain credentials for MongoDB databases | filename |
40 | FileZilla FTP configuration fileCan contain credentials for FTP servers | filename |
41 | FileZilla FTP recent servers fileCan contain credentials for FTP servers | filename |
42 | Ventrilo server configuration fileCan contain passwords | filename |
43 | Terraform variable config fileCan contain credentials for terraform providers | filename |
44 | Shell configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
45 | Shell configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
46 | Shell configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
47 | Private SSH key | filename |
48 | Private SSH key | filename |
49 | Private SSH key | filename |
50 | Private SSH key | filename |
51 | SSH configuration file | path |
52 | Potential cryptographic private key | extension |
53 | Shell command history file | filename |
54 | MySQL client command history file | filename |
55 | PostgreSQL client command history file | filename |
56 | PostgreSQL password file | filename |
57 | Ruby IRB console history file | filename |
58 | Pidgin chat client account configuration file | path |
59 | Hexchat/XChat IRC client server list configuration file | path |
60 | Irssi IRC client configuration file | path |
61 | Recon-ng web reconnaissance framework API key database | path |
62 | DBeaver SQL database manager configuration file | filename |
63 | Mutt e-mail client configuration file | filename |
64 | S3cmd configuration file | filename |
65 | AWS CLI credentials file | path |
66 | SFTP connection configuration file | filename |
67 | T command-line Twitter client configuration file | filename |
68 | gitrob configuration file | filename |
69 | Shell configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
70 | Shell profile configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
71 | Shell command alias configuration fileShell configuration files can contain passwords, API keys, hostnames and other goodies | filename |
72 | PHP configuration file | filename |
73 | GNOME Keyring database file | extension |
74 | KeePass password manager database fileFeed it to Hashcat and see if you're lucky | extension |
75 | SQL dump file | extension |
76 | Apache htpasswd file | filename |
77 | Configuration file for auto-login processCan contain username and password | filename |
78 | Rubygems credentials fileCan contain API key for a rubygems.org account | path |
79 | Tugboat DigitalOcean management tool configuration | filename |
80 | DigitalOcean doctl command-line client configuration fileContains DigitalOcean API key and other information | path |
81 | git-credential-store helper credentials file | filename |
82 | GitHub Hub command-line client configuration fileCan contain GitHub API access token | path |
83 | Git configuration file | filename |
84 | Chef private keyCan be used to authenticate against Chef servers | path |
85 | Potential Linux shadow fileContains hashed passwords for system users | path |
86 | Potential Linux passwd fileContains system user information | path |
87 | Docker configuration fileCan contain credentials for public or private Docker registries | filename |
88 | NPM configuration fileCan contain credentials for NPM registries | filename |
89 | Environment configuration file | filename |
90 | Contains word: credential | path |
91 | Contains word: password | path |
2. Install
Before you begin, you'll need to have these
Programming languages:
Skills:
You'll need to know how to access the command line (aka, "Terminal") on your machine.
Open a Terminal and enter the following command:
# As a dependency in your Node.js app
npm i data-loss-signatures --save-prod
3. Usage
Use data-loss-signatures.signatures
to find file extensions, names, and paths
that commonly leak secrets.
const { signatures } = require('data-loss-signatures')
// ⚠️ Note: the 'recursive-readdir' module is not bundled with
// data-loss-signatures. 'recursive-readdir' is referenced
// only as an example.
const recursiveReaddir = require('recursive-readdir')
const potentialLeaks = recursiveReaddir('/path/to/local/repo')
.then(files => files
.map(file => signatures
.map(signature => signature.match(file)))
)
.catch(err => err)
4. API
The data-loss-signatures module provides a
Signatures
class, which validates data-loss-signatures and
converts regular expression strings to RE2 (whenever possible).
The data-loss-signatures module's public API provides:
factory
method: a convenience function that creates a signature object.nullSignature
: implements a default object literal with all signatures properties set tonull
.Signature
: a class that constructs a signature object.signatures
: an array ofSignature
instances.toArray(data: {String|Array.<Object>})
: generates anArray.<Signature>
from a JSON string or object literal array.validParts
: a constants enum of validSignature.prototype.part
values.validTypes
: a constants enum of validSignature.prototype.type
values.
4.1. data-loss-signatures.Signature
A class that constructs Signature objects.
const { Signature, validParts, validTypes } = require('data-loss-signatures')
const signature = new Signature({
caption: 'Potential cryptographic private key',
description: '',
part: validParts.EXTENSION,
pattern: '.pem',
type: validTypes.MATCH
})
4.2. data-loss-signatures.Signature.prototype.match
Discover possible data leaks by match
ing a Signature pattern
against file extensions, names, and paths.
const rsaTokenSignature = new Signature({
'caption': 'Private SSH key',
'description': '',
'part': 'filename',
'pattern': '^.*_rsa$',
'type': 'regex'
})
const suspiciousFilePath = '/hmm/what/might/this/be/id_rsa'
rsaTokenSignature.match(suspiciousFilePath)
// => ['/hmm/what/might/this/be/id_rsa']
const fileThatIsJustBeingCoolBruh = 'file/that/is/just/being/cool/bruh'
rsaTokenSignature.match(suspiciousFilePath)
// => null
Review the source code for signature
.
5. Accessing signatures with other tools and programming languages
You can access signatures.json
without the data-loss-signatures
Node module. Select a tool or programming language below to view examples.
You can access data-loss rules using HTTPS. You can GET all signatures directly from Gitlab with cURL.
curl -X GET \
'https://gitlab.com/gregswindle/data-loss-signatures/raw/master/signatures.json'
package main
import (
"fmt"
"net/http"
"io/ioutil"
)
func main() {
url := "https://gitlab.com/gregswindle/data-loss-signatures/raw/master/signatures.json"
req, _ := http.NewRequest("GET", url, nil)
req.Header.Add("Private-Token", "<your-personal-token>")
req.Header.Add("cache-control", "no-cache")
res, _ := http.DefaultClient.Do(req)
defer res.Body.Close()
body, _ := ioutil.ReadAll(res.Body)
fmt.Println(res)
fmt.Println(string(body))
}
const http = require('https')
const options = {
method: 'GET',
hostname: ['gitlab', 'com'],
path: ['api', 'v4', 'projects'],
headers: {
'Private-Token': '<your-access-token>',
'cache-control': 'no-cache'
}
}
const req = http.request(options, res => {
const chunks = []
res.on('data', chunk => {
chunks.push(chunk)
})
res.on('end', () => {
var body = Buffer.concat(chunks)
console.log(body.toString())
})
})
req.end()
Python3
import http.client
conn = http.client.HTTPConnection("gitlab,com")
payload = ""
headers = {
'Accept': "application/json",
'cache-control': "no-cache"
}
conn.request("GET", "gregswindle,data-loss-signatures,raw,master,signatures.json", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))
Python2
import requests
url = "https://gitlab.com/gregswindle/data-loss-signatures/raw/master/signatures.json"
payload = ""
headers = {
'Accept': "application/json",
'cache-control': "no-cache"
}
response = requests.request("GET", url, data=payload, headers=headers)
print(response.text)
require 'uri'
require 'net/http'
url = URI("'https://gitlab.com/gregswindle/data-loss-signatures/raw/master/signatures.json")
http = Net::HTTP.new(url.host, url.port)
request = Net::HTTP::Get.new(url)
request["Private-Token"] = '<your-personal-token>'
request["cache-control"] = 'no-cache'
response = http.request(request)
puts response.read_body
6. Maintainers
The Maintainer Guide has useful information for Maintainers and Trusted Committers.
7. Contributions
We gratefully accept Merge Requests! Here's what you need to know to get started.
Before submitting a Merge Request, please read our:
Thanks goes to our awesome contributors (emoji key):
This project follows the all-contributors specification. Contributions of any kind welcome!
7.1. Adding a Signature
Before adding a new Signature, please review all current definitions: the Signature might already exist.
If the Signature does not exist, please be sure to add your Signature with the following properties:
caption
: A succinct summary for the Signature. Think of caption as a well-written email subject.description
: Provide more details about the Signature if necessary. description is especially useful for differentiating similar Signatures.part
: An enumeration that defines what the Signature is evaluating. Valid values are:contents
: The string(s) within a file.extension
: A file extension (which defines the Content-Type or mime-type).filename
: The unique name of the file.path
: The directory path relative to the repo and without the filename.
pattern
: The string or regular expression to look for.type
: An enumeration that defines how to evaluate for secrets. Valid values are:match
: A strict string equivalency evaluation.regex
: A regular expression "search" or "test".
7.2. Editing a Signature
Edits are welcome! Just be sure to unit test.
7.3. Removing a Signature
Please provide a testable justification for any Signature removal.
8. License
Apache-2.0 © 2019 Greg Swindle
9. References and Attributions
^1: What is Data Leakage? Defined, Explained, and Explored | Forcepoint. (2019) Retrieved January 27, 2019, from https://www.forcepoint.com/cyber-edu/data-leakage