semgrep v0.0.1
sgrep
sgrep is a tool for easily detecting and preventing bugs and anti-patterns in
your codebase. It combines the convenience of grep with the correctness of
syntactical and semantic search. Quickly write rules so you can code with
confidence.
Try it now: https://sgrep.live
Overview
Language support:
| Python | Javascript | Go | Java | C | Typescript | PHP |
|---|---|---|---|---|---|---|
| ✅ | ✅ | ✅ | ✅ | ✅ | Coming... | Coming... |
Example patterns:
| Pattern | Matches |
|---|---|
$X == $X | if (node.id == node.id): ... |
requests.get(..., verify=False, ...) | requests.get(url, timeout=3, verify=False) |
os.system(...) | from os import system; system('echo sgrep') |
$ELEMENT.innerHTML | el.innerHTML = "<img src='x' onerror='alert(`XSS`)'>"; |
$TOKEN.SignedString([]byte("...")) | ss, err := token.SignedString([]byte("HARDCODED KEY")) |
Installation
Install sgrep with Docker:
$ docker pull returntocorp/sgrepAnd double check that it was installed correctly:
$ docker run --rm returntocorp/sgrep --helpInstallation with Brew (Experimental)
brew tap returntocorp/sgrep https://github.com/returntocorp/sgrep.git
brew install semgrepUsage
Start with a simple example:
$ cat << EOF > test.py
a = 1
b = 2
if a == a: # oops, supposed to be a == b
print('sgrep test')
EOF$ docker run --rm -v "${PWD}:/home/repo" returntocorp/sgrep --lang python --pattern '$X == $X' test.py
test.py
3:if a == a: # oops, supposed to be a == bFrom here you can use our rules to search for issues in your codebase:
$ cd /path/to/code
$ docker run --rm -v "${PWD}:/home/repo" returntocorp/sgrep --config r2cYou can also create your own rules:
$ cd /path/to/code
$ docker run --rm -v "${PWD}:/home/repo" returntocorp/sgrep --generate-config
$ docker run --rm -v "${PWD}:/home/repo" returntocorp/sgrepConfiguration
For simple patterns use the --lang and --pattern flags. This mode of
operation is useful for quickly iterating on a pattern on a single file or
folder:
$ docker run --rm -v "${PWD}:/home/repo" returntocorp/sgrep --lang javascript --pattern 'eval(...)' path/to/file.jsTo fine-tune your searching, specify the --help flag:
$ docker run --rm returntocorp/sgrep --helpConfiguration Files
For advanced configuration use the --config flag. This flag automagically
handles a multitude of input types:
--config <file|folder|yaml_url|tarball_url|registy_name>
In the absense of this flag, a default configuration is loaded from .sgrep.yml
or multiple files matching .sgrep/**/*.yml.
Operators
Configuration files make use of two primary operators:
- Metavariables like
$X,$WIDGET, or$USERS. Metavariable names can only contain uppercase characters - names like$xor$SOME_VALUEare invalid. Metavariables are used to track a variable across a specific code scope. - The
...(ellipsis) operator. The ellipsis operator abstracts away sequences so you don't have to sweat the details of a particular code pattern.
Let's consider an example:
rules:
- id: open-never-closed
patterns:
- pattern: $FILE = open(...)
- pattern-not-inside: |
$FILE = open(...)
...
$FILE.close()
message: "file object opened without corresponding close"
languages: [python]
severity: ERRORThis rule looks for files that are opened but never closed. It accomplishes
this by looking for the open(...) pattern and not a following close()
pattern. The $FILE metavariable ensures that the same variable name is used
in the open and close calls. The ellipsis operator allows for any arguments
to be passed to open and any sequence of code statements in-between the open
and close calls. We don't care how open is called or what happens up to
a close call, we just need to make sure close is called.
For a more complete introduction to the configuration format please see the advanced configuration documentation.
Equivalences
Equivalences are another key concept in sgrep. sgrep automatically searches
for code that is semantically equivalent. For example, the following patterns
are semantically equivalent
subprocess.Popen(...)from subprocess import Popen as sub_popen
result = sub_popen("ls")For a full list of sgrep feature support by language see the
language matrix.
Registry
As mentioned above, you may also specify a registry name as configuration. r2c provides a registry of configuration files. These rules have been tuned on thousands of repositories using our analysis platform.
$ docker run --rm -v "${PWD}:/home/repo" returntocorp/sgrep --config r2cResources
- r2c
sgrepmeetup slides - Simple configuration documentation
- Advanced configuration documentation
- Integrations
- Development
- Bug reports
Contribution
sgrep is LGPL-licensed, feel free to help out: CONTRIBUTING.