@sesamestrong/shift-refactor v2.0.0
Shift Refactor
shift-refactor is a suite of utility functions designed to analyze and modify JavaScript source files.
It originated as a tool to reverse engineer obfuscated JavaScript but is general-purpose enough for arbitrary transformations.
Who is this for?
Anyone who works with JavaScript ASTs (Abstract Syntax Trees). If you're not familiar with ASTs, here are a few use cases where they come in useful:
- Automatic refactoring, making sweeping changes to JavaScript source files (Developers, QA).
- Analyzing JavaScript for linting, complexity scoring, etc (Developers, QA).
- Extracting API details to auto-generate documentation or tests (Developers, QA).
- Scraping JavaScript for information or security vulnerabilities (Pen Testers, QA, Security Teams, Hacker types).
- Programmatically transforming malicious or obfuscated JavaScript (Reverse Engineers).
Status
Stable.
Installation
$ npm install shift-refactorUsage
The script below finds and prints all literal strings in a script.
// Read 'example.js' as text
const fs = require('fs');
const src = fs.readFileSync('example.js', 'utf-8');
const { refactor } = require('.');
// Create a refactor query object
const $script = refactor(src);
// Select all `LiteralStringExpression`s
const $stringNodes = $script('LiteralStringExpression')
// Turn the string AST nodes into real JS strings
const strings = $stringNodes.codegen();
// Output the strings to the console
strings.forEach(string => console.log(string));Advanced Example
This script takes the obfuscated source and turns it into something much more readable.
const { refactor } = require('.'); // require('shift-refactor');
const Shift = require('shift-ast');
// Obfuscated source
const src = `var a=['\x74\x61\x72\x67\x65\x74','\x73\x65\x74\x54\x61\x72\x67\x65\x74','\x77\x6f\x72\x6c\x64','\x67\x72\x65\x65\x74','\x72\x65\x61\x64\x65\x72'];var b=function(c,d){c=c-0x0;var e=a[c];return e;};(function(){class c{constructor(d){this[b('0x0')]=d;}['\x67\x72\x65\x65\x74'](){console['\x6c\x6f\x67']('\x48\x65\x6c\x6c\x6f\x20'+this[b('0x0')]);}[b('0x1')](e){this['\x74\x61\x72\x67\x65\x74']=e;}}const f=new c(b('0x2'));f[b('0x3')]();f[b('0x1')](b('0x4'));f[b('0x3')]();}());`;
const $script = refactor(src);
const strings = $script(`Script > :first-child ArrayExpression > .elements`);
const destringifyDeclarator = $script(`VariableDeclarator[binding.name="b"][init.params.items.length=2]`);
destringifyDeclarator.rename('destringify');
const destringifyOffset = destringifyDeclarator.$(`BinaryExpression > LiteralNumericExpression`);
const findIndex = (c, d) => c - destringifyOffset.first().value;
$script(`CallExpression[callee.name="destringify"]`).replace(
node => {
return new Shift.LiteralStringExpression({
value: strings.get(findIndex(node.arguments[0].value)).value
})
}
)
$script(`[binding.name="a"]`).delete();
$script(`[binding.name="destringify"]`).delete();
$script.convertComputedToStatic();
console.log($script.print());Query Syntax
The query syntax is from shift-query (which is a port of esquery) and closely resemble CSS selector syntax.
The following selectors are supported:
- AST node type:
FunctionDeclaration - wildcard:
* - attribute existence:
[attr] - attribute value:
[attr="foo"]or[attr=123] - attribute regex:
[attr=/foo.*/] - attribute conditons:
[attr!="foo"],[attr>2],[attr<3],[attr>=2], or[attr<=3] - nested attribute:
[attr.level2="foo"] - field:
FunctionDeclaration > IdentifierExpression.name - First or last child:
:first-childor:last-child - nth-child (no ax+b support):
:nth-child(2) - nth-last-child (no ax+b support):
:nth-last-child(1) - descendant:
ancestor descendant - child:
parent > child - following sibling:
node ~ sibling - adjacent sibling:
node + adjacent - negation:
:not(ExpressionStatement) - matches-any:
:matches([attr] > :first-child, :last-child) - subject indicator:
!IfStatement > [name="foo"] - class of AST node:
:statement,:expression,:declaration,:function, or:target
Useful sites & tools
- Shift-query's online sandbox to test queries quickly.
- Shift-query CLI tool to query JavaScript on the command line.
- AST Explorer to explore JavaScript AST's visually (make sure to select "shift" on the top menu bar).
- Shift-AST.org - home of the Shift JavaScript tool suite.
API
refactor(string | Shift AST)
Create a refactor query object.
Note:
This function assumes that it is being passed complete JavaScript source or a root AST node (Script or Module) so that it can create and maintain global state.
Example
const { refactor } = require('shift-refactor');
const $script = refactor(`/* JavaScript Source *\/`);Refactor Query Object
The API is meant to look and feel like jQuery since – like jQuery – it works with CSS-style queries and regularly accesses nodes on a tree. Each query object is both a function and an instance of the internal RefactorSession class.
Calling the query object as a function will produce a new query object, You can call a refactor query with a query to produce a new query object with the new nodes or you can call methods off the object to act on the nodes already selected. The examples prefix refactor query objects with a $ to indicate they are refactor query objects and not naked Nodes or other objects.
Example
const { refactor } = require('shift-refactor');
const $script = refactor(src);
const $variableDecls = $script('VariableDeclarationStatement')
const $bindingIdentifiers = $variableDecls('BindingIdentifier');
const names = $bindingIdentifiers.map(node => node.name);Methods
.$(queryOrNodes).append(replacer).closest(closestSelector).codegen().declarations().delete().filter(iterator).find(iterator).findMatchingExpression(sampleSrc).findMatchingStatement(sampleSrc).findOne(selectorOrNode).first(selector).forEach(iterator).get(index).logOut().lookupVariable().lookupVariableByName(name).map(iterator).nameString().parents().prepend(replacer).print().query(selector).raw().references().rename(newName).replace(replacer).replaceAsync(replacer).replaceChildren(query, replacer).statements().toJSON().type()
.$(queryOrNodes)
Sub-query from selected nodes
Example
const { refactor } = require('shift-refactor');
const src = `
let a = 1;
function myFunction() {
let b = 2, c = 3;
}
`
$script = refactor(src);
const funcDecl = $script('FunctionDeclaration[name.name="myFunction"]');
const innerIdentifiers = funcDecl.$('BindingIdentifier');
// innerIdentifiers.nodes: myFunction, b, c (note: does not include a).append(replacer)
Inserts the result ofreplacerafter the selected statement.
Note:
Only works on Statement nodes.
Example
const { refactor } = require('shift-refactor');
const Shift = require('shift-ast');
const src = `
var message = "Hello";
console.log(message);
`
$script = refactor(src);
$script('LiteralStringExpression[value="Hello"]').closest(':statement').append('debugger');.closest(closestSelector)
Finds the closest parent node that matches the passed selector.
Example
const { refactor } = require('shift-refactor');
const src = `
function someFunction() {
interestingFunction();
}
function otherFunction() {
interestingFunction();
}
`
$script = refactor(src);
// finds all functions that call `interestingFunction`
const fnDecls = $script('CallExpression[callee.name="interestingFunction"]').closest('FunctionDeclaration');.codegen()
Generates JavaScript source for the first selected node.
Example
const { refactor } = require('shift-refactor');
const src = `
for (var i=1; i < 101; i++){
if (i % 15 == 0) console.log("FizzBuzz");
else if (i % 3 == 0) console.log("Fizz");
else if (i % 5 == 0) console.log("Buzz");
else console.log(i);
}
`
$script = refactor(src);
const strings = $script("LiteralStringExpression")
console.log(strings.codegen());.declarations()
Finds the declaration for the selected Identifier nodes.
Note:
Returns a list of Declaration objects for each selected node, not a shift-refactor query object.
Example
const { refactor } = require('shift-refactor');
const src = `
const myVariable = 2, otherVar = 3;
console.log(myVariable, otherVar);
`
$script = refactor(src);
// selects the parameters to console.log() and finds their declarations
const decls = $script('CallExpression[callee.object.name="console"][callee.property="log"] > .arguments').declarations();.delete()
Delete nodes
Example
const { refactor } = require('shift-refactor');
$script = refactor('foo();bar();');
$script('ExpressionStatement[expression.callee.name="foo"]').delete();.filter(iterator)
Filter selected nodes via passed iterator
Example
const { refactor } = require('shift-refactor');
const src = `
let doc = window.document;
function addListener(event, fn) {
doc.addEventListener(event, fn);
}
`
$script = refactor(src);
const values = $script('BindingIdentifier').filter(node => node.name === 'doc');.find(iterator)
Finds node via the passed iterator iterator
Example
const { refactor } = require('shift-refactor');
const src = `
const myMessage = "He" + "llo" + " " + "World";
`
$script = refactor(src);
$script('LiteralStringExpression')
.find(node => node.value === 'World')
.replace('"Reader"');.findMatchingExpression(sampleSrc)
Finds an expression that closely matches the passed source.
Note:
Used for selecting nodes by source pattern instead of query. The passed source is parsed as a Script and the first statement is expected to be an ExpressionStatement.Matching is done by matching the properties of the parsed statement, ignoring additional properties/nodes in the source tree.
Example
const { refactor } = require('shift-refactor');
const src = `
const a = someFunction(paramOther);
const b = targetFunction(param1, param2);
`
$script = refactor(src);
const targetCallExpression = $script.findMatchingExpression('targetFunction(param1, param2)');.findMatchingStatement(sampleSrc)
Finds a statement that matches the passed source.
Note:
Used for selecting nodes by source pattern vs query. The passed source is parsed as a Script and the first statement alone is used as the statement to match. Matching is done by matching the properties of the parsed statement, ignoring additional properties/nodes in the source tree.
Example
const { refactor } = require('shift-refactor');
const src = `
function someFunction(a,b) {
var innerVariable = "Lots of stuff in here";
foo(a);
bar(b);
}
`
$script = refactor(src);
const targetDeclaration = $script.findMatchingStatement('function someFunction(a,b){}');.findOne(selectorOrNode)
Finds and selects a single node, throwing an error if zero or more than one is found.
Note:
This is useful for when you want to target a single node but aren't sure how specific your query needs to be to target that node and only that node.
Example
const { refactor } = require('shift-refactor');
const src = `
let outerVariable = 1;
function someFunction(a,b) {
let innerVariable = 2;
}
`
$script = refactor(src);
// This would throw, because there are multiple VariableDeclarators
// $script.findOne('VariableDeclarator');
// This won't throw because there is only one within the only FunctionDeclaration.
const innerVariableDecl = $script('FunctionDeclaration').findOne('VariableDeclarator');.first(selector)
Returns the first selected node. Optionally takes a selector and returns the first node that matches the selector.
Example
const { refactor } = require('shift-refactor');
const src = `
func1();
func2();
func3();
`
$script = refactor(src);
const func1CallExpression = $script('CallExpression').first();.forEach(iterator)
Iterate over selected nodes
Example
const { refactor } = require('shift-refactor');
const src = `
let a = [1,2,3,4];
`
$script = refactor(src);
$script('LiteralNumericExpression').forEach(node => node.value *= 2);.get(index)
Get selected node at index.
Example
const { refactor } = require('shift-refactor');
const src = `
someFunction('first string', 'second string', 'third string');
`
$script = refactor(src);
const thirdString = $script('LiteralStringExpression').get(2);.logOut()
console.log()s the selected nodes. Useful for inserting into a chain to see what nodes you are working with.
Example
const { refactor } = require('shift-refactor');
const src = `
let a = 1, b = 2;
`
$script = refactor(src);
$script("VariableDeclarator").logOut().delete();.lookupVariable()
Looks up the Variable from the passed identifier node
Note:
ReturnsVariableobjects from shift-scope, that contain all the references and declarations for a program variable.
Example
const { refactor } = require('shift-refactor');
const src = `
const someVariable = 2, other = 3;
someVariable++;
function thisIsAVariabletoo(same, as, these) {}
`
$script = refactor(src);
// Finds all variables declared within a program
const variables = $script('BindingIdentifier').lookupVariable();.lookupVariableByName(name)
Looks up Variables by name.
Note:
There may be multiple across a program. Variable lookup operates on the global program state. This method ignores selected nodes.
Example
const { refactor } = require('shift-refactor');
const src = `
const someVariable = 2, other = 3;
`
$script = refactor(src);
const variables = $script.lookupVariableByName('someVariable');.map(iterator)
Transform selected nodes via passed iterator
Example
const { refactor } = require('shift-refactor');
const src = `
let doc = window.document;
function addListener(event, fn) {
doc.addEventListener(event, fn);
}
`
$script = refactor(src);
const values = $script('BindingIdentifier').map(node => node.name);.nameString()
Retrieve the names of the first selected node. Returns undefined for nodes without names.
Example
const { refactor } = require('shift-refactor');
const src = `
var first = 1, second = 2;
`
$script = refactor(src);
const firstName = $script('BindingIdentifier[name="first"]').nameString();.parents()
Retrieve parent node(s)
Example
const { refactor } = require('shift-refactor');
const src = `
var a = 1, b = 2;
`
$script = refactor(src);
const declarators = $script('VariableDeclarator');
const declaration = declarators.parents();.prepend(replacer)
Inserts the result ofreplacerbefore the selected statement.
Note:
Only works on Statement nodes.
Example
const { refactor } = require('shift-refactor');
const Shift = require('shift-ast');
const src = `
var message = "Hello";
console.log(message);
`
$script = refactor(src);
$script('ExpressionStatement[expression.type="CallExpression"]').prepend(new Shift.DebuggerStatement());.print()
Generates JavaScript source for the first selected node.
Example
const { refactor } = require('shift-refactor');
const Shift = require('shift-ast');
const src = `
window.addEventListener('load', () => {
lotsOfWork();
})
`
$script = refactor(src);
$script("CallExpression[callee.property='addEventListener'] > ArrowExpression")
.replace(new Shift.IdentifierExpression({name: 'myListener'}));
console.log($script.print());.query(selector)
Sub-query from selected nodes
Note:
synonym for .$()
.raw()
Returns the raw Shift node for the first selected node.
Example
const { refactor } = require('shift-refactor');
const src = `
const a = 2;
`
$script = refactor(src);
const declStatement = $script('VariableDeclarationStatement').raw();.references()
Finds the references for the selected Identifier nodes.
Note:
Returns a list of Reference objects for each selected node, not a shift-refactor query object.
Example
const { refactor } = require('shift-refactor');
const src = `
let myVar = 1;
function someFunction(a,b) {
myVar++;
return myVar;
}
`
$script = refactor(src);
const refs = $script('BindingIdentifier[name="myVar"]').references();.rename(newName)
Rename all references to the first selected node to the passed name.
Note:
Uses the selected node as the target, but affects the global state.
Example
const { refactor } = require('shift-refactor');
const src = `
const myVariable = 2;
myVariable++;
const other = myVariable;
function unrelated(myVariable) { return myVariable }
`
$script = refactor(src);
$script('VariableDeclarator[binding.name="myVariable"]').rename('newName');.replace(replacer)
Replace selected node with the result of the replacer parameter
Example
const { refactor } = require('shift-refactor');
const Shift = require('shift-ast');
const src = `
function sum(a,b) { return a + b }
function difference(a,b) {return a - b}
`
$script = refactor(src);
$script('FunctionDeclaration').replace(node => new Shift.VariableDeclarationStatement({
declaration: new Shift.VariableDeclaration({
kind: 'const',
declarators: [
new Shift.VariableDeclarator({
binding: node.name,
init: new Shift.ArrowExpression({
isAsync: false,
params: node.params,
body: node.body
})
})
]
})
})).replaceAsync(replacer)
Async version of .replace() that supports asynchronous replacer functions
Example
const { refactor } = require('shift-refactor');
$script = refactor('var a = "hello";');
async function work() {
await $script('LiteralStringExpression').replaceAsync(
(node) => Promise.resolve(`"goodbye"`)
)
}.replaceChildren(query, replacer)
Recursively replaces child nodes until no nodes have been replaced.
Example
const { refactor } = require('shift-refactor');
const Shift = require('shift-ast');
const src = `
1 + 2 + 3
`
$script = refactor(src);
$script.replaceChildren(
'BinaryExpression[left.type=LiteralNumericExpression][right.type=LiteralNumericExpression]',
(node) => new Shift.LiteralNumericExpression({value: node.left.value + node.right.value})
);.statements()
Returns the selects the statements for the selected nodes. Note: it will "uplevel" the inner statements of nodes with a.bodyproperty.Does nothing for nodes that have no statements property.
Example
const { refactor } = require('shift-refactor');
const src = `
console.log(1);
console.log(2);
`
$script = refactor(src);
const rootStatements = $script.statements();.toJSON()
JSON-ifies the current selected nodes.
Example
const { refactor } = require('shift-refactor');
const src = `
(function(){ console.log("Hey")}())
`
$script = refactor(src);
const json = $script.toJSON();.type()
Return the type of the first selected node
Example
const { refactor } = require('shift-refactor');
const Shift = require('shift-ast');
const src = `
myFunction();
`
$script = refactor(src);
const type = $script('CallExpression').type();2 years ago