js-virtualizer v1.0.2
js-virtualizer
virtualization-based obfuscation for javascript
js-virtualizer is a proof-of-concept project which brings virtualization-based obfuscation to javascript. In this implementation, bytecode is fed to a virtual machine implemented javascript which runs on its own instruction set. A transpiler is included to convert individual functions to opcodes for the VM. It is important to note that js-virtualizer is not intended for use on entire programs, but rather for individual functions! There will be a significant performance hit if you try to run an entire program through the VM.
Usage
!WARNING
You need to mark the functions you want to virtualize by putting a comment with the text// @virtualize
above the function.
// @virtualize
function virtualize() {
console.log("hello from the virtualized function");
}
function notVirtualized() {
console.log("this function will not be virtualized");
}
!TIP See examples/basic.js for a full example and the samples folder for some sample code you can try virtualizing.
const {transpile} = require("js-virtualizer");
async function main() {
const result = await transpile(`
// @virtualize
function virtualize() {
console.log("hello world from the JSVM");
}
virtualize()
`, {
// the filename of the code; will be used as the default output filename
fileName: 'example.js',
// whether or not the transpiler should directly write the output to a file
writeOutput: true,
// the path to write the vm for the transpiled code to
vmOutputPath: "./vm_output.js",
// the path to write the transpiled code to
transpiledOutputPath: "./output.js",
// the passes apply to the result before returning
passes: [
"RemoveUnused",
"ObfuscateVM",
"ObfuscateTranspiled"
]
});
console.log(`Virtualized code saved to: ${result.transpiledOutputPath}`);
}
main();
Transpiler Support
- variables
- proper scoping for let and const
- all primitive types
- object expressions
- array expressions
- object destructuring
- array destructuring
- assignment
- functions
- arrow functions
- function expressions
- function declarations
- function calls (both external and internal) with proper
this
context - callbacks
- awaiting functions (running async functions concurrently is not supported)
- a function accessing its own "this" property
- other statements
- return statements
- if/else/else if statements
- for loops
- for of loops
- for in loops
- while loops
- switch cases
- try/catch/finally
- throw statements
- continue statements
- break statements
- misc
- sequence expressions
- template literals
- ternary operators
- logical operators
- new expressions
- unary operators (typeof, delete, etc.)
- binary operators
- update operators
- comparison operators
- bitwise operators
Limitations
!WARNING
It is highly recommended that you modify and obfuscate the vm_dist.js file before using it in a production environment. For instance, including the opcode names in the VM makes it more trivial to reverse engineer the workings of the virtualized code
- this project is targeting server-side javascript runtimes such as node.js, and has not been tested in the browser. however, it should be trivial to get it working in the browser by removing the
require
statements and replacing them with the appropriate browser equivalents invm_dist.js
- if you try to virtualize a program with async functions running concurrently, it will not work as the transpiler & virtual machine were not designed with concurrency in mind (it is a proof-of-concept, after all). the JSVM currently does not support async functions in the context of the whole program. however, you can use async functions within virtualized function as they have their own context
- performance is not guaranteed. js-virtualizer is not intended for use in high-performance applications. it is intended for use in applications where you need to protect your code from reverse engineering
- given the virtual machine, the virtualized function is pretty trivial to reverse engineer. it is recommended that the virtual machine class is obfuscated before use
- declaring variables by
var
is not supported. it is not guaranteed that the variable will behave as expected. you should uselet
orconst
instead
Todo
- transpiler
- provide a proper
this
property to functions - template literals
- proper for and while loops
- sequence expressions
- object and array destructuring
- arrow functions
- object expressions
- callbacks
- try/catch/finally
- proper var support
- proper reference counting to manage variables captured by protos (functions declared within functions) and other data types which are passed by reference (objects, arrays, etc.)
- currently, any captured variables do not get dropped by the transpiler and persist in memory, even when going out of scope
- need to add a way to check for references to both variables which store protos as well as the variables which are captured by protos
- once no more references to the proto exist, all variables captured by the proto should be dropped (assuming they have no other references; there should be a counter for the number of references to captured variables)
- add support for async functions in the context of the whole function
- currently, you are only able to properly await functions, but not run them concurrently as you would in a normal program
async would require complex register management. the registers need to be restored after calling the async function, but some registers may have been mutated by the program before the resolution.this can be mitigated as we can just never drop any variables and keep them for the entire lifetime of the function. however, this would still require async context switching
- allow for declaration of classes (i don't know why you would want to init a class in a function but this is still a limitation of the current implementation)
- obfuscation passes/optimization passes
- obfuscation techniques
- opcode shuffling and minification (remove unused opcodes, rename opcodes, etc.)
- argument scrambling (change the order of arguments in function calls)
- string encryption
- dead code injection
- VM memory protection (encrypt data in the registers and restore it just in time. this should probably be done mostly by the VM)
- bytecode integrity checks