Xudpforwarder NPM

WindRibbon UDP Forwarder (XUDPForwarder)

Introduction

A UDP forwarder (relay) with datagram modification functionality.

Requirements

Package	Version
NodeJS	>= 8.7.0

Installation

To install this program, type following command:

npm install xudpforwarder -g

After installation, you can run xudpforwarder in your terminal, like (replace sample.json with the path of your own configuration file):

xudpforwarder --log-level=info --configuration sample.json

Components

NAT

The core of our system is a user-space NAT engine. In our system, a connection must be initiated by the client.

Imagine such a scenario, we a client (10.8.0.1), a server (10.8.0.3:1194) and our relay (10.8.0.2:1194). The client want to send a datagram to the server, so it creates a datagram socket on its local port (for example 1000), writes data to the datagram and then send it to our relay (10.8.0.2:1194). The relay assign a new port (for example 2000) for the client (10.8.0.1:1000) and then send the datagram with source address (10.8.0.2:2000). When the reply datagram comes, the relay changes the source address back to the relay address (10.8.0.2:1194) and send it back to the client (10.8.0.1:1000).

In our system, we maintain an internal NAT table, each NAT entry has a timeout. If related port keeps idle for a long time, we will close it.

To configure the NAT engine, use following configuration:

{
    "destination": {
        "address": "127.0.0.1",                       //  The destination address.
        "port": 2501                                  //  The destination port.
    },
    "buffer": {
        "upstream": {
            ...                                       //  Upstream buffer queue configuration.
        },
        "downstream": {
            ...                                       //  Downstream buffer queue configuration.
        }
    },
    "pipeline": {
        "upstream": [],                               //  Upstream processor pipeline.
        "downstream": []                              //  Downstream processor pipeline.
    },
    "socket": {
        "client": {
            "bind": {
                "address": "127.0.0.1"                //  Client interface (used to connect to remote servers).
            },
            "type": "udp4",                           //  Type of the socket ("udp4" for IPv4, "udp6" for IPv6).
            "reuse": true,                            //  Reuse address switch (optional, default: false).
            "ttl": 64,                                //  Initial TTL of outgoing datagrams (optional, default: 64).
            "buffer": {
                "receive": 16777216,                  //  SO_RCVBUF option of the socket (optional, default: by system).
                "send": 16777216                      //  SO_SNDBUF option of the socket (optional, default: by system).
            }
        },
        "server": {
            "bind": {
                "address": "127.0.0.1",               //  Server interface (used to receive from local clients).
                "port": 2500                          //  Server port.
            },
            "type": "udp4",                           //  Type of the socket ("udp4" for IPv4, "udp6" for IPv6).
            "reuse": true,                            //  Reuse address switch (optional, default: false).
            "ttl": 64,                                //  Initial TTL of outgoing datagrams (optional, default: 64).
            "buffer": {
                "receive": 16777216,                  //  SO_RCVBUF option of the socket (optional, default: by system).
                "send": 16777216                      //  SO_SNDBUF option of the socket (optional, default: by system).
            }
        }
    },
    "timeout": 60000,                                 //  NAT entry timeout (unit: milliseconds).
    "syscall": {
        "max": 8192                                   //  Maximum unfinished system call count (optional, default: 8192).
    },
    "hooks": {                                        //  (Optional) The hooks.
        "ready": {                                    //  (Optional) The "ready" hook.
            "cwd": ".",                               //  The working directory of the "ready" hook.
            "command": "scripts/ready.sh"             //  The command of the "ready" hook.
//            "command": ":[Base64-encoded Script]"   //  The inline version of the "ready" hook.      
        },
        "nat/established": {
            "cwd": ".",
            "command": "scripts/nat-established.sh"   //  The command of the "nat/established" hook.
//            "command": ":[Base64-encoded Script]"   //  The inline version of the "nat/established" hook.
        },
        "nat/shutdown": {
            "cwd": ".",
            "command": "scripts/nat-shutdown.sh"      //  The command of the "nat/shutdown" hook.
//            "command": ":[Base64-encoded Script]"   //  The inline version of the "nat/shutdown" hook.
        }
    }
}

Buffer queue

Overview

Like most router system, we maintain a buffer queue in each processor and the NAT engine to get rid of burst datagrams. The processor or NAT engine would process datagrams in the queue one by one.

Figure 1 - Overview of buffer queue

You can set the maximum size (maximum total length of in-queue datagrams) and the maximum datagram count of a buffer queue. Of course, you can also set the maximum alive time of each datagram.

Use following configuration to configure a buffer queue:

{
    "max-size": 1048576, //  Optional (unit: bytes).
    "max-count": 1234,   //  Optional (unit: datagrams).
    "max-time": 1000     //  Optional (unit: milliseconds).
}

In processor

Like what you see in Figure 2. Incoming datagrams of a processor will be first put into the buffer queue. Aside of that, the processor will take datagrams from the queue, process them and send them out.

Figure 2 - Buffer queue in a processor

In some processors (like "bypass" processor and "filter" processors), the datagram won't be modified. The processor is so fast that the queue will keep empty in most times. You can set a small queue size in these processors. But in some processors (like cryptography-related processors), the processing speed is slow. If the incoming speed is larger than the processing speed, the incoming datagrams will be accumulated in the buffer queue. So you have to set a appropriate queue size in these processors. Our recommendation is to set the "max-time" of the queue to 1 second.

To configure buffer queue in processor, see sections below.

In NAT engine

In the NAT engine, we have two type of buffer queues - upstream queue and downstream queue. See Figure 3.

Figure 3 - Buffer queue in NAT engine

Each NAT entry has its independent upstream queue and all NAT entries share one downstream queue. These queues can be configured to a relevant small size because the outgoing system call speed is generally very fast.

To configure buffer queue in NAT engine, see previous sections.

Processors

Before sending or after receiving a datagram, the system will deliver the datagram to a series of processor. The processors will process the datagram one-by-one and then deliver it to another side.

In our system, we use JSON to configure the processors. Before using any processor, you must configure it properly.

Filter

This kind of processors can filter datagrams that matches specified rules (such as datagram size limit).

Threshold

This processor is used to filter out datagrams that are too small or too large. Only datagrams within specified size range can bypass this filter processor.

To configure the processor, use following configuration:

{
    "type": "filter/threshold",
    "min": 0                     //  Minimum datagram length (optional, default: not set).
    "max": 1500,                 //  Maximum datagram length (optional, default: not set).
    "buffer": {
        ...                      //  Processor buffer queue configuration.
    }
}

Bandwidth

This processor can shape the UDP datagram traffic under specific bandwidth. You can always use this filter to limit the speed of the UDP datagram traffic.

To configure the processor, use following configuration:

{
    "type": "filter/bandwidth",
    "bandwidth": 524288,            //  Bandwidth limit (unit: byte/s).
    "buffer": {
        ...                         //  Processor buffer queue configuration.
    }
}

Permutate

This processor can be used in changing the position of critical fields of specified protocol. So that the firewall can not identify these fields and then we would bypass the firewall.

The "permutate" processor generates a permutation by using a seed (or you can think of it as a key) and then use the permutation to rearrange the order of the data.

For example, we have a seed received a datagram:

i	INi
0	'H'
1	'e'
2	'l'
3	'l'
4	'o'

And then we will use the seed to generate a permutation (with the same length as the datagram). Note that when the seed and length are unchanged, the resulting permutation also unchanged.

Here is our permutation:

i	Pi
0	4
1	1
2	0
3	3
4	2

In non-inverse mode, we use following rules to generate the output datagram (OUT0..4):

OUT[i] = IN[P[i]]

In inverse mode, we use following rules instead:

OUT[P[i]] = IN[i]

Note that if you send a datagram to a non-inverse mode permutate processor and then send the output again to a inverse mode permutate processor with the same key, you will get a datagram with the same content as the original datagram.

For example, in above example, we will get following output "oeHll" if we pass the original datagram to a non-inverse mode permutate processor. Then pass "oeHll" to a inverse mode permutate processor, you will get "Hello" which is the content of the original datagram.

In real world, generating permutation is very time-expensive. So it is better of you to use cache to speed your application up.

To configure the processor, use following configuration:

{
    "type": "cryptography/basic/permutate",
    "buffer": {
        ...                                  //  Processor buffer queue configuration.
    },
    "inverse": false,                        //  Inverse mode switch.
    "seed": "blahblah",                      //  PRNG seed.
    "rounds": 1,                             //  Permutate rounds (optional, default: 1).
    "cache": {
        "enable": true,                      //  Enable cache.
        "timeout": 30000,                    //  Cache timeout (unit: milliseconds).
        "size": 100                          //  Cache size (unit: records).
    }
}

Segmenter

In some conditions, you may want to split a datagram into several datagrams with some defined rules in one side and reassemble them in another side. For example, you may want to split a large datagram to some smaller ones to get rid of IP fragmentation. In some authorization states, this processor can also be used to bypass firewall (by changing the size of some critical datagrams).

The segmenter contains two parts – the disassembler and the assembler.

Disassembler

Here is the data flow diagram of the disassembler processor:

Figure 4 - Disassembler processor

The input datagram will be delivered to both the disassembler and the fragment length generator. The fragment length generator will generates a series of fragment length numbers and pass it to the disassembler. Then the disassembler will split the input datagram, add the header and send it out.

Here is the splitting algorithm:

Pass the count of remaining bytes (X) to the fragment length generator.
The fragment length generator will return an integer Y.
If X is less than or equal to Y, add some random padding (with length Y-X) to the tail of remaining bytes, send them and quit.
If X is larger than Y, split the first Y bytes off from the remaining bytes, send them and repeat step (1).

The fragment length generator use an algorithm chain to generate fragment length numbers. Currently we have following algorithms.

Fixed

This algorithm has two modes – "set" and "increase".

In "set" mode, we will return a fixed integer. In "increase" mode, we will add a fixed incremental to the input and then return it.

For example, in "set" mode, if the input is 1000 and the fixed value is 500, you will get the result 500. In "increase" mode, you will get the result 1500.

Use following JSON to configure this algorithm:

{
    "type": "fixed",
    "mode": "set",     //  Mode ("set" or "increase").
    "value": 1000      //  The value.
}

Random

Use a pseudo-random number generator (PRNG) to generate an integer as the incremental of the input. For example, if the PRNG generates 1, 2, 3, 4 and the inputs are 10, 20, 30, 40, the output will be 11, 22, 33, 44.

The algorithm accepts 4 parameters, includes the min/max value of generated integers, the seed and the probability distribution segment count. Generally, the segment count and the entropy are inversely related.

Use following JSON to configure this algorithm:

{
    "type": "random",
    "min": 0,             //  Minimum value.
    "max": 100,           //  Maximum value.
    "segments": 16        //  Distribution segment count (1 <= x <= max - min + 1).
}

Round-robin

The algorithm adds an integer X to the input. X was drawn from an array cyclically.

Use following JSON to configure this algorithm:

{
    "type": "round-robin",
    "values": [1, 2, 3, 4, 5]          //  Selections array.
}

Threshold

The algorithm give thresholds to the input value. If the min threshold was set and the input value is lower than that, the output value will be set to the min threshold. Similarly, if the max threshold was set and the input value is larget that than, the output value will be set to the max threshold. Otherwise, the output will be the same as the input value.

Use following JSON to configure this algorithm:

{
    "type": "threshold",
    "min": 16,            //  Minimum threshold (optional, default: not set).
    "max": 1500           //  Maximum threshold (optional, default: not set).
}

To chain algorithms, you should use an array to wrap all selected algorithm configurations. For example:

[
    {
        "type": "fixed",
        "mode": "increase",
        "value": 64
    },
    {
        "type": "random",
        "seed": "Lovely Seed",
        "min": -64,
        "max": 64
    },
    {
        "type": "round-robin",
        "values": [0, 59, 7, 38, 62, 19]
    },
    {
        "type": "threshold",
        "max": 1400
    }
]

After configuring the algorithm chain of the fragment length generator, you can configure the disassembler with following configuration:

{
    "type": "segmenter/disassembler",
    "max-datagram": 1420,               //  Maximum output datagram length.
    "pipeline": [
        ...                             //  Pipeline.
    ],
    "buffer": {
        ...                             //  Processor buffer queue configuration.
    }
}

Here is a full example:

{
    "type": "segmenter/disassembler",
    "max-datagram": 1420,
    "pipeline": [
        {
            "type": "fixed",
            "mode": "set",
            "value": 512
        }
    ],
    "buffer": {
        "max-time": 1000
    }
}

Assembler

This processor will reassemble fragments sent by the disassembler processor.

To configure the processor, use following configuration:

{
    "type": "segmenter/assembler",
    "timeout": 1000,                 //  Assembler entry timeout (unit: milliseconds).
    "max-datagram": 1500,            //  Maximum output datagram length (optional, default: by system).
    "buffer": {
        ...                          //  Processor buffer queue configuration.
    }
}

Padding

Similar to "segmenter", but this processor won't separate original datagram into pieces. Use following configurations:

{
    "type": "pad/padding",
    "pipeline": [
        ...                             //  Pipeline (the same as the pipeline configuration in "segmenter" processor).
    ],
    "buffer": {
        ...                             //  Processor buffer queue configuration.
    }
}

Unpadding

This processor will extract padded data generated by "padding" processor.

Use following configuration:

{
    "type": "pad/unpadding",
    "buffer": {
        ...                             //  Processor buffer queue configuration.
    }
}

Cryptography

Cipher

This processor will encrypt input datagram with specified cryptography algorithm and password. Currently, we supports following algorithm(s):

RC4
AES-128-CBC
AES-192-CBC
AES-256-CBC

To configure the processor, use following configuration (use lower-case algorithm name only, e.g. "rc4"):

{
    "type": "cryptography/cipher",
    "algorithm": "rc4",             //  Encryption algorithm.
    "password": "[Password 1]",     //  Encryption password.
    "buffer": {
        ...                         //  Processor buffer queue configuration.
    }
}

Decipher

This processor will decrypt input datagrams that encrypted by the cipher.

{
    "type": "cryptography/decipher",
    "algorithm": "rc4",             //  Decryption algorithm.
    "password": "[Password 1]",     //  Decryption password.
    "buffer": {
        ...                         //  Processor buffer queue configuration.
    }
}

Bypass

This processor will bypass all input datagrams.

Since it does nothing, it also doesn't need much configuring. If you want to use this processor, just use following configuration with no change:

{
    "type": "basic/bypass",
    "buffer": {
        ...                   //  Processor buffer queue configuration.
    }
}

Load balance

This processor will balance incoming traffic to specific endpoints (selections).

If you want to use this processor, modify and use following configuration:

{
    "type": "load-balance",
    "selections": [
        {
            "address": "xxx",      //  Address of the first selection.
            "port": 1000,          //  Port of the first selection.
        },
        //  ... Other selections ...
    ],
    "algorithm": {
        "type": "round-robin"      //  Load balance algorithm ("round-robin" or "random").
    },
    "buffer": {
        ...                        //  Processor buffer queue configuration.
    }
}

Hooks

Hooks is used to notify other applications some information about this program. Generally, hooks is a set of scripts that can handle certain parameters.

File-based hook and inline script hook

Currently we support two different kinds of hooks:

File-based hook: An executable file that will be executed at specific condition, passed with specific parameters and will exit with 0 (means success) or other exit codes.
Inline script hook: A base64-encoded JavaScript file that was directly written in the configuration file. The script will be executed in a sandbox of Node.JS without creating a new process.

In file-based hooks, parameters are passed through the command line directly. The file must be executable or the program will report an error. The file can be written in any programming language so you can write it with your favorite language.

The inline script hook can only be written with JavaScript language. The parameters will be passed to "Hook.arguments", you can visit this variable to get access to these arguments. When the hook ends, you must call "Hook.exit(code, error)" explicitly. You can also import Node.JS package in your hook. For example:

//
//  TODO: Add your own copyright header.
//

//
//  Imports.
//

//  You can import some libraries here.
//var Util = require("util");

//  Main logic.
(function() {
    //  Display a "hello" message.
    console.log("Hello! This is a hook!");

    //  Display arguments.
    console.log(Hook.arguments);

    //  TODO: Add your own logic here.
    //  ...

    //  Exit with no error.
    Hook.exit(0, null);
    
    //  Or... exit with some error.
    //Hook.exit(128 /*  The error code  */, new Error("Some error."));
})();

After writing the script, you must encode it with Base64 and add a leading ":" character. For example:

:Ly8KLy8gIFRPRE86IEFkZCB5b3VyIG93biBjb3B5cmlnaHQgaGVhZGVyLgovLwoKLy8KLy8gIEltcG9ydHMuCi8vCgovLyAgWW91IGNhbiBpbXBvcnQgc29tZSBsaWJyYXJpZXMgaGVyZS4KLy92YXIgVXRpbCA9IHJlcXVpcmUoInV0aWwiKTsKCi8vICBNYWluIGxvZ2ljLgooZnVuY3Rpb24oKSB7CiAgICAvLyAgRGlzcGxheSBhICJoZWxsbyIgbWVzc2FnZS4KICAgIGNvbnNvbGUubG9nKCJIZWxsbyEgVGhpcyBpcyBhIGhvb2shIik7CgogICAgLy8gIERpc3BsYXkgYXJndW1lbnRzLgogICAgY29uc29sZS5sb2coSG9vay5hcmd1bWVudHMpOwoKICAgIC8vICBUT0RPOiBBZGQgeW91ciBvd24gbG9naWMgaGVyZS4KICAgIC8vICAuLi4KCiAgICAvLyAgRXhpdCB3aXRoIG5vIGVycm9yLgogICAgSG9vay5leGl0KDAsIG51bGwpOwogICAgCiAgICAvLyAgT3IuLi4gZXhpdCB3aXRoIHNvbWUgZXJyb3IuCiAgICAvL0hvb2suZXhpdCgxMjggLyogIFRoZSBlcnJvciBjb2RlICAqLywgbmV3IEVycm9yKCJTb21lIGVycm9yLiIpKTsKfSkoKTsK

Now you can write the encoded line to "command" field, the program will parse and execute it at specific condition.

Hook: ready

Once the program was ready to accept new connections, this hook will be executed. We will bypass two parameters to this hook - the bind address and bind port.

Here is the template of this hook:

#!/bin/sh

#  Read bind information.
BIND_ADDR="$1"
BIND_PORT="$2"

#  TODO: Run your own logic here.
#  ...

exit 0

See "examples/08-external-hooks/src-inline/hook-ready.js" for the inline version of this hook.

Hook: nat/established

This hook will be triggered when a new NAT session was created. The details of the NAT session will be passed through the hook.

Here is the template of this hook:

#!/bin/sh

#  Get bind address and port.
SOURCE_ADDRESS="$1"
SOURCE_PORT="$2"
BIND_ADDRESS="$3"
BIND_PORT="$4"
SESSION_ID="$5"

#  TODO: Write your own logic here.
#  ...

exit 0

See "examples/08-external-hooks/src-inline/hook-nat-established.js" for the inline version of this hook.

Hook: nat/shutdown

This hook will be triggered when a NAT session was closed. The details and statistics of the session will be passed through the hook.

Here is the template of this hook:

#!/bin/sh

#  Get bind address and port.
SOURCE_ADDRESS="$1"
SOURCE_PORT="$2"
BIND_ADDRESS="$3"
BIND_PORT="$4"
SESSION_ID="$5"
SESSION_RCVBYTES="$6"
SESSION_RCVDATAGRAMS="$7"
SESSION_SNDBYTES="$8"
SESSION_SNDDATAGRAMS="$9"

#  TODO: Write your own logic here.
#  ...

exit 0

See "examples/08-external-hooks/src-inline/hook-nat-shutdown.js" for the inline version of this hook.

Configuration

Write the NAT engine configuration to a JSON file as the input configuration file of this program. See sample configuration files in "examples" directory for more details.

Signals

Currently, the program supports following signals:

Signal	Description
SIGUSR2	Activate the garbage collector. (Only available when the program was started with "--enable-gc" parameter.)
SIGINT	Close the program.
SIGHUP	Restart the program. (within the same process but reloads the configuration file.)