[sic]

Encode JS with 2 different characters: []

A tiny library to inject self-executing code using a sequence of square brackets: [ and ]. Created by Martin Kleppe aka @aemkei.

Example

Click to Convert ↓↓↓↓↓ ...
↑ Run the code on this site.


Explanation

The basic idea is to encode strings using the binary representation of all ASCII values:

code = 'alert';

characters = code.split('');
// ["a", "l", "e", "r", "t"]

ascii = characters.map(c => c.charCodeAt(0));
// [97, 108, 101, 114, 116]

binary = ascii.map(c => c.toString(2));
// ["1100001", "1101100", "1100101", "1110010", "1110100"]

encoded = binary.join('');
// "11000011101100110010111100101110100"

To decode the string, we need to convert it back:

binary = encoded.match(/.{7}/g);
// ["1100001", "1101100", "1100101", "1110010", "1110100"]

ascii = binary.map(b => parseInt(b, 2));
// [97, 108, 101, 114, 116]

characters = ascii.map(String.fromCharCode);
// ["a", "l", "e", "r", "t"]

code = characters.join('');
// "alert"

Unfortunately the encoded string has two extra quotes around the binary representation. The question is: How can we get rid of the " symbol and only use two different characters to obfuscate our code? It turns out that there are (at least) three different solutions:

The dot notation is quite simple because we can assign the two bits to different properties. In our example, the custom entry point xxx has two properties .x = 0 and .xx = 1. By chaining these properties we can encode binary values.

Let us use _ instead of x for our example because it looks more interesting:

___.__._._._._.__ = "1100001"

We use a getter accessor function instead of accessing the properties directly. This helps us to remember the binary value and convert it into a character once we reached seven bits. Here is the code that will parse and eval the script:

let binary = ''; // binary representation for every char
let code = ''; // final code to execute

// main entry point
___ = {
  get _()   { return handle(0); },
  get __()  { return handle(1); },
  get ___() { eval(code); }
}

function handle(bit) {
  // compose an ASCII binary code
  binary += bit;
  if (binary.length == 7) {
    // add more chars to the final code
    const charCode = parseInt(binary, 2);
    code += String.fromCharCode(charCode);
    binary = '';
  }
  // return entry to allow chaining
  return ___;
}

Note that the last getter ___ will evaluate the code. It is an implicit function call that is triggered when we access the property. An obfuscated alert(1) looks like this:

___                   // entry
.__.__._._._._.__     // "a"
.__.__._.__.__._._    // "l"
.__.__._._.__._.__    // "e"
.__.__.__._._.__._    // "r"
.__.__.__._.__._._    // "t"
._.__._.__._._._      // "("
._.__.__._._._.__     // "1"
._.__._.__._._.__     // ")"
.___                  // eval

A similar approach uses tagged template literals where =0 and x=1. The obfuscated code looks super dense and our decoder is only 90 bytes:

// obfuscator
C=B='',_=([l])=>(B+=+!!l,B[6]&&(+B||eval(C),C
+=String.fromCharCode(parseInt(B,2)),B=[]),_)

// alert(1)
_
`_``_``````````_`
`_``_````_``_`````
`_``_``````_````_`
`_``_``_``````_```
`_``_``_````_`````
```_````_```````
```_``_````````_`
```_````_``````_`
``````````````

In the final step, we use the array literal notation [] instead of our own _ entry point and access properties using the bracket notation [][propertyString].

With squared brackets we can create exactly two different strings: "" and "undefined" - enough to encode our zeros and ones:

String([])     === ''
String([][[]]) === 'undefined' it will be `0`.

Here is the full script:

// assign method 'undefined' to all arrays
Object.assign(Array.prototype, {
  get undefined() {
    let binary = ''; // the binary representation for every char
    let code = ''; // the final code to execute

    // return a Proxy for every access to ['undefined'] or ['']
    const sic = new Proxy({}, {
      get: function(_, name) {
        // compose an ASCII binary code
        binary += Number(!name);
        if (binary.length == 7){
          // if binary is not '0000000'
          if (binary !== '0000000') {
            // add more chars to the final code
            const charCode = parseInt(binary, 2);
            code += String.fromCharCode(charCode);
            binary = '';
          } else {
            // otherwise eval the code
            Function(code)();
          }
        }
        return sic;
      }
    });

    return sic;
  }
});
      

This uses a Proxy instead of a simple getter. If the property name is empty ("") it will be converted to 1 and if the name is "undefined" it will be 0.

Here is the minified version (< 128 bytes):

[C=B=[]].__proto__[C.B]=T=new Proxy(B,{get:(_,N)=>(B+=+!N,B[6]&&
(+B||eval(C),C+=String.fromCharCode(parseInt(B,2)),B=[]),T)});

20XX - Martin Kleppe