Cyber Defenders Discovery Camp 2019 LSCVM

Writeups for the LSCVM-related challenges

Here are write ups for the 2 LSCVM-related challenges presented during the qualifiers (Sorry, I forgot about these). The format of these challenges were very similar - a binary containing a interpreter called LSCVM (LightSpeed Corp VM) and some code to check for challenge completion was made available. A debugger (take a look here for a detailed list of the opcodes available) was built during the qualifiers, while a tool to transpile Python to LSCVM opcodes was built between the qualifiers and finals.

Since it was stated during the qualifiers that LSCVM will be reused during the finals, I decided to build a tool capable of making use of all the features of the LSCVM Opcodes, for example, function calls.

LSCVM: Immaculate Invasion

As mentioned above, LSCVM is a interpreter for a bunch of opcodes. Decompilation of the relevant function once cleaned up:

void op_exec(byte param_1)
{
  switch(param_1) {
  default:
    nop();
    break;
  case 0x20:
    nop();
    break;
  case 0x41:
    stack_add_ints();
    break;
  case 0x42:
    vm_exit();
    break;
  case 0x43:
    FUN_00100d84();
    break;
...snip...

This function processes operations provided as an ASCII string. For example, the following string pushes numbers to the stack representing ASCII characters then prints them.

Using the debugger to execute opcodes found in the binaries

What do we need to do for this challenge then? Taking a look at the decompilation for main:

exec_vm("eiMaAhiMcAeiMaAhjMfAijMbAPPPPP",debug);
fgets(ptr_input,0x1000,stdin);
exec_vm(ptr_input,debug);
iVar2 = strcmp(&heap_pointer,"lsc_user");
if (iVar2 == 0) {
  strcpy(&weird_global_thing,&heap_pointer);
  exec_vm("eiMaAhiMcAeiMaAhhcMMcAeehMMcAgjcMMdAeehMMhAeehMMdAeehMMdAdfgMMhAijMiAPPPPPPPPPPP",debug
          );
  fgets(ptr_input,0x1000,stdin);
  exec_vm(ptr_input,debug);
  iVar2 = strcmp(&heap_pointer,"hi_darkspeed-corp!");
  if (iVar2 == 0) {
                  /* prints flag here */
    printf("\nLogin Successful! %s\n\n");

The binary prompts for an ID and Password, expecting the two strings to be found at the start of the heap. As such, we can write a crude script to generate the instructions. Of course, this is an extremely naive implementation that only uses addition to build the numbers.

def stack_push(num):
    # calculate the shortest way to push num onto the stack
    out = []

    if num == 0:
        return [0x61]

    first = True
    while num > 0:
        if num < 9:
            out.append(0x61 + num) # + num
            num = 0
        else:
            out.append(0x6a) # + 9
            num -= 9
        if not first:
            out.append(0x41) # add
        
        first = False
    
    return out

def write_heap(inp):
    opcodes = []
    for i, char in enumerate(inp):
        opcodes += stack_push(ord(char))
        opcodes += stack_push(i) # push i
        opcodes += [0x4b]

    opcodes = [chr(a) for a in opcodes]

    return "".join(opcodes)

print("lsc_user: {}".format(write_heap("lsc_user")))
print("hi_darkspeed-corp!: {}".format(write_heap("hi_darkspeed-corp!")))
> nc lscvm-ii.cddc19q.ctf.sg 9001

=== Welcome to LSCVM(LightSpeed Corp Virtual Machine) ===

ID : jjAjAjAjAjAjAjAjAjAjAjAaKjjAjAjAjAjAjAjAjAjAjAjAhAbKjjAjAjAjAjAjAjAjAjAjAcKjjAjAjAjAjAjAjAjAjAfAdKjjAjAjAjAjAjAjAjAjAjAjAjAeKjjAjAjAjAjAjAjAjAjAjAjAhAfKjjAjAjAjAjAjAjAjAjAjAcAgKjjAjAjAjAjAjAjAjAjAjAjAgAhK
Password : jjAjAjAjAjAjAjAjAjAjAfAaKjjAjAjAjAjAjAjAjAjAjAgAbKjjAjAjAjAjAjAjAjAjAfAcKjjAjAjAjAjAjAjAjAjAjAbAdKjjAjAjAjAjAjAjAjAjAhAeKjjAjAjAjAjAjAjAjAjAjAjAgAfKjjAjAjAjAjAjAjAjAjAjAiAgKjjAjAjAjAjAjAjAjAjAjAjAhAhKjjAjAjAjAjAjAjAjAjAjAjAeAiKjjAjAjAjAjAjAjAjAjAjAcAjKjjAjAjAjAjAjAjAjAjAjAcAjbAKjjAjAjAjAjAjAjAjAjAjAbAjcAKjjAjAjAjAjdAKjjAjAjAjAjAjAjAjAjAjAjeAKjjAjAjAjAjAjAjAjAjAjAjAdAjfAKjjAjAjAjAjAjAjAjAjAjAjAgAjgAKjjAjAjAjAjAjAjAjAjAjAjAeAjhAKjjAjAgAjiAK

Login Successful! $CDDC19${IcY_GrE37ings_Fr0M_LigHT5pEeDC0Rp}

lsc_user, Good Bye!

LSCVM: Quintessential Harlequin

Decompilation of the main function:

  puts("/-------------------------------------------------");
  puts("|   Welcome to LSC\'s Quine Challenge for Noobs   |");
  puts("-------------------------------------------------/\n");
  printf("[+] Please input your code : ");
  fgets(user_input,0x2000,stdin);
  sVar3 = strlen(user_input);
  *(undefined *)((long)&local_201c + sVar3 + 3) = 0;
  if (user_input[0] == '\0') {
    puts("\n[-] Nice try, but this is not what we want :(\n");
    uVar4 = 0xffffffff;
  }
  else {
    exec_vm(user_input,(ulong)local_201c,(ulong)local_201c);
    iVar2 = strcmp(user_input,&print_buffer);
    if (iVar2 == 0) {
      printf("\n[+] Congrats! This is our gift to you : ");
      FUN_0010142e();
    }
    else {
      puts("\n[+] Quine Test --------------------------");
      printf("   - Your input : %s\n",user_input);
      printf("   - Result     : %s\n",&print_buffer);
      puts("\nAww, how unfortunate! Maybe you will perform better next time. Bye!!");
    }
    uVar4 = 0;
  }

Now this was the fun challenge that saw no solves during the qualifiers where it was presented (this was solved long after when I finished writing the transpiler). This binary had the same interpreter as before, except that printing was redirected into a buffer instead of being printed to console.

As the name implies, the challenge here is to write a quine that when executed, printed out the same source code. After quite a bit of head scratching and searching, I came across this article which summarised the concept clearly enough for me to implement:

So to make this possible anyway, we write the build the program from two parts, one which call the code and one which we call the data. The data represents (the textual form of) the code, and it is derived in an algorithmic way from it (mostly, by putting quotation marks around it, but sometimes in a slightly more complicated way). The code uses the data to print the code (which is easy because the data represents the code); then it uses the data to print the data (which is possible because the data is obtained by an algorithmic transformation from the code).

My interpretation what the above suggests that a quine should be:

  1. Data representation of part 2 and 3
  2. Code that prints out the data above
  3. Code that interprets the data above and prints out what runs as step 2 and this step

With this in mind, adjusted for the limitations we face here, I worked towards building a program with the following structure:

  1. Opcodes that push the ASCII representation of everything below onto the stack (data). To "compress" this a bit, I opted to subtract 0x40 from all the data here
  2. Opcodes that read the data above (opcodes) character by character, then prints out opcodes to create the character (So say the data above was 0x12 0x13, this chunk of code would read 0x12, then output aaA which is 99+ resulting in 0x12. This chunk of code essentially prints the data into opcodes that when executed, recreates the data block.
  3. Opcodes that read the same chunk of data above, but this time, prints the data into opcodes that when executed, becomes opcodes that perform step 2 and 3. To reverse the "compression" of the data, this step adds 0x40 to each opcode before printing it.

This would have been painful to write by hand and debug - so no idea how was this meant to be done during the 2 days the qualifiers ran for. Having written the transpiler, I could just transpile Python into code that LSCVM could interpret:

from translator import Translator

# the length of source compiled to lscvm
# lucky guess to find 232, should be possible to bruteforce
COMPILED_CODE_LENGTH = 232

source = """data_len = {}
i = 0
while i < data_len:
    num = stack_find(1 + data_len - i)
    first = 1
    while num > 0:
        if num > 9:
            putchar(0x6A)
            num -= 9
        else:
            putchar(0x61 + num)
            num = 0

        if first == 0:
            putchar(0x41)
        
        first = 0
    i += 1

i = 0
while i < data_len:
    num = stack_find(1 + data_len - i)
    putchar(0x40 + num)
    i += 1"""

def naive_num(num):
    opcodes = ""

    first = 1
    while num > 0:
        if num > 9:
            opcodes += "j"
            num -= 9
        else:
            opcodes += chr(0x61 + num)
            num = 0
        
        if first == 0:
            opcodes += "A"
        
        first = 0

    return opcodes

t = Translator()
code = t.translate(source.format(COMPILED_CODE_LENGTH))
print("Compiled length: {}".format(len(code)))
assert COMPILED_CODE_LENGTH == len(code)

data_block = ""
for c in code:
    opcodes = naive_num(ord(c) - 0x40)
    data_block += opcodes

print(data_block + code)

There was an interesting chicken and egg problem here: My code block needs to know how much data is stored on the stack. However, changing the number in the code block changes the amount stored on the stack. Luckily, I managed to identify 232 as the length of the data block through trial and error. If that did not work, using the nop instruction would probably have worked to pad the values.

As such, when ran locally:

$ ./lscvm-qh < qh
/-------------------------------------------------
|   Welcome to LSC's Quine Challenge for Noobs   |
-------------------------------------------------/

[+] Please input your code : jjAjAjAfAjjAjAhAjjAjAjAbAjjAjAjAeAjeAbjeAjjAjAgAjcAjjAjAgAjjAjAhAjcAjjAjAgAjjAjAjAfAjjAjAhAjjAjAjAgAjjAjAjAgAbbjeAjjAbAjjAjAgAgjjAjAhAfjjAjAgAfjbAjjAjAhAbjjAjAjAjjAiAjjAjAgAjjAjAhAhjjAjAhAjjAjAhAjjAjAhAjjAjAjAdAjjAjAjAjeAjjAjAjAeAjeAbjeAjjAiAjjAjAhAjjAjAgAfbjjAjAhAfjjAbAgjjAjAiAjcAjjAjAhAjjAjAjAjcAjjAjAgAjjAjAiAjjAjAjAeAjeAjjAjAjAeAjeAjjAbAjjAjAgAgjjAjAiAfjjAjAgAjbAjjAjAhAjjAbAjjAjAjAjjAiAjjAjAgAjjAjAhAhjjAjAhAjjAjAjAbAjjAjAhAjjAjAjAgAjjAjAjAgAbbjeAjjAiAjjAjAiAfjjAjAjAgAjbAjjAjAhAjjAbAjjAjAjAjjAiAjjAjAgAjjAjAhAhjjAjAhAjjAjAjAbAjjAjAjAcAjeAjjAiAjjAjAiAjjAjAhAjjAjAjAbAjjAjAjAgAjjAjAjAbAbjeAbjeAjhAjjAjAiAfjjAjAjAgAjjAbAjjAjAiAjcAjjAjAjAgAjjAjAjAeAbhjjAjAhAjjAjAhAjjAjAjAfAjjAjAjAbAjeAjjAjAjAjeAbjeAjjAjAiAfbjhAjjAjAgAjjAjAiAjcAjjAjAjAfjjAjAgAjbAjjAjAjAjjAiAjjAjAgAjjAjAhAhjjAjAhAjjAjAjAfAjjAiAjjAjAjAcAjjAjAjAgAjjAjAjAbAbjeAjhAjjAjAgAhjjAjAgAjjAjAjAjcAheejjAjAhAfjjAjAhAbjjAjAhAjcAheejjAjAgAjjAjAhAjcAjjAjAgAjjAjAhAjjAjAhAjjAjAjAdAjjAjAjAeAjeAbjeAjjAbAjjAjAgAgjjAjAhAfjjAjAgAfjbAjjAjAhAbjjAjAjAjjAiAjjAjAgAjjAjAhAhjjAjAhAjjAjAjAfAjjAjAjAjeAjjAiAjjAjAhAjjAjAgAfbjjAjAhAfjjAbAgjjAjAiAjcAjjAjAjAfAjjAjAjAfAjeAjjAjAiAfbjhAjjAjAhAfjjAjAhAbjjAjAhAjcAheeibehMAMaKabKaibjjAAMSaFbEaEJbAdZabGbbbgdMhMAMZbaEAbESFcKbdKachMhMSaFcEaJbSdZabGbebjjAAMZcEjJbSdZabGbefMZcbejeAMAMPcEjScKjhAGbbieMdMAMcEAPacKdEaJdZabGbiZfjeAMPaGadKGDDbEbAbKGDDabKabbghMAMSaFbEaEJbAdZabGbidMZbaEAbESFcKiiMcEAPbEbAbKGDD
[+] Congrats! This is our gift to you : Flag is FAKE_FLAG_RUN_AGAINST_SERVER