DiceCTF 2021 - Lambda
Post
Cancel

# DiceCTF 2021 - Lambda

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # this is apparently "too nested" for the native python parser, so we need to use a custom parser prog_string = open('./prog', 'r').read() prog, _ = load(prog_string) flag = input('flag plz: ').encode('ascii') print('checking...') # --- takes 1-2 minutes to check flag o = _______________________(flag, prog) # --- output = bytes(o[:o.index(0)]).decode('ascii') print(output) if output == b'Correct!': print('Flag: %s' % flag) 

load parses a string containing nested tuples generated by the following grammar

1 S -> '(' S ')' | S ',' S | ϵ 

The author didn’t use eval() or exec() because these functions can’t parse big nested strings!

So, the nested tuple is parsed and the input is validated using that. Let’s clean up the code, making the lambdas meaningful

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 # xapply: apply function over argument list xapply=lambda x:lambda b,**a:x(*b,**a) # sfirst(l): returns l[0], where l is a sequence sfirst=xapply(lambda x,*b:x) # stail(l): returns l[1:], where l is a sequence stail=xapply(lambda x,*b:b) # first(t): returns t[0], where t is a sequence of length 2 first=xapply(lambda x,b:x) # last(t): returns t[1], where t is a sequence of length 2 last=xapply(lambda x,b:b) # make_tuple(...): returns the list of arguments as a tuple make_tuple=lambda *x:x # iif(cond, true_fn, false_fn): cond? true_fn : false_fn iif=lambda x,b,a:sfirst(stail(make_tuple(*(((),)*(x==())),a,b)))() # make1_tuple(x): increases the nesting level of x by 1 make1_tuple=lambda x:(x,) # make_nested_paren(n): # n = 0 => return (), 1 => return ((),), 2=> return (((),),) and so on make_nested_paren=xapply(lambda *x,a=():iif(x,lambda:make_nested_paren(stail(x),a=make1_tuple(a)),lambda:a)) # rol1: rotate left by 1 rol1=xapply(lambda *x:make_tuple(*stail(x),sfirst(x))) # ii: apply func (a) n times, where n is the nesting level of x to b ii = lambda x,b,a: iif(x,lambda:ii(sfirst(x),a(b),a),lambda:b) # hh: apply a, len(x) times on b, tail recur hh=T(lambda x,b,a:iif(x,lambda:hh(sfirst(x),a(b),a),lambda:b)) # rol1 n-1 times => rotate right 1 ror1=xapply(lambda *x:hh(sfirst(make_nested_paren(x)),x,rol1)) # one(l): put l[0] into n'th nested level where n = len(l) # one([1,2,3]) = (((1,),),) one=xapply(lambda *x:iif(stail(x),lambda:one(make_tuple(make1_tuple(sfirst(x)),*stail(stail(x)))),lambda:make1_tuple(sfirst(x)))) list_len=lambda x:hh(x,0,lambda b:b+1) 

Let’s look at the implementation of hh.

1 hh=T(lambda x,b,a:iif(x,lambda:hh(sfirst(x),a(b),a),lambda:b)) 

it says - while x is not () (false), compute a(b) and set x = x[0], ie., apply a on the b n times, where n is the nesting level of x. For example, if x = ((((),),),) then it returns a(a(a(b))). The T function forces the lambda to do tail recursion

So, we can rewrite hh as

1 2 3 4 5 def hh(n, val, fn): while n: val = fn(val) n -= 1 return val 

## Representation of Integers

Integers are represented as nested tuples. 0 is represented as (), 1 as ((),), 2 as (((),),) and so on..

## Validating the Input

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 def loop(tape,code,str_input): # while rip doesn't point to () => loop # else halt the Turing Machine while sfirst(code): tape, code, str_input = run_turing_machine(tape, code, str_input) return tape checker = lambda x,b: xmap( loop( ((),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),(),()), b, make_tuple( *xmap(xmap(x,lambda x:((),)*x),one), *(((),(),(),(),(),(),(),(),(),(),(),(),(),(),(),())) ) ), list_len ) o = checker(flag, prog) # --- 

run_turing_machine is a very big function. You can find the source here. It’s a big if else ladder. The instructions are represented in huffman encoding (no two encoding have the same prefix). Why did i say huffman? Check the conditions of the if else ladders

## Machine Model

The author has implemented a two tape turing machine using lambda calculus

A tape is represented as a circular list. Two tapes are used:

1. I - used for storing the input and doing alu arithmetic
2. D - used for storing the registers

Each of the tape (T) has a tape pointer (tp) to it. It supports three operations:

1. tpush - write the data to the cell at tp
2. tpop - read the data at tp
3. tp += N - move the tape pointer N cells to right
4. tp -= N - move the tape pointer N cells to left

The letter t in the above instructions is a placeholder, ie., for tape D, tpush becomes dpush and tpop becomes dpop. Similarly for tape I, tpush becomes ipush and tpop becomes ipop. Read and write always happen at the current tape pointer

Instruction List [and Encoding]:

1. ipush; dpop
• Encoding - 00
• Read from tape I and write to tape D
2. ipush D[0]
• Encoding - 01
• Read from tape D and push to tape I
3. dp += I[0]
• Encoding - 100
• Move the tape pointer of tape D right by I[0] cells
4. dp -= I[0]
• Encoding - 101
• Move the tape pointer of tape D left by I[0] cells
5. jmp N
• Jump to Nth instruction
• Encoding 1100 refers to jump back relative to current ip
• The displacement is given by val(curr[1][1][0][0][0]) where curr points to the current instruction, val returns the integer represented using nested tuples
• Encoding 1101 refers to jump forward relative to current ip
• The displacement is given by val(curr[1][1][0][0][0]) where curr points to the current instruction, val returns the integer represented using nested tuples
6. ipush IMM
• Encoding - 11100
• Push immediate value (IMM) to the tape I
• The immediate value is given by val(curr[-1][-1][-1][0][0].len()-1)
7. ipush I[index]
• Encoding - 11101
• Push value at index from tape I into tape I. index is val(curr[-1][-1][-1][0][-1].len()-1})
8. if (I[0] == 0) then jmp R
• Encoding - 111100
• Conditional Jump
• Jumps to instruction at offset R when the value at the tape pointer of I is 0
9. jmp \$+I[0]
• Encoding - 111101
• Relative Jump
• Jumps to instruction at offset rip+1+I[0], where rip is pointing to the instruction being executed
10. add I[0], I[1]
• Encoding - 1111100
• Pops two words from the tape I, adds them and pushes back to tape I
11. sub I[0], I[1]
• Encoding - 1111101
• Subtraction
• Pops two words from the tape I, computes I[1]-I[0] and pushes back to tape I

Let’s convert the turing machine to readable asm, since it’s hard to read the TM

Source Code

## Assembly

Executing the above script, we get the asm. Cleaning up by basic blocks, we get the following code

ASM

### Understanding the code

Let’s represent the flag bytes as f[0], f[1], ..., f[-2], f[-1] since we don’t know the length yet.

1. The setup routine (at #0234) initializes the I tape, at 0279 jmp 138 the I tape looks like this

f[-1], f[-2], …, f[1], f[0], 43,
I[0], 54, I[1], 35, I[2], 31,
I[3], 78, I[4], 116, I[5], 32,
I[6], 28, I[7], 1, I[8], 47,
I[9], 84, I[10], 25, I[11], 26,
I[12], 123, I[13], 69, I[14], 102,
I[15], 50, I[16], 65, I[17], 21,
I[18], 16 (top of stack)

Resolving the I[n] references, we get

1 2 3 4 5 6 7 8 f[-1], f[-2], ..., f[1], f[0], 43, f[0], 54, f[1], 35, f[2], 31, f[3], 78, f[4], 116, f[5], 32, f[6], 28, f[7], 1, f[8], 47, f[9], 84, f[10], 25, f[11], 26, f[12], 123, f[13], 69, f[14], 102, f[15], 50, f[16], 65, f[17], 21, f[18], 16 (top of stack) 

So, we can see the following:

1. The flag length is 19 bytes
2. Every flag byte is paired with an integer

at #0138 the D tape is setup as:

1 2 3 4 5 6 D[1] = 280 D[2] = I[0] // 16 D[3] = 42 D[4] = 24 D[5] = 13 D[6] = 0 

at #0212, we have a check

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 D[2] -= 1 if D[2] == 0 { goto #0225 } else { goto #0222 } #0225: goto D[1] #0222: goto #0156 #0156: D[7] = 175 goto #0095 #0095: D[9] = 128 D[8] = D[3]+D[4]+D[5] goto #0110 

at #0110, we have

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 D[10] = 123 goto #0048 #0048: D[12] = D[8] D[11] = D[9] goto #0054 #0054: D[11] -= 1 D[12] -= 1 if D[12] == 0 { goto #0086 } if D[11] == 0 { goto #0088 } goto #0054 

The loop at #0054 runs min(D[8], D[9]) iterations, and does nothing!
Cases:

1. D[9] is less than D[8]
• then D[8] is set to D[8]-D[9] and #0110 is executed
2. D[8] is less than or equal to D[9]
• goto D[7]

Now since D[2] != 0, and D[8] = D[3]+D[4]+D[5] = 79, we jump to #0175. At #0175, the values are updated using a fibonacci sequence

1 2 3 4 5 #0175: D[3] = D[4] D[4] = D[5] D[5] = D[8] // D[3]+D[4]+D[5] D[7] = 202 

now at #0095, we have a different code

1 2 3 #0095: D[9] = 128 D[8] = D[3] + I[0] // flag check bytes 

So, now we are ready to write a script!

## Solution

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 import string al = set(string.printable) al -= set(string.whitespace) al = list(map(ord, al)) # check bytes - bytes paired with every flag byte x = [ 21, 65, 50, 102, 69, 123, 26, 25, 84, 47, 1, 28, 32, 116, 78, 31, 35, 54, 43 ] l = [42, 24, 13] # a[n] = a[n-1]+a[n-2]+a[n-3] for _ in range(19): l.append(sum(l[-3:])) fk = '' xl = [] for i in range(19): a1 = 128+x[i]-l[i+1] & 0xff a2 = x[i]-l[i+1] & 0xff a3 = l[i+1]-x[i]+128 & 0xff a4 = l[i+1]-x[i] & 0xff a = list(filter(lambda j: j in al, [a1, a2, a3, a4])) if len(a) == 1: fk = chr(a[0]) + fk else: fk = '[{}]'.format(''.join(map(chr, a))) + fk xl.append(a) xl = xl[::-1] out = [] def rec_print(l): if not l: return [['']] tails = rec_print(l[1:]) xtails = [] cl = l[0] for tail in tails: for fuck in cl: xtail = [i for i in tail] for i in range(len(tail)): xtail[i] = chr(fuck) + tail[i] xtails.append(xtail) return xtails ans = rec_print(xl) ans = [i[0] for i in ans] print('\n'.join(ans)) 

Executing this, we get