Finally, note that all of this only happens when __INCLUDE_LEVEL__ > 12. Before that, it recursively includes itself. Note the definition of the pre-defined __INCLUDE_LEVEL__ macro:
This macro expands to a decimal integer constant that represents the depth of nesting in include files. The value of this macro is incremented on every ‘#include’ directive and decremented at the end of every included file. It starts out at 0, its value within the base file specified on the command line.
The following else statement corresponds to the previous if __INCLUDE_LEVEL__ > 12 which, if passed, checks the flag.
Before the __INCLUDE_LEVEL__ goes to 13, it recursively includes itself.
Converting to Python Code
In an attempt to make the code more readable and to analyse the checking of the flag, I wrote a script to convert the preprocessor directives to Python code.
withopen("cpp.c", 'r')as infile,open("converted.py", 'w')as outfile: line = infile.readline()[:-1] curr_indent =0while line !='#include <stdio.h>':if line.startswith('//'): line = infile.readline()[:-1]continueif line.startswith('#define'): line_data = line.split()iflen(line_data)==3: outfile.write(f"{curr_indent *' '}{line_data[1]} = {line_data[2]}\n")else: outfile.write(f"{curr_indent *' '}{line_data[1]} = None\n")elif line.startswith('#ifdef'): outfile.write(f"{curr_indent *' '}if '{line.split()[1]}' in locals() or '{line.split()[1]}' in globals():\n") curr_indent +=4elif line.startswith('#ifndef'): outfile.write(f"{curr_indent *' '}if '{line.split()[1]}' not in locals() and '{line.split()[1]}' not in globals():\n") curr_indent +=4elif line.startswith('#undef'): outfile.write(f"{curr_indent *' '}if '{line.split()[1]}' in locals() or '{line.split()[1]}' in globals():\n") outfile.write(f"{(curr_indent +4) *' '}del {line.split()[1]}\n")elif line.startswith('#if'): outfile.write(f"{curr_indent *' '}{line[1:]}:\n") curr_indent +=4elif line.startswith('#else'): outfile.write(f"{(curr_indent -4) *' '}else:\n")elif line.startswith('#endif'): curr_indent -=4elif line.startswith('#error'): outfile.write(f"{curr_indent *' '}print({' '.join(line.split()[1:])})\n")else:print(line) line = infile.readline()[:-1]
The result is something like this:
S =0if INCLUDE_LEVEL >12:if S ==0:if'S'inlocals()or'S'inglobals():del S S =1if'S'inlocals()or'S'inglobals():del S S =24if S ==1:if'S'inlocals()or'S'inglobals():del S S =2if'R0'inlocals()or'R0'inglobals():if'R0'inlocals()or'R0'inglobals():del R0else: R0 =Noneif'R1'inlocals()or'R1'inglobals():if'R1'inlocals()or'R1'inglobals():del R1...else:if S !=-1: INCLUDE_LEVEL +=1exec(open('include.py').read()) INCLUDE_LEVEL -=1if S !=-1: INCLUDE_LEVEL+=1exec(open('include.py').read()) INCLUDE_LEVEL -=1
This is Python code that performs the same checking functionality as the preprocessor directives. This works by replacing ifdef and ifndef with checking whether the variable exists, which is basically the same thing!
For LD(x, y), we can define the following function which accepts l as an array of [l7, l6, ..., l0]:
defLD(l,n):returneval(f"ROM_{''.join([str(x) for x in l])}_{n}")
Analysis (Part 2)
After some dynamic analysis, I found that the code essentially checks each character of the flag one by one, starting from index 0. The value of S follows a predictable sequence for the checking of the first few characters, then goes to S = 56 before the program says INVALID_FLAG.
Let's inspect this part of the code then.
if S ==56:print(S)if'S'inlocals()or'S'inglobals():del S S =57if'Q0'notinlocals()and'Q0'notinglobals():if'Q1'notinlocals()and'Q1'notinglobals():if'Q2'notinlocals()and'Q2'notinglobals():if'Q3'notinlocals()and'Q3'notinglobals():if'Q4'notinlocals()and'Q4'notinglobals():if'Q5'notinlocals()and'Q5'notinglobals():if'Q6'notinlocals()and'Q6'notinglobals():if'Q7'notinlocals()and'Q7'notinglobals():if'S'inlocals()or'S'inglobals():del S S =58if S ==57:if'S'inlocals()or'S'inglobals():del S S =58print("INVALID_FLAG")if S ==58:if'S'inlocals()or'S'inglobals():del S S =59if'S'inlocals()or'S'inglobals():del S S =-1
At S = 56, if any of Q0 to Q7 is defined, then S is set to 57, and this results in INVALID_FLAG. However, if all of Q0 to Q7 is not defined, then it skips this part of the code and jumps to S = 58.
Solving
This knowledge greatly reduces the time complexity of a bruteforce solution. The idea is that when any of Q0 to Q7 is set, we can conclude that the last-checked character is wrong. Previously, we would have no way of knowing whether each individual character is correct, only that the flag as a whole is wrong.
The following solver script implements this, albeit in a very hacked-together kind of way. Since LD(l, n) is called for each bit in each flag character, we know that the i-th character is wrong if by the i+1-th character, any of Q0 to Q7 is set. This is then handled by the driver code by moving on to the next possible i-th character.
import stringfrom io import StringIOfrom contextlib import redirect_stdoutCHAR_a =97...CHAR_UNDERSCORE =95FLAG_0 = CHAR_C...FLAG_26 = CHAR_RBRACEresult =''for i inrange(27):defLD(l,n):global iiff"ROM_{''.join([str(x) for x in l])}_{n}"==f"ROM_{bin(i +128+1)[2:]}_0":if'Q0'inlocals()or'Q0'inglobals()or\'Q1'inlocals()or'Q1'inglobals()or\'Q2'inlocals()or'Q2'inglobals()or\'Q3'inlocals()or'Q3'inglobals()or\'Q4'inlocals()or'Q4'inglobals()or\'Q5'inlocals()or'Q5'inglobals()or\'Q6'inlocals()or'Q6'inglobals()or\'Q7'inlocals()or'Q7'inglobals():raiseExceptionprint(f"ROM_{''.join([str(x) for x in l])}_{n}", eval(f"ROM_{''.join([str(x) for x in l])}_{n}"))returneval(f"ROM_{''.join([str(x) for x in l])}_{n}") f =StringIO()for char in string.ascii_letters + string.digits +"_{}":print(f"Trying FLAG_{i} = {char}...")exec(f"FLAG_{i} = ord(char)")try:withredirect_stdout(f):exec(open('converted.py').read())except:continueelse:breakprint(char, "works!") result += charprint('FLAG:', result)
This gives us the flag relatively quickly: CTF{pr3pr0cess0r_pr0fe5sor}