Human-aided Bot Operation
Published on Sep 23, 2019
We’d like to think that we can outsmart computers. If we’re trying to defend a website against bots, we’d like to be able to include a CAPTCHA and call it a day. Sure, we might need to contort some letters or sprinkle some lines and dots in the background, but ultimately, we feel pretty confident that we can come up with a sufficient roadblock that stops bots while letting humans through. Unless -- and there’s a faint worry in the back of our minds -- machine learning has progressed far enough that maybe bots can actually defeat our twisted text CAPTCHAs.
It turns out that it’s actually not that difficult to come up with alternate CAPTCHAs that don’t involve users deciphering mangled text. Unless those particular CAPTCHAs are very popular, it’s very unlikely that there is an automated way of solving them. At the same time, it’s easy to become overconfident as illustrated by this XKCD comic: the vulnerability is not so much that the bot can overcome the CAPTCHA, but rather that the bot can simply outsource CAPTCHA work to humans. There are a number of CAPTCHA-breaking networks that pay humans very small amounts of money to solve a CAPTCHA challenge. Consider the varying levels of cooperation between bots and humans:
Level 1: Naive bot, no human interaction
Some scrapers do not account for CAPTCHAs.
Countermeasure: Any CAPTCHA will be effective in this trivial case.
Level 2: Bot with CAPTCHA-breaking capabilities
Some CAPTCHAs (e.g. distorted text or math problems) have been around long enough that libraries have been developed to circumvent those CAPTCHAs. Bots can use those libraries and be programmed to fill in specific fields to answer CAPTCHA challenges.
Countermeasure: A sufficiently non-standard CAPTCHA should be effective until the CAPTCHA-breaking software improves.
Level 3: Bots requiring human configuration but not ongoing human interaction
Countermeasure: Websites might be able to counter these bots by adding sufficient variety. For example, they might use randomly generated values for the id of the CAPTCHA field. Or, they could change the special word on each page load. These strategies are still suspect because a sufficiently configurable bot could be programmed to read the CAPTCHA instructions (which are meant to be easy for humans to understand) and act accordingly. More effective would be stronger CAPTCHAs that require more user interaction.
Level 4: Bots that use remote humans on an ongoing basis
Suppose you come up with a CAPTCHA that is impossible for computers to automatically crack. Perhaps you have discovered some novel way to transform characters. Or perhaps you render graphically instructions to type a specific word so that computers cannot easily read the text, and even if they do, they need to figure out which word is the magic keyword and you have taken care to vary the instructions so that the keyword is not always the last word. Or perhaps you have a very large collection of questions and answers that can be found on the internet, but which bots cannot easily answer. Here, bots do not need to become better; they can simply send the CAPTCHA challenge to humans to solve relatively cheaply. Humans can load the image or text and send back the answer.
Level 5: The bot and the human are one: full browser emulation
The bot might be a full-fledged browser and when it detects a CAPTCHA element, it waits for a human to interactively control it to solve the CAPTCHA (similar to how screensharing software allows clients to take control of a desktop or an application). All human input is captured and emulated.
Countermeasure: It seems unlikely that there is an effective obstacle to this behavior for public sites relying only on CAPTCHA, since this would be virtually indistinguishable from humans browsing the site. However, this type of arrangement would require fairly sophisticated software and would likely cost much more in human labor. One consolation is that as illustrated with the bear joke, you don’t need to make your website completely impervious to bots, only to be difficult enough that it would be less expensive to just contact you and work with you directly.
To that end, one possible mitigation is to limit the time allowed to solve the CAPTCHA, making it more expensive to scrape (because a human needs to be nearby). This method is likely defeated by humans simply refreshing the page when they are ready to solve the CAPTCHA. At which point you can track the number of unsolved CAPTCHAs… and the arms race continues.
At NetToolKit, we have spent time thinking through these vulnerabilities and have tried to engineer protections against various threats. We believe Shibboleth can handle any level of bot-human integration except for full-browser emulation (we’re working on another service for that). If you have any notes to add or ideas for new and fun CAPTCHAs, please let us know.