Go to content Go to navigation Go to search

: Indsigtsfuld kommentar om CAPTCHA

2009-03-28 10:30 af Thomas Damgaard Nielsen -

QuoteMstr har skrevet denne indsigtsfulde kommentar om CAPTCHAs


A few common CAPTCHA fallacies

Everyone has a great idea for a CAPTCHA, but very few people know what the hell is really going on. Remember that the machine doesn’t need to solve the CAPTCHA every time, that machines are infinitely patient and have huge memories, and that another machine needs to make sure the human gave the right answer!

Ideas that won’t work:

  1. Make clients identify an object from a picture. Machines can’t describe objects in pictures: if machines can’t describe the picture, how the hell is the CAPTCHA server supposed to verify that the client gave the correct answer? If a human being manually inputs the pictures and acceptable descriptions for each, then another human can program his attacking machine to do the same thing! Having a large, but finite set of pictures doesn’t help either since a machine doesn’t need to solve the CAPTCHA every time. It can just learn the correct responses without actually understanding the image. ANY APPROACH BASED ON IDENTIFYING A MEMBER OF A FINITE SET DOES NOT WORK AS A CAPTCHA.
  2. As a special case of #2, QUIZZES DO NOT WORK: either the questions are finite and subject to attacker memorization, or the number of patterns for the question is finite, and these patterns can be detected by a machine. (Consider “A train is coming from Denver at X miles per hour…” —- same problem, different coefficients)
  3. Send the client a special program that verifies he’s real: if it doesn’t work for DRM, it won’t work for CAPTCHAs. An attacker can just program his machine to simulate slow typing, slow thinking, or a cross-eyed human being. YOU CANNOT CONTROL THE EXECUTION ENVIRONMENT. No amount of Javascript obfuscation, encryption, or header-checking will make the slightest bit of difference for a determined hacker.
  4. As a special case of #3, TIMING ANALYSIS DOES NOT WORK. Machines can simulate arbitrary delays.
  5. Limiting CAPTCHA-solving attempts by cookie/IP address/etc.: that doesn’t work. Attackers don’t obey web standards, and have botnets

Really, it’s very easy to think you’ve come up with a very clever CAPTCHA. When you think that, all you’ve done is stoked your ego and screwed yourself over. It’s the same reason why we don’t roll our own cryptography: CAPTCHA-making is a very hard problem, mainly because your problem space must be infinite (to avoid an attacking machine simply memorizing answers), the answers verifiable by a machine, but the problems not solvable by a machine.

How many questions can be checked by machines but not answered by them?

Not many; fewer every day. There are no questions that can’t be answered by a computer (and which can be answered by a human mind). The Church-Turing thesis has some validity: the human mind is no more powerful than a turing machine, and ultimately, computers and our brains are equivalently computationally. There’s nothing a computer can’t solve: there are just things we haven’t figured out yet.

Kommentarer

  Textile hjælp