This article explains how intelligent applications from Carnegie Mellon University and Berkeley researchers counter auto registration spam programs and how to build your own using ASP.NET & XML Web services.
view demo
"We can only see a short distance ahead, but we can see plenty there that needs to be done."
- A. M. Turing, Father of AI. British Mathematician (1912-1954)
Scientific research in academia is tightly coupled with today's technological revolution. In this article we will discuss the design, development, and use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart). We experience various forms of CAPTCHAs in our everyday lives, for instance, signing up for an email account, performing DNS lookup (whois), or using images to differentiate between a person and a software program. All major vendors, Web portals, and email providers use CAPTCHA to improve their quality of service. Search engines and Web directories are utilizing CAPTCHA to avoid skews in their listings, possibly caused by autonomous rogue submission programs. Online polls use this technique to avoid multiple voting, as proxy addressing and/or IP spoofing makes it difficult to maintain the integrity of online polls. Protection from brute force or dictionary-based password attacks are also provided using this simple but effective practice.
First I'll describe a short history of CAPTCHA and provide a definition of Turing's test and machine vision. Then I'll define how Yahoo!, AltaVista, PayPal, and other portals use the CAPTCHA approaches in various ways to protect their digital assets. Finally I'll explain how to write a program in ASP.NET to protect a Web application from autonomous bots. Apart from the theoretical discussion, I'll explain the code snippets for manipulating images in ASP.NET and C#. Three in-depth examples will cover dynamic image generation, dictionary-based CAPTCHA style imaging, and Web services to return such images. Besides CAPTCHAs, this article will enhance our knowledge about .NET imaging libraries, on the fly image generation, and serving binary data using XML Web services.
CAPTCHA is an acronym for "Completely Automated Public Turing Test to Tell Computers and Humans Apart". As the name suggests, it's a test to distinguish the degree of being human. As defined on the CAPTCHA home page at the Carnegie Melon University School of Computer Science's Web site:
CAPTCHA is a program that can generate and grade tests that
- Most humans can pass.
- Current computer programs can't pass.
For instance, the following image, which is generated by the Web service we'll see later in this article, is difficult to be read by a computer program. However a seven year old can easily figure it out.
Figure 1.1: A visual CAPTCHA generated by captchaWebservice (listing at end of article).
With the exponential growth of services and businesses over the Internet, online security has become a real concern for software developers, architects, managers, and vendors. Software programs are written to impersonate human beings, mimic their surfing patters, and imitate online activities. These "pretending to be human" programs are referred to as robots or virtual agents. Imagine a software program performing a brute force attack (exhaustive search) on your e-mail address, an attack which requires trying all possible permutations and combinations of password values until the right one is found. Digital assets are at risk from spam bots on various fronts -- Web polls, Web registrations, automated services, and search engine submission to name a few.
|
What is Turing Test?
Turing test was introduced by Alan M. Turing (1912-1954) as "the imitation game" in his paper. This test is the foundation to determine if a computer program has intelligence or more precisely, can it make interrogator believe it's a human being when it is actually a machine.
Turing test is described as "The new form of the problem can be described in terms of a game which we call the "imitation game." It is played with three people, a man (A), a woman (B), and an interrogator (C) who may be of either sex. The interrogator stays in a room apart from the other two. The object of the game for the interrogator is to determine which of the other two is the man and which is the woman. He knows them by labels X and Y, and at the end of the game he says either "X is A and Y is B" or "X is B and Y is A." The interrogator is allowed to put questions to A and B."
In spite of much criticism (Chinese Room Argument (Searle,1980), variant definitions of 'thinking' and 'intelligence' Turing test is an essential foundation approach in AI, philosophy and cognitive science.
Courtesy CAPTCHA.NET. The CAPTCHA Project is a project of the School of Computer Science at Carnegie Mellon University. It is funded by the NSF Aladdin Center.
|
This is where scientists felt automated testing to differentiate between humans and machines was necessary. Udi Manber, Yahoo's chief scientist, Manuel Blum and his graduate students at the School of Computer Science at Carnegie Mellon University developed CAPTCHA support for Yahoo! so a rogue routine using the HTTP POST to re-submit a form over and over again would not go through.
It was a challenging problem for this group of intelligent people. Yahoo's chief scientist Udi Manber said "If you're in academia, you're always looking for interesting problems. If you're in industry, like me, you've got too many interesting problems." Yahoo protected several of their services including yahoo briefcase, yahoo mail and groups from automated registration abuse by introducing CAPTCHAs in it.
An example of dictionary-based attack protection is shown below.
Figure 1.2: Yahoo! Countering the brute force attack.
As shown in the figure above, Yahoo! would not let a software program continue unless it fills in the image value. This image is dynamic and will be different every time. Even if the user ID and password are correct, a legitimate user can't login unless he provides the text in the image. This is how it differentiates between a human being and a computer.
An average service exploiter or brute attack bot (short for robot, the autonomous agent) can't read this image. To read this it needs OCR (optical character recognition). Even if we consider an OCR with excellent image recognition capabilities, it will be difficult to read even this image's filename in HTML!
http://reg.yimg.com/i/retcQ.dZFemtHS_cf_8Qk12i.XyVGZ2Ej2qW7dKNiIqt0C1AF6mlqmWnUuLe.jpg
The filename is a long random string which contains the hash encoding of a challenge file so the string could be matched. We will discuss these details later.
Also the HTML form contains the following hidden value regarding the challenge.
<input type=hidden name=".challenge" value="c9gLhuwLilq7KGFDsNBjac2ZSvWL"
>
Figure 1.3: Accessing Yahoo! CAPTCHA image.
This image is stored on disk and can be accessed as shown in the figure above, but it's still difficult to read with an optical character recognizer. The affine transformations (skew, stretch, scale) has made this text difficult for an OCR, which weighs the neural output on the basis of pattern matching, to read. It's entirely different from how humans read. Humans just don't read text; they also have contextual background with it to pick it very clearly. This is not the case with machine vision. With letters "t" and "i" mingling together against the fuzzy background, blending gradient and noise, it will be very difficult for an OCR to read.
Introducing images wasn't able to hinder the bots from automated registration for long. Bot writers took it as challenge and using optical character recognition techniques, such images were read, and automation has continued. But reading a simple text-based CAPTCHA image with a predictable grid was much easier than a skewed, twisted, and distorted image built to baffle the bots.
It's difficult for machines to read noise, affine transformed characters (especially mirror effects and xy-sheering), segmentation, gradient, occlusion, degradations, etc. Machines don't think in a social context. The way they watch an image is like blind men and an elephant; they all will have separate interpretations.
For increasingly intelligent bots, CMU created various kinds of tests, namely Gimpy, Bongo, Pix, Sounds and Byan.
Gimpy
They state gimpy as their most reliable system. Furthermore "It was originally built for (and in collaboration with) Yahoo! to keep bots out of their chat rooms, to prevent scripts from obtaining an excessive number of their e-mail addresses, and to prevent computer programs from publishing classified ads."
It chooses a certain number of words from a dictionary, as our Web service and application at the end of this article does, and displays them corrupted and distorted in an image. It challenges users to type the text in the image, which humans can do and a bot can't.
Pictures below demonstrate CAPTCHAs generated by gimpy.
Figure 1.4: Images generated by Gimpy. (Courtesy CMU)
Figure 1.5: A multiple word test by Gimpy. (Courtesy CMU)
Bongo
Bongo is a program that asks the user to solve a visual pattern recognition problem like the one below. Further details can be obtained from CMU CAPTCHA homepage.
Pix
This program selects random images with certain objects in common and asks users what is common among them. This novel and intelligent approach, I believe, will keep the bots baffled for quite some time.
Figure below demonstrates how pix works.
Figure 1.6: A picture recognition test by Pix. (Courtesy CMU)
Sounds
As its name suggests, this is an audio based version of gimpy. It randomly selects a word or numbers and generates a sound clip with quality degradation. To overcome an application like this, bots would have to be equipped with not only OCR but voice recognition as well.