The Turing Test

Alan Turing and the Imitation Game

   Alan Turing, in a 1951 paper, proposed a test called "The Imitation Game" that might finally settle the issue of machine intelligence. The first version of the game he explained involved no computer intelligence whatsoever. Imagine three rooms, each connected via computer screen and keyboard to the others. In one room sits a man, in the second a woman, and in the third sits a person - call him or her the "judge". The judge's job is to decide which of the two people talking to him through the computer is the man. The man will attempt to help the judge, offering whatever evidence he can (the computer terminals are used so that physical clues cannot be used) to prove his man-hood. The woman's job is to trick the judge, so she will attempt to deceive him, and counteract her opponent's claims, in hopes that the judge will erroneously identify her as the male.

   What does any of this have to do with machine intelligence? Turing then proposed a modification of the game, in which instead of a man and a woman as contestants, there was a human, of either gender, and a computer at the other terminal. Now the judge's job is to decide which of the contestants is human, and which the machine. Turing proposed that if, under these conditions, a judge were less than 50% accurate, that is, if a judge is as likely to pick either human or computer, then the computer must be a passable simulation of a human being and hence, intelligent. The game has been recently modified so that there is only one contestant, and the judge's job is not to choose between two contestants, but simply to decide whether the single contestant is human or machine.

   The entry on the Turing Test (click here) is short, but very clearly stated. A longer, but point-form review of the imitation game and its modifications written by Larry Hauser, click here (if link fails, click here for a local copy) is also available. Hauser's page may not contain enough detail to explain the test, but it is an excellent reference or study guide and contains some helpful diagrams for understanding the interplay of contestant and judge. The page also makes reference to John Searle's Chinese Room, a thought experiment developed as an attack on the Turing test and similar "behavioural" intelligence tests. We will discuss the Chinese Room in the next section.

Natural Language Processing (NLP)

   Partly out of an attempt to pass Turing's test, and partly just for the fun of it, there arose, largely in the 1970s, a group of programs that tried to cross the first human-computer barrier: language. These programs, often fairly simple in design, employed small databases of (usually English) language combined with a series of rules for forming intelligent sentences. While most were woefully inadequate, some grew to tremendous popularity. Perhaps the most famous such program was Joseph Weizenbaum's ELIZA. Written in 1966 it was one of the first and remained for quite a while one of the most convincing. ELIZA simulates a Rogerian psychotherapist (the Rogerian therapist is empathic, but passive, asking leading questions, but doing very little talking. e.g. "Tell me more about that," or "How does that make you feel?") and does so quite convincingly, for a while. There is no hint of intelligence in ELIZA's code, it simply scans for keywords like "Mother" or "Depressed" and then asks suitable questions from a large database. Failing that, it generates something generic in an attempt to elicit further conversation. Most programs since have relied on similar principles of keyword matching, paired with basic knowledge of sentence structure. There is however, no better way to see what they are capable of than to try them yourself. We have compiled a set of links to some of the more famous attempts at NLP. Students are encouraged to interact with these programs in order to get a feeling for their strengths and weaknesses, but many of the pages provided here link to dozens of such programs, don't get lost among the artificial people.

Online Examples of NLP

   A series of online demos (many are Java applets, so be sure you are using a Java-capable browser) of some of the more famous NLP programs.

The Loebner Prize

   Although Turing proposed his test in 1951, it was not until 40 years later, in 1991, that the test was first really implemented. Dr. Hugh Loebner, a professor very much interested in seeing AI succeed, pledged $100,000 to the first entrant that could pass the test. The 1991 contest had some serious problems though, (perhaps most notable was that the judges were all computer science specialists, and knew exactly what kind of questions might trip up a computer) and it was not until 1995 that the contest was re-opened. Since then, there has been an annual competition, which has yet to find a winner. While small prizes are given out to the most "human-like" computer, no program has had the 50% success Turing aimed for.

Validity of the Turing Test

   Alan Turing's imitation game has fueled 40 years of controversy, with little sign of slowing. On one side of the argument, human-like interaction is seen as absolutely essential to human-like intelligence. A successful AI is worthless if its intelligence lies trapped in an unresponsive program. Some have even extended the Turing Test. Steven Harnad (see below) has proposed the "Total Turing Test", where instead of language, the machine must interact in all areas of human endeavor, and instead of a five minute conversation, the duration of the test is a lifetime. James Sennett has proposed a similar extension (if link fails, click here for a local copy) to the Turing Test that challenges AI to mimic not only human thought but also personhood as a whole. To illustrate his points, the author uses Star Trek: The Next Generation's character 'Data'.

   Opponents of Turing's behavioural criterion of intelligence argue that it is either not sufficient, or perhaps not even relevant at all. What is important, they argue, is that the computer demonstrates cognitive ability, regardless of behaviour. It is not necessary that a program speak in order for it to be intelligent. There are humans that would fail the Turing test, and unintelligent computers that might pass. The test is neither necessary nor sufficient for intelligence, they argue. In hopes of illuminating the debate, we have assigned two papers that deal with the Turing Test from very different points of view. The first is a criticism of the test, the second comes to its defense.

Additional Resources

   Students interested in more information on the Turing test and the surrounding controversy may find the links below helpful. Each is a compilation of Turing Test related material, the first dealing with the more applied issues, the Loebner prize and NLP programs in general; the second with the philosophical issues surrounding the Test and its variations.

Previous (Can Machines Think?) | Home | Next (The Chinese Room)

If any part of the site is not working for you, or if you would like to see a resource added, please contact us.
All contents copyright , 1999.