GA Tech researcher looking to improve on Turing Test

Chuck Bednar for redOrbit.com – @BednarChuck

For the second time in a year, a research from the Georgia Institute of Technology is challenging the Turing Test, the current standard method used to evaluate whether or not a computer program or machine is capable of exhibiting human-level intelligence.

Like his colleague Mark Riedl before him, Georgia Tech professor Ashok Goel has proposed an alternate method of measuring artificial intelligence (AI) that deals with the shortcomings of the technique originally proposed by computing pioneer Alan Turing back in 1950.

Using visual analogy as the basis for a new method

Last November, Riedl devised an alternative method that relies not on its ability to converse, but on its ability to create a convincing story, poem or painting. He and his colleagues came up with a tweaked version of the Lovelace Test which created clear, measurable parameters that could be used to judge the artistic work created by an AI system called Lovelace 2.0.

Goel, a professor in the School of Interactive Computing, suggests a different technique designed to address the fact that the Turing Test only considers natural language when measuring artificial intelligence. He proposes using Raven’s Progressive Matrices Test, which relies exclusively on a set of visual analogy problems to measure general intelligence, as the basis for a new test.

By using visual analogy problems, Raven’s Progressive Matrices Test forces users to identify the relationship between items that are similar in appearance. Understanding such problems requires elements that are essential to general intelligence, Goel said, including visual thinking, common-sense reasoning, mental imagery and background knowledge – which would make it a good basis for a new, more thorough alternative to the venerable Turing Test.

The Georgia Tech professor proposes that computers be shown raw input images, then tasked with coming up with solutions to an issue. The AI will use image processing to make sense of the picture and visual “thinking” to address the problem, Goel explained in a presentation at the 29th AAAI Conference on Artificial Intelligence in Austin, Texas, earlier this month.

Addressing the limitations of the Turing Test

In a statement, Goel explained that the prevailing sense of the workshop was that the Turing Test had some issues and could be deceived relatively easily. To prove his point, he refers to a chatbot made by a trio of Russian programmers called “Eugene Goostman,” which managed to pass the Turing Test by convincing one-third of judges that it was actually a real Ukrainian boy.

The results of that experiment have been disputed for a number of reasons, he said, including the fact that was able to influence the test judges by posing as a non-native English speaker, thus gaining an unfair advantage. Goel noted that the transcript of that chat session “makes no sense,” showing that the method is not the best way to test the capabilities of artificial intelligence.

“The general sense among many AI researchers,” he told redOrbit via email, “is that the Turing Test can be deceived relatively easily, for example, by giving non sequitur answers but explaining the lack of logical coherence by pretending to be young, or a non-native speaker of English, or belonging to a different culture, or so on.”

As a replacement, he proposes a new series of tests that will fully account for all aspects of the complex nature of human intelligence – tests centered around Raven’s Progressive Matrices Test and the process of visual thinking as a whole. Ultimately, he hopes to design programs capable of passing human intelligence tests, but admits that such advances are still a long way off.

“Although wholly visual in nature, the Raven’s Progressive Test of intelligence measures general intelligence,” Goel told redOrbit. “Nevertheless, we can improve the test by including open-ended questions. For example, we can show computer a drawing and ask it questions about the drawing. Interpretation of drawings goes beyond just visual thinking and tests other aspects of general intelligence, and humans generally are good at interpreting drawings.”

“It is so good to see renewed interest in AI after a hiatus of several years,” he added. “With the invention of popular AI systems such as Apple’s Siri and IBM’s Watson, AI has recaptured the imagination of the society at large. I expect that AI systems likely will soon do as well on the Raven’s Progressive Test of intelligence as humans – in fact, some of our programs already are beginning to do so. However, manifesting human-level creativity, as in scientific discovery or engineering invention, likely will take longer.”

—–

Follow redOrbit on TwitterFacebookGoogle+, Instagram and Pinterest.