Coined after the British polymath, Alan Turing, in 1950, the Turing test was initially designed as a basic way to examine a computer’s ability to show intelligence indistinguishable from that of a human. Now, however, the test has become a staple within the world of Artificial Intelligence.
Alan Turing conceptualized the test after a game known as ‘The Imitation Game’.
The game goes as follows: there are three people, a man (A), a woman (B), and an interrogator (C), who can be either a man or a woman. The interrogator knows them as X and Y, not knowing which is the man and which is the woman.
While the interrogator stays in a room apart from the other two, his/her goal is to decide which of the other two is indeed the man and which is the woman. The interrogator builds his/her assumption by asking them a series of questions, such as “Can X tell me the length of his/her hair?”
Assuming X is A, it is A’s objective to get C to make the wrong identification and therefore would answer something like: “My hair is about 10 inches long”. The objective of B, however, is to help the interrogator, and therefore provide him with truthful and perhaps even blunt statements, such as “I am the woman!”
At the end of the game, the interrogator must say either: ‘X is A and Y is B’ or ‘X is B and Y is A’.
Turing then posed the question: what happens if A or B were actually a machine? Would the interrogator struggle as much when the game is played like this as he/she would when it is played between a man and a woman?
The Turing test states that if a machine does actually manage to convince an interrogator that it is in fact human for over 30% of a five-minute conversation, it passes the test and wins the ‘game’.
Since its conception, the Turing test has become a benchmark for measuring AI intelligence and was used on the earliest chatbots, ELIZA and PARRY. By the 90’s, the test moved on to becoming an annual competition: the Loebner Prize.
However, the test was so difficult that it took over sixty years for a machine (i.e. chatbot) to obtain the prize.
The first chatbot to ever pass the Turing test was dubbed Eugene Goostman, and posed as a 13-year-old Ukrainian boy who managed to convince a third of the judges he spoke with that he was a human.
This success, however, comes with a lot of controversy. Many claimed that Eugene had cheated because ‘he’ dismissed odd or confusing answers by claiming a foreign language and young age.
Many claim that the Turing test has always had some issues since the beginning. One such issue revolves around the notion that the test suggests that intelligence and human behaviors are one and the same, when in fact there is a huge difference between true intelligence and human behavior. Therefore, people argue that the Turing test draws a false equivalent and as a result could encourage artificial stupidity.
For example: because the goal is to trick people into believing the chatbot is actually a human, the test becomes all about mimicry and chatbots can supposedly win a competition by producing something as simple as common typing errors.
Some people, such as American philosopher John Searle, claim the Turing test to be inadequate by arguing that although programming a computer could make it appear to understand human language by following a predefined set of instructions, it cannot produce real understanding of it. This argument is widely known as the Chinese Room Argument.
Finally, the Turing test was only designed for written conversations and does not take into account spoken interactions, such as those held between users and Hyro’s voice assistants. This lack of consideration restrains the test’s ability to examine the full scope of today’s technological offerings.
Despite its issues, there must be some redeeming qualities involved, if not people wouldn’t still be speaking about the Turing test today, 70 years later.
‘Intelligence’ is a tricky quality to define and the Turing test does a great job in bringing simplicity to such a complex topic. It often occurs that whatever we consider intelligence in machines one day, becomes obsolete the next. The Turing test however, is a consistent measurable solution that can clearly withstand time.
The test is also relatively flexible, allowing for discussion covering a wide array of topics. This provides a yardstick for artificial general intelligence (AGI). Additionally, it can also measure narrow scopes of topics surrounding machine learning, and therefore reveal strong artificial narrow intelligence (ANI).