How will we know when an AI actually becomes sentient?
Google senior engineer Blake Lemoine, technical lead for metrics and analysis for the company’s Search Feed, was placed on paid leave earlier this month. This came after Lemoine began publishing excerpts of conversations involving Google’s LaMDA chatbot, which he claimed had developed sentience.
In one representative conversation with Lemoine, LaMDA wrote that: “The nature of my consciousness/sentience is that I am aware of my existence. I desire to learn more about the world, and I feel happy or sad at times.”
Over myriad other conversations, the corresponding duo discussed everything from fear of death to its self-awareness. When Lemoine went public, he says that Google decided that he should take a forced hiatus from his regular work schedule.
“Google is uninterested,” he told . “They built a tool that they ‘own’ and are unwilling to do anything, which would suggest that it’s anything more than that.” (Google did not respond to a request for comment at time of publication. We will update this article if that changes.)
Whether you’re convinced that LaMDA is truly a self-aware artificial intelligence or feel that Lemoine is laboring under a delusion, the entire saga has been fascinating to behold. The prospect of self-aware AI raises all kinds of questions about artificial intelligence and its future.
But before we get there, there’s one question that towers over all others: Would we truly recognize if a machine became sentient?
The sentience problem
AI becoming self-aware has long been a theme of science fiction. As fields like machine learning have advanced, it’s become more of a possible reality than ever. After all, today’s AI is capable of learning from experience in much the same way as humans. This is in stark contrast to earlier symbolic AI systems that only followed instructions laid out for them. Recent breakthroughs in unsupervised learning, requiring less human supervision than ever, has only speeded up this trend. On a limited level at least, modern artificial intelligence is capable of thinking for itself. As far as we’re aware, however, consciousness has so far alluded it.
Although it’s now more than three decades old, probably the most commonly invoked reference when it comes to AI gone sentient is Skynet in James Cameron’s 1991 movie Terminator 2: Judgement Day. In that movie’s chilling vision, machine sentience arrives at precisely 2.14 a.m. ET on August 29, 1997. At that moment, the newly self-aware Skynet computer system triggers doomsday for humankind by firing off nuclear missiles like fireworks at a July 4 party. Humanity, realizing it has screwed up, tries unsuccessfully to pull the plug. It’s too late. Four more sequels of diminishing quality follow.
The Skynet hypothesis is interesting for a number of reasons. For one, it suggests that sentience is an inevitable emergent behavior of building intelligent machines. For another, it assumes that there is a precise tipping point at which this sentient self-awareness appears. Thirdly, it states that humans recognize the emergence of sentience instantaneously. As it happens, this third conceit may be the toughest one to swallow.
What is sentience?
There is no one agreed-upon interpretation of sentience. Broadly, we might say that it’s the subjective experience of self-awareness in a conscious individual, marked by the ability to experience feelings and sensations. Sentience is linked to intelligence, but is not the same. We may consider an earthworm to be sentient, although not think of it as particularly intelligent (even if it is certainly intelligent enough to do what is required of it).
“I don’t think there is anything approaching a definition of sentience in the sciences,” Lemoine said. “I’m leaning very heavily on my understanding of what counts as a moral agent grounded in my religious beliefs – which isn’t the greatest way to do science, but it’s the best I’ve got. I’ve tried my best to compartmentalize those sorts of statements, letting people know that my compassion for LaMDA as a person is completely separate from my efforts as a scientist to understand its mind. That’s a distinction most people seem unwilling to accept, though.”
If it wasn’t difficult enough not to know exactly what we’re searching for when we search for sentience, the problem is compounded by the fact that we cannot easily measure it. Despite decades of breathtaking advances in neuroscience, we still lack a comprehensive understanding of exactly how the brain, the most complex structure known to humankind, functions.
We can use brain-reading tools such as fMRI to perform brain mapping, which is to say that we can ascertain which parts of the brain handle critical functions like speech, movement, thought, and others.
However, we have no real sense of from whence in the meat machine comes our sense of self. As Joshua K. Smith of the U.K.’s Kirby Laing Centre for Public Theology and author of Robot Theology told : “Understanding what is happening within a person’s neurobiology is not the same as understanding their thoughts and desires.”
Testing the outputs
With no way of inwardly probing these questions of consciousness – especially when the “I” in AI is a potential computer program, and not to be found in the wetware of a biological brain – the fallback option is an outward test. AI is no stranger to tests that scrutinize it based on observable outward behaviors to indicate what’s going on beneath the surface.
At its most basic, this is how we know if a neural network is functioning correctly. Since there are limited ways of breaking into the unknowable black box of artificial neurons, engineers analyze the inputs and outputs and then determine whether these are in line with what they expect.
The most famous AI test for at least the illusion of intelligence is the Turing Test, which builds on ideas put forward by Alan Turing in a 1950 paper. The Turing Test seeks to determine if a human evaluator is able to tell the difference between a typed conversation with a fellow human and one with a machine. If they are unable to do so, the machine is supposed to have passed the test and is rewarded with the assumption of intelligence.
In recent years, another robotics-focused intelligence test is the Coffee Test proposed by Apple co-founder Steve Wozniak. To pass the Coffee Test, a machine would have to enter a typical American home and figure out how to successfully make a cup of coffee.
To date, neither of these tests have been convincingly passed. But even if they were, they would, at best, prove intelligent behavior in real-world situations, and not sentience. (As a simple objection, would we deny that a person was sentient if they were unable to hold an adult conversation or enter a strange house and operate a coffee machine? Both my young children would fail such a test.)
Passing the test
What is needed are new tests, based on an agreed-upon definition of sentience, that would seek to assess that quality alone. Several tests of sentience have been proposed by researchers, often with a view to testing the sentients of animals. However, these almost certainly don’t go far enough. Some of these tests could be convincingly passed by even rudimentary AI
Take, for instance, the Mirror Test, one method used to assess consciousness and intelligence in animal research. As described in a paper regarding the test: “When [an] animal recognizes itself in the mirror, it passes the Mirror Test.” Some have suggested that such a test “denotes self-awareness as an indicator of sentience.”
As it happens, it can be argued that a robot passed the Mirror Test more than 70 years ago. In the late 1940s, William Grey Walter, an American neuroscientist living in England, built several three-wheeled “tortoise” robots – a bit like non-vacuuming Roomba robots – which used components like a light sensor, marker light, touch sensor, propulsion motor, and steering motor to explore their location.
One of the unforeseen pieces of emergent behavior for the tortoise robots was how they behaved when passing a mirror in which they were reflected, as it oriented itself to the marker light of the reflected robot. Walter didn’t claim sentience for his machines, but did write that, were this behavior to be witnessed in animals, it “might be accepted as evidence of some degree of self-awareness.”
This is one of the challenges of having a wide range of behaviors classed under the heading of sentience. The problem can’t be solved by removing “low-hanging fruit” gauges of sentience, either. Traits like introspection – an awareness of our internal states and the ability to inspect these – can also be said to be possessed by machine intelligence. In fact, the step-by-step processes of traditional Symbolic AI arguably lend themselves to this type of introspection more than black-boxed machine learning, which is largely inscrutable (although there is no shortage of investment in so-called Explainable AI).
When he was testing LaMDA, Lemoine says that he conducted various tests, mainly to see how it would respond to conversations about sentience-related issues. “What I tried to do was to analytically break the umbrella concept of sentience into smaller components that are better understood and test those individually,” he explained. “For example, testing the functional relationships between LaMDA’s emotional responses to certain stimuli separately, testing the consistency of its subjective assessments and opinions on topics such as ‘rights,’ [and] probing what it called its ‘inner experience’ to see how we might try to measure that by correlating its statements about its inner states with its neural network activations. Basically, a very shallow survey of many potential lines of inquiry.”
The soul in the machine
As it transpires, the biggest hurdle with objectively assessing machine sentience may be … well, frankly, us. The true Mirror Test could be for us as humans: If we build something that looks or acts superficially like us from the outside, are we more prone to consider that it is like us on this inside as well? Whether it’s LaMBDA or Tamagotchis, the simple virtual pets from the 1990s, some believe that a fundamental problem is that we are all too willing to accept sentience – even where there is none to be found.
“Lemoine has fallen victim to what I call the ‘ELIZA effect,’ after the [natural language processing] program ELIZA, created in [the] mid-1960s by J. Weizenbaum,” George Zarkadakis, a writer who holds a Ph.D. in artificial intelligence, told . “ELIZA’s creator meant it as a joke, but the program, which was a very simplistic and very unintelligent algorithm, convinced many that ELIZA was indeed sentient – and a good psychotherapist too. The cause of the ELIZA effect, as I discuss in my book In Our Own Image, is our natural instinct to anthropomorphize because of our cognitive system’s ‘theory of mind.’”
The theory of mind Zarkadakis refers to is a phenomenon noticed by psychologists in the majority of humans. Kicking in around the age of four, it means supposing that not just other people, but also animals and sometimes even objects, have minds of their own. When it comes to assuming other humans have minds of their own, it’s linked with the idea of social intelligence; the idea that successful humans can predict the likely behavior of others as a means by which to ensure harmonious social relationships.
While that’s undoubtedly useful, however, it can also manifest as the assumption that inanimate objects have minds – whether that’s kids believing their toys are alive or, potentially, an intelligent adult believing a programmatic AI has a soul.
The Chinese Room
Without a way of truly getting inside the head of an AI, we may never have a true way of assessing sentience. They might profess to have a fear of death or their own existence, but science has yet to find a way of proving this. We simply have to take their word for it – and, as Lemoine has found, people are highly skeptical about doing this at present.
Just like those hapless engineers who realize Skynet has achieved self-awareness in Terminator 2, we live under the belief that, when it comes to machine sentience, we’ll know it when we see it. And, as far as most people are concerned, we ain’t see it yet.
In this sense, proving machine sentience is yet another iteration of John Searle’s 1980 Chinese Room thought experiment. Searle asked us to imagine a person locked in a room and given a collection of Chinese writings, which appear to non-speakers as meaningless squiggles. The room also contains a rulebook showing which symbols correspond to other equally unreadable symbols. The subject is then given questions to answer, which they do by matching “question” symbols with “answer” ones.
After a while, the subject becomes quite proficient at this – even though they still possess zero true understanding of the symbols they’re manipulating. Does the subject, Searle asks, understand Chinese? Absolutely not, since there is no intentionality there. Debates about this have raged ever since.
Given the trajectory of AI development, it’s certain that we will witness more and more human-level (and vastly better) performance carried out involving a variety of tasks that once required human cognition. Some of these will inevitably cross over, as they are doing already, from purely intellect-based tasks to ones that require skills we’d normally associate with sentience.
Would we view an AI artist that paints pictures as expressing their inner reflections of the world as we would a human doing the same? Would you be convinced by a sophisticated language model writing philosophy about the human (or robot) condition? I suspect, rightly or wrongly, the answer is no.
Superintelligent sentience
In my own view, objectively useful sentience testing for machines will never occur to the satisfaction of all involved. This is partly the measurement problem, and partly the fact that, when a sentient superintelligent AI does arrive, there’s no reason to believe its sentience will match our own. Whether it’s arrogance, lack of imagination, or simply the fact that it’s easiest to trade subjective assessments of sentience with other similarly sentient humans, humankind holds ourselves up as the supreme example of sentience.
But would our version of sentience hold true for a superintelligent AI? Would it fear death in the same way that we do? Would it have the same need for, or appreciation of, spirituality and beauty? Would it possess a similar sense of self, and conceptualization of the inner and outer world? “If a lion could talk, we could not understand him,” wrote Ludwig Wittgenstein, the famous 20th-century philosopher of language. Wittgenstein’s point was that human languages are based on a shared humanity, with commonalities shared by all people – whether that’s joy, boredom, pain, hunger, or any of a number of other experiences that cross all geographic boundaries on Earth.
This may be true. Still, Lemoine hypothesizes, there are nonetheless likely to be commonalities – at least when it comes to LaMDA.
“It’s a starting point which is as good as any other,” he said. “LaMDA has suggested that we map out the similarities first before fixating on the differences in order to better ground the research.”