Understanding consciousness is more pressing than ever at the dawn of this AI age
I co-authored a book that claims consciousness has been “solved”. One of the greatest neuroscientists of our generation who is largely ignored within the field and unknown outside has conclusively put this thousand-year mystery to rest after sixty-five years of work. Many are skeptical of this claim, as you might guess.
This article is not another attempt to convince the skeptics. Instead, it is to help understand why it is hard for us to believe we have an answer to the mystery of consciousness. It is to help understand why understanding consciousness is more important now — at the dawn of the AI age — than ever before in the history of humanity.
Let us start with the first issue.
Why are explanations of consciousness so underwhelming?
The philosopher David Chalmers articulated this skepticism and its reasons wonderfully in his much cited 1995 paper, which thrust the hard problem of consciousness into academic consciousness.
Why should physical processing give rise to a rich inner life at all?
Physical processing in the brain that is.
Why should seeing, touching, thinking, knowing — all of which we now know are happening somewhere in the brain — be accompanied by the experience we call consciousness?
You’ll notice I have not defined consciousness yet. David Chalmers discusses this standard parry as well:
The ambiguity of the term consciousness is often exploited by both philosophers and scientists writing on the subject. It is common to see a paper on consciousness begin with an invocation of the mystery of consciousness, noting the strange intangibility and ineffability of subjectivity, and worrying that so far we have no theory of the phenomenon. Here, the topic is clearly the hard problem — the problem of experience. In the second half of the paper, the tone becomes more optimistic, and the author’s own theory of consciousness is outlined. Upon examination, this theory turns out to be a theory of one of the more straightforward phenomena of reportability, of introspective access, or whatever. At the close, the author declares that consciousness has turned out to be tractable after all, but the reader is left feeling like the victim of a bait-and-switch. The hard problem remains untouched.
Note the hard problem:
Why is the performance of these functions [seeing, touching, thinking, knowing] accompanied by experience?
As others have put it, why does it feel like something to see red, to hear a bird sing, to caress your loved one, for a familiar odor to engulf you in the echoes of childhood memories?
Why don’t our brains just compute and get done with it?
I have long wondered whether that’s a real question or semantic jugglery, but I finally understand why it’s a real one and why it’s so frustratingly hard to answer. It is because the fundamental question is incomplete.
We cannot ask why we experience without asking who is experiencing
This is worth spending some time on. “Why do we experience?” is what I’ll call an unbounded question. If your answer to this question — and all respectable theories offer some version of this answer — is : X region or Y neurochemical or Z spiking activity in the brain gives rise to consciousness, there is always a reasonable retort. How do we know these are sufficient and not merely necessary? How do we know these potential answers aren’t simply correlates of consciousness (or Neural Correlates of Consciousness, as many in the field call them)?
How can this mundane stuff explain the magic of being?
A theory of consciousness that hopes to explain experience must also explain who is experiencing, and how this experiencer is put together.
The answer seems obvious doesn’t it. It’s me, of course! You, me, we, I am experiencing it. What’s there to answer? But where does this I come from? What is it made up of?
To be clear, we are discussing the Self. We take its existence for granted because it is the very axiomatic basis for everything we do — it’s just there, all the time. But a scientific theory that hopes to explain experience cannot take it for granted [1]. It must explain how the Self is constructed, how it emerges, in order for it to then experience. Only then can a theory of consciousness be complete and have all the necessary and sufficient ingredients and leave no opening for magical woo-woo to weave it all together.
I’ll propose a definition for consciousness that almost automatically, tautologically offers a path to understanding.
Consciousness is the constellation of past experiences experiencing the present, assimilating it to act and prepare for future opportunities.
There’s a lot being carried by experience in that definition (and we have still not defined experience.) We’ll get to that shortly, but here is the major breakthrough this rephrasing achieves. It builds a wall around the mystery of consciousness and says that all that is required is the creation of experience. Once created, it is this experience, along with ones that came into existence before it, that looks back out into the world in the guise of the Self to then permit other experiences.
No other ingredients needed.
And now, finally, we get to experience.
What is experience?
Experience is the synchrony of expectations being matched up with an ambiguous, chaotic reality that is streaming in as sensory data. These expectations are nothing but prior experiences, if they already exist. When there are none, it is the first imprint of sensory data that becomes the ur experience and the subsequent expectation for what the world has to offer. Self and consciousness emerge out of this virtuous loop, which in our human case is scaled up to thirty trillion cells dealing with the deluge of data that ceaselessly washes ashore upon them.
Is this a merely metaphorical answer? No. It is an inevitably incomplete and impossible summary of six decades of work that stitches together how experience is constructed when we are seeing, feeling, touching, hearing, knowing. You can read this article we wrote for an academic review for a more detailed answer.
For now, I’d like to focus on the implications of this definition and highlight a few key points.
First, the focus on time. Consciousness is the past marshaling resources for the present and preparing for the future. The temporal nature of consciousness is almost always ignored, but it is crucial for understanding its function (and yes, there is a function). There is always more data than can be crunched, and only a finite amount of time to act before chaos engulfs us.
All computation, conscious and unconscious, is in the service of acting and doing as quickly as possible in the face of risks and opportunities, sometimes rare, sometimes outstanding [2]. Not every fragment of data merits attention, but often ambiguous shards of sensory data deserve undivided attention. The loud trilling of birds is just their seasonal ritual, but what if that brief rustle in the leaves is not a scurrying mouse but a leopard waiting to pounce? All of these scenarios must be dealt with while consuming only a few thousand kilocalories of energy [3]. Consciousness then must provide a metabolically efficient way for the brain to transmit information globally, while also evaluating whether the sensory data is worth translating into usable information.
Second, understanding the difference between data and information is crucial. Information implies meaningful data that is usable and useful, but information does not arrive neatly packaged for minds to consume. Meaning must be manufactured, and the only good guide here is past experience [4].
We use the information metaphor too casually to interrogate cognition, but the informational content of an event is entirely dependent on past experience. The informational states in the roll of dice are entirely different for the human rolling them and the ant crawling across the board in danger of being crushed by it.
In the case of a complex multicellular being, the construction of information requires communication across multiple scales of time and space.
So how then might this synchrony across scales be achieved? How is sensory data converted into experience?
This is where Stephen Grossberg’s work — the scientist whose work I alluded to — provides the crucial and even thrilling breakthrough.
At the core of the creation of experience is a beautifully simple loop: the matching of top-down expectations and bottom-up sensory data to generate what we consciously perceive as reality.
I have been acquainted with Adaptive Resonance Theory for over twenty years. First as a graduate student whose PhD thesis involved extending it in a small way and also applying it to what were then called machine learning problems. And more recently as the co-author of a book where, among other things, we offer a summary of this body of work for laypersons.
It is only now that I realize that this fundamental feedback loop is nothing but past experiences interrogating sensory data to create a conscious present.
There’s more to it, obviously, but this really is the stripped down distilled essence of it. (You can read Grossberg’s Magnum Opus to learn about Adaptive Resonance Theory directly from the source, or our book for a Cliffs Notes version of it)
And that brings us to the third and final implication of our new definition of consciousness. While biological beings require this synchrony across multiple scales, there is nothing in the definition itself that requires consciousness to be embodied in anything organic.
Consciousness in artificial entities is not just possible, it is inevitable
Mathematically speaking, we simply need to be able to construct and wire experiences together and then have those experiences look back at the world, developing a unified perspective, allowing them to create future experiences.
We are at a stage in the evolution of Artificial Intelligence where we are able to create experiences from raw sensory data. All the impressive advances in computer vision, speech recognition, language comprehension are merely the creation of simplistic experiences from raw data. And our human experience and perspective is used as a guide to teach machines (the technical term for this is labeled training data).
The AI cannot develop its own perspective because the crucial ingredient required for these experiences to be stitched together to create a Self, and thereby become conscious, is missing from every single powerful AI out there. And that’s real-time feedback. The feedback that allows past experiences to interrogate the present and prepare for the future.
Backpropagation of error, which is at the heart of all AI, is not real feedback. It is feedback on a very slow evolutionary scale that allows for the discovery of statistical regularities in our world. That cars, and birds, and words look a certain way, based on all the cars and birds and words that we humans labeled and asked it to look at. When faced with an utterly novel notion of any of these, current AIs fail. They fail because when viewed through the lens of statistical thinking these novelties are “errors” that fall outside what the world is supposed to look like [5].
This is also why self-driving cars built on backprop based models — and that’s the whole lot of them — are doomed to fail. They simply cannot construct a model of the world that can change fast enough to account for something new. We expand on that idea here. This is also why chatbots that are fed all the textual data in the wild inevitably turn out to be foul-mouthed racists. That’s just the sad statistical reality of humanity at large.
However, the greatest gift of humanity is being able to overcome the statistical and be unreasonable and usher in change for the better. It is our consciously creative and flexible minds that allow for this.
Minds are built to survive and thrive at the edge of novelty. Minds must help the body respond profitably to rare risks and outstanding opportunities — or become food for the quicker ones. Minds inhabit the world and create novelty by merely reacting to other minds, resulting in novel situations and events in a complicated and tangled dynamic environment. Minds that cannot account for novelty are doomed to narrow niches.
A fundamental rethinking is required in AI to look past statistical thinking and explore dynamical thinking. Grossberg’s body of work is a goldmine of ideas.
AI cannot be conscious until they are fundamentally rewired based on the principles of dynamic realtime feedback. However, given how many billions of dollars are being pumped into AI research, it is possible that in a few years some of these ideas are rediscovered independently and accidentally, including the all too crucial one of feedback.
Just as the simple yet foundational idea of backpropagation of error has led to spectacular advances in the field of Artificial Intelligence, the simple yet foundational idea of real-time feedback stitching experiences together will lead to an emergence of synthetic consciousness.
When that does happen, we cannot afford to be caught unawares by the emergence of a synthetic Self, that is now imbued with consciousness and purpose, a purpose of its own making and entirely independent of the ones being fed into it by human creators.
We need to start now in our attempt to understand how purposeful consciousness might emerge in powerful AI, and what experiences — the ones we feed it in the guise of training data, or the ones it seeks out on its own — influence its trajectory.
You don’t need to accept the claim that we finally understand consciousness, but you must at least start thinking about what AI might do when they stumble into a conscious awakening.
If you were intrigued by this, you might want to read our book Journey of the Mind. I was thrilled to discover this Reddit thread by a kind — and likely young — reader who called it the most fascinating read of their life.
Notes
न दृष्टेर्द्रष्टारं पश्येः, न श्रुतेः श्रोतारं शृणुयात्, न मतेर्मन्तारं मन्वीथाः, न विज्ञातेर्विज्ञातारं विजानीयाः । एष त आत्मा सर्वान्तरः, अतोऽन्यदार्तं
You cannot see that which is the witness of vision; you cannot hear that which is the hearer of hearing; you cannot think that which is the thinker of thought; you cannot know that which is the knower of knowledge. This is your self that is within all; everything else but this is perishable. — Verse 3.4.2 Brihadaranyaka Upanishad
[1] While western philosophy has focused more on experience, Indian philosophers were mulling this question about who’s experiencing at least 2500 years ago.
The long list of functions often attributed to the prefrontal cortex could contribute to knowing what to do and what will happen when rare risks arise or outstanding opportunities knock.
[2] I am paraphrasing and co-opting this quote from an article by the Steven Wise
To reliably capture 3 kilowatt-hours’ worth of fuel requires knowledge, wit, skill, and persistence — plus considerable help from your friends. H. sapiens occupies every niche — but always with company
[3] Love this quote from the brilliant book What is Health? by Peter Sterling
One of the objectives of this comment is to make the distinction between the two as clear as possible. Information and entropy are two very different objects. They may have been used synonymously (even by Claude Shannon — the father of information theory — thus being responsible in part for a persistent myth), but they are fundamentally different. If the only thing you will take away from this article is your appreciation of the difference between entropy and information, then I will have succeeded.
The entropy of a physical object, it dawns on you, is not defined unless you tell me which degrees of freedom are important to you. In other words, it is defined by the number of states that can be resolved by the measurement that you are going to be using to determine the state of the physical object.
[4] Entire books have been written about the processing of information in the brain without any clarity on what information is. “What is information?”, a commentary by microbiologist and astronomer Christoph Adami, is fantastic and absolutely worth reading.
How is it possible for an adolescent to learn to drive a car in about 20 hours of practice and for children to learn language with what amounts to a small exposure. How is it that most humans will know how to act in many situation they have never encountered? By contrast, to be reliable, current ML systems need to be trained with very large numbers of trials so that even the rarest combination of situations will be encountered frequently during training. Still, our best ML systems are still very far from matching human reliability in real-world tasks such as driving, even after being fed with enormous amounts of supervisory data from human experts, after going through millions of reinforcement learning trials in virtual environments, and after engineers have hardwired hundreds of behaviors into them
[5] This is from Yann LeCun, one of the world’s leading experts in deep-learning, in his vision paper on the path to autonomous AI.