Neanderthals could not find out off switch on humans, would human be able to find one on a

Posted by Sukant Khurana in Specials
February 8, 2018

Originally published at:


Neanderthals could not find out off switch on humans, would human be able to find one on artificial general intelligence (if it is ever invented)?

by Aadhar Sharma, Raamesh Gowri Raghavan, and Sukant Khurana

(Note: This article is rehash of ideas Sukant and Aadhar have presented elsewhere before)

Photo by Markus Spiske on Unsplash

Intelligent machines, no more a mere work of fiction, aid us in efforts of all kinds, from the synthesis of art to automobile manufacturing. They are designed only for expertise in one or a small number of domains. At present specialized intelligence is becoming common but a general intelligence of machines remains a dream (or nightmare).

Artificial General Intelligence (henceforth, AGI; synonymous with ultra-intelligence or super-intelligence) has been characterized in fiction as intelligence that can perform any task par or exceeding human levels. General intelligence doesn’t necessarily imply that it has to be a modeled on humans. It should somehow accomplish the task at hand.

While deciphering the Enigma (machine), Alan Turing was assisted by a team of mathematicians, and Irving John Good was the chief statistician. A decade later, Turing made speculations about AI; he believed early on that we should expect machines to takeover. I.J. Good had been the first to make articulate comments on reproduction of AGI. An excerpt from his 1963 paper, “Speculations Concerning the First Ultraintelligent Machine”, is often quoted, [1]

“Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultra-intelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. It is curious that this point is made so seldom outside of science fiction. It is sometimes worthwhile to take science fiction seriously.”

Isaac AsimovPhillip DickArthur C. Clarke, and others have created great works of science fiction, closely knitting the ideas of an AGI, human interaction, and the traditional melodrama. While being overly exaggerated, these works do illuminate some of the key concerns mentioned by Good. It seems that even the philosophical terrain is very foggy; to call the headway ‘doom’ or reprieve from it, is not clear. To elucidate if this path is just blurry or really murky carries a great importance. It motivates us to seek strategies to understand AGI better, enable its ethical development, and a safe deployment.

Possible concerns:

Let’s assume that we succeed in building an AGI after years of training, and deploy it in the form of a robotic servant supposed to keep its owner happy and smiling. An AGI that wants nothing too big but to keep owners happy might implant electrodes to elicit a Duchenne smile or stimulate the Nucleus Accumbens (pleasure center of the brain) to provide kicks of pure, orgasmic gratification [2]. A Duchenne smile is one way to accomplish the goal with little effort — by stimulating the facial muscles to make it appear smiling. One might argue that the job was done flawlessly, but was it really? This basic theory should be taken seriously; even though the algorithm did accomplish its task, determining how ethical its behavior was shouldn’t require deep introspection. Evidently, great care needs to go in designing, analyzing, and deploying AGI.

Physicist, Max Tegmark believes that AGI is achievable theoretically since intelligence fundamentally is information processing by an arrangement of atoms acting under the laws of physics. “It is theoretically possible”, agrees Bill Gates. It’s estimated and widely accepted that AGI is practically achievable and would arrive anywhere between ten to fifty years. Given the current pace of AI, and assuming that we have a few decades to plan for its arrival strategically, is it enough? There are divided views to this question: Some like Nick Bostrom, believe that it’s about time we invest in advocacy of ethics and philosophy in researching Superintelligence. Another camp believes that we have enough time to ponder; Andrew Ng compared it to worrying about overpopulation on Mars. A third camp believes though an AGI can arrive in a few decades, there’s no need to worry since the off-switch is our assurance of control. We would argue that AI calls for ethical norms, and philosophical work, in addition to technical developments.

The Control Problem: off-switch

“If a machine can think, it might think more intelligently than we do, and then where should we be? Even if we could keep the machines in a subservient position, for instance by turning off the power at strategic moments, we should, as a species, feel greatly humbled. … [T]his new danger is certainly something which can give us anxiety.”

Turing explained the control problem of an AGI in 1951, speaking at BBC radio lecture [5]. Even the idea of off-switch acts as a method of dodging the uncomfortable prospects of a malevolent AI. “Where is the off-switch?”[6], argues Nick; furthermore, there is no such thing, and if there is, why couldn’t the Neanderthals figure it out. We are intelligent adversaries that can anticipate actions and plan around them, but so can an AGI, perhaps much better than us. If we simulate an AGI on a Virtual Machine (to quarantine it) can we be sure that it can’t leverage a bug in the code to its advantage? Ignoring the issue is not a good strategy, a better approach would be to create an AI that even if evades containment stays aligned with human values (or any values engineered into it). Defining values is yet another ethical and technical challenge.

A pioneer of cybernetics, Norbert Weiner, noted that it could become immensely hard to control learning machines as they may develop strategies too complicated to be anticipated by programmers, he wrote:

“Complete subservience and complete intelligence do not go together.”

“The future will be an ever more demanding struggle against the limitations of our intelligence, not a comfortable hammock in which we can lie down to be waited upon by our robot slaves.”

While Norbert’s comment is ominous, the appeals of AGI is too powerful to resist. No matter how in future it gets deployed, any behavior shall be ethically and morally justified. Efforts have been made to identify these obligations, and several paths and strategies have been proposed to approach it safely. Stuart Russel, a professor of Computer Science at UC Berkeley, pitches a three-point action plan to create a human-compatible AI. One, the AI should be altruistic towards humanity and human values; two, initially not knowing what human values are, it should avoid any single-minded pursuit of the objective; three, it should learn values by observing humans. According to him, even if the AGI anticipates the off-switch, it shan’t form malevolent intentions against its owner. He mentions that the machine should plan as follows:

1. Human may switch me off.

2. …but only if I am doing something wrong.

3. I don’t know what “wrong” is, but I should refrain from doing anything in the category of “wrong.”

4. Therefore, I should let her switch me off.

A robot following such an action plan may respect human values and confer to the ethical obligations. However, it can’t be said how technically efficient this AGI would be; such planning may introduce interaction delays that may render it virtually non-functional.

Photo by Fabian Grohs on Unsplash

Value Alignment

The AI alignment problem is central to the construction of a safe and ethical AGI. Providing, identifying, and aligning the AGI with values is the chief concern here. The idea is that an AI aligned to human values would be expected to behave like a human and would, therefore, be accommodated and accepted in the society. However, it’s not clear how one should proceed with quantifying or qualifying human values since the notion is very fuzzy in itself. Human values tend to differ between people, cultures and countries, and with such a broad distribution how one shall select the appropriate values for alignment, is not clear.

A world where AGI is not aligned with human-values is not hard to envision; a simple analogy is of humans and gorillas. These gentle beasts are our second-closest biological relatives; they are immensely muscular and may easily kill a strongman, yet their fate as a species now wholly depends upon us. We outsmart, dominate, poach, and habituate them for our greedy motives. Harambe, a male silverback gorilla at the Cincinnati Zoo, was shot and killed because a human child climbed into his enclosure; he first dragged the child by its leg and later held it close, which some animal behaviorists believe to be a sign of protection. While Harambe’s behavior was a debate among primatologists, his killing had caused mass criticism — raising him to a cultural icon.

How would an AGI respond if a human substitutes for Harambe? It’s crucial to embed human values to aid decision making in such situations, not only keeping in mind the ethical and moral obligations towards humans but also towards AI. AI’s prospects in society won’t just get tarnished but may even vanish if a single such incident happens, therefore it’s of grave seriousness and urgency to align AGI with human values, no matter how hard it is.

Nick Bostrom, is a professor of philosophy at the University of Oxford and the author of the critically acclaimed book ‘Superintelligence: Paths, Dangers, and Strategies’. He wrote ‘Requiem’ [7], a poem in Swedish, describing a medieval era brigadier who overslept for some centuries. Waking up from the slumber finds his men to have deserted the camp. Out of self-righteousness and dedication to his duties, he adjusts his saber and races his horse to the battlefield. While the horse gallops across streams and fields, he hears a roaring thunder, a boom of a fighter jet that sweeps past him, making him aware of the reality. The brigadier realizes that he has become obsolete; his self-esteem crashes down as a quail been shot.

While the poem doesn’t speak about AI, one can figure out the analogy for possible future with AGI. Super-intelligence may lead to “intelligence explosion” -as pointed out by I.J. Good; A system that recursively improves itself and produces more intelligent (sub-) systems that may in-turn do the same. This exponential growth could reach a point of over-population. Even if these systems are aligned with human values would we be able to assimilate their presence as biological entities, or would it perturb the equilibrium of physical and mental stability that’s maintained by homeostasis? If the rate at which this happens is slow enough to be accommodated in society and humanity, the future surely looks good, but if the rate is too great to be matched by our biological and cultural clocks, then dystopia may be at hand. Machines would degrade the quality of humanity and replace us as dominant beings, all a repercussion of shallow human foresight and poor ethical orientations. Bostrom writes,

“Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb […] We have little idea when the detonation will occur, though if we hold the device to our ear we can hear a faint ticking sound.”

AGI concerns humanity; if done terribly wrong it poses a threat to our existence (or all lifeforms). Even though its arrival is debatable, most expect it to take form eventually. Having no clear idea about the time we have at hand, it’s most wise to advance with great caution since poor judgment can and will lead to grim prospects. AI has always drawn inspirations from many other fields, such as neuroscience, linguistics, physics and mathematics. The safest way to go is to build on fundamental models. We are also obliged to put in more resources at the disposal of these fields, to develop cognitive systems based on human cognition, per se. Neuroscientist, Christof Koch writes:

“To constrain what could happen and ensure that humanity retains some modicum of control, we need to better understand the only known form of intelligence. That is, we need to develop a science of intelligence by studying people and their brains to try to deduce what might be the ultimate capabilities and goals of a machine intelligence. […] Will it make a difference to the AI’s action if it feels something, anything and if it, too, can experience the sights and sounds of the universe?”

For humanity, AGI can become the most significant tool yet. It may help us to find mathematical proofs and scientific discoveries, eradicate poverty and terrorism, heal ailments and contain epidemics, enhance the human condition, administer climate change, discover exoplanets or explore corners of the universe. Or it may establish a universal authority. The journey to AGI has many routes, which we take will define our future. The choices are our own and require profound contemplation. Sam Harris speaks, “Now is the good time to make sure that it’s the god we wanna live with”.

— —


Adhar Sharma was a researcher working in Dr. Sukant Khurana’s group, focusing on Ethics of Artificial Intelligence.

Raamesh Gowri Raghavan is collaborating with Dr. Sukant Khurana on various projects, ranging from popular writing of AI, influence of technology on art, and mental health awareness.

Mr. Raamesh Gowri Raghavan is an award winning poet, a well-known advertising professional, historian, and a researcher exploring the interface of science and art. He is also championing a massive anti-depression and suicide prevention effort with Dr. Khurana and Farooq Ali Khan.

You can know more about Raamesh at: and

Dr. Sukant Khurana runs an academic research lab and several tech companies. He is also a known artist, author, and speaker. You can learn more about Sukant at or and if you wish to work on biomedical research, neuroscience, sustainable development, artificial intelligence or data science projects for public good, you can contact him at or by reaching out to him on linkedin