Can we build AI that doesn’t turn on us? Is it already too late?
18 April 2018
A report from the UK House of Lords Select Committee on Artificial Intelligence has made a number of recommendations for the UK’s approach to the rise of algorithms. The report ‘AI in the UK: ready, willing and able?’ suggests the creation of a cross-sector AI Code to help mitigate the risks of AI outstripping human intelligence.
The main recommendation in the report is that autonomous power to hurt, destroy or deceive human beings should never be vested in artificial intelligence. The committee calls for the Law Commission to clarify existing liability law and considers whether it will be sufficient when AI systems malfunction or cause harm to users. The authors predict a situation where it is possible to foresee a scenario where AI systems may
malfunction, underperform or otherwise make erroneous decisions which cause harm. In particular, this might happen when an algorithm learns and evolves of its own accord.
The authors of the report confess that it was “not clear” to them or their witnesses whether “new mechanisms for legal liability and redress in such situations are required, or whether existing mechanisms are sufficient”. Their proposals, for securing some sort of prospective safety, echo Isaac Asimov’s three laws for robotics.
- A robot may not injure a human being or, through inaction, allow a human being to come to harm.
- A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
- A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
But these elaborations of principle may turn out to be merely semantic. The safety regime is not just a question of a few governments and tech companies agreeing on various principles. This is a global problem – and indeed even if Google were to get together with all the other giants in this field, Alibaba, Alphabet, Amazon, Apple, Facebook, Microsoft and Tencent, it may not be able to anticipate the consequences of building machines that can self-improve.
In his Waking Up podcast series, neuroscientist Sam Harris has explored these questions with a number of AI experts such as Nick Bostrom, David Deutsch and most recently with computer scientist Eliezer Yudkowsky. For anyone interested in the subject, Racing Toward the Brink is imperative listening (although at over two hours not a quick one). At the top of the list of safety issues is the “alignment problem”, a label given to the fact that we cannot align the utility function of an AI system with human values. Any system that is sufficiently intelligent to carry out human or superhuman tasks will of necessity prefer to ensure its own continued existence. Not only that, but it will be bent on acquiring energy and computational sources to succeed in its assigned task. Stuart Russell, author of one of the leading textbooks on AI, gives a basic example. If you have a sufficiently intelligent system whose goal is simply to bring you coffee, even that system has an implicit strategy of not letting you switch it off.
Prior to publishing its report, House of Lords Committee visited the AI company Deep Mind, whose progeny famously beat the human champion at the fiendishly difficult game of Go and was itself superseded by another machine, Alpha Go, which was even better, because it was programmed in a more general way. As a result of this it learned chess as well, and beat all the best existing chess engines in a day. This was an enormous breakthrough and gives us an idea of where all this is heading. The real story is not how Alpha Zero beat the human competitor, but how it beat the human Go programmers and the human chess programmers in a day. All this programming took years, and Alpha Go surpassed it in less than twenty four hours.
So the danger is not only in the superhuman narrow AI ability, but in the superhuman general AI. The UK Parliamentary Committee describes general AI thus:
Artificial general intelligence refers to a machine with broad cognitive abilities, which is able to think, or at least simulate convincingly, all of the intellectual capacities of a human being, and potentially surpass them—it would essentially be intellectually indistinguishable from a human being.
But the report essentially addresses a more limited, human level AI. The authors focus on the idea of restricting the use of unintelligible systems in certain important or safety-critical domains, such as judicial or healthcare decisions or indeed those relating to weaponisation. In the healthcare world alone, we may have to conduct a root and branch review of legal liability, where AI is making the treatment and therapy decisions. Our current system of basing liability on standards of behaviour that could be reasonably expected of a professional in the field, and foreseeability and so on, is difficult if not impossible to apply to a decision that has been arrived at by an AI system.
It was not clear to us, nor to our witnesses, whether new mechanisms for legal liability and redress in such situations are required, or whether existing mechanisms are sufficient.
But this limited form of AI, addressed in the report, may turn out to be a straw target. With artificial intelligence, there’s a continuum of intelligence which is hard to conceptualise. As Sam Harris has repeatedly observed in his interviews with AI specialists, as you ramp up intelligence, spaces of inquiry and ideation and experience open up that we simply cannot imagine. How can this be constrained by recommendations and legal codes? The select committee proposes to deal with this problem by demanding certain levels of transparency:
there will be particular safety-critical scenarios where technical transparency is imperative, and regulators in those domains must have the power to mandate the use of more transparent forms of AI, even at the potential expense of power and accuracy.
But the question of controlling AI is ultimately an empirical problem. As Yudkowsky points out, we cannot know what AI is capable of until we make the machine and set it on its path. What certain classes of computational systems actually do when you switch them on cannot be settled by definitions of intelligence, morality etc. And Sam Harris thinks that it possible to have a system that has a utility function that is sufficiently strange that it would be completely out of alignment with our values once we scale it up.
Will we build super-intelligent AI that passes the Turing test [that is, the ability to behave in such a way that it is indistinguishable from a human], and will resonate with us as a person, by recognising our emotions and so on? […] insofar as this thing thinks faster and better thoughts, to what extent will its self improvement cause it to migrate away from some equilibrium that we want it to stay in, morally and ethically and indeed in consonance with human flourishing?
After Alpha Zero it seems more plausible that AI systems can get into doing more sophisticated and more dangerous things without their programmes being rewritten, because of self improvement.
Both participants in Racing Toward the Brink agree that you cannot just shut down an AI system once it gets out of control. This picture fails to respect the AI’s intelligence; if the AI is smarter than the human that is keeping it in a box, and pointing a gun at it, it will figure out a way to outsmart the human software, by treating the human brain as a navigational system and taking it to where it “wants”. This is what Yudowksy calls “cognitive uncontainability” – it is the thing that makes something smarter than you dangerous, because you cannot foresee what it might try. You don’t know what is impossible to it. The more complicated the system is, the more something which in possession of higher intelligence and greater learning is likely to have to what your eyes looks like magic. Yudkowsky draws the analogy of showing a medieval person the way an air conditioned room is cooled; to that person who has had no chance of learning the laws of the system that makes air conditioning possible.
That you can expose the human mind to super intelligence and not have that super intelligence walk straight through it as a matter of what looks to us like magic – it’s taking advantage of laws we don’t know about … the idea that human minds are secure is loony.
The authors of the House of Lords report believe that
it is not acceptable to deploy any artificial intelligence system which could have a substantial impact on an individual’s life, unless it can generate a full and satisfactory explanation for the decisions it will take.
But if Yudowsky, Bostrom et al are right, we no longer have a choice in the matter. After all, even limited human brains use dishonesty and manipulation all the time to achieve our ends. The bottom line is that the assumption that we would build an intelligent system that could be prevented by ethical codes from deceiving or even destroying us is based on sentiment and our limited understanding –
an AI system built to maximise human happiness may not have a built in desire to deceive people, but its utility function may include that very deception device in order to achieve its goals.
The sharpest take home message from this discussion is not a sophisticated or complicated one. As an inventing species, we do not wait to tinker around and maximise the safety of systems before releasing them. Whatever select committees in the UK – and their equivalent the world over – may be calling for, the likelihood is that the sloppiest version of AI will get out before we are able to release the non-sloppy version that will defend us.
It’s almost by definition easier to build the unsafe version than the safe version. In the space of all possible super-intelligent AIs, more will be unsafe and unaligned with our interests than aligned. Given that we’re in some kind of arms race … one can assume that we’re running the risk of building dangerous AI than building safe AI. (Sam Harris)
People just ignore stuff until it’s way way way too late to start thinking about things (Eliezer Yudowsky)