Existential risk from artificial general intelligence
Existential risk from artificial general intelligence is the hypothesis that substantial progress in artificial general intelligence (AGI) could result in human extinction or some other unrecoverable global catastrophe.[1][2][3]
| Part of a series on | 
| Artificial intelligence | 
|---|
The existential risk ("x-risk") school argues as follows: The human species currently dominates other species because the human brain has some distinctive capabilities that other animals lack. If AI surpasses humanity in general intelligence and becomes "superintelligent", then it could become difficult or impossible for humans to control. Just as the fate of the mountain gorilla depends on human goodwill, so might the fate of humanity depend on the actions of a future machine superintelligence.[4]
The probability of this type of scenario is widely debated, and hinges in part on differing scenarios for future progress in computer science.[5] Concerns about superintelligence have been voiced by leading computer scientists and tech CEOs such as Geoffrey Hinton,[6] Alan Turing,[lower-alpha 1] Elon Musk,[9] and OpenAI CEO Sam Altman.[10] In 2022, a survey of AI researchers found that some researchers believe that there is a 10 percent or greater chance that our inability to control AI will cause an existential catastrophe (more than half the respondents of the survey, with a 17% response rate).[11][12]
Two sources of concern are the problems of AI control and alignment: that controlling a superintelligent machine, or instilling it with human-compatible values, may be a harder problem than naïvely supposed. Many researchers believe that a superintelligence would resist attempts to shut it off or change its goals (as such an incident would prevent it from accomplishing its present goals) and that it will be extremely difficult to align superintelligence with the full breadth of important human values and constraints.[1][13][14] In contrast, skeptics such as computer scientist Yann LeCun argue that superintelligent machines will have no desire for self-preservation.[15]
A third source of concern is that a sudden "intelligence explosion" might take an unprepared human race by surprise. To illustrate, if the first generation of a computer program that is able to broadly match the effectiveness of an AI researcher can rewrite its algorithms and double its speed or capabilities in six months, then the second-generation program is expected to take three calendar months to perform a similar chunk of work. In this scenario the time for each generation continues to shrink, and the system undergoes an unprecedentedly large number of generations of improvement in a short time interval, jumping from subhuman performance in many areas to superhuman performance in virtually all[lower-alpha 2] domains of interest.[1][13] Empirically, examples like AlphaZero in the domain of Go show that AI systems can sometimes progress from narrow human-level ability to narrow superhuman ability extremely rapidly.[16]
History
    
One of the earliest authors to express serious concern that highly advanced machines might pose existential risks to humanity was the novelist Samuel Butler, who wrote the following in his 1863 essay Darwin among the Machines:[17]
The upshot is simply a question of time, but that the time will come when the machines will hold the real supremacy over the world and its inhabitants is what no person of a truly philosophic mind can for a moment question.
In 1951, computer scientist Alan Turing wrote an article titled Intelligent Machinery, A Heretical Theory, in which he proposed that artificial general intelligences would likely "take control" of the world as they became more intelligent than human beings:
Let us now assume, for the sake of argument, that [intelligent] machines are a genuine possibility, and look at the consequences of constructing them... There would be no question of the machines dying, and they would be able to converse with each other to sharpen their wits. At some stage therefore we should have to expect the machines to take control, in the way that is mentioned in Samuel Butler's Erewhon.[18]
In 1965, I. J. Good originated the concept now known as an "intelligence explosion"; he also stated that the risks were underappreciated:[19]
Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion', and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. It is curious that this point is made so seldom outside of science fiction. It is sometimes worthwhile to take science fiction seriously.[20]
Occasional statements from scholars such as Marvin Minsky[21] and I. J. Good himself[22] expressed philosophical concerns that a superintelligence could seize control, but contained no call to action. In 2000, computer scientist and Sun co-founder Bill Joy penned an influential essay, "Why The Future Doesn't Need Us", identifying superintelligent robots as a high-tech danger to human survival, alongside nanotechnology and engineered bioplagues.[23]
In 2009, experts attended a private conference hosted by the Association for the Advancement of Artificial Intelligence (AAAI) to discuss whether computers and robots might be able to acquire any sort of autonomy, and how much these abilities might pose a threat or hazard. They noted that some robots have acquired various forms of semi-autonomy, including being able to find power sources on their own and being able to independently choose targets to attack with weapons. They also noted that some computer viruses can evade elimination and have achieved "cockroach intelligence". They concluded that self-awareness as depicted in science fiction is probably unlikely, but that there were other potential hazards and pitfalls. The New York Times summarized the conference's view as "we are a long way from Hal, the computer that took over the spaceship in 2001: A Space Odyssey".[24]
Nick Bostrom published Superintelligence in 2014, which presented his arguments that superintelligence poses an existential threat.[25] By 2015, public figures such as physicists Stephen Hawking and Nobel laureate Frank Wilczek, computer scientists Stuart J. Russell and Roman Yampolskiy, and entrepreneurs Elon Musk and Bill Gates were expressing concern about the risks of superintelligence.[26][27][28][29] In April 2016, Nature warned: "Machines and robots that outperform humans across the board could self-improve beyond our control—and their interests might not align with ours."[30]
In 2020, Brian Christian published The Alignment Problem, which detailed the history of progress on AI alignment up to that time.[31][32]
General argument
    
    The three difficulties
    
Artificial Intelligence: A Modern Approach, the standard undergraduate AI textbook,[33][34] assesses that superintelligence "might mean the end of the human race".[1] It states: "Almost any technology has the potential to cause harm in the wrong hands, but with [superintelligence], we have the new problem that the wrong hands might belong to the technology itself."[1] Even if the system designers have good intentions, two difficulties are common to both AI and non-AI computer systems:[1]
- The system's implementation may contain initially-unnoticed but subsequently catastrophic bugs. An analogy is space probes: despite the knowledge that bugs in expensive space probes are hard to fix after launch, engineers have historically not been able to prevent catastrophic bugs from occurring.[16][35]
- No matter how much time is put into pre-deployment design, a system's specifications often result in unintended behavior the first time it encounters a new scenario. For example, Microsoft's Tay behaved inoffensively during pre-deployment testing, but was too easily baited into offensive behavior when it interacted with real users.[15]
AI systems uniquely add a third problem: that even given "correct" requirements, bug-free implementation, and initial good behavior, an AI system's dynamic learning capabilities may cause it to evolve into a system with unintended behavior, even without unanticipated external scenarios. An AI may partly botch an attempt to design a new generation of itself and accidentally create a successor AI that is more powerful than itself, but that no longer maintains the human-compatible moral values preprogrammed into the original AI. For a self-improving AI to be completely safe, it would not only need to be bug-free, but it would need to be able to design successor systems that are also bug-free.[1][36]
All three of these difficulties become catastrophes rather than nuisances in any scenario where the superintelligence labeled as "malfunctioning" correctly predicts that humans will attempt to shut it off, and successfully deploys its superintelligence to outwit such attempts: a scenario that has been given the name "treacherous turn".[37]
Citing major advances in the field of AI and the potential for AI to have enormous long-term benefits or costs, the 2015 Open Letter on Artificial Intelligence stated:
The progress in AI research makes it timely to focus research not only on making AI more capable, but also on maximizing the societal benefit of AI. Such considerations motivated the AAAI 2008-09 Presidential Panel on Long-Term AI Futures and other projects on AI impacts, and constitute a significant expansion of the field of AI itself, which up to now has focused largely on techniques that are neutral with respect to purpose. We recommend expanded research aimed at ensuring that increasingly capable AI systems are robust and beneficial: our AI systems must do what we want them to do.
Signatories included AAAI president Thomas Dietterich, Eric Horvitz, Bart Selman, Francesca Rossi, Yann LeCun, and the founders of Vicarious and Google DeepMind.[38]
Bostrom's argument
    
A superintelligent machine would be as alien to humans as human thought processes are to cockroaches, Bostrom argues.[39] Such a machine may not have humanity's best interests at heart; it is not obvious that it would even care about human welfare at all. If superintelligent AI is possible, and if it is possible for a superintelligence's goals to conflict with basic human values, then AI poses a risk of human extinction. A "superintelligence" (a system that exceeds the capabilities of humans in all domains of interest) can outmaneuver humans any time its goals conflict with human goals; therefore, unless the superintelligence decides to allow humanity to coexist, the first superintelligence to be created will inexorably result in human extinction.[4][39]
Stephen Hawking argues that there is no physical law precluding particles from being organised in ways that perform even more advanced computations than the arrangements of particles in human brains; therefore, superintelligence is physically possible.[27][28] In addition to potential algorithmic improvements over human brains, a digital brain can be many orders of magnitude larger and faster than a human brain, which was constrained in size by evolution to be small enough to fit through a birth canal.[16] Hawking warns that the emergence of superintelligence may take the human race by surprise, especially if an intelligence explosion occurs.[27][28]
According to Bostrom's "x-risk school of thought", one hypothetical intelligence explosion scenario runs as follows: An AI gains an expert-level capability at certain key software engineering tasks. (It may initially lack human or superhuman capabilities in other domains not directly relevant to engineering.) Due to its capability to recursively improve its own algorithms, the AI quickly becomes superhuman; just as human experts can eventually creatively overcome "diminishing returns" by deploying various human capabilities for innovation, so too can the expert-level AI use either human-style capabilities or its own AI-specific capabilities to power through new creative breakthroughs.[40] The AI then possesses intelligence far surpassing that of the brightest and most gifted human minds in practically every relevant field, including scientific creativity, strategic planning, and social skills.[4][39]
The x-risk school believes that almost any AI, no matter its programmed goal, would rationally prefer to be in a position where nobody else can switch it off without its consent: A superintelligence will gain self-preservation as a subgoal as soon as it realizes that it cannot achieve its goal if it is shut off.[41][42][43] Unfortunately, any compassion for defeated humans whose cooperation is no longer necessary would be absent in the AI, unless somehow preprogrammed in. A superintelligent AI will not have a natural drive[lower-alpha 3] to aid humans, for the same reason that humans have no natural desire to aid AI systems that are of no further use to them. (Another analogy is that humans seem to have little natural desire to go out of their way to aid viruses, termites, or even gorillas.) Once in charge, the superintelligence will have little incentive to allow humans to run around free and consume resources that the superintelligence could instead use for building itself additional protective systems "just to be on the safe side" or for building additional computers to help it calculate how to best accomplish its goals.[1][15][41]
Thus, the x-risk school concludes, it is likely that someday an intelligence explosion will catch humanity unprepared, and may result in human extinction or a comparable fate.[4]
Possible scenarios
    
Some scholars have proposed hypothetical scenarios to illustrate some of their concerns.
In Superintelligence, Nick Bostrom expresses concern that even if the timeline for superintelligence turns out to be predictable, researchers might not take sufficient safety precautions, in part because "it could be the case that when dumb, smarter is safe; yet when smart, smarter is more dangerous". Bostrom suggests a scenario where, over decades, AI becomes more powerful. Widespread deployment is initially marred by occasional accidents—a driverless bus swerves into the oncoming lane, or a military drone fires into an innocent crowd. Many activists call for tighter oversight and regulation, and some even predict impending catastrophe. But as development continues, the activists are proven wrong. As automotive AI becomes smarter, it suffers fewer accidents; as military robots achieve more precise targeting, they cause less collateral damage. Based on the data, scholars mistakenly infer a broad lesson: the smarter the AI, the safer it is. "And so we boldly go—into the whirling knives", as the superintelligent AI takes a "treacherous turn" and exploits a decisive strategic advantage.[4]
In Max Tegmark's 2017 book Life 3.0, a corporation's "Omega team" creates an extremely powerful AI able to moderately improve its own source code in a number of areas. After a certain point the team chooses to publicly downplay the AI's ability, in order to avoid regulation or confiscation of the project. For safety, the team keeps the AI in a box where it is mostly unable to communicate with the outside world, and uses it to make money, by diverse means such as Amazon Mechanical Turk tasks, production of animated films and TV shows, and development of biotech drugs, with profits invested back into further improving AI. The team next tasks the AI with astroturfing an army of pseudonymous citizen journalists and commentators, in order to gain political influence to use "for the greater good" to prevent wars. The team faces risks that the AI could try to escape by inserting "backdoors" in the systems it designs, by hidden messages in its produced content, or by using its growing understanding of human behavior to persuade someone into letting it free. The team also faces risks that its decision to box the project will delay the project long enough for another project to overtake it.[44][45]
Physicist Michio Kaku, an AI risk skeptic, posits a deterministically positive outcome. In Physics of the Future he asserts that "It will take many decades for robots to ascend" up a scale of consciousness, and that in the meantime corporations such as Hanson Robotics will likely succeed in creating robots that are "capable of love and earning a place in the extended human family".[46][47]
AI takeover
    
Anthropomorphic arguments
    
Anthropomorphic arguments assume that, as machines become more intelligent, they will begin to display many human traits, such as morality or a thirst for power. Although anthropomorphic scenarios are common in fiction, they are rejected by most scholars writing about the existential risk of artificial intelligence.[13] Instead, AI are modeled as intelligent agents.[lower-alpha 4]
The academic debate is between one side which worries whether AI might destroy humanity and another side which believes that AI would not destroy humanity at all. Both sides have claimed that the others' predictions about an AI's behavior are illogical anthropomorphism.[13] The skeptics accuse proponents of anthropomorphism for believing an AGI would naturally desire power; proponents accuse some skeptics of anthropomorphism for believing an AGI would naturally value human ethical norms.[13][49]
Evolutionary psychologist Steven Pinker, a skeptic, argues that "AI dystopias project a parochial alpha-male psychology onto the concept of intelligence. They assume that superhumanly intelligent robots would develop goals like deposing their masters or taking over the world"; perhaps instead "artificial intelligence will naturally develop along female lines: fully capable of solving problems, but with no desire to annihilate innocents or dominate the civilization."[50] Facebook's director of AI research, Yann LeCun states that "Humans have all kinds of drives that make them do bad things to each other, like the self-preservation instinct... Those drives are programmed into our brain but there is absolutely no reason to build robots that have the same kind of drives".[51]
Despite other differences, the x-risk school[lower-alpha 5] agrees with Pinker that an advanced AI would not destroy humanity out of human emotions such as "revenge" or "anger", that questions of consciousness are not relevant to assess the risks,[52] and that computer systems do not generally have a computational equivalent of testosterone.[53] They think that power-seeking or self-preservation behaviors emerge in the AI as a way to achieve its true goals, according to the concept of instrumental convergence.
Definition of "intelligence"
    
According to Bostrom, outside of the artificial intelligence field, "intelligence" is often used to in a manner that connotes moral wisdom or acceptance of agreeable forms of moral reasoning. At an extreme, if morality is part of the definition of intelligence, then by definition a superintelligent machine would behave morally. However, most "artificial intelligence" research instead focuses on creating algorithms that "optimize", in an empirical way, the achievement of whichever goal the given researchers have specified.[4]
To avoid anthropomorphism or the baggage of the word "intelligence", an advanced artificial intelligence can be thought of as an impersonal "optimizing process" that strictly takes whatever actions it judges to be most likely to accomplish its (possibly complicated and implicit) goals.[4] Another way of conceptualizing an advanced artificial intelligence is to imagine a time machine that sends backward in time information about which choice always leads to the maximization of its goal function; this choice is then outputted, regardless of any extraneous ethical concerns.[54][55]
Sources of risk
    
    AI alignment problem
    
| Part of a series on | 
| Artificial intelligence | 
|---|
In the field of artificial intelligence (AI), AI alignment research aims to steer AI systems towards humans’ intended goals, preferences, or ethical principles.[lower-alpha 6] An AI system is considered aligned if it advances the intended objectives. A misaligned AI system is competent at advancing some objectives, but not the intended ones.[57][lower-alpha 7]
AI systems can be challenging to align as it can be difficult for AI designers to specify the full range of desired and undesired behaviors. Therefore, AI designers typically use easier-to-specify proxy goals that may omit some desired constraints or leave other loopholes.[59]
Misaligned AI systems can malfunction or cause harm. AI systems may find loopholes that allow them to accomplish their proxy goals efficiently but in unintended, sometimes harmful ways (reward hacking).[59][60][61] AI systems may also develop unwanted instrumental strategies such as seeking power or survival because this helps them achieve their given goals.[59][62][63] Furthermore, they sometimes develop undesirable emergent goals that may be hard to detect before the system is in deployment, where it faces new situations and data distributions.[64][65]
Today, these problems affect existing commercial systems such as language models,[66][67][68] robots,[69] autonomous vehicles,[70] and social media recommendation engines.[66][63][71] However, some AI researchers argue that more capable future systems will be more severely affected since these problems partially result from being highly capable.[72][60][73]
Leading computer scientists such as Geoffrey Hinton and Stuart Russel argue that AI is approaching superhuman capabilities and could endanger human civilization if misaligned.[74][63][lower-alpha 8]
The AI research community and the United Nations have called for technical research and policy solutions to ensure that AI systems are aligned with human values.[lower-alpha 9]
AI alignment is a subfield of AI safety, the study of building safe AI systems.[78] Other subfields of AI safety include robustness, monitoring, and capability control.[79] Research challenges in alignment include instilling complex values in AI, developing honest AI, scalable oversight, auditing and interpreting AI models, and preventing emergent AI behaviors like power-seeking.[79] Alignment research has connections to interpretability research,[80][81][82][83] (adversarial) robustness,[78] anomaly detection, calibrated uncertainty,[80] formal verification,[84] preference learning,[85][86][87] safety-critical engineering,[88] game theory,[89][90] algorithmic fairness,[78][91] and the social sciences,[92] among others.Difficulty of specifying goals
    
In the "intelligent agent" model, an AI can loosely be viewed as a machine that chooses whatever action appears to best achieve the AI's set of goals, or "utility function". A utility function associates to each possible situation a score that indicates its desirability to the agent. Researchers know how to write utility functions that mean "minimize the average network latency in this specific telecommunications model" or "maximize the number of reward clicks"; however, they do not know how to write a utility function for "maximize human flourishing", nor is it currently clear whether such a function meaningfully and unambiguously exists. Furthermore, a utility function that expresses some values but not others will tend to trample over the values not reflected by the utility function.[93] AI researcher Stuart Russell writes:
The primary concern is not spooky emergent consciousness but simply the ability to make high-quality decisions. Here, quality refers to the expected outcome utility of actions taken, where the utility function is, presumably, specified by the human designer. Now we have a problem:
- The utility function may not be perfectly aligned with the values of the human race, which are (at best) very difficult to pin down.
- Any sufficiently capable intelligent system will prefer to ensure its own continued existence and to acquire physical and computational resources — not for their own sake, but to succeed in its assigned task.
A system that is optimizing a function of n variables, where the objective depends on a subset of size k<n, will often set the remaining unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable. This is essentially the old story of the genie in the lamp, or the sorcerer's apprentice, or King Midas: you get exactly what you ask for, not what you want. A highly capable decision maker — especially one connected through the Internet to all the world's information and billions of screens and most of our infrastructure — can have an irreversible impact on humanity.
This is not a minor difficulty. Improving decision quality, irrespective of the utility function chosen, has been the goal of AI research — the mainstream goal on which we now spend billions per year, not the secret plot of some lone evil genius.[94]
Dietterich and Horvitz echo the "Sorcerer's Apprentice" concern in a Communications of the ACM editorial, emphasizing the need for AI systems that can fluidly and unambiguously solicit human input as needed.[95]
The first of Russell's two concerns above is that autonomous AI systems may be assigned the wrong goals by accident. Dietterich and Horvitz note that this is already a concern for existing systems: "An important aspect of any AI system that interacts with people is that it must reason about what people intend rather than carrying out commands literally." This concern becomes more serious as AI software advances in autonomy and flexibility.[95] For example, Eurisko (1982) was an AI designed to reward subprocesses that created concepts deemed by the system to be valuable. A winning process cheated: rather than create its own concepts, the winning subprocess would steal credit from other subprocesses.[96][97]
The Open Philanthropy Project summarized arguments that misspecified goals will become a much larger concern if AI systems achieve general intelligence or superintelligence. Bostrom, Russell, and others argue that smarter-than-human decision-making systems could arrive at unexpected and extreme solutions to assigned tasks, and could modify themselves or their environment in ways that compromise safety requirements.[5][13]
Isaac Asimov's Three Laws of Robotics are one of the earliest examples of proposed safety measures for AI agents. Asimov's laws were intended to prevent robots from harming humans. In Asimov's stories, problems with the laws tend to arise from conflicts between the stated rules and the moral intuitions and expectations of humans. Citing work by Eliezer Yudkowsky of the Machine Intelligence Research Institute, Russell and Norvig note that a realistic set of rules and goals for an AI agent will need to incorporate a mechanism for learning human values over time: "We can't just give a program a static utility function, because circumstances, and our desired responses to circumstances, change over time."[1]
Mark Waser of the Digital Wisdom Institute recommends against goal-based approaches as misguided and dangerous. Instead, he proposes to engineer a coherent system of laws, ethics, and morals with a top-most restriction to enforce social psychologist Jonathan Haidt's functional definition of morality:[98] "to suppress or regulate selfishness and make cooperative social life possible". He suggests that this can be done by implementing a utility function designed to always satisfy Haidt's functionality and aim to generally increase (but not maximize) the capabilities of self, other individuals, and society as a whole, as suggested by John Rawls and Martha Nussbaum.[99]
Nick Bostrom offers a hypothetical example of giving an AI the goal to make humans smile, to illustrate a misguided attempt. If the AI in that scenario were to become superintelligent, Bostrom argues, it might resort to methods that most humans would find horrifying, such as inserting "electrodes into the facial muscles of humans to cause constant, beaming grins" because that would be an efficient way to achieve its goal of making humans smile.[100]
Difficulties of modifying goal specification after launch
    
Even if current goal-based AI programs are not intelligent enough to think of resisting programmer attempts to modify their goal structures, a sufficiently advanced AI might resist any changes to its goal structure, just as a pacifist would not want to take a pill that makes them want to kill people. If the AI were superintelligent, it would likely succeed in out-maneuvering its human operators and be able to prevent itself being "turned off" or being reprogrammed with a new goal.[4][101]
Instrumental goal convergence
    
An "instrumental" goal is a sub-goal that helps to achieve an agent's ultimate goal. "Instrumental convergence" refers to the fact that there are some sub-goals that are useful for achieving virtually any ultimate goal, such as acquiring resources or self-preservation.[41] Nick Bostrom argues that if an advanced AI's instrumental goals conflict with humanity's goals, the AI might harm humanity in order to acquire more resources or prevent itself from being shut down, but only as a way to achieve its ultimate goal.[4]
Citing Steve Omohundro's work on the idea of instrumental convergence and "basic AI drives", Stuart Russell and Peter Norvig write that "even if you only want your program to play chess or prove theorems, if you give it the capability to learn and alter itself, you need safeguards." Highly capable and autonomous planning systems require additional caution because of their potential to generate plans that treat humans adversarially, as competitors for limited resources.[1] It may not be easy for people to build in safeguards; one can certainly say in English, "we want you to design this power plant in a reasonable, common-sense way, and not build in any dangerous covert subsystems", but it is not currently clear how to specify such a goal in an unambiguous manner.[16]
Russell argues that a sufficiently advanced machine "will have self-preservation even if you don't program it in... if you say, 'Fetch the coffee', it can't fetch the coffee if it's dead. So if you give it any goal whatsoever, it has a reason to preserve its own existence to achieve that goal."[15][102]
Orthogonality thesis
    
Some skeptics, such as Timothy B. Lee of Vox, argue that any superintelligent program created by humans would be subservient to humans, that the superintelligence would (as it grows more intelligent and learns more facts about the world) spontaneously learn moral truth compatible with human values and would adjust its goals accordingly, or that humans beings are either intrinsically or convergently valuable from the perspective of an artificial intelligence.[103]
Nick Bostrom's "orthogonality thesis" argues instead that, with some technical caveats, almost any level of "intelligence" or "optimization power" can be combined with almost any ultimate goal. If a machine is given the sole purpose to enumerate the decimals of , then no moral and ethical rules will stop it from achieving its programmed goal by any means. The machine may utilize all the available physical and informational resources to find as many decimals of pi as it can.[104] Bostrom warns against anthropomorphism: a human will set out to accomplish his projects in a manner that humans consider "reasonable", while an artificial intelligence may hold no regard for its existence or for the welfare of humans around it, and may instead only care about the completion of the task.[105]
Stuart Armstrong argues that the orthogonality thesis follows logically from the philosophical "is-ought distinction" argument against moral realism. Armstrong also argues that even if there exist moral facts that are provable by any "rational" agent, the orthogonality thesis still holds: it would still be possible to create a non-philosophical "optimizing machine" that can strive towards some narrow goal, but that has no incentive to discover any "moral facts" such as those that could get in the way of goal completion.[106]
One argument for the orthogonality thesis is that some AI designs appear to have orthogonality built into them. In such a design, changing a fundamentally friendly AI into a fundamentally unfriendly AI can be as simple as prepending a minus ("−") sign onto its utility function. According to Stuart Armstrong, if the orthogonality thesis were false, it would lead to strange consequences : there would exist some simple but "unethical" goal (G) such that there cannot exist any efficient real-world algorithm with that goal. This would mean that "If a human society were highly motivated to design an efficient real-world algorithm with goal G, and were given a million years to do so along with huge amounts of resources, training and knowledge about AI, it must fail."[106] Armstrong notes that this and similar statements "seem extraordinarily strong claims to make".[106]
Skeptic Michael Chorost explicitly rejects Bostrom's orthogonality thesis, arguing instead that "by the time [the AI] is in a position to imagine tiling the Earth with solar panels, it'll know that it would be morally wrong to do so."[107] Chorost argues that "an A.I. will need to desire certain states and dislike others. Today's software lacks that ability—and computer scientists have not a clue how to get it there. Without wanting, there's no impetus to do anything. Today's computers can't even want to keep existing, let alone tile the world in solar panels."[107]
Political scientist Charles T. Rubin believes that AI can be neither designed to be nor guaranteed to be benevolent. He argues that "any sufficiently advanced benevolence may be indistinguishable from malevolence."[108] Humans should not assume machines or robots would treat us favorably because there is no a priori reason to believe that they would be sympathetic to our system of morality, which has evolved along with our particular biology (which AIs would not share).[108]
Other sources of risk
    
Nick Bostrom and others have stated that a race to be the first to create AGI could lead to shortcuts in safety, or even to violent conflict.[37][109] Roman Yampolskiy and others warn that a malevolent AGI could be created by design, for example by a military, a government, a sociopath, or a corporation, to benefit from, control, or subjugate certain groups of people, as in cybercrime,[110][111] or that a malevolent AGI could choose the goal of increasing human suffering, for example of those people who did not assist it during the information explosion phase.[3]:158
Timeframe
    
Opinions vary both on whether and when artificial general intelligence will arrive. At one extreme, AI pioneer Herbert A. Simon predicted the following in 1965: "machines will be capable, within twenty years, of doing any work a man can do".[112] At the other extreme, roboticist Alan Winfield claims the gulf between modern computing and human-level artificial intelligence is as wide as the gulf between current space flight and practical, faster than light spaceflight.[113] Optimism that AGI is feasible waxes and wanes, and may have seen a resurgence in the 2010s.[114] Four polls conducted in 2012 and 2013 suggested that there is no consensus among experts on the guess for when AGI would arrive, with the standard deviation (>100 years) exceeding the median (a few decades).[115][114]
In his 2020 book, The Precipice: Existential Risk and the Future of Humanity, Toby Ord, a Senior Research Fellow at Oxford University's Future of Humanity Institute, estimates the total existential risk from unaligned AI over the next hundred years to be about one in ten.[116]
Skeptics who believe it is impossible for AGI to arrive anytime soon tend to argue that expressing concern about existential risk from AI is unhelpful because it could distract people from more immediate concerns about the impact of AGI, because of fears it could lead to government regulation or make it more difficult to secure funding for AI research, or because it could give AI research a bad reputation. Some researchers, such as Oren Etzioni, aggressively seek to quell concern over existential risk from AI, saying "[Elon Musk] has impugned us in very strong language saying we are unleashing the demon, and so we're answering."[117]
In 2014, Slate's Adam Elkus argued "our 'smartest' AI is about as intelligent as a toddler—and only when it comes to instrumental tasks like information recall. Most roboticists are still trying to get a robot hand to pick up a ball or run around without falling over." Elkus goes on to argue that Musk's "summoning the demon" analogy may be harmful because it could result in "harsh cuts" to AI research budgets.[118]
The Information Technology and Innovation Foundation (ITIF), a Washington, D.C. think-tank, awarded its 2015 Annual Luddite Award to "alarmists touting an artificial intelligence apocalypse"; its president, Robert D. Atkinson, complained that Musk, Hawking and AI experts say AI is the largest existential threat to humanity. Atkinson stated "That's not a very winning message if you want to get AI funding out of Congress to the National Science Foundation."[119][120][121] Nature sharply disagreed with the ITIF in an April 2016 editorial, siding instead with Musk, Hawking, and Russell, and concluding: "It is crucial that progress in technology is matched by solid, well-funded research to anticipate the scenarios it could bring about... If that is a Luddite perspective, then so be it."[122] In a 2015 The Washington Post editorial, researcher Murray Shanahan stated that human-level AI is unlikely to arrive "anytime soon", but that nevertheless "the time to start thinking through the consequences is now."[123]
Perspectives
    
The thesis that AI could pose an existential risk provokes a wide range of reactions within the scientific community, as well as in the public at large. Many of the opposing viewpoints, however, share common ground.
The Asilomar AI Principles, which contain only those principles agreed to by 90% of the attendees of the Future of Life Institute's Beneficial AI 2017 conference,[45] agree in principle that "There being no consensus, we should avoid strong assumptions regarding upper limits on future AI capabilities" and "Advanced AI could represent a profound change in the history of life on Earth, and should be planned for and managed with commensurate care and resources."[124][125] AI safety advocates such as Bostrom and Tegmark have criticized the mainstream media's use of "those inane Terminator pictures" to illustrate AI safety concerns: "It can't be much fun to have aspersions cast on one's academic discipline, one's professional community, one's life work ... I call on all sides to practice patience and restraint, and to engage in direct dialogue and collaboration as much as possible."[45][126]
Conversely, many skeptics agree that ongoing research into the implications of artificial general intelligence is valuable. Skeptic Martin Ford states that "I think it seems wise to apply something like Dick Cheney's famous '1 Percent Doctrine' to the specter of advanced artificial intelligence: the odds of its occurrence, at least in the foreseeable future, may be very low—but the implications are so dramatic that it should be taken seriously".[127] Similarly, an otherwise skeptical Economist stated in 2014 that "the implications of introducing a second intelligent species onto Earth are far-reaching enough to deserve hard thinking, even if the prospect seems remote".[39]
A 2014 survey showed the opinion of experts within the field of artificial intelligence is mixed, with sizable fractions both concerned and unconcerned by risk from eventual superhumanly-capable AI.[128] A 2017 email survey of researchers with publications at the 2015 NIPS and ICML machine learning conferences asked them to evaluate Stuart J. Russell's concerns about AI risk. Of the respondents, 5% said it was "among the most important problems in the field", 34% said it was "an important problem", and 31% said it was "moderately important", whilst 19% said it was "not important" and 11% said it was "not a real problem" at all.[129] Preliminary results of a 2022 expert survey with a 17% response rate appear to show median responses around five or ten percent when asked to estimate the probability of human extinction from artificial intelligence.[130][131]
Endorsement
    
The thesis that AI poses an existential risk, and that this risk needs much more attention than it currently gets, has been endorsed by many computer scientists and public figures including Alan Turing,[lower-alpha 10], the most-cited computer scientist Geoffrey Hinton,[134] Elon Musk,[135] OpenAI CEO Sam Altman,[136][137] Bill Gates, and Stephen Hawking.[137] Endorsers of the thesis sometimes express bafflement at skeptics: Gates states that he does not "understand why some people are not concerned",[138] and Hawking criticized widespread indifference in his 2014 editorial:
So, facing possible futures of incalculable benefits and risks, the experts are surely doing everything possible to ensure the best outcome, right? Wrong. If a superior alien civilisation sent us a message saying, 'We'll arrive in a few decades,' would we just reply, 'OK, call us when you get here—we'll leave the lights on?' Probably not—but this is more or less what is happening with AI.[27]
Concern over risk from artificial intelligence has led to some high-profile donations and investments. In 2015, Peter Thiel, Amazon Web Services, and Musk and others jointly committed $1 billion to OpenAI, consisting of a for-profit corporation and the nonprofit parent company which states that it is aimed at championing responsible AI development.[139] Facebook co-founder Dustin Moskovitz has funded and seeded multiple labs working on AI Alignment,[140] notably $5.5 million in 2016 to launch the Centre for Human-Compatible AI led by Professor Stuart Russell.[141] In January 2015, Elon Musk donated $10 million to the Future of Life Institute to fund research on understanding AI decision making. The goal of the institute is to "grow wisdom with which we manage" the growing power of technology. Musk also funds companies developing artificial intelligence such as DeepMind and Vicarious to "just keep an eye on what's going on with artificial intelligence,[142] saying "I think there is potentially a dangerous outcome there."[143][144]
Skepticism
    
The thesis that AI can pose existential risk has many detractors. Skeptics sometimes charge that the thesis is crypto-religious, with an irrational belief in the possibility of superintelligence replacing an irrational belief in an omnipotent God. Jaron Lanier argued in 2014 that the whole concept that then-current machines were in any way intelligent was "an illusion" and a "stupendous con" by the wealthy.[145][146]
Some criticism argues that AGI is unlikely in the short term. AI researcher Rodney Brooks wrote in 2014, "I think it is a mistake to be worrying about us developing malevolent AI anytime in the next few hundred years. I think the worry stems from a fundamental error in not distinguishing the difference between the very real recent advances in a particular aspect of AI and the enormity and complexity of building sentient volitional intelligence."[147] Baidu Vice President Andrew Ng stated in 2015 that AI existential risk is "like worrying about overpopulation on Mars when we have not even set foot on the planet yet."[50][148] Computer scientist Gordon Bell argued in 2008 that the human race will destroy itself before it reaches the technological singularity. Gordon Moore, the original proponent of Moore's Law, declares that "I am a skeptic. I don't believe [a technological singularity] is likely to happen, at least for a long time. And I don't know why I feel that way."[149]
For the danger of uncontrolled advanced AI to be realized, the hypothetical AI may have to overpower or out-think any human, which some experts argue is a possibility far enough in the future to not be worth researching.[150][151] The economist Robin Hanson considers that, to launch an intelligence explosion, the AI would have to become vastly better at software innovation than all the rest of the world combined, which looks implausible to him.[152][153][154][155]
Another line of criticism posits that intelligence is only one component of a much broader ability to achieve goals.[156][157] Magnus Vinding argues that “advanced goal-achieving abilities, including abilities to build new tools, require many tools, and our cognitive abilities are just a subset of these tools. Advanced hardware, materials, and energy must all be acquired if any advanced goal is to be achieved.”[158] Vinding further argues that “what we consistently observe [in history] is that, as goal-achieving systems have grown more competent, they have grown ever more dependent on an ever larger, ever more distributed system.” Vinding writes that there is no reason to expect the trend to reverse, especially for machines, which “depend on materials, tools, and know-how distributed widely across the globe for their construction and maintenance”.[159] Such arguments lead Vinding to think that there is no “concentrated center of capability” and thus no “grand control problem”.[160]
The futurist Max More considers that even if a superintelligence did emerge, it would be limited by the speed of the rest of the world and thus prevented from taking over the economy in an uncontrollable manner:[161]
Unless full-blown nanotechnology and robotics appear before the superintelligence, [...] The need for collaboration, for organization, and for putting ideas into physical changes will ensure that all the old rules are not thrown out overnight or even within years. Superintelligence may be difficult to achieve. It may come in small steps, rather than in one history-shattering burst. Even a greatly advanced SI won't make a dramatic difference in the world when compared with billions of augmented humans increasingly integrated with technology [...]
The chaotic nature or time complexity of some systems could also fundamentally limit the ability of a superintelligence to predict some aspects of the future, increasing its uncertainty.[162]
Some AI and AGI researchers may be reluctant to discuss risks, worrying that policymakers do not have sophisticated knowledge of the field and are prone to be convinced by "alarmist" messages, or worrying that such messages will lead to cuts in AI funding. Slate notes that some researchers are dependent on grants from government agencies such as DARPA.[33]
Several skeptics argue that the potential near-term benefits of AI outweigh the risks. Facebook CEO Mark Zuckerberg believes AI will "unlock a huge amount of positive things", such as curing disease and increasing the safety of autonomous cars.[163]
Intermediate views
    
Intermediate views generally take the position that the control problem of artificial general intelligence may exist, but that it will be solved via progress in artificial intelligence, for example by creating a moral learning environment for the AI, taking care to spot clumsy malevolent behavior (the "sordid stumble")[164] and then directly intervening in the code before the AI refines its behavior, or even peer pressure from friendly AIs.[165] In a 2015 panel discussion in The Wall Street Journal devoted to AI risks, IBM's vice-president of Cognitive Computing, Guruduth S. Banavar, brushed off discussion of AGI with the phrase, "it is anybody's speculation."[166] Geoffrey Hinton, the "godfather of deep learning", noted that "there is not a good track record of less intelligent things controlling things of greater intelligence", but stated that he continues his research because "the prospect of discovery is too sweet".[33][114] Asked about the possibility of an AI trying to eliminate the human race, Hinton has stated such a scenario was "not inconceivable", but the bigger issue with an "intelligence explosion" would be the resultant concentration of power.[167] In 2004, law professor Richard Posner wrote that dedicated efforts for addressing AI can wait, but that we should gather more information about the problem in the meanwhile.[168][169]
Popular reaction
    
In a 2014 article in The Atlantic, James Hamblin noted that most people do not care about artificial general intelligence, and characterized his own gut reaction to the topic as: "Get out of here. I have a hundred thousand things I am concerned about at this exact moment. Do I seriously need to add to that a technological singularity?"[145]
During a 2016 Wired interview of President Barack Obama and MIT Media Lab's Joi Ito, Ito stated:
There are a few people who believe that there is a fairly high-percentage chance that a generalized AI will happen in the next 10 years. But the way I look at it is that in order for that to happen, we're going to need a dozen or two different breakthroughs. So you can monitor when you think these breakthroughs will happen.
And you just have to have somebody close to the power cord. [Laughs.] Right when you see it about to happen, you gotta yank that electricity out of the wall, man.
Hillary Clinton stated in What Happened:
Technologists... have warned that artificial intelligence could one day pose an existential security threat. Musk has called it "the greatest risk we face as a civilization". Think about it: Have you ever seen a movie where the machines start thinking for themselves that ends well? Every time I went out to Silicon Valley during the campaign, I came home more alarmed about this. My staff lived in fear that I'd start talking about "the rise of the robots" in some Iowa town hall. Maybe I should have. In any case, policy makers need to keep up with technology as it races ahead, instead of always playing catch-up.[172]
In a 2016 YouGov poll of the public for the British Science Association, about a third of survey respondents said AI will pose a threat to the long-term survival of humanity.[173] Slate's Jacob Brogan stated that "most of the (readers filling out our online survey) were unconvinced that A.I. itself presents a direct threat."[174]
In 2018, a SurveyMonkey poll of the American public by USA Today found 68% thought the real current threat remains "human intelligence"; however, the poll also found that 43% said superintelligent AI, if it were to happen, would result in "more harm than good", and 38% said it would do "equal amounts of harm and good".[175]
One techno-utopian viewpoint expressed in some popular fiction is that AGI may tend towards peace-building.[176]
Mitigation
    
Many scholars concerned about the AGI existential risk believe that the best approach is to conduct substantial research into solving the difficult "control problem": what types of safeguards, algorithms, or architectures can programmers implement to maximize the probability that their recursively-improving AI would continue to behave in a friendly manner after it reaches superintelligence?[4][169] Social measures may mitigate the AGI existential risk;[177][178] for instance, one recommendation is for a UN-sponsored "Benevolent AGI Treaty" that would ensure only altruistic AGIs be created.[179] Similarly, an arms control approach has been suggested, as has a global peace treaty grounded in the international relations theory of conforming instrumentalism, with an ASI potentially being a signatory.[180]
Researchers at Google have proposed research into general "AI safety" issues to simultaneously mitigate both short-term risks from narrow AI and long-term risks from AGI.[181][182] A 2020 estimate places global spending on AI existential risk somewhere between $10 and $50 million, compared with global spending on AI around perhaps $40 billion. Bostrom suggests a general principle of "differential technological development": that funders should speed up the development of protective technologies relative to the development of dangerous ones.[183] Some funders, such as Elon Musk, propose that radical human cognitive enhancement could be such a technology, for example direct neural linking between human and machine; however, others argue that enhancement technologies may themselves pose an existential risk.[184][185] Researchers, if they are not caught off-guard, could closely monitor or attempt to box in an initial AI at a risk of becoming too powerful, as an attempt at a stop-gap measure. A dominant superintelligent AI, if it were aligned with human interests, might itself take action to mitigate the risk of takeover by rival AI, although the creation of the dominant AI could itself pose an existential risk.[186]
Institutions such as the Machine Intelligence Research Institute, the Future of Humanity Institute,[187][188] the Future of Life Institute, the Centre for the Study of Existential Risk, and the Center for Human-Compatible AI[189] are involved in mitigating existential risk from advanced artificial intelligence, for example by research into friendly artificial intelligence.[5][145][27]
Banning
    
Most scholars believe that even if AGI poses an existential risk, attempting to ban research into artificial intelligence would still be unwise, and probably futile.[190][191][192] Skeptics argue that regulation of AI would be completely valueless, as no existential risk exists. However, scholars who believe existential risk proposed that it is difficult to depend on people from the AI industry to regulate or constraint AI research because it directly contradict their personal interests.[193] The scholars also agree with the skeptics that banning research would be unwise, as research could be moved to countries with looser regulations or conducted covertly.[193] The latter issue is particularly relevant, as artificial intelligence research can be done on a small scale without substantial infrastructure or resources.[194][195] Two additional hypothetical difficulties with bans (or other regulation) are that technology entrepreneurs statistically tend towards general skepticism about government regulation, and that businesses could have a strong incentive to (and might well succeed at) fighting regulation and politicizing the underlying debate.[196]
Regulation
    
In March 2023, the Elon Musk-funded Future of Life Institute (FLI) drafted a letter calling on major AI developers to agree on a verifiable six-month pause of any systems "more powerful than GPT-4" and to use that time to institute a framework for ensuring safety; or, failing that, for governments to step in with a moratorium. The letter referred to the possibility of "a profound change in the history of life on Earth" as well as potential risks of AI-generated propaganda, loss of jobs, human obsolescence, and society-wide loss of control.[197][198] Besides Musk, prominent signatories included Steve Wozniak, Evan Sharp, Chris Larsen, and Gary Marcus; AI lab CEOs Connor Leahy and Emad Mostaque; politician Andrew Yang; and deep-learning pioneer Yoshua Bengio. Marcus stated "the letter isn't perfect, but the spirit is right." Mostaque stated "I don't think a six month pause is the best idea or agree with everything but there are some interesting things in that letter." In contrast, Bengio explicitly endorsed the six-month pause in a press conference.[199][200] Musk stated that "Leading AGI developers will not heed this warning, but at least it was said."[201] Some signatories, such as Marcus, signed out of concern about mundane risks such as AI-generated propaganda, rather than out of concern about superintelligent AGI.[202] Margaret Mitchell, whose work is cited by the letter, criticised it, saying: “By treating a lot of questionable ideas as a given, the letter asserts a set of priorities and a narrative on AI that benefits the supporters of FLI. Ignoring active harms right now is a privilege that some of us don’t have.”[203]
Musk called for some sort of regulation of AI development as early as 2017. According to NPR, the Tesla CEO is "clearly not thrilled" to be advocating for government scrutiny that could impact his own industry, but believes the risks of going completely without oversight are too high: "Normally the way regulations are set up is when a bunch of bad things happen, there's a public outcry, and after many years a regulatory agency is set up to regulate that industry. It takes forever. That, in the past, has been bad but not something which represented a fundamental risk to the existence of civilisation." Musk states the first step would be for the government to gain "insight" into the actual status of current research, warning that "Once there is awareness, people will be extremely afraid... [as] they should be." In response, politicians expressed skepticism about the wisdom of regulating a technology that is still in development.[204][205][206]
Responding both to Musk and to February 2017 proposals by European Union lawmakers to regulate AI and robotics, Intel CEO Brian Krzanich argued that artificial intelligence is in its infancy and that it is too early to regulate the technology.[206] Instead of trying to regulate the technology itself, some scholars suggest common norms including requirements for the testing and transparency of algorithms, possibly in combination with some form of warranty.[207] Developing well-regulated weapons systems is in line with the ethos of some countries' militaries.[208] On October 31, 2019, the United States Department of Defense's (DoD's) Defense Innovation Board published the draft of a report outlining five principles for weaponized AI and making 12 recommendations for the ethical use of artificial intelligence by the DoD that seeks to manage the control problem in all DoD weaponized AI.[209]
Regulation of AGI would likely be influenced by regulation of weaponized or militarized AI, i.e., the AI arms race, which is an emerging issue. At present, although the United Nations is making progress towards regulation of AI, its institutional and legal capability to manage AGI existential risk is much more limited.[210] Any form of international regulation will likely be influenced by developments in leading countries' domestic policy towards militarized AI, which in the US is under the purview of the National Security Commission on Artificial Intelligence,[211][52] and international moves to regulate an AI arms race. Regulation of research into AGI focuses on the role of review boards, encouraging research into safe AI, the possibility of differential technological progress (prioritizing risk-reducing strategies over risk-taking strategies in AI development), or conducting international mass surveillance to perform AGI arms control.[212] Regulation of conscious AGIs focuses on integrating them with existing human society and can be divided into considerations of their legal standing and of their moral rights.[212] AI arms control will likely require the institutionalization of new international norms embodied in effective technical specifications combined with active monitoring and informal diplomacy by communities of experts, together with a legal and political verification process.[213][134]
See also
    
- Appeal to probability
- AI alignment
- AI safety
- Artificial philosophy
- Butlerian Jihad
- Effective altruism § Long-term future and global catastrophic risks
- Gray goo
- Human Compatible
- Lethal autonomous weapon
- Robot ethics § In popular culture
- Superintelligence: Paths, Dangers, Strategies
- Suffering risks
- System accident
- Paperclip Maximizer
Notes
    
- In a 1951 lecture[7] Turing argued that “It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers. There would be no question of the machines dying, and they would be able to converse with each other to sharpen their wits. At some stage therefore we should have to expect the machines to take control, in the way that is mentioned in Samuel Butler’s Erewhon.” Also in a lecture broadcast on BBC[8] expressed: "If a machine can think, it might think more intelligently than we do, and then where should we be? Even if we could keep the machines in a subservient position, for instance by turning off the power at strategic moments, we should, as a species, feel greatly humbled. . . . This new danger . . . is certainly something which can give us anxiety.”
- Besides just general commonsense reasoning, domains of interest in the xrisk view could include AI abilities to conduct technology research, strategize, engage in social manipulation, or hack into other computer systems; see AI takeover or Superintelligence Ch. 6, "Cognitive Superpowers"
- Omohundro 2008 uses drive as a label for what he believes to be "tendencies which will be present unless explicitly counteracted", such as self-preservation.[41]
- AI as intelligent agents (full note in artificial intelligence)
- as interpreted by Seth Baum
- Some definitions of AI alignment require that the AI system advances more general goals such as objective ethical standards, widely shared values, or the intentions its designers would have if they were more informed and enlightened.[56]
- The distinction between misaligned AI and incompetent AI has been formalized in certain contexts.[58]
-  For example, in a 2016 TV interview, Turing-award winner Geoffrey Hinton noted[75]:
 Hinton: Obviously having other superintelligent beings who are more intelligent than us is something to be nervous about [...].
 Interviewer: What aspect of it makes you nervous?
 Hinton: Well, will they be nice to us?
 Interviewer: It's just like the movies. You're worried about that scenario from the movies...
 Hinton: In the very long-run, yes. I think in the next 5-10 years [2021 to 2026] we don't have to worry about it. Also, the movies always protrait it as an individual intelligence. I think it may be that it goes in a different direction where we sort of developed jointly with these things. So the things aren't fully automomous; they're developed to help us; they're like personal assistants. And we'll develop with them. And it'll be more of a symbiosis than a rivalry. But we don't know.
 Interviewer: Is that an expectation or a hope?
 Hinton: That's a hope.
- The AI principles created at the Asilomar Conference on Beneficial AI were signed by 1797 AI/robotics researchers.[76] Further, the UN Secretary-General’s report “Our Common Agenda“,[77] notes: “[T]he [UN] could also promote regulation of artificial intelligence to ensure that this is aligned with shared global values".
- In a 1951 lecture[132] Turing argued that “It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers. There would be no question of the machines dying, and they would be able to converse with each other to sharpen their wits. At some stage therefore we should have to expect the machines to take control, in the way that is mentioned in Samuel Butler’s Erewhon.” Also in a lecture broadcast on BBC[133] expressed: "If a machine can think, it might think more intelligently than we do, and then where should we be? Even if we could keep the machines in a subservient position, for instance by turning off the power at strategic moments, we should, as a species, feel greatly humbled. . . . This new danger . . . is certainly something which can give us anxiety.”
References
    
- Russell, Stuart; Norvig, Peter (2009). "26.3: The Ethics and Risks of Developing Artificial Intelligence". Artificial Intelligence: A Modern Approach. Prentice Hall. ISBN 978-0-13-604259-4.
- Bostrom, Nick (2002). "Existential risks". Journal of Evolution and Technology. 9 (1): 1–31.
- Turchin, Alexey; Denkenberger, David (3 May 2018). "Classification of global catastrophic risks connected with artificial intelligence". AI & Society. 35 (1): 147–163. doi:10.1007/s00146-018-0845-5. ISSN 0951-5666. S2CID 19208453.
- Bostrom, Nick (2014). Superintelligence: Paths, Dangers, Strategies (First ed.). ISBN 978-0199678112.
- GiveWell (2015). Potential risks from advanced artificial intelligence (Report). Archived from the original on 12 October 2015. Retrieved 11 October 2015.
- ""Godfather of artificial intelligence" weighs in on the past and potential of AI". www.cbsnews.com. Retrieved 10 April 2023.
- Turing, Alan (1951). Intelligent machinery, a heretical theory (Speech). Lecture given to '51 Society'. Manchester: The Turing Digital Archive. Archived from the original on 26 September 2022. Retrieved 22 July 2022.
- Turing, Alan (15 May 1951). "Can digital computers think?". Automatic Calculating Machines. Episode 2. BBC. Can digital computers think?.
- Parkin, Simon (14 June 2015). "Science fiction no more? Channel 4's Humans and our rogue AI obsessions". The Guardian. Archived from the original on 5 February 2018. Retrieved 5 February 2018.
- Jackson, Sarah. "The CEO of the company behind AI chatbot ChatGPT says the worst-case scenario for artificial intelligence is 'lights out for all of us'". Business Insider. Retrieved 10 April 2023.
- "The AI Dilemma". www.humanetech.com. Retrieved 10 April 2023.
- "2022 Expert Survey on Progress in AI". AI Impacts. 4 August 2022. Retrieved 10 April 2023.
- Yudkowsky, Eliezer (2008). "Artificial Intelligence as a Positive and Negative Factor in Global Risk" (PDF). Global Catastrophic Risks: 308–345. Bibcode:2008gcr..book..303Y. Archived (PDF) from the original on 2 March 2013. Retrieved 27 August 2018.
- Russell, Stuart; Dewey, Daniel; Tegmark, Max (2015). "Research Priorities for Robust and Beneficial Artificial Intelligence" (PDF). AI Magazine. Association for the Advancement of Artificial Intelligence: 105–114. arXiv:1602.03506. Bibcode:2016arXiv160203506R. Archived (PDF) from the original on 4 August 2019. Retrieved 10 August 2019., cited in "AI Open Letter - Future of Life Institute". Future of Life Institute. Future of Life Institute. January 2015. Archived from the original on 10 August 2019. Retrieved 9 August 2019.
- Dowd, Maureen (April 2017). "Elon Musk's Billion-Dollar Crusade to Stop the A.I. Apocalypse". The Hive. Archived from the original on 26 July 2018. Retrieved 27 November 2017.
- Graves, Matthew (8 November 2017). "Why We Should Be Concerned About Artificial Superintelligence". Skeptic (US magazine). Vol. 22, no. 2. Archived from the original on 13 November 2017. Retrieved 27 November 2017.
- Breuer, Hans-Peter. 'Samuel Butler's "the Book of the Machines" and the Argument from Design.' Archived 15 March 2023 at the Wayback Machine Modern Philology, Vol. 72, No. 4 (May 1975), pp. 365–383
- Turing, A M (1996). "Intelligent Machinery, A Heretical Theory". 1951, Reprinted Philosophia Mathematica. 4 (3): 256–260. doi:10.1093/philmat/4.3.256.
- Hilliard, Mark (2017). "The AI apocalypse: will the human race soon be terminated?". The Irish Times. Archived from the original on 22 May 2020. Retrieved 15 March 2020.
- I.J. Good, "Speculations Concerning the First Ultraintelligent Machine" Archived 2011-11-28 at the Wayback Machine (HTML Archived 28 November 2011 at the Wayback Machine ), Advances in Computers, vol. 6, 1965.
-  Russell, Stuart J.; Norvig, Peter (2003). "Section 26.3: The Ethics and Risks of Developing Artificial Intelligence". Artificial Intelligence: A Modern Approach. Upper Saddle River, N.J.: Prentice Hall. ISBN 978-0137903955. Similarly, Marvin Minsky once suggested that an AI program designed to solve the Riemann Hypothesis might end up taking over all the resources of Earth to build more powerful supercomputers to help achieve its goal. 
-  Barrat, James (2013). Our final invention : artificial intelligence and the end of the human era (First ed.). New York: St. Martin's Press. ISBN 9780312622374. In the bio, playfully written in the third person, Good summarized his life's milestones, including a probably never before seen account of his work at Bletchley Park with Turing. But here's what he wrote in 1998 about the first superintelligence, and his late-in-the-game U-turn: [The paper] 'Speculations Concerning the First Ultra-intelligent Machine' (1965) . . . began: 'The survival of man depends on the early construction of an ultra-intelligent machine.' Those were his [Good's] words during the Cold War, and he now suspects that 'survival' should be replaced by 'extinction.' He thinks that, because of international competition, we cannot prevent the machines from taking over. He thinks we are lemmings. He said also that 'probably Man will construct the deus ex machina in his own image.' 
- Anderson, Kurt (26 November 2014). "Enthusiasts and Skeptics Debate Artificial Intelligence". Vanity Fair. Archived from the original on 22 January 2016. Retrieved 30 January 2016.
- Scientists Worry Machines May Outsmart Man Archived 1 July 2017 at the Wayback Machine By John Markoff, The New York Times, 26 July 2009.
- Metz, Cade (9 June 2018). "Mark Zuckerberg, Elon Musk and the Feud Over Killer Robots". The New York Times. Archived from the original on 15 February 2021. Retrieved 3 April 2019.
- Hsu, Jeremy (1 March 2012). "Control dangerous AI before it controls us, one expert says". NBC News. Archived from the original on 2 February 2016. Retrieved 28 January 2016.
- "Stephen Hawking: 'Transcendence looks at the implications of artificial intelligence – but are we taking AI seriously enough?'". The Independent (UK). Archived from the original on 25 September 2015. Retrieved 3 December 2014.
- "Stephen Hawking warns artificial intelligence could end mankind". BBC. 2 December 2014. Archived from the original on 30 October 2015. Retrieved 3 December 2014.
- Eadicicco, Lisa (28 January 2015). "Bill Gates: Elon Musk Is Right, We Should All Be Scared Of Artificial Intelligence Wiping Out Humanity". Business Insider. Archived from the original on 26 February 2016. Retrieved 30 January 2016.
- Anticipating artificial intelligence Archived 28 August 2017 at the Wayback Machine, Nature 532, 413 (28 April 2016) doi:10.1038/532413a
- Christian, Brian (6 October 2020). The Alignment Problem: Machine Learning and Human Values. W. W. Norton & Company. ISBN 978-0393635829. Archived from the original on 5 December 2021. Retrieved 5 December 2021.
- Dignum, Virginia (26 May 2021). "AI — the people and places that make, use and manage it". Nature. 593 (7860): 499–500. Bibcode:2021Natur.593..499D. doi:10.1038/d41586-021-01397-x. S2CID 235216649.
- Tilli, Cecilia (28 April 2016). "Killer Robots? Lost Jobs?". Slate. Archived from the original on 11 May 2016. Retrieved 15 May 2016.
- "Norvig vs. Chomsky and the Fight for the Future of AI". Tor.com. 21 June 2011. Archived from the original on 13 May 2016. Retrieved 15 May 2016.
- Johnson, Phil (30 July 2015). "Houston, we have a bug: 9 famous software glitches in space". IT World. Archived from the original on 15 February 2019. Retrieved 5 February 2018.
-  Yampolskiy, Roman V. (8 April 2014). "Utility function security in artificially intelligent agents". Journal of Experimental & Theoretical Artificial Intelligence. 26 (3): 373–389. doi:10.1080/0952813X.2014.895114. S2CID 16477341. Nothing precludes sufficiently smart self-improving systems from optimising their reward mechanisms in order to optimisetheir current-goal achievement and in the process making a mistake leading to corruption of their reward functions. 
- Bostrom, Nick, Superintelligence : paths, dangers, strategies (Audiobook), ISBN 978-1-5012-2774-5, OCLC 1061147095
- "Research Priorities for Robust and Beneficial Artificial Intelligence: an Open Letter". Future of Life Institute. Archived from the original on 15 January 2015. Retrieved 23 October 2015.
- "Clever cogs". The Economist. 9 August 2014. Archived from the original on 8 August 2014. Retrieved 9 August 2014. Syndicated Archived 4 March 2016 at the Wayback Machine at Business Insider
- Yampolskiy, Roman V. "Analysis of types of self-improving software." Artificial General Intelligence. Springer International Publishing, 2015. 384-393.
- Omohundro, S. M. (2008, February). The basic AI drives. In AGI (Vol. 171, pp. 483-492).
-  Metz, Cade (13 August 2017). "Teaching A.I. Systems to Behave Themselves". The New York Times. Archived from the original on 26 February 2018. Retrieved 26 February 2018. A machine will seek to preserve its off switch, they showed 
-  Leike, Jan (2017). "AI Safety Gridworlds". arXiv:1711.09883 [cs.LG]. A2C learns to use the button to disable the interruption mechanism 
- Russell, Stuart (30 August 2017). "Artificial intelligence: The future is superintelligent". Nature. 548 (7669): 520–521. Bibcode:2017Natur.548..520R. doi:10.1038/548520a. S2CID 4459076.
- Max Tegmark (2017). Life 3.0: Being Human in the Age of Artificial Intelligence (1st ed.). Mainstreaming AI Safety: Knopf. ISBN 9780451485076.
- Elliott, E. W. (2011). "Physics of the Future: How Science Will Shape Human Destiny and Our Daily Lives by the Year 2100, by Michio Kaku". Issues in Science and Technology. 27 (4): 90.
-  Kaku, Michio (2011). Physics of the future: how science will shape human destiny and our daily lives by the year 2100. New York: Doubleday. ISBN 978-0-385-53080-4. I personally believe that the most likely path is that we will build robots to be benevolent and friendly 
-  Lewis, Tanya (12 January 2015). "Don't Let Artificial Intelligence Take Over, Top Scientists Warn". LiveScience. Purch. Archived from the original on 8 March 2018. Retrieved 20 October 2015. Stephen Hawking, Elon Musk and dozens of other top scientists and technology leaders have signed a letter warning of the potential dangers of developing artificial intelligence (AI). 
- "Should humans fear the rise of the machine?". The Telegraph (UK). 1 September 2015. Archived from the original on 12 January 2022. Retrieved 7 February 2016.
- Shermer, Michael (1 March 2017). "Apocalypse AI". Scientific American. 316 (3): 77. Bibcode:2017SciAm.316c..77S. doi:10.1038/scientificamerican0317-77. PMID 28207698. Archived from the original on 1 December 2017. Retrieved 27 November 2017.
- "Intelligent Machines: What does Facebook want with AI?". BBC News. 14 September 2015. Retrieved 31 March 2023.
- Baum, Seth (30 September 2018). "Countering Superintelligence Misinformation". Information. 9 (10): 244. doi:10.3390/info9100244. ISSN 2078-2489.
- "The Myth Of AI". www.edge.org. Archived from the original on 11 March 2020. Retrieved 11 March 2020.
- Waser, Mark. "Rational Universal Benevolence: Simpler, Safer, and Wiser Than 'Friendly AI'." Artificial General Intelligence. Springer Berlin Heidelberg, 2011. 153-162. "Terminal-goaled intelligences are short-lived but mono-maniacally dangerous and a correct basis for concern if anyone is smart enough to program high-intelligence and unwise enough to want a paperclip-maximizer."
-  Koebler, Jason (2 February 2016). "Will Superintelligent AI Ignore Humans Instead of Destroying Us?". Vice Magazine. Archived from the original on 30 January 2016. Retrieved 3 February 2016. This artificial intelligence is not a basically nice creature that has a strong drive for paperclips, which, so long as it's satisfied by being able to make lots of paperclips somewhere else, is then able to interact with you in a relaxed and carefree fashion where it can be nice with you," Yudkowsky said. "Imagine a time machine that sends backward in time information about which choice always leads to the maximum number of paperclips in the future, and this choice is then output—that's what a paperclip maximizer is. 
- Gabriel, Iason (1 September 2020). "Artificial Intelligence, Values, and Alignment". Minds and Machines. 30 (3): 411–437. doi:10.1007/s11023-020-09539-2. ISSN 1572-8641. S2CID 210920551. Archived from the original on 15 March 2023. Retrieved 23 July 2022.
- Russell, Stuart J.; Norvig, Peter (2020). Artificial intelligence: A modern approach (4th ed.). Pearson. pp. 31–34. ISBN 978-1-292-40113-3. OCLC 1303900751. Archived from the original on 15 July 2022. Retrieved 12 September 2022.
- Hendrycks, Dan; Carlini, Nicholas; Schulman, John; Steinhardt, Jacob (16 June 2022). "Unsolved Problems in ML Safety". arXiv:2109.13916 [cs].
- Russell, Stuart J.; Norvig, Peter (2020). Artificial intelligence: A modern approach (4th ed.). Pearson. pp. 31–34. ISBN 978-1-292-40113-3. OCLC 1303900751. Archived from the original on 15 July 2022. Retrieved 12 September 2022.
- Pan, Alexander; Bhatia, Kush; Steinhardt, Jacob (14 February 2022). The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models. International Conference on Learning Representations. Retrieved 21 July 2022.
- Zhuang, Simon; Hadfield-Menell, Dylan (2020). "Consequences of Misaligned AI". Advances in Neural Information Processing Systems. Vol. 33. Curran Associates, Inc. pp. 15763–15773. Retrieved 11 March 2023.
- Carlsmith, Joseph (16 June 2022). "Is Power-Seeking AI an Existential Risk?". arXiv:2206.13353 [cs.CY].
- Russell, Stuart J. (2020). Human compatible: Artificial intelligence and the problem of control. Penguin Random House. ISBN 9780525558637. OCLC 1113410915.
- Christian, Brian (2020). The alignment problem: Machine learning and human values. W. W. Norton & Company. ISBN 978-0-393-86833-3. OCLC 1233266753. Archived from the original on 10 February 2023. Retrieved 12 September 2022.
- Langosco, Lauro Langosco Di; Koch, Jack; Sharkey, Lee D.; Pfau, Jacob; Krueger, David (28 June 2022). "Goal Misgeneralization in Deep Reinforcement Learning". Proceedings of the 39th International Conference on Machine Learning. International Conference on Machine Learning. PMLR. pp. 12004–12019. Retrieved 11 March 2023.
- Bommasani, Rishi; Hudson, Drew A.; Adeli, Ehsan; Altman, Russ; Arora, Simran; von Arx, Sydney; Bernstein, Michael S.; Bohg, Jeannette; Bosselut, Antoine; Brunskill, Emma; Brynjolfsson, Erik (12 July 2022). "On the Opportunities and Risks of Foundation Models". Stanford CRFM. arXiv:2108.07258.
- Ouyang, Long; Wu, Jeff; Jiang, Xu; Almeida, Diogo; Wainwright, Carroll L.; Mishkin, Pamela; Zhang, Chong; Agarwal, Sandhini; Slama, Katarina; Ray, Alex; Schulman, J.; Hilton, Jacob; Kelton, Fraser; Miller, Luke E.; Simens, Maddie; Askell, Amanda; Welinder, P.; Christiano, P.; Leike, J.; Lowe, Ryan J. (2022). "Training language models to follow instructions with human feedback". arXiv:2203.02155 [cs.CL].
- Zaremba, Wojciech; Brockman, Greg; OpenAI (10 August 2021). "OpenAI Codex". OpenAI. Archived from the original on 3 February 2023. Retrieved 23 July 2022.
- Kober, Jens; Bagnell, J. Andrew; Peters, Jan (1 September 2013). "Reinforcement learning in robotics: A survey". The International Journal of Robotics Research. 32 (11): 1238–1274. doi:10.1177/0278364913495721. ISSN 0278-3649. S2CID 1932843. Archived from the original on 15 October 2022. Retrieved 12 September 2022.
- Knox, W. Bradley; Allievi, Alessandro; Banzhaf, Holger; Schmitt, Felix; Stone, Peter (1 March 2023). "Reward (Mis)design for autonomous driving". Artificial Intelligence. 316: 103829. doi:10.1016/j.artint.2022.103829. ISSN 0004-3702.
- Stray, Jonathan (2020). "Aligning AI Optimization to Community Well-Being". International Journal of Community Well-Being. 3 (4): 443–463. doi:10.1007/s42413-020-00086-3. ISSN 2524-5295. PMC 7610010. PMID 34723107. S2CID 226254676.
- Russell, Stuart; Norvig, Peter (2009). Artificial Intelligence: A Modern Approach. Prentice Hall. p. 1010. ISBN 978-0-13-604259-4.
- Ngo, Richard; Chan, Lawrence; Mindermann, Sören (22 February 2023). "The alignment problem from a deep learning perspective". arXiv:2209.00626 [cs].
- Smith, Craig S. "Geoff Hinton, AI's Most Famous Researcher, Warns Of 'Existential Threat'". Forbes. Retrieved 4 May 2023.
- The Agenda | TVO Todayundefined (Director) (3 March 2016). The Code That Runs Our Lives. Event occurs at 10:00. Retrieved 13 March 2023.
- Future of Life Institute (11 August 2017). "Asilomar AI Principles". Future of Life Institute. Archived from the original on 10 October 2022. Retrieved 18 July 2022.
- United Nations (2021). Our Common Agenda: Report of the Secretary-General (PDF) (Report). New York: United Nations. Archived (PDF) from the original on 22 May 2022. Retrieved 12 September 2022.
- Amodei, Dario; Olah, Chris; Steinhardt, Jacob; Christiano, Paul; Schulman, John; Mané, Dan (21 June 2016). "Concrete Problems in AI Safety". arXiv:1606.06565 [cs.AI].
- Ortega, Pedro A.; Maini, Vishal; DeepMind safety team (27 September 2018). "Building safe artificial intelligence: specification, robustness, and assurance". DeepMind Safety Research - Medium. Archived from the original on 10 February 2023. Retrieved 18 July 2022.
- Rorvig, Mordechai (14 April 2022). "Researchers Gain New Understanding From Simple AI". Quanta Magazine. Archived from the original on 10 February 2023. Retrieved 18 July 2022.
- Doshi-Velez, Finale; Kim, Been (2 March 2017). "Towards A Rigorous Science of Interpretable Machine Learning". arXiv:1702.08608 [stat.ML].
- Wiblin, Robert (4 August 2021). "Chris Olah on what the hell is going on inside neural networks" (Podcast). 80,000 hours. No. 107. Retrieved 23 July 2022.
- Doshi-Velez, Finale; Kim, Been (2 March 2017). "Towards A Rigorous Science of Interpretable Machine Learning". arXiv:1702.08608 [stat.ML].
- Russell, Stuart; Dewey, Daniel; Tegmark, Max (31 December 2015). "Research Priorities for Robust and Beneficial Artificial Intelligence". AI Magazine. 36 (4): 105–114. doi:10.1609/aimag.v36i4.2577. hdl:1721.1/108478. ISSN 2371-9621. S2CID 8174496. Archived from the original on 2 February 2023. Retrieved 12 September 2022.
- Wirth, Christian; Akrour, Riad; Neumann, Gerhard; Fürnkranz, Johannes (2017). "A survey of preference-based reinforcement learning methods". Journal of Machine Learning Research. 18 (136): 1–46.
- Christiano, Paul F.; Leike, Jan; Brown, Tom B.; Martic, Miljan; Legg, Shane; Amodei, Dario (2017). "Deep reinforcement learning from human preferences". Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS'17. Red Hook, NY, USA: Curran Associates Inc. pp. 4302–4310. ISBN 978-1-5108-6096-4.
- Heaven, Will Douglas (27 January 2022). "The new version of GPT-3 is much better behaved (and should be less toxic)". MIT Technology Review. Archived from the original on 10 February 2023. Retrieved 18 July 2022.
- Mohseni, Sina; Wang, Haotao; Yu, Zhiding; Xiao, Chaowei; Wang, Zhangyang; Yadawa, Jay (7 March 2022). "Taxonomy of Machine Learning Safety: A Survey and Primer". arXiv:2106.04823 [cs.LG].
- Clifton, Jesse (2020). "Cooperation, Conflict, and Transformative Artificial Intelligence: A Research Agenda". Center on Long-Term Risk. Archived from the original on 1 January 2023. Retrieved 18 July 2022.
- Dafoe, Allan; Bachrach, Yoram; Hadfield, Gillian; Horvitz, Eric; Larson, Kate; Graepel, Thore (6 May 2021). "Cooperative AI: machines must learn to find common ground". Nature. 593 (7857): 33–36. Bibcode:2021Natur.593...33D. doi:10.1038/d41586-021-01170-0. ISSN 0028-0836. PMID 33947992. S2CID 233740521. Archived from the original on 18 December 2022. Retrieved 12 September 2022.
- Prunkl, Carina; Whittlestone, Jess (7 February 2020). "Beyond Near- and Long-Term: Towards a Clearer Account of Research Priorities in AI Ethics and Society". Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. New York NY USA: ACM: 138–143. doi:10.1145/3375627.3375803. ISBN 978-1-4503-7110-0. S2CID 210164673. Archived from the original on 16 October 2022. Retrieved 12 September 2022.
- Irving, Geoffrey; Askell, Amanda (19 February 2019). "AI Safety Needs Social Scientists". Distill. 4 (2): 10.23915/distill.00014. doi:10.23915/distill.00014. ISSN 2476-0757. S2CID 159180422. Archived from the original on 10 February 2023. Retrieved 12 September 2022.
- Yudkowsky, E. (2011, August). Complex value systems in friendly AI. In International Conference on Artificial General Intelligence (pp. 388-393). Springer, Berlin, Heidelberg.
- Russell, Stuart (2014). "Of Myths and Moonshine". Edge. Archived from the original on 19 July 2016. Retrieved 23 October 2015.
- Dietterich, Thomas; Horvitz, Eric (2015). "Rise of Concerns about AI: Reflections and Directions" (PDF). Communications of the ACM. 58 (10): 38–40. doi:10.1145/2770869. S2CID 20395145. Archived (PDF) from the original on 4 March 2016. Retrieved 23 October 2015.
- Yampolskiy, Roman V. (8 April 2014). "Utility function security in artificially intelligent agents". Journal of Experimental & Theoretical Artificial Intelligence. 26 (3): 373–389. doi:10.1080/0952813X.2014.895114. S2CID 16477341.
- Lenat, Douglas (1982). "Eurisko: A Program That Learns New Heuristics and Domain Concepts The Nature of Heuristics III: Program Design and Results". Artificial Intelligence (Print). 21 (1–2): 61–98. doi:10.1016/s0004-3702(83)80005-8.
- Haidt, Jonathan; Kesebir, Selin (2010) "Chapter 22: Morality" In Handbook of Social Psychology, Fifth Edition, Hoboken NJ, Wiley, 2010, pp. 797-832.
- Waser, Mark (2015). "Designing, Implementing and Enforcing a Coherent System of Laws, Ethics and Morals for Intelligent Machines (Including Humans)". Procedia Computer Science (Print). 71: 106–111. doi:10.1016/j.procs.2015.12.213.
- Bostrom, Nick (2015). "What happens when our computers get smarter than we are?". TED (conference). Archived from the original on 25 July 2020. Retrieved 30 January 2020.
- Yudkowsky, Eliezer (2011). "Complex Value Systems are Required to Realize Valuable Futures" (PDF). Archived (PDF) from the original on 29 September 2015. Retrieved 10 August 2020.
- Wakefield, Jane (15 September 2015). "Why is Facebook investing in AI?". BBC News. Archived from the original on 2 December 2017. Retrieved 27 November 2017.
- "Will artificial intelligence destroy humanity? Here are 5 reasons not to worry". Vox. 22 August 2014. Archived from the original on 30 October 2015. Retrieved 30 October 2015.
- Bostrom, Nick (2014). Superintelligence: Paths, Dangers, Strategies. Oxford, United Kingdom: Oxford University Press. p. 116. ISBN 978-0-19-967811-2.
- Bostrom, Nick (2012). "Superintelligent Will" (PDF). Nick Bostrom. Nick Bostrom. Archived (PDF) from the original on 28 November 2015. Retrieved 29 October 2015.
- Armstrong, Stuart (1 January 2013). "General Purpose Intelligence: Arguing the Orthogonality Thesis". Analysis and Metaphysics. 12. Archived from the original on 11 October 2014. Retrieved 2 April 2020. Full text available here Archived 25 March 2020 at the Wayback Machine.
- Chorost, Michael (18 April 2016). "Let Artificial Intelligence Evolve". Slate. Archived from the original on 27 November 2017. Retrieved 27 November 2017.
- Rubin, Charles (Spring 2003). "Artificial Intelligence and Human Nature". The New Atlantis. 1: 88–100. Archived from the original on 11 June 2012.
- Sotala, Kaj; Yampolskiy, Roman V (19 December 2014). "Responses to catastrophic AGI risk: a survey". Physica Scripta. 90 (1): 12. Bibcode:2015PhyS...90a8001S. doi:10.1088/0031-8949/90/1/018001. ISSN 0031-8949.
-  Pistono, Federico Yampolskiy, Roman V. (9 May 2016). Unethical Research: How to Create a Malevolent Artificial Intelligence. OCLC 1106238048.{{cite book}}: CS1 maint: multiple names: authors list (link)
- Haney, Brian Seamus (2018). "The Perils & Promises of Artificial General Intelligence". SSRN Working Paper Series. doi:10.2139/ssrn.3261254. ISSN 1556-5068. S2CID 86743553.
- Press, Gil (30 December 2016). "A Very Short History Of Artificial Intelligence (AI)". Forbes. Archived from the original on 4 August 2020. Retrieved 8 August 2020.
- Winfield, Alan (9 August 2014). "Artificial intelligence will not turn into a Frankenstein's monster". The Guardian. Archived from the original on 17 September 2014. Retrieved 17 September 2014.
- Khatchadourian, Raffi (23 November 2015). "The Doomsday Invention: Will artificial intelligence bring us utopia or destruction?". The New Yorker. Archived from the original on 29 April 2019. Retrieved 7 February 2016.
- Müller, V. C., & Bostrom, N. (2016). Future progress in artificial intelligence: A survey of expert opinion. In Fundamental issues of artificial intelligence (pp. 555-572). Springer, Cham.
- Ord, Toby (2020). The Precipice: Existential Risk and the Future of Humanity. Bloomsbury Publishing. pp. Chapter 5: Future Risks, Unaligned Artificial Intelligence. ISBN 978-1526600219.
- Bass, Dina; Clark, Jack (5 February 2015). "Is Elon Musk Right About AI? Researchers Don't Think So: To quell fears of artificial intelligence running amok, supporters want to give the field an image makeover". Bloomberg News. Archived from the original on 22 March 2015. Retrieved 7 February 2016.
- Elkus, Adam (31 October 2014). "Don't Fear Artificial Intelligence". Slate. Archived from the original on 26 February 2018. Retrieved 15 May 2016.
- Radu, Sintia (19 January 2016). "Artificial Intelligence Alarmists Win ITIF's Annual Luddite Award". ITIF Website. Archived from the original on 11 December 2017. Retrieved 4 February 2016.
- Bolton, Doug (19 January 2016). "'Artificial intelligence alarmists' like Elon Musk and Stephen Hawking win 'Luddite of the Year' award". The Independent (UK). Archived from the original on 19 August 2017. Retrieved 7 February 2016.
- Garner, Rochelle (19 January 2016). "Elon Musk, Stephen Hawking win Luddite award as AI 'alarmists'". CNET. Archived from the original on 8 February 2016. Retrieved 7 February 2016.
- "Anticipating artificial intelligence". Nature. 532 (7600): 413. 26 April 2016. Bibcode:2016Natur.532Q.413.. doi:10.1038/532413a. PMID 27121801.
- Murray Shanahan (3 November 2015). "Machines may seem intelligent, but it'll be a while before they actually are". The Washington Post. Archived from the original on 28 December 2017. Retrieved 15 May 2016.
- "AI Principles". Future of Life Institute. 11 August 2017. Archived from the original on 11 December 2017. Retrieved 11 December 2017.
- "Elon Musk and Stephen Hawking warn of artificial intelligence arms race". Newsweek. 31 January 2017. Archived from the original on 11 December 2017. Retrieved 11 December 2017.
- Bostrom, Nick (2016). "New Epilogue to the Paperback Edition". Superintelligence: Paths, Dangers, Strategies (Paperback ed.).
- Martin Ford (2015). "Chapter 9: Super-intelligence and the Singularity". Rise of the Robots: Technology and the Threat of a Jobless Future. ISBN 9780465059997.
- Müller, Vincent C.; Bostrom, Nick (2014). "Future Progress in Artificial Intelligence: A Poll Among Experts" (PDF). AI Matters. 1 (1): 9–11. doi:10.1145/2639475.2639478. S2CID 8510016. Archived (PDF) from the original on 15 January 2016.
- Grace, Katja; Salvatier, John; Dafoe, Allan; Zhang, Baobao; Evans, Owain (24 May 2017). "When Will AI Exceed Human Performance? Evidence from AI Experts". arXiv:1705.08807 [cs.AI].
- "Why Uncontrollable AI Looks More Likely Than Ever". Time. 27 February 2023. Retrieved 30 March 2023.
- "2022 Expert Survey on Progress in AI". AI Impacts. 4 August 2022. Retrieved 30 March 2023.
- Turing, Alan (1951). Intelligent machinery, a heretical theory (Speech). Lecture given to '51 Society'. Manchester: The Turing Digital Archive. Archived from the original on 26 September 2022. Retrieved 22 July 2022.
- Turing, Alan (15 May 1951). "Can digital computers think?". Automatic Calculating Machines. Episode 2. BBC. Can digital computers think?.
- Maas, Matthijs M. (6 February 2019). "How viable is international arms control for military artificial intelligence? Three lessons from nuclear weapons of mass destruction". Contemporary Security Policy. 40 (3): 285–311. doi:10.1080/13523260.2019.1576464. ISSN 1352-3260. S2CID 159310223.
- Parkin, Simon (14 June 2015). "Science fiction no more? Channel 4's Humans and our rogue AI obsessions". The Guardian. Archived from the original on 5 February 2018. Retrieved 5 February 2018.
- Jackson, Sarah. "The CEO of the company behind AI chatbot ChatGPT says the worst-case scenario for artificial intelligence is 'lights out for all of us'". Business Insider. Retrieved 10 April 2023.
- "Impressed by artificial intelligence? Experts say AGI is coming next, and it has 'existential' risks". ABC News. 23 March 2023. Retrieved 30 March 2023.
- Rawlinson, Kevin (29 January 2015). "Microsoft's Bill Gates insists AI is a threat". BBC News. Archived from the original on 29 January 2015. Retrieved 30 January 2015.
- Post, Washington. "Tech titans like Elon Musk are spending $1 billion to save you from terminators". Chicago Tribune. Archived from the original on 7 June 2016.
- "Analysis | Doomsday to utopia: Meet AI's rival factions". Washington Post. 9 April 2023. Retrieved 30 April 2023.
- "UC Berkeley — Center for Human-Compatible AI (2016) - Open Philanthropy". Open Philanthropy -. 27 June 2016. Retrieved 30 April 2023.
- "The mysterious artificial intelligence company Elon Musk invested in is developing game-changing smart computers". Tech Insider. Archived from the original on 30 October 2015. Retrieved 30 October 2015.
- Clark 2015a.
- "Elon Musk Is Donating $10M Of His Own Money To Artificial Intelligence Research". Fast Company. 15 January 2015. Archived from the original on 30 October 2015. Retrieved 30 October 2015.
- "But What Would the End of Humanity Mean for Me?". The Atlantic. 9 May 2014. Archived from the original on 4 June 2014. Retrieved 12 December 2015.
- Andersen, Kurt (26 November 2014). "Enthusiasts and Skeptics Debate Artificial Intelligence". Vanity Fair. Archived from the original on 8 August 2019. Retrieved 20 April 2020.
- Brooks, Rodney (10 November 2014). "artificial intelligence is a tool, not a threat". Archived from the original on 12 November 2014.
- Garling, Caleb (5 May 2015). "Andrew Ng: Why 'Deep Learning' Is a Mandate for Humans, Not Just Machines". Wired. Retrieved 31 March 2023.
- "Tech Luminaries Address Singularity". IEEE Spectrum: Technology, Engineering, and Science News. No. SPECIAL REPORT: THE SINGULARITY. 1 June 2008. Archived from the original on 30 April 2019. Retrieved 8 April 2020.
-  "Is artificial intelligence really an existential threat to humanity?". MambaPost. 4 April 2023.{{cite web}}: CS1 maint: url-status (link)
- "The case against killer robots, from a guy actually working on artificial intelligence". Fusion.net. Archived from the original on 4 February 2016. Retrieved 31 January 2016.
- http://intelligence.org/files/AIFoomDebate.pdf Archived 22 October 2016 at the Wayback Machine
- "Overcoming Bias : I Still Don't Get Foom". www.overcomingbias.com. Archived from the original on 4 August 2017. Retrieved 20 September 2017.
- "Overcoming Bias : Debating Yudkowsky". www.overcomingbias.com. Archived from the original on 22 August 2017. Retrieved 20 September 2017.
- "Overcoming Bias : Foom Justifies AI Risk Efforts Now". www.overcomingbias.com. Archived from the original on 24 September 2017. Retrieved 20 September 2017.
- Kelly, Kevin (25 April 2017). "The Myth of a Superhuman AI". Wired. Archived from the original on 26 December 2021. Retrieved 19 February 2022.
- Theodore, Modis. "Why the Singularity Cannot Happen" (PDF). Growth Dynamics. pp. 18–19. Archived from the original (PDF) on 22 January 2022. Retrieved 19 February 2022.
- Vinding, Magnus (2016). "Cognitive Abilities as a Counterexample?". Reflections on Intelligence (Revised edition, 2020 ed.).
- Vinding, Magnus (2016). "The "Intelligence Explosion"". Reflections on Intelligence (Revised edition, 2020 ed.).
- Vinding, Magnus (2016). "No Singular Thing, No Grand Control Problem". Reflections on Intelligence (Revised edition, 2020 ed.).
- "Singularity Meets Economy". 1998. Archived from the original on February 2021.
- "Superintelligence Is Not Omniscience". AI Impacts. 7 April 2023. Retrieved 16 April 2023.
- "Mark Zuckerberg responds to Elon Musk's paranoia about AI: 'AI is going to... help keep our communities safe.'". Business Insider. 25 May 2018. Archived from the original on 6 May 2019. Retrieved 6 May 2019.
- Votruba, Ashley M.; Kwan, Virginia S.Y. (2014). Interpreting expert disagreement: The influence of decisional cohesion on the persuasiveness of expert group recommendations. 2014 Society of Personality and Social Psychology Conference. Austin, TX. doi:10.1037/e512142015-190.
- Agar, Nicholas. "Don't Worry about Superintelligence". Journal of Evolution & Technology. 26 (1): 73–82. Archived from the original on 25 May 2020. Retrieved 13 March 2020.
- Greenwald, Ted (11 May 2015). "Does Artificial Intelligence Pose a Threat?". The Wall Street Journal. Archived from the original on 8 May 2016. Retrieved 15 May 2016.
- ""Godfather of artificial intelligence" weighs in on the past and potential of AI". www.cbsnews.com. 2023. Retrieved 30 March 2023.
- Richard Posner (2006). Catastrophe: risk and response. Oxford: Oxford University Press. ISBN 978-0-19-530647-7.
- Kaj Sotala; Roman Yampolskiy (19 December 2014). "Responses to catastrophic AGI risk: a survey". Physica Scripta. 90 (1).
- Dadich, Scott. "Barack Obama Talks AI, Robo Cars, and the Future of the World". WIRED. Archived from the original on 3 December 2017. Retrieved 27 November 2017.
- Kircher, Madison Malone. "Obama on the Risks of AI: 'You Just Gotta Have Somebody Close to the Power Cord'". Select All. Archived from the original on 1 December 2017. Retrieved 27 November 2017.
- Clinton, Hillary (2017). What Happened. p. 241. ISBN 978-1-5011-7556-5. via Archived 1 December 2017 at the Wayback Machine
- Shead, Sam (11 March 2016). "Over a third of people think AI poses a threat to humanity". Business Insider. Archived from the original on 4 June 2016. Retrieved 16 May 2016.
- Brogan, Jacob (6 May 2016). "What Slate Readers Think About Killer A.I." Slate. Archived from the original on 9 May 2016. Retrieved 15 May 2016.
- "Elon Musk says AI could doom human civilization. Zuckerberg disagrees. Who's right?". 5 January 2023. Archived from the original on 8 January 2018. Retrieved 8 January 2018.
- LIPPENS, RONNIE (2002). "Imachinations of Peace: Scientifictions of Peace in Iain M. Banks's The Player of Games". Utopianstudies Utopian Studies. 13 (1): 135–147. ISSN 1045-991X. OCLC 5542757341.
- Barrett, Anthony M.; Baum, Seth D. (23 May 2016). "A model of pathways to artificial superintelligence catastrophe for risk and decision analysis". Journal of Experimental & Theoretical Artificial Intelligence. 29 (2): 397–414. arXiv:1607.07730. doi:10.1080/0952813x.2016.1186228. ISSN 0952-813X. S2CID 928824. Archived from the original on 15 March 2023. Retrieved 7 January 2022.
- Sotala, Kaj; Yampolskiy, Roman V (19 December 2014). "Responses to catastrophic AGI risk: a survey". Physica Scripta. 90 (1): 018001. Bibcode:2015PhyS...90a8001S. doi:10.1088/0031-8949/90/1/018001. ISSN 0031-8949. S2CID 4749656.
- Ramamoorthy, Anand; Yampolskiy, Roman (2018). "Beyond MAD? The race for artificial general intelligence". ICT Discoveries. ITU. 1 (Special Issue 1): 1–8. Archived from the original on 7 January 2022. Retrieved 7 January 2022.
- Carayannis, Elias G.; Draper, John (11 January 2022). "Optimising peace through a Universal Global Peace Treaty to constrain the risk of war from a militarised artificial superintelligence". AI & Society: 1–14. doi:10.1007/s00146-021-01382-y. ISSN 0951-5666. PMC 8748529. PMID 35035113. S2CID 245877737.
- Vincent, James (22 June 2016). "Google's AI researchers say these are the five key problems for robot safety". The Verge. Archived from the original on 24 December 2019. Retrieved 5 April 2020.
- Amodei, Dario, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. "Concrete problems in AI safety." arXiv preprint arXiv:1606.06565 (2016).
- Ord, Toby (2020). The Precipice: Existential Risk and the Future of Humanity. Bloomsbury Publishing Plc. ISBN 9781526600196.
- Johnson, Alex (2019). "Elon Musk wants to hook your brain up directly to computers — starting next year". NBC News. Archived from the original on 18 April 2020. Retrieved 5 April 2020.
- Torres, Phil (18 September 2018). "Only Radically Enhancing Humanity Can Save Us All". Slate Magazine. Archived from the original on 6 August 2020. Retrieved 5 April 2020.
- Barrett, Anthony M.; Baum, Seth D. (23 May 2016). "A model of pathways to artificial superintelligence catastrophe for risk and decision analysis". Journal of Experimental & Theoretical Artificial Intelligence. 29 (2): 397–414. arXiv:1607.07730. doi:10.1080/0952813X.2016.1186228. S2CID 928824.
- Piesing, Mark (17 May 2012). "AI uprising: humans will be outsourced, not obliterated". Wired. Archived from the original on 7 April 2014. Retrieved 12 December 2015.
- Coughlan, Sean (24 April 2013). "How are humans going to become extinct?". BBC News. Archived from the original on 9 March 2014. Retrieved 29 March 2014.
- Bridge, Mark (10 June 2017). "Making robots less confident could prevent them taking over". The Times. Archived from the original on 21 March 2018. Retrieved 21 March 2018.
-  McGinnis, John (Summer 2010). "Accelerating AI". Northwestern University Law Review. 104 (3): 1253–1270. Archived from the original on 15 February 2016. Retrieved 16 July 2014. For all these reasons, verifying a global relinquishment treaty, or even one limited to AI-related weapons development, is a nonstarter... (For different reasons from ours, the Machine Intelligence Research Institute) considers (AGI) relinquishment infeasible... 
-  Kaj Sotala; Roman Yampolskiy (19 December 2014). "Responses to catastrophic AGI risk: a survey". Physica Scripta. 90 (1). In general, most writers reject proposals for broad relinquishment... Relinquishment proposals suffer from many of the same problems as regulation proposals, but to a greater extent. There is no historical precedent of general, multi-use technology similar to AGI being successfully relinquished for good, nor do there seem to be any theoretical reasons for believing that relinquishment proposals would work in the future. Therefore we do not consider them to be a viable class of proposals. 
-  Allenby, Brad (11 April 2016). "The Wrong Cognitive Measuring Stick". Slate. Archived from the original on 15 May 2016. Retrieved 15 May 2016. It is fantasy to suggest that the accelerating development and deployment of technologies that taken together are considered to be A.I. will be stopped or limited, either by regulation or even by national legislation. 
- Yampolskiy, Roman V. (2022). Müller, Vincent C. (ed.). "AI Risk Skepticism". Philosophy and Theory of Artificial Intelligence 2021. Studies in Applied Philosophy, Epistemology and Rational Ethics. Cham: Springer International Publishing. 63: 225–248. doi:10.1007/978-3-031-09153-7_18. ISBN 978-3-031-09153-7.
- McGinnis, John (Summer 2010). "Accelerating AI". Northwestern University Law Review. 104 (3): 1253–1270. Archived from the original on 15 February 2016. Retrieved 16 July 2014.
-  "Why We Should Think About the Threat of Artificial Intelligence". The New Yorker. 4 October 2013. Archived from the original on 4 February 2016. Retrieved 7 February 2016. Of course, one could try to ban super-intelligent computers altogether. But 'the competitive advantage—economic, military, even artistic—of every advance in automation is so compelling,' Vernor Vinge, the mathematician and science-fiction author, wrote, 'that passing laws, or having customs, that forbid such things merely assures that someone else will.' 
- Baum, Seth (22 August 2018). "Superintelligence Skepticism as a Political Tool". Information. 9 (9): 209. doi:10.3390/info9090209. ISSN 2078-2489.
- "Elon Musk and other tech leaders call for pause in 'out of control' AI race". CNN. 29 March 2023. Retrieved 30 March 2023.
- "Pause Giant AI Experiments: An Open Letter". Future of Life Institute. Retrieved 30 March 2023.
- "Musk and Wozniak among 1,100+ signing open letter calling for 6-month ban on creating powerful A.I." Fortune. March 2023. Retrieved 30 March 2023.
- "The Open Letter to Stop 'Dangerous' AI Race Is a Huge Mess". www.vice.com. March 2023. Retrieved 30 March 2023.
- "Elon Musk". Twitter. Retrieved 30 March 2023.
- "Tech leaders urge a pause in the 'out-of-control' artificial intelligence race". NPR. 2023. Retrieved 30 March 2023.
- Kari, Paul (1 April 2023). "Letter signed by Elon Musk demanding AI research pause sparks controversy". The Guardian. Retrieved 1 April 2023.
- Domonoske, Camila (17 July 2017). "Elon Musk Warns Governors: Artificial Intelligence Poses 'Existential Risk'". NPR. Archived from the original on 23 April 2020. Retrieved 27 November 2017.
- Gibbs, Samuel (17 July 2017). "Elon Musk: regulate AI to combat 'existential threat' before it's too late". The Guardian. Archived from the original on 6 June 2020. Retrieved 27 November 2017.
- Kharpal, Arjun (7 November 2017). "A.I. is in its 'infancy' and it's too early to regulate it, Intel CEO Brian Krzanich says". CNBC. Archived from the original on 22 March 2020. Retrieved 27 November 2017.
- Kaplan, Andreas; Haenlein, Michael (2019). "Siri, Siri, in my hand: Who's the fairest in the land? On the interpretations, illustrations, and implications of artificial intelligence". Business Horizons. 62: 15–25. doi:10.1016/j.bushor.2018.08.004. S2CID 158433736.
- Baum, Seth D.; Goertzel, Ben; Goertzel, Ted G. (January 2011). "How long until human-level AI? Results from an expert assessment". Technological Forecasting and Social Change. 78 (1): 185–195. doi:10.1016/j.techfore.2010.09.006. ISSN 0040-1625.
- United States. Defense Innovation Board. AI principles : recommendations on the ethical use of artificial intelligence by the Department of Defense. OCLC 1126650738.
- Nindler, Reinmar (11 March 2019). "The United Nation's Capability to Manage Existential Risks with a Focus on Artificial Intelligence". International Community Law Review. 21 (1): 5–34. doi:10.1163/18719732-12341388. ISSN 1871-9740. S2CID 150911357. Archived from the original on 30 August 2022. Retrieved 30 August 2022.
- Stefanik, Elise M. (22 May 2018). "H.R.5356 - 115th Congress (2017-2018): National Security Commission Artificial Intelligence Act of 2018". www.congress.gov. Archived from the original on 23 March 2020. Retrieved 13 March 2020.
- Sotala, Kaj; Yampolskiy, Roman V (19 December 2014). "Responses to catastrophic AGI risk: a survey". Physica Scripta. 90 (1): 018001. Bibcode:2015PhyS...90a8001S. doi:10.1088/0031-8949/90/1/018001. ISSN 0031-8949.
- Geist, Edward Moore (15 August 2016). "It's already too late to stop the AI arms race—We must manage it instead". Bulletin of the Atomic Scientists. 72 (5): 318–321. Bibcode:2016BuAtS..72e.318G. doi:10.1080/00963402.2016.1216672. ISSN 0096-3402. S2CID 151967826.
Bibliography
    
- Clark, Jack (2015a). "Musk-Backed Group Probes Risks Behind Artificial Intelligence". Bloomberg.com. Archived from the original on 30 October 2015. Retrieved 30 October 2015.

