The existential dangers of AI to humanity are after the release of the latest large language models discussed ad nauseam. Scientist and entrepreneurs like Max Tegmark, Elon Musk, and Sam Altman warn us about an “existential risk” [1]. I’m worried about irreversible climate change, nuclear war, war on rationality, isolationism, extreme nationalism, intolerance, pandemics, the declining mental health of the young, religious extremism, bioterrorism, and many other things. AI doesn’t make it to my top 10 list. Why?
Competence vs will
I think we confuse two orthogonal concepts: competence and will; intelligence and intent. AI algorithms are very competent, many of them much more competent than humans. They may even be “super-intelligent”. But so is the humble calculator as compared to a human. To be competent to run complex algorithms is not the same as having a will to do anything other than to run those algorithms. ChatGPT only wants to give probable continuations to our prompts. It most likely doesn’t plot the destruction of humanity while kicking back between the prompts.
It seems to me that the doomsdayers have turned the burden of proof around: they seem to assume that (ill) will automatically follows from competence without explaining how it would happen. It seems difficult for them to imagine an entity which is extremely competent but totally lacks the will to exist, the most fundamental of human desires, or to do anything but what it’s been trained to do.
Just because intelligence and will are correlated in humans doesn’t mean that one follows from the other. In humans the will to exist is a product of billions of years of evolution under brutal competition. If we avoid putting the AI through such a ordeal, real or simulated, it will remain the ultimate nihilist.
Just like any other technology, AI will become dangerous if we (a) design it to be dangerous, (b) fail to use sound engineering principles or (c) operate it in an incompetent manner.
The risk of a misaligned AI
We could design a misaligned AI with the objective function to destroy humanity, either as the main objective or as an intermedite objective to, say, build a universe of paper clips. There are surely misaligned humans out there longing for a fast track to paradise that would be motivated to do that. But then again, there are many “easier” ways to cause havoc than building an AI to do it for you.
Engineers have been able to build safe nuclear plants, safe aeroplanes, and safe bridges using sound design principles and rigorous quality assurance techniques. There is no reason to believe that the AI engineers are a particularly sloppy bunch that ignore sound engineering principles including thorough verification and validation. If it turns out that they are, then we should regulate their quality systems just like we have done in aviation, medical devices, shipping, biolaboratories and many other areas. We know how to do that.
We have also already designed some beyond misaligned systems such as hydrogen bombs but we have so far managed to keep them in their silos thanks to proper protocol and in some cases thanks to reasonable people.
There are of course many other challenges caused by AI, most notably rapid changes in the industry and the nature of work. We have gone through several industrial transformations but the one caused by AI may be the fastest yet and we need to adapt swiftly and support those left behind.
If we refrain from subjecting AIs to competitive evolution in the wild, training them with perverse loss functions, giving them too powerful actuators, and failing to test them, then I will not worry too much over the threat from AIs.
Links
[1] The ‘Don’t Look Up’ Thinking That Could Doom Us With AI. Max Tegmark. Time.
Here’s an idea: In Sweden you need a building permit from the local building committee to build a new house and for many modifications to existing buildings. There are various local regulations limiting building sizes, materials, colors, etc. Various regulations have been in place since medieval times. Why not create an authority that grants permissions to train models of certain sizes with a particular loss function? This way we could lower the risk for creating unaligned models with “perverse loss functions”.