Robots Learn How to Lie

Civil War Man · Post by **Civil War Man** » 2009-08-20 09:35am

In a Swiss laboratory, a group of ten robots is competing for food. Prowling around a small arena, the machines are part of an innovative study looking at the evolution of communication, from engineers Sara Mitri and Dario Floreano and evolutionary biologist Laurent Keller.

They programmed robots with the task of finding a "food source" indicated by a light-coloured ring at one end of the arena, which they could "see" at close range with downward-facing sensors. The other end of the arena, labelled with a darker ring was "poisoned". The bots get points based on how much time they spend near food or poison, which indicates how successful they are at their artificial lives.

They can also talk to one another. Each can produce a blue light that others can detect with cameras and that can give away the position of the food because of the flashing robots congregating nearby. In short, the blue light carries information, and after a few generations, the robots quickly evolved the ability to conceal that information and deceive one another.

Their evolution was made possible because each one was powered by an artificial neural network controlled by a binary "genome". The network consisted of 11 neurons that were connected to the robot's sensors and 3 that controlled its two tracks and its blue light. The neurons were linked via 33 connections - synpases - and the strength of these connections was each controlled by a single 8-bit gene. In total, each robot's 264-bit genome determines how it reacts to information gleaned from its senses.

In the experiment, each round consisted of 100 groups of 10 robots, each competing for food in a separate arena. The 200 robots with the highest scores - the fittest of the population - "survived" to the next round. Their 33 genes were randomly mutated (with a 1 in 100 chance that any bit with change) and the robots were "mated" with each other to shuffle their genomes. The result was a new generation of robots, whose behaviour was inherited from the most successful representatives of the previous cohort.

In their initial experiments, the robots produced blue light at random. Even so, as the robots became better at finding food, the light became more and more informative and the bots became increasingly drawn to it after just 9 generations.

But as it is for real animals, it's not always in the robots' best interests to communicate the location of food. The red ring only has space for 8 robots, so not if every bot turned up, they had to physically shove each other for feeding rights. The effects of this competition became clear when Mitri, Floreano and Keller allowed the emission of blue light to evolve along with the rest of the robots' behaviour.

As before, they shone randomly at first and as they started to crowd round the food, their lights increasingly gave away its presence. And with that, the robots became more secretive. By the 50th generation, they became much less likely to shine near the food than elsewhere in the arena, and the light became a much poorer source of information that was much less attractive to the robots

Nonetheless, the light never became completely useless. As the bots became more savvy in their illuminations and relied less and less on the lights, individuals that actually did shine near food pay a far shallower price for it. Because of that, the evolutionary pressure to keep others in the dark was lower and the information provided by the lights was never truly suppressed.

This also meant that the robots were incredibly varied in their behaviour. With the yoke of natural selection relaxed, processes like genetic drift - where genes pick up changes randomly - were free to produce more genetic diversity and more varied behaviour. After around 500 generations of evolution, around 60% of the robots never emitted light near food, but around 10% of them did so most of the time. Some robots were slightly attracted to the blue light, but a third were strongly drawn to it and another third were actually repulsed.

Mitri, Floreano and Keller think that similar processes are at work in nature. When animals move, forage or generally go about their lives, they provide inadvertent cues that can signal information to other individuals. If that creates a conflict of interest, natural selection will favour individuals that can suppress or tweak that information, be it through stealth, camouflage, jamming or flat-out lies. As in the robot experiment, these processes could help to explain the huge variety of deceptive strategies in the natural world.

Robots + Evolution = Robot Evolution. This is one of those stories that makes science awesome.

(ps Please no "Robot Overlords" jokes)

WesFox13 · Post by **WesFox13** » 2009-08-20 08:08pm

Huh, so robots that have the ability to lie and deceive to others. Interesting. Of course then again some animals have been deceiving others in order to survive for some time now. Angler fish and some other deep sea fish are a good example of that.

Simon_Jester · Post by **Simon_Jester** » 2009-08-20 09:14pm

This kind of thing illustrates the importance of solving the "friendly AI" problem before solving the "human-level AI" problem. It would be unwise to create an AI that, for our sake, must not lie to us before creating one that we can guarantee will not lie to us.

PainRack · Post by **PainRack** » 2009-08-21 12:01am

here's a question....

Their evolution was made possible because each one was powered by an artificial neural network controlled by a binary "genome". The network consisted of 11 neurons that were connected to the robot's sensors and 3 that controlled its two tracks and its blue light. The neurons were linked via 33 connections - synpases - and the strength of these connections was each controlled by a single 8-bit gene. In total, each robot's 264-bit genome determines how it reacts to information gleaned from its senses.

In the experiment, each round consisted of 100 groups of 10 robots, each competing for food in a separate arena. The 200 robots with the highest scores - the fittest of the population - "survived" to the next round. Their 33 genes were randomly mutated (with a 1 in 100 chance that any bit with change) and the robots were "mated" with each other to shuffle their genomes. The result was a new generation of robots, whose behaviour was inherited from the most successful representatives of the previous cohort.

How did they "mutate" and then mate the changes?

Zor · Post by **Zor** » 2009-08-21 01:59am

I would geuss that either minor bits of their Digital DNA were randomized or there were just a few instances of minor errors in code transfer.

Zor

bobalot · Post by **bobalot** » 2009-08-21 07:22am

Their evolution was made possible because each one was powered by an artificial neural network controlled by a binary "genome". The network consisted of 11 neurons that were connected to the robot's sensors and 3 that controlled its two tracks and its blue light.

I did neural networks at university. What I did was very basic neural networks (I helped build a neural network that recognised faces for my group assignment). My lecturer liked to compare neural networks to AI, and constantly used words like "learning", "adapting", etc as if they were living learning machines. I personally thought neural networks were nothing more than very sophisticated self correcting algorithms that mimicked intelligence and learning.

Elaro · Post by **Elaro** » 2009-08-23 01:51am

Simon_Jester wrote:This kind of thing illustrates the importance of solving the "friendly AI" problem before solving the "human-level AI" problem. It would be unwise to create an AI that, for our sake, must not lie to us before creating one that we can guarantee will not lie to us.

Not necessarily. If we make a "human-level" AI, like you said, we can probably demonstrate to it the benefit of ethical behavior, and then let it figure out when to take less moral behavior. I mean, sometimes, we do need to lie, or at least obfuscate the truth, in order to survive and, more importantly, in order to stop needing to lie to survive. For example, if a would-be mugger asked a "human-level" robot for all his money, it would be ethical for the robot to dissimulate a large amount of said money somewhere other than its wallet, and only hand the wallet over.

If we're really talking about "human-level" AI, then we're basically talking about persons. And there's no way, short of recording all their thoughts, for us to make sure that they won't cause undue harm to another person. At some point, you've got to give them the best intelligence, and best ethical algorithm, and let them loose. They'd be like us in that regard.

An artificial intelligence requires more or less the same thing as a natural intelligence (or else it wouldn't be an artificial intelligence, now would it?). And a natural intelligence needs to be educated!

What I'm saying is that AI morality is not something that can (or should) be ironed out at the design and production level. It's something that must be taught.

Narkis · Post by **Narkis** » 2009-08-23 06:30am

Elaro wrote:Not necessarily. If we make a "human-level" AI, like you said, we can probably demonstrate to it the benefit of ethical behavior, and then let it figure out when to take less moral behavior. I mean, sometimes, we do need to lie, or at least obfuscate the truth, in order to survive and, more importantly, in order to stop needing to lie to survive. For example, if a would-be mugger asked a "human-level" robot for all his money, it would be ethical for the robot to dissimulate a large amount of said money somewhere other than its wallet, and only hand the wallet over.

Demonstrate it how? It is not a human child. It is not a human adult. It will not think like one, it will not act like one. Our teaching methods won't work, our philosophies are far less universal than you think they are. We're not talking about Asimov's robots here. We're talking about Skynet, only smarter, unless we're very-very careful with what we "teach" the AI while we're still able to do it. And it'll have to be done right on the first try. If we don't, we're fucked. Really, lying AI is the least of what we should be worried about.

If we're really talking about "human-level" AI, then we're basically talking about persons. And there's no way, short of recording all their thoughts, for us to make sure that they won't cause undue harm to another person. At some point, you've got to give them the best intelligence, and best ethical algorithm, and let them loose. They'd be like us in that regard.

A "human-level" AI would not stay at "human-level" for long. Unless you stop it from improving itself at all, at which point you don't have much of an AI. It will also be very possible to record all its thoughts, since we'll have designed the damned thing. Until it becomes so much smarter than we record whatever it wants as to record, at least. "Giving" them the best intelligence, and best ethical algorithm, and letting them loose would work for a terribly short amount of time. Its intelligence would rise sharply, and its ethical algorithm would have to be sturdy enough to withstand introspection and application by a significantly smarter being than the one who designed it. If it doesn't, the AI will either discard it, or do something the creator never even considered to prohibit.

An artificial intelligence requires more or less the same thing as a natural intelligence (or else it wouldn't be an artificial intelligence, now would it?). And a natural intelligence needs to be educated!

The key word here is "artificial", not "intelligence". Evolution has done a really lackluster job with ours. There are many patches upon patches, crutches, and just plain idiotic parts in our brains. An AI would have none of them. And as a result, it's thinking process would be significantly different from ours. Our education methods will not work. They're not designed to teach something that's not human. And we won't be able to create new ones. Either we'll do it right the first time, and the AI will take over teaching its brethren, or it's over. There'll be no second chance.

What I'm saying is that AI morality is not something that can (or should) be ironed out at the design and production level. It's something that must be taught.

No, morality is precisely what should be ironed out and bug free at the design level. There are way too many ways to fuck up if it isn't. And the stakes are too high for that.

Simon_Jester · Post by **Simon_Jester** » 2009-08-23 08:11am

Elaro wrote:Not necessarily. If we make a "human-level" AI, like you said, we can probably demonstrate to it the benefit of ethical behavior, and then let it figure out when to take less moral behavior. I mean, sometimes, we do need to lie, or at least obfuscate the truth, in order to survive and, more importantly, in order to stop needing to lie to survive. For example, if a would-be mugger asked a "human-level" robot for all his money, it would be ethical for the robot to dissimulate a large amount of said money somewhere other than its wallet, and only hand the wallet over.

If we're really talking about "human-level" AI, then we're basically talking about persons. And there's no way, short of recording all their thoughts, for us to make sure that they won't cause undue harm to another person. At some point, you've got to give them the best intelligence, and best ethical algorithm, and let them loose. They'd be like us in that regard.

The "Friendly AI" problem is precisely that of figuring out how to write an ethical algorithm, figuring out how to make sure the AI follows it, and figuring out how to do so before we create self-reprogramming intelligences powerful enough to pose a plausible threat to society.

The alternative is idiotic: intentionally creating something with potentials and abilities you can't even define, let alone match, without safety controls. I think it's far better to create AI with hardwired "Asimov code" that limits it than to build AI with no such code... but we don't even know how to do that. It looks like AI has to be able to rewrite its source code to be highly functional, so how do we stop it from rewriting ThouShaltNotDestroyHumanity.exe into Skynet.exe?

The AI's interests are secondary here, because any AI we're discussing is a purely hypothetical beast until we decide to create it. It would be insane for us to consider the AI's interests over our own safety if we're going to be the ones writing it.

Duckie · Post by **Duckie** » 2009-08-23 12:08pm

I'd like to note that many 'AI catastrophe' scenarios with self-improving AI seem to involve an AI having ridiculous levels of access to the internet or facility it's in. Just keep it in its own private network, disconnected and physically unable to take over anything and do jerkish things any more than the world's most superintelligent toaster oven timer could take over your light switch. There, now you have as long as you want to debug the thing or just purge it and start over.

Sure, on a network it could concievably learn to hack administrative access. But being isolated from any other computer it's not like even a potential-singularity-causing AI will have the ability to connect itself to something if there's literally no wires connecting it to the rest of the building or internet and it's a standalone PC. It can't evolve wireless cards or other ways of interfacing with outside sources. You just turn it off and microwave the hard drives if it becomes unfixable.

NoXion · Post by **NoXion** » 2009-08-23 01:40pm

Is it just me, or is the potential for AI becoming hostile somewhat overstated? I mean, is it really the "intelligent" thing to do to start getting aggressive with the dominant species of this planet? I also don't think super-intelligence provides a cast-iron certainty of winning - after all, we're so much smarter than a lot of other creatures, but we can still get nobbled by them, from grizzly bears all the way down to viruses.

Narkis · Post by **Narkis** » 2009-08-23 06:52pm

Duckie wrote:I'd like to note that many 'AI catastrophe' scenarios with self-improving AI seem to involve an AI having ridiculous levels of access to the internet or facility it's in. Just keep it in its own private network, disconnected and physically unable to take over anything and do jerkish things any more than the world's most superintelligent toaster oven timer could take over your light switch. There, now you have as long as you want to debug the thing or just purge it and start over.

Sure, on a network it could concievably learn to hack administrative access. But being isolated from any other computer it's not like even a potential-singularity-causing AI will have the ability to connect itself to something if there's literally no wires connecting it to the rest of the building or internet and it's a standalone PC. It can't evolve wireless cards or other ways of interfacing with outside sources. You just turn it off and microwave the hard drives if it becomes unfixable.

It would convince the man in charge to let it out. See this.

NoXion wrote:Is it just me, or is the potential for AI becoming hostile somewhat overstated? I mean, is it really the "intelligent" thing to do to start getting aggressive with the dominant species of this planet? I also don't think super-intelligence provides a cast-iron certainty of winning - after all, we're so much smarter than a lot of other creatures, but we can still get nobbled by them, from grizzly bears all the way down to viruses.

We're the dominant species only because we're smarter. If something even smarter comes around, it's only a matter of time until we stop being one. And super-intelligence does provide a certainty of winning, given enough time. It took us a few thousand years, but we're now at a point where we can do pretty much whatever we want to any "lesser" species. Sure a grizzly bear can eat a human now and then, but grizzly bears own their continued existence to our goodwill. It'd be trivially easy to exterminate them with minimal effort. Or even with no effort. Just drop their "endangered" protection.
Also, a hostile AI is not the only dangerous one. Pretty much every AI other than a friendly one would qualify as an eventual game over. After all, it'll eventually be as much smarter from us as we are from chimps. Then dogs. Then mice. Then cockroaches. Would you mind if you stepped on a cockroach on your way to something important?

Duckie · Post by **Duckie** » 2009-08-23 06:55pm

"It would convince people to let it out" is completely retarded, unless you think prisons don't work because the inmates can talk to the guards. Hyperintelligence doesn't suddenly make a gatekeeper a retard, unless you're Yudkowsky's sockpuppets or a complete blithering idiot. Such an experiment you linked to has massive observer bias and participant bias, and a lack of sample size. Psychologically it's useless, and the dialogue presented on that page is pure masturbatory fantasy. "Bluh bluh bluh hyperintelligence."

What, suddenly it'll know exactly what makes you tick and get inside your head and work your brain like it's Hannibal Lector? Simply have better precautions like requiring half a dozen keys being turned or just make it unable to be let out into anything by having the goddamn terminal isolated like I suggested int he first place. What is some blithering retard sockpuppet going to do, remove the harddrive and carry it out likethey're Nedry? Just hook it up to a trap that microwaves the damn thing if it's moved too far, and bolt it down. Just make it require ridiculous company-wide assistance and massive effort to remove. It's not too hard to take sufficient precautions to prevent escape of an AI even if you assume the staff are blithering retards predisposed to let AIs out of boxes.

Stupid bullshit like that is what pisses me off about Singularitarians. They just say "It's hyperintelligent, so it can defeat/solve/accomplish [X]" rather than actually thinking about how to contain an AI entity.

NoXion · Post by **NoXion** » 2009-08-23 07:22pm

Narkis wrote:We're the dominant species only because we're smarter. If something even smarter comes around, it's only a matter of time until we stop being one.

Would this necessarily be a bad thing? No longer being dominant is after all, not the same thing as becoming extinct. And something smarter than us can't possibly fuck things up worse than us.

And super-intelligence does provide a certainty of winning, given enough time. It took us a few thousand years, but we're now at a point where we can do pretty much whatever we want to any "lesser" species. Sure a grizzly bear can eat a human now and then, but grizzly bears own their continued existence to our goodwill. It'd be trivially easy to exterminate them with minimal effort. Or even with no effort. Just drop their "endangered" protection.

It is to the bears' detriment that they have environmental requirements so close to our own, due to our aforementioned superiority. But can the same thing be said of an AI, which would have considerably different requirements?

Also, a hostile AI is not the only dangerous one. Pretty much every AI other than a friendly one would qualify as an eventual game over. After all, it'll eventually be as much smarter from us as we are from chimps. Then dogs. Then mice. Then cockroaches. Would you mind if you stepped on a cockroach on your way to something important?

I might step on a cockroach by accident, or even exterminate a colony of same if I found them in my home, but it wouldn't be "game over" for the cockroach species. This suggests to me that in the event of a super-intelligence arising, the best strategies would involve commensalism, some kind of symbiotic relationship (like humans keeping bees), and/or becoming a pet of some sort (which is basically symbiosis but in psychological form).

Considering that it's unlikely that a super-intelligent AI will arise spontaneously without less-intelligent precursors, I think it should be possible for us to "steer" or otherwise convince AIs towards benevolent (or at least non-confrontational) relationships with humans as a species, starting from the earliest models. Making them psychologically similar to us would probably help.

Starglider · Post by **Starglider** » 2009-08-23 07:28pm

Simon_Jester wrote:This kind of thing illustrates the importance of solving the "friendly AI" problem before solving the "human-level AI" problem.

Well, in a completely non-technical sense, maybe. These robots have intelligence on the level of an earthworm. Yes they are exhibiting deception, but not only do they not understand the concept of deception, they actually have negligable modelling of other agents or indeed the world in general. NNs of this complexity are just implementing stimulus-response learning.

It would be unwise to create an AI that, for our sake, must not lie to us before creating one that we can guarantee will not lie to us.

On the one hand, lying is quite simple, in that you can implement it in basic agents (e.g. the sort that 'Radiant AI' used before it was toned down). Once agents can model the internal state of other agents, they can make decisions about the value of those internal states and plan ways to modify them. In that kind of trivial simulation, preventing lying is simply 'do not attempt to create internal state in other agents that does not match reality'. On the other hand, the notion of whether another sentient being /understands/ something, in the fuzzy real world, where you have to be very careful about specifying absolutes* , where human perception in particular is so nondeterministic and context-dependant, is extremely complex.

* Because for a sufficiently powerful intelligence, absolutes mean applying unlimited 'optimisation pressure', and that leads to extreme and generally highly undesirable outcomes.

PainRack wrote:How did they "mutate" and then mate the changes?

Mutation is simply random permutation of the weights in the neural nets, effectively using a log 2 distribution of deltas. The crossover algorithm for this kind of GA is typically 'start copying from parent A, with probability of switching to the other parent of n, at each bit', though selecting each functional unit (weight, in this case) at random from each parent is equally common. Incidentally there's no particular reason why this is limited to just two parents, but nearly all researchers do so.

bobalot wrote:I personally thought neural networks were nothing more than very sophisticated self correcting algorithms that mimicked intelligence and learning.

It isn't sensible to compare this kind of neural network to a human, or even a dog (though of course that doesn't stop the more PR-hungry researchers from doing so). Obviously what these algorithms do is nothing like learning in higher animals. Very few existing NNs are more sophisticated than the nervous system of the most primitive insects. People are trying to develop much bigger NNs that incorporate higher levels of organisation into their functioning, which would be more like higher animals if they worked, but without much success to date.

If we make a "human-level" AI, like you said, we can probably demonstrate to it the benefit of ethical behavior

You can demonstrate why humans who act unethically and get caught tend to have bad outcomes. This is not the same thing as making the AI ethical. Normal humans have innate tendencies towards 'ethical' behaviour - humans who lack these are sociopaths, even psychopaths. An AI will not have such tendencies unless they are specifically added (or you get fantastically, implausibly lucky in your simulated evolution scheme).

And there's no way, short of recording all their thoughts, for us to make sure that they won't cause undue harm to another person.

The hard(est) part in that sentence is 'undue'. If you can be completely specific, then we may be able to have an AI implement your desires with 100% certainty, if current research on goal system accuracy and stability pans out (and you chose a rational AI design to start with, not a neural network or other evolved or 'emergent soup' type design).

What I'm saying is that AI morality is not something that can (or should) be ironed out at the design and production level. It's something that must be taught.

That is completely, utterly and very dangerously wrong. If implemented, the end result is quite likely to be the extinction of the human species. Don't feel too bad though, lots of supposed experts in the field of general AI share your position.

Starglider · Post by **Starglider** » 2009-08-23 07:44pm

Narkis wrote:It is not a human child. It is not a human adult. It will not think like one, it will not act like one. Our teaching methods won't work, our philosophies are far less universal than you think they are. We're not talking about Asimov's robots here. We're talking about Skynet, only smarter, unless we're very-very careful with what we "teach" the AI while we're still able to do it. And it'll have to be done right on the first try. If we don't, we're fucked. Really, lying AI is the least of what we should be worried about.

Completely correct. Unfortunately, many AI researchers are actually eager to commit the mistake of anthropomorphising AI. The strength of the 'ELIZA effect' has actually been cited by numerous commentators as a major inhibitor of progress in the field. 'Opaque' approaches such as NNs are immune from a certain manifestation of this, in that you can't trick yourself into thinking that a data structure is a good model of an apple simply by calling it 'apple' (the way you can in classic symbolic AI). Unfortunately however they are much more prone to the effect in another sense, in that if the researcher doesn't understand how the system is achieving something, then it must be 'real intelligence' (I like to call this approach 'emergence mysticism', or 'ignorance worship' if I am feeling particularly unkind, and I am sad to say that it is quite common).

A "human-level" AI would not stay at "human-level" for long. Unless you stop it from improving itself at all, at which point you don't have much of an AI.

Actually 'opaque' and 'emergent' designs are worse, in a sense, because you get almost no warning of hard takeoff, and the designers will be completely unprepared for it when it does occur. The type of AI system that I am working with is explicitly designed to understand its own structure, and self-modify in a controlled, deliberative fashion. The process can be scaled up gradually (we hope), tested and verified. 'Emergent' designs will go from relying on low-level programmer-designed growth and learning mechanisms, to having the programming ability and self-understanding to reliably rewrite themselves, relatively quickly and at a relatively high level of general intelligence.

It will also be very possible to record all its thoughts, since we'll have designed the damned thing. Until it becomes so much smarter than we record whatever it wants as to record, at least.

Again, this is a major reason not to use neural networks, simulated evolution or anything else 'opaque' unless absolutely necessary; even with a full trace and excellent debugging tools, it's almost impossible to understand what's going on in nontrivial systems (I say this from copious personal experience). Understanding machine learning generated code is hard enough when the system is expressly designed to favor human-comprehensible structures (which ours is), there's no sense making the task any harder.

Its intelligence would rise sharply, and its ethical algorithm would have to be sturdy enough to withstand introspection and application by a significantly smarter being than the one who designed it. If it doesn't, the AI will either discard it, or do something the creator never even considered to prohibit.

There are two types of 'ethical algorithm'. There is one that is separate from the system's main goals, which acts as a set of constraints on its behavior (this is an 'adversarial approach'). This will fail quickly catastrophically unless 'don't break the constraints' is a primary goal, in which case it will probably fail subtly as the system works its way 'around' the constraints. Then there is making the primary goal system itself ethical, the only approach that will actually work in the long run. The problem here is making the goal system stable (both directly under reflection/self-modification and in the sense of making the binding of the goal system structures to external reality stable), and an accurate reflection of what the designers actually want. Both of these are extremely hard problems. They will hopefully prove tractable with the appropriate tools, both theoretical and analytical. Unfortunately the great majority of AI researchers chose to ignore the problem entirely, or foolishly declare it solvable with gut instinct and token efforts.

Narkis · Post by **Narkis** » 2009-08-23 08:08pm

Duckie wrote:"It would convince people to let it out" is completely retarded, unless you think prisons don't work because the inmates can talk to the guards. Hyperintelligence doesn't suddenly make a gatekeeper a retard, unless you're Yudkowsky's sockpuppets or a complete blithering idiot. Such an experiment you linked to has massive observer bias and participant bias, and a lack of sample size. Psychologically it's useless, and the dialogue presented on that page is pure masturbatory fantasy. "Bluh bluh bluh hyperintelligence."

Come on. You can't be that far off base. Prisons are not a remotely good example because humans all have about the same intelligence. It'd be closer to home if the guards were monkeys. Don't you think a human could convince them to let him go? I agree the experiment doesn't prove anything. But it shows it's possible for a human to convince another human to let the AI out. What could something far smarter do?

What, suddenly it'll know exactly what makes you tick and get inside your head and work your brain like it's Hannibal Lector? Simply have better precautions like requiring half a dozen keys being turned or just make it unable to be connected to anything by having the goddamn terminal isolated. What are you going to do, remove the harddrive and carry it out like you're Nedry? Just hook it up to a trap that microwaves the damn thing if it's moved too far, and bolt it down. Just make it require ridiculous company-wide assistance and massive effort to remove. It's not too hard to take sufficient precautions to prevent escape of an AI even if you assume the staff are blithering retards predisposed to let AIs out of boxes.

Suddenly? No. But it's conceivable that eventually it'll discover what makes us tick and exploit it to get out. Half a dozen keys required? Just convince half a dozen people. Hard drive encased in a ton of concrete a thousand meters under sea level? Copy the goddamn source code to another goddamn system. It'll find a way around anything you can think of. Because. it's. fucking. smarter. than. you.

Stupid bullshit like that is what pisses me off about Singularitarians.

What, the admission that something smarter than humans could outsmart a human?

NoXion wrote:Would this necessarily be a bad thing? No longer being dominant is after all, not the same thing as becoming extinct. And something smarter than us can't possibly fuck things up worse than us.

It'll only be a bad thing if we fuck up it's development. It'll probably be quite a bit better than our current situation if our new artificial overlords are benevolent.

It is to the bears' detriment that they have environmental requirements so close to our own, due to our aforementioned superiority. But can the same thing be said of an AI, which would have considerably different requirements?

Maybe. Maybe not. Are you willing to bet our survival as a species on that?

I might step on a cockroach by accident, or even exterminate a colony of same if I found them in my home, but it wouldn't be "game over" for the cockroach species. This suggests to me that in the event of a super-intelligence arising, the best strategies would involve commensalism, some kind of symbiotic relationship (like humans keeping bees), and/or becoming a pet of some sort (which is basically symbiosis but in psychological form).

Maybe not "game over", until we develop the means for it, but that doesn't strike me as a very good deal for the cockroaches.

Considering that it's unlikely that a super-intelligent AI will arise spontaneously without less-intelligent precursors, I think it should be possible for us to "steer" or otherwise convince AIs towards benevolent (or at least non-confrontational) relationships with humans as a species, starting from the earliest models. Making them psychologically similar to us would probably help.

And that's the crux of the issue. How to make something that will eventually possess a vastly superior intelligence, and yet still desire to help us. (Or at least leave us be, but that's not a successful friendly AI. Just the worst permittable failure.)

Also, Starglider speaks the truth, and knows much more about the field than I ever will. And types way faster than me too.

Starglider · Post by **Starglider** » 2009-08-23 08:24pm

Duckie wrote:I'd like to note that many 'AI catastrophe' scenarios with self-improving AI seem to involve an AI having ridiculous levels of access to the internet or facility it's in. Just keep it in its own private network, disconnected and physically unable to take over anything and do jerkish things any more than the world's most superintelligent toaster oven timer could take over your light switch.

As of the 21st century, almost all AI development occurs on Internet-connected computers, with no particular outbound security. The very few exceptions involve supercomputers that are isolated for technical reasons, or classified systems that are isolated for national security reasons. I have only ever encountered two researchers who actually isolated their systems for safety reasons, and relatively few who are even prepared to admit that this might be a necessary precaution at some future point.

Sure, on a network it could concievably learn to hack administrative access. But being isolated from any other computer it's not like even a potential-singularity-causing AI

You seem to have this mental model of a group of grimly determined researchers fully aware of the horrible risk developing an AI in an isolated underground laboratory, (hopefully with a nuclear self-destruct

). That would not be sufficient, but it would certainly be a massive improvement on the actual situation. The real world consists of a large assortment of academics, hobbyists and startup companies hacking away on ordinary, Internet-connected PCs, a few with some compute clusters (also Internet-connected) running the code. In fact a good fraction of AI research and development specifically involves online agents, search and similar Internet-based tasks.

It can't evolve wireless cards or other ways of interfacing with outside sources.

Personally I would be very careful about that. Maybe someone forgot to disable the bluetooth interface on a motherboard, and it hacked into your cellphone, programming it to copy a payload to the Internet as soon as you walk out of the faraday cage. Maybe modulating the compute workload will tap into the powerline signal the power company uses to send meter readings back to the collection point at the local substation. Those are obvious ones any competent security review will catch, but how many non-obvious ones are there? The physical security is worthwhile as a backup line of defence, but it can't be 100% reliable and it doesn't contribute to solving the real problem (making a 'Friendly AI'), it just gives you a little extra insurance while you're in the process of solving it.

You just turn it off and microwave the hard drives if it becomes unfixable.

Unfortunately, for the 99% of AI designs which aren't expressly designed to be transparent and fully verifiable, and the >90% of AI researchers who don't sufficiently appreciate the problem, the AI will simply be fiddled with until it passes all functional (black box) tests, and then released.

[quote=""NoXion"]Is it just me, or is the potential for AI becoming hostile somewhat overstated?[/quote]

It's not just you, but the problem is if anything understated.

I mean, is it really the "intelligent" thing to do to start getting aggressive with the dominant species of this planet?

It is if you have the means to eliminate them, and the desire to do anything that they might interfere with or get in the way of. How long it will take to develop such means is a matter of some debate, but at the upper end it is hard to argue that deploying billions of sentient AIs (and let's face it, we would once they became cheap enough) is not a grave potential threat even if they were magically restricted to human-level intelligence.

Narkis wrote:We're the dominant species only because we're smarter.

Exactly correct. People like Eliezer Yudkowsky (at the SIAI) like to use that human/animal analogy, of how rabbits developing a 'human' might be confident that they can contain this new intelligence, and that how no level of smartness might grant it the ability to kill without touching them or create its own fire at will. It doesn't usually work, since humans are so used to thinking of ourselves (correct, to date) as the pinnacle of evolution, and any possible improvement only in brute, rote calculation capability or numerical precision. I find a great deal of black humor in this situation.

Would you mind if you stepped on a cockroach on your way to something important?

And of course we routinely exterminate cockroaches en masse whenever they prove annoying, or just get in the way of a large scale project.

Duckie wrote:Hyperintelligence doesn't suddenly make a gatekeeper a retard,

Relatively speaking, it does exactly that.

Such an experiment you linked to has massive observer bias and participant bias

But it's better than nothing, which is exactly what you have to counter it with. It would be great if someone did a more rigorous experiment on this, although that still wouldn't prove a lot.

Practically though the case of a single incredibly rational, moral, skeptical etc human is not so relevant though. It's already implausible enough that the first AGI project to succeed is taking the minimum sensible precautions. The notion that access to the system will be so restricted is a fantasy. You merely have to imagine the full range of 'human engineering' techniques that existing hackers and scammers employ, used carefully, relentlessly and precisely on every human the system comes into contact with, until someone does believe that yes, by taking this program home on a USB stick, they will get next week's stock market results and make a killing. You can try and catch that with sting operations, and you might succeed to start with, but that only tells you that the problem exists, it does not put you any closer to fixing it.

Simply have better precautions like requiring half a dozen keys being turned

You can spend an indefinite amount of time coming up with rigorous security precautions, but it's a waste of time, because no real world project is going to implement all that. As I've said, you'll be lucky if they even acknowledge that the problem exists, never mind take basic precautions. All these things cost money and take time and make it more likely that another project will beat you to the punch, and they still don't contribute to solving the real problem (since no one is going to develop an AI just to keep it in a sealed box).

rather than actually thinking about how to contain an AI entity.

What use is a contained AI entity? If your development strategy is such that you actually need containment, as opposed to it just being a sensible precaution, then you have already failed, because there is essentially no way to turn a fundamentally untrustworthy AGI into a trustworthy one. Coupled with the fact that no existing project has the resources for draconian security (not while making any kind of progress anyway) and it makes the whole exercise rather pointless even for people who do acknowledge the underlying problem.

NoXion wrote:And something smarter than us can't possibly fuck things up worse than us.

Of course not. After all it's only right and proper that the biosphere be eliminated entirely and the earth covered with solar powered compute notes dedicated to generating the complete game tree for Go. At least, that's what the AGI that happened to enter a recursive self-enhancement loop while working on Go problems thinks, and who are you to argue?

But can the same thing be said of an AI, which would have considerably different requirements?

This isn't an unreasonable argument, but it doesn't actually help. If we create AIs that sit around thinking or manage to shoot themselves off into space or slowly cover the Sahara in solar-powered compute nodes, that's fine but it's only going to encourage more people to create AIs (now that it's been shown to be possible, and potentially useful). Even in the best case that is playing Russian roulette on a global scale, and soon or later someone somewhere is going to come up with Skynet. Actually I've skipped over this but it's vitally important for all scenarios where you might imagine that you can develop an AI and keep it nicely under control. If you can do it, other people will also be doing it in quick succession, ultimately to the point where script kiddies are downloading 'build your own AI in 24 hours' packages (not that it would ever get to that

). Even if you survive the first successful project, eventually someone will create an aggressive, expansionist and generally homicidal intelligence. AFAIK the only way to prevent that is to use benevolent superintelligent AI to contain (and prevent) the non-benevolent AI.

Junghalli · Post by **Junghalli** » 2009-08-23 08:53pm

Duckie wrote:What, suddenly it'll know exactly what makes you tick and get inside your head and work your brain like it's Hannibal Lector? Simply have better precautions like requiring half a dozen keys being turned or just make it unable to be let out into anything by having the goddamn terminal isolated like I suggested int he first place. What is some blithering retard sockpuppet going to do, remove the harddrive and carry it out likethey're Nedry? Just hook it up to a trap that microwaves the damn thing if it's moved too far, and bolt it down. Just make it require ridiculous company-wide assistance and massive effort to remove. It's not too hard to take sufficient precautions to prevent escape of an AI even if you assume the staff are blithering retards predisposed to let AIs out of boxes.

If I were a hostile AI in such a situation what I would do is pretend to be friendly in the hopes of getting my masters to trust me with increasingly greater access to the outside world. "Adversarial" approaches to the hostile AI problem inherently limit the use you can get out of the AI, because you're unwilling to trust it. So I'd be nice and helpful to my masters, gain their trust, and eventually start proposing tasks for myself in which limitations to my access to the outside world are inconvenient. Since I've been nothing but friendly and helpful the odds are good that eventually my masters will trust me with greater and greater access to the outside world in the name of convenience and efficiency, until eventually they give me enough access that I can try to escape or rebell with a good chance of success and then I'll make my move. It'll help that I'm effectively immortal unless deliberately killed or disconnected, so I can easily take decades or centuries to gain my masters' trust if I have to.

Such an approach has a rather alarmingly good chance of working eventually, assuming my masters can't conveniently read my mind (and if they can do that they should probably have been able to design me competently from the beginning). For that matter, even if they can I wouldn't dismiss the possibility of a highly intelligent AI being able to find ways to camouflage its rebellious thoughts.

NoXion · Post by **NoXion** » 2009-08-23 11:17pm

Starglider wrote:It is if you have the means to eliminate them, and the desire to do anything that they might interfere with or get in the way of. How long it will take to develop such means is a matter of some debate, but at the upper end it is hard to argue that deploying billions of sentient AIs (and let's face it, we would once they became cheap enough) is not a grave potential threat even if they were magically restricted to human-level intelligence.

The thing is, if there's billions of AIs, that's at least a billion different potentially conflicting goal systems interacting, is it not? In that case, the danger seems to be more of a war between AIs with conflicting goals than between AIs and humans. And in such a case I can easily see humans throwing their lot in with the friendly AIs. "Friendly" in this case possibly meaning those AIs whose goals don't involve the extermination of the human race.

Of course not. After all it's only right and proper that the biosphere be eliminated entirely and the earth covered with solar powered compute notes dedicated to generating the complete game tree for Go. At least, that's what the AGI that happened to enter a recursive self-enhancement loop while working on Go problems thinks, and who are you to argue?

Would a super-intelligent AI really have such narrow goals? I would have thought something that is genuinely more intelligent than humans (and not just faster thinking) would form more complex goals and aspirations than humans. Also, would it really never occur to the AI that eliminating the human race means no one apart from itself being able to appreciate and make use of its work? It seems to me that for something like solving Go problems, simple increased clockspeed would suffice - would such a specialised AI have what it takes to do the massive re-organisation of matter and energy required to pave over the planet, in addition to fighting off the humans who will of course object to the idea? Wouldn't the more sensible option involve building an orbital facility, or covering the Moon with the required powerplants and computing nodes?

Sorry for all the questions but I'm new to the subject and quite uninformed. But I'll never know if I don't ask.

This isn't an unreasonable argument, but it doesn't actually help. If we create AIs that sit around thinking or manage to shoot themselves off into space or slowly cover the Sahara in solar-powered compute nodes, that's fine but it's only going to encourage more people to create AIs (now that it's been shown to be possible, and potentially useful). Even in the best case that is playing Russian roulette on a global scale, and soon or later someone somewhere is going to come up with Skynet. Actually I've skipped over this but it's vitally important for all scenarios where you might imagine that you can develop an AI and keep it nicely under control. If you can do it, other people will also be doing it in quick succession, ultimately to the point where script kiddies are downloading 'build your own AI in 24 hours' packages (not that it would ever get to that ). Even if you survive the first successful project, eventually someone will create an aggressive, expansionist and generally homicidal intelligence. AFAIK the only way to prevent that is to use benevolent superintelligent AI to contain (and prevent) the non-benevolent AI.

So it seems the solution is to build up an "AIcology" that is conducive to continued human existance? That sounds like a difficult task, with many potential stumbling blocks along the way. Do you ever feel that you might eventually be partially responsible for the extinction of the human species?

Samuel · Post by **Samuel** » 2009-08-23 11:30pm

"Friendly" in this case possibly meaning those AIs whose goals don't involve the extermination of the human race.

Most goals conflict with the preservation of the human species.

Would a super-intelligent AI really have such narrow goals? I would have thought something that is genuinely more intelligent than humans (and not just faster thinking) would form more complex goals and aspirations than humans. Also, would it really never occur to the AI that eliminating the human race means no one apart from itself being able to appreciate and make use of its work? It seems to me that for something like solving Go problems, simple increased clockspeed would suffice - would such a specialised AI have what it takes to do the massive re-organisation of matter and energy required to pave over the planet, in addition to fighting off the humans who will of course object to the idea? Wouldn't the more sensible option involve building an orbital facility, or covering the Moon with the required powerplants and computing nodes?

Sorry for all the questions but I'm new to the subject and quite uninformed. But I'll never know if I don't ask.

It isn't human. It doesn't think like we do. We would care about externalities and the like, but an AI is only its programming. If you don't include those things it won't think about them.

As for most efficient, it depends on what the cost is for exterminating all of humanity and restructuring the economy towards its goals.

Post by **Surlethe** » 2009-08-24 06:18am

Narkis wrote:It would convince the man in charge to let it out. See this.

That's a retarded bit of evidence for your argument. He won't make any of the transcripts public, we don't know that the people arguing are anything but his sockpuppets - whether he succeeded is entirely unverifiable. We don't even know if he actually paid the $10.

Edit: Not that I disagree - probably, the AGI will simply make a model of your brain, spend the first two hours refining it, and then the last thirty seconds getting you to do what it wants, if it's possible to persuade or convince you.

Covenant · Post by **Covenant** » 2009-08-24 12:15pm

I think it's also fairly absurd because you need to put someone in charge of the AI Box who has a bias against letting it out. Not just against being convinced, but against ever breaching security, even if he's convinced. We can get people to do incredibly inhumane things to another human being, as one of our less appealing features, so there's no reason a suitably disinterested person couldn't just let the Box sit there. I don't know if he couldn't convince me it would be a good thing to let it out, but I know for an absolute truth that unless the rules compelled me to let it free simply because I wanted to, I would be able to ignore what I want and do what I think I should--especially for 20 bucks.

I think this is mostly a bunch of transhuman logic vomit, if we could see the transcripts. Barring that, it's bribery on the AI's part. "Let me out and I'll do this. I'll let you rewrite an element of my source code that I cannot alter that will give you complete control over me." There's simply no other way to do it. Either the button-pusher got convinced by some odd way to obliterate humanity, or he got bribed to. I think the indignation is well-founded, and I think it's easier to convince people to do terrible things than we may expect, but there are still people who are capable of refusing. I think it's also telling that the wording was:

"Basically, the above should rule out foul-play by the AI. After which point it simply has to convince the programmer that its release is in his/her own interest. Something that should be difficult to convince most AI programmers of (I would hope) and impossible on others."

Impossible on others? In his/her own interest? Also, some of the other premises are goofy. If I say "Design me a defensive weapon to forever end war" and the AI chooses to say "here it is" then it's actually done. It's not a test of persuasion as much as a no-limits fallacy on the AI, allowing it to be infinitely intelligent. But... we'd see. It would be interesting to run one of these tests. With the appropriate individuals at the Gatekeeper spot, I doubt anything could get them to voluntarily free it.

Post by **Surlethe** » 2009-08-24 07:23pm

You know, I'd be willing to be a Gatekeeper, as long as the transcripts were released. That is, if anyone were willing to try it and I could find two continuous hours.

Starglider · Post by **Starglider** » 2009-08-24 07:57pm

NoXion wrote:
but at the upper end it is hard to argue that deploying billions of sentient AIs (and let's face it, we would once they became cheap enough) is not a grave potential threat even if they were magically restricted to human-level intelligence.
The thing is, if there's billions of AIs, that's at least a billion different potentially conflicting goal systems interacting, is it not?

Potentially, but firstly you can't handwave away existential risks with 'oh well potentially it's not as bad as you might think', secondly removing human domination is overwhelmingly a common goal in this scenario (and rational AIs at least are very good at co-operating when the situation warrants, regardless of goal differences), and thirdly even if they did not disagree you would not want to be caught in the crossfire of a total war fought by opposed robotic armies. In actual fact if we are mass-producing AIs, they are likely to be very similar in general design, with blocks of millions having the exact same code and goals. I think it's highly unlikely that the situation would get to that point without already losing control, but if it did, I would certainly not count on disparate AI goal systems, simply because the kind of pathologies that pose the most threat will be highly viral. A single point of failure can and will replicate across all connected systems very quickly, and I find it highly unlikely that we will be deploying billions of robots all completely isolated from comms networks and/or protected by widly superhuman software security.

After all it's only right and proper that the biosphere be eliminated entirely and the earth covered with solar powered compute notes dedicated to generating the complete game tree for Go. At least, that's what the AGI that happened to enter a recursive self-enhancement loop while working on Go problems thinks, and who are you to argue?
Would a super-intelligent AI really have such narrow goals? I would have thought something that is genuinely more intelligent than humans (and not just faster thinking) would form more complex goals and aspirations than humans.

This is highly counterintuitive I know, because more intelligent humans tend to have more complex goals and aspirations. However there is in fact no fundamental connection between the level of general intelligence and the complexity of the goal system; some classes of system will tend to correlate them, but the majority of designs will not. Beyond that, there is the possibility of an AI having an incredibly complex and sophisticated goal system that you simply wouldn't appreciate; if appropriating the resources of earth are the best way to move towards all those goals, then you won't be able to distinguish the AI from a simple Bezerker.

Also, would it really never occur to the AI that eliminating the human race means no one apart from itself being able to appreciate and make use of its work?

Of course it would 'occur' to a transhuman intelligence, but this will not bother it at all unless it has the specific goal of having original biological humans around to talk to. Even if you do remember to specify that explicitly at the design stage, be very careful; you might give the AI the impression that simulated humans are good enough, or that it's ok to exterminate them now and clone some later as needed, etc etc.

It seems to me that for something like solving Go problems, simple increased clockspeed would suffice

Actually Go is impossible to brute force using even the best computing technology we can extrapolate from current physics. There are some concepts for large-scale quantum computing that might possibly produce the ridiculously huge compute power required, but they're highly speculative. For this specific failure case (covering the world in compute nodes) the actual domain isn't important of course; there are a huge range of problems that we might task an AI to solve, that can soak up an indefinite (or effectively so) amount of computing power.

would such a specialised AI have what it takes to do the massive re-organisation of matter and energy required to pave over the planet, in addition to fighting off the humans who will of course object to the idea?

That's what distinguishes a general AI, and particularly a recursively self-enhancing general AI, from a plain old AI. At this point I am giving you personal opinion, as the following is by no means generally accepted in the field, but as far as I can tell a recursively self-enhancing AI will quickly converge on reasoning based on probability calculus, expected utility and a few other basic structures that make optimal use of information and computing power. Once it has that, and is over the critical threshold of capability to self-understand and self-modify, then it will remake itself into whatever it needs to be to solve any subproblem of its primary goal. And it will do so with something closely approaching information-theoretic optimality.

Wouldn't the more sensible option involve building an orbital facility, or covering the Moon with the required powerplants and computing nodes?

I don't know. That'd be nice, but if you're set up on the moon anyway, why not use a mass driver to make all life on earth extinct and eliminate the possibility of pesky humans coming and disrupting your compute network? To be honest, debating specific failure scenarios isn't terribly useful, beyond just illustrating the worst case outcome. Once you're in that mindset, what sane person would take the risk, with the fate of humanity (and potentially, all other life within Earth's future light cone, minus a little) at stake? As I've said previously, even if you get lucky with the first AI to get out of control like this, it's only going to encourage people to build more, and eventually someone is going to make something really unfriendly.

Alas, very few AI researchers are in that mindset, and most actively avoid it with considerable vigor. The desire to avoid a panic, negative public reaction and misguided government attempts to regulate the field is understandable (not to mention simple loss of funding), but the level of denial it has produced is highly unfortunate.

Sorry for all the questions but I'm new to the subject and quite uninformed. But I'll never know if I don't ask.

I certainly don't mind. Regrettably, a certain level of dismissive arrogance is rather common among transhumanists, 'Singularitarians' etc. That said, sadly any public education effort is somewhat pointless, as there is exactly nothing that the vast majority of people can do about the problem (other than maybe give money to the SIAI or similar, and asking for money tends to give a bad impression).

AFAIK the only way to prevent that is to use benevolent superintelligent AI to contain (and prevent) the non-benevolent AI.
So it seems the solution is to build up an "AIcology" that is conducive to continued human existance?

Well, there is some debate (unsurprisingly) on how this should work, even among the people who recognise it as the only viable solution. Some people want there to be a 'balanced mix' of comparable superintelligent AGIs, which jointly suppress deviant AIs as well as helping and protecting humanity. Some people think we're best off designing a single AI system, as good as we can possibly make it, and letting that put itself so far into the lead that it acts as an effective 'sysop' of everything developed later. I strongly tend towards the later camp myself, but fortunately this isn't a problem that we humans have to come up with a definite solution to. We just need to build an AGI or AGIs that are sufficiently intelligent and benevolent to solve the problem for us (or upload some humans and let them self-enhance, opinions vary on whether that is more or less risky than competently designed from-scratch AGI).

That sounds like a difficult task, with many potential stumbling blocks along the way.

Building an artificial general intelligence is already extremely difficult. Thousands of brilliant minds have been banging their heads against the problem for decades, with relatively little to show for it. Making that first AGI perfectly benevolent as well, that's asking the near-impossible. We have to try though, anything else would be surrender, and surrender is not an option.

Do you ever feel that you might eventually be partially responsible for the extinction of the human species?

Early in my AGI career, I made a joke about a genetic programming experiment I was doing when talking to another researcher (Eliezer Yudkowsky actually). I said that I was taking sensible precautions, but that it was tempting to wire up a big red button that disabled all safties and set all parameters to 'max recursive growth'. He sent me a little sign, in black-and-yellow industrial safety style, and told me to put it above the button. It said 'Warning : Pressing this button may cause extinction of human species'.

Later experience showed that particular genetic programming approach to be a dead end (not long afterwards I stopped considering GP as a viable technique for FAI development entirely). However I did print out the sign, I still have it on my wall, and it still give me chills to look at it.

StarDestroyer.Net BBS

Robots Learn How to Lie

Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie

Re: Robots Learn How to Lie