If you see a robot resembling the Terminator pointing a weapon at you, you have a pretty good idea that a malign AI is engaging in bad behavior. This is exactly the sort of nightmare scenario that’s generating a lot of buzz lately: an AI weapon breaks free of human control to pursue its own objectives, which might include eradicating or subjugating humans.

Such a catastrophic event would not be hard to spot. But much more likely cases of AI breaking free of human control to create mischief could be far more difficult to detect, because such activities could be hard to differentiate from malicious human behaviors such as ransomware, botnets, social media disinformation and attacks on power and water infrastructure.

And knowing the difference between actions of malign humans and malign AIs is crucially important, especially in national security. For example, great powers such as the US, China and Russia very likely have policies for when an adversary cyber attack (e.g. taking down, water, power and telecommunications) rises to the level of an act of war, thereby justifying “kinetic” (bombs, missiles, ground assaults) retaliation (1).

What if a malign AI decided the best way to harm humans would be to get them to fight each other, by launching AI cyber attacks made to look like they originated from country X or Y?

Knowing that the opponent was synthetic or real could spell the difference between war and peace.

Recently, while preparing for a foreign policy conference on controlling AI, I asked a new, hot, experimental AI, simply called "Assistant" (2), how to differentiate rogue AIs from malicious humans. Here is a sample of what the state-of-the-art AI said.

Unusual patterns of behavior

Lack of human-like mistakes A rogue AI may not make the same kind of mistakes that humans typically make, such as

Increased scalability and speed

Unusual communication patterns

Self-improvement and adaptation

Granted, malicious humans could unleash malicious AIs that would present many of the above features, but such human-directed activity might still be differentiable from purely rogue AI behavior based upon such factors as timing of the attacks, geopolitical context, what is targeted and who is (or is not targeted).

In any case, research on differentiating AI behavior from human behavior will take on increasing importance for many reasons, one of which is to prevent the next world war.

References

1) https://www.jstor.org/stable/10.7249/mg877af.18?seq=1

2) https://arena.lmsys.org/

QOSHE - Differentiating Malicious AI From Malicious Humans - Eric Haseltine Ph.d
menu_open
Columnists Actual . Favourites . Archive
We use cookies to provide some features and experiences in QOSHE

More information  .  Close
Aa Aa Aa
- A +

Differentiating Malicious AI From Malicious Humans

90 0
09.05.2024

If you see a robot resembling the Terminator pointing a weapon at you, you have a pretty good idea that a malign AI is engaging in bad behavior. This is exactly the sort of nightmare scenario that’s generating a lot of buzz lately: an AI weapon breaks free of human control to pursue its own objectives, which might include eradicating or subjugating humans.

Such a catastrophic event would not be hard to spot. But much more likely cases of AI breaking free of human control to create mischief could be far more difficult to detect, because such activities could be hard to differentiate from malicious human behaviors such as ransomware, botnets,........

© Psychology Today


Get it on Google Play