menu_open Columnists
We use cookies to provide some features and experiences in QOSHE

More information  .  Close

Button‑pushing explorers: How to grasp that AI agents can do amazing things while knowing nothing

18 0
12.05.2026

The nonprofit ARC Prize Foundation on May 1, 2026, released the results of a new benchmark: a test of an AI system’s ability to solve a game. The results were striking – humans scored 100%, while the most advanced AI systems scored under 1%.

At first glance, this may be surprising to users of AI who are impressed by its polished essays, codebases and multistep projects generated in seconds. How can these brilliant AI systems struggle with these simple Tetris-shape puzzles?

That confusion points to a risk: AI is becoming integrated into everyday life faster than people can make sense of it.

We are cognitive psychologists who study how to teach difficult concepts. To recognize the limits and risks of today’s AI agent systems, it’s important for people to grasp that the systems can both accomplish superhuman feats and make mistakes few humans would. To that end, we propose a new way to think about AIs: as button-pushing explorers.

We teach college students, a group rapidly incorporating AI tools into their daily routines. That gives us regular opportunities to ask what they think is going on with AI. The answers vary widely. One student said that someone at OpenAI or Anthropic is reading and approving every response the system generates. Another, more succinctly, said, “It’s magic.”

These responses illustrate two tempting ways of making sense of AI. At one extreme, AI is treated as an inscrutable black box – a powerful but ultimately mysterious force. At another, people explain it using the same assumptions they use to understand other humans: that its outputs reflect reasoning or judgment.

The worry is that these misinterpretations don’t go away as users gain more experience interacting with AI, and they might get reinforced. When AI performs well, its output can feel like evidence of understanding or confirmation that it really is something like magic. That apparent success makes it harder to question what the system is actually doing. Biases can seem logical or inevitable; harmful behavior can........

© The Conversation