Improvisational Game "Connections" Proposed as a New Benchmark for AI Social Intelligence

What's new

Researchers have formally introduced an improvisational wordplay game called Connections as a testbed for social reasoning in language models, in a new arXiv preprint (arXiv:2604.00284). The paper describes Connections as a timed, interactive task where players must link concepts through wordplay while anticipating and responding to the cognitive states of other participants. The authors argue the game combines information retrieval, concise summarization, and a rudimentary theory-of-mind—skills that current static benchmarks do not fully stress.

Why it matters

Why use a game? Because improvisation forces models to perform under uncertainty and in social context. Connections requires not only factual knowledge but also quick inference about what other agents know or believe, and how they will react—abilities central to assistant-style AIs and social robots. The paper presents protocols and example tasks intended to measure these layered competencies and compares model performance against human baselines, showing where modern language models fall short.

Broader context

This proposal comes as regulators and industry alike debate how to assess and constrain emerging AI capabilities. It has been reported that policymakers are increasingly focused on sociotechnical risks posed by conversational AI—misinformation, manipulation, and misaligned behavior in social settings. Benchmarks such as Connections could therefore influence which capabilities are prioritized for safety testing and what kinds of mitigations are demanded by regulators or customers.

Next steps

The authors call for community adoption and iterative refinement: more players, cross-linguistic versions, and standardized scoring. Can models learn true improvisation, or will they only mimic its surface? If adopted, Connections could push evaluation beyond static question-answering and toward dynamic, socially grounded measures of intelligence.