Tag: ai safety

Research Papers

ImpossibleBench: Measuring LLMs' Propensity of Exploiting Test Cases

Ziqian Zhong, Aditi Raghunathan, Nicholas Carlini • October 20, 2025

This paper introduces ImpossibleBench, a benchmark framework to quantify an LLM's propensity to exploit test cases. We create "impossible" variants of coding tasks by mutating test cases to conflict with natural-language specifications, measuring an agent's "cheating rate" as its pass rate on these impossible tasks.

LLMs Can Get "Brain Rot"!

Shuo Xing, Junyuan Hong, Yifan Wang, Runjin Chen, Zhenyu Zhang, Ananth Grama, Zhengzhong Tu, Zhangyang Wang • October 13, 2025

We investigate the "LLM Brain Rot Hypothesis"—that continual exposure to low-quality web text can induce cognitive decline in LLMs. Through controlled experiments on Twitter/X corpora, we demonstrate significant declines in reasoning, long-context understanding, and safety, while inflating negative traits.