@OpenAI
Our results depend on reading models’ reasoning (“chain-of-thought”), and we believe the field isn't prepared for eval-aware models with opaque reasoning. Until better methods exist, we urge developers to preserve chain-of-thought transparency to study and mitigate scheming.