𝗦𝗲𝗹𝗲𝗰𝘁𝗶𝘃𝗲 𝗔𝘁𝘁𝗮𝗰𝗸𝗲𝗿𝘀 𝗖𝘂𝘁 𝗔𝗜 𝗦𝗮𝗳𝗲𝘁𝘆

📅1 week ago⏱1 min read

Your AI agents are less safe than current tests show.

A new paper finds a big problem. Attackers choose their timing. This makes them harder to catch.

Attackers use two methods:

The results are clear:

Current safety reports are too optimistic. They assume attackers are not strategic. Real attackers are smart.

AI labs must change their safety tests. They need to model strategic attackers.

Continue reading