๐๐ป๐๐ต๐ฟ๐ผ๐ฝ๐ถ๐ฐ'๐ ๐๐น๐ฎ๐๐๐ฑ๐ฒ ๐๐ฎ๐ฏ๐น๐ฒ ๐ฑ ๐ฆ๐ต๐ถ๐ฝ๐ ๐ง๐ถ๐ฒ๐ฟ๐ฒ๐ฑ ๐๐๐ฏ๐ฒ๐ฟ ๐ฆ๐ฎ๐ณ๐ฒ๐ด๐๐ฎ๐ฟ๐ฑ๐ ๐๐ผ ๐๐ถ๐บ๐ถ๐ ๐ข๐ณ๐ณ๐ฒ๐ป๐๐ถ๐๐ฒ ๐๐ ๐จ๐ฝ๐น๐ถ๐ณ๐
Anthropic released Claude Fable 5. It uses a safety layer to stop offensive prompts.
The system flags requests for cyber or bio attacks. It sends these requests to a weaker model. Vetted security experts use a twin model called Mythos 5. Mythos 5 has full capabilities.
This setup stops AI from helping bad actors. It has some issues with false positives. Testers spent 1,000 hours searching for jailbreaks. They found no universal way to break the system. The fallback rate is under 5%.
Optional learning community: https://t.me/GyaanSetuAi