Google DeepMind Backs $10M Fund to Solve Multi Agent AI Safety Risks

Translated for your language. Read the original.

AI-assisted draft.

GyaanSetu Editorial3 వారాల క్రితం3min read

In this article

మల్టీ-ఏజెంట్ AI భద్రతా ప్రమాదాలను పరిష్కరించడానికి Google DeepMind $10M నిధికి మద్దతు

AI ఏజెంట్లు సాధారణ చాట్‌బాట్‌ల నుండి సంక్లిష్టమైన పనులను నిర్వహించగల స్వయంప్రతిపత్తి కలిగిన (autonomous) సంస్థలుగా అభివృద్ధి చెందుతున్న కొద్దీ, వ్యవస్థాగత ప్రమాదాల (systemic risk) కొత్త పరిధి ఉద్భవిస్తోంది. లక్షలాది సంఖ్యలో ఈ స్వయంప్రతిపత్తి కలిగిన ఏజెంట్లు నిజ ప్రపంచంలో పరస్పరం సంభాషించడం ప్రారంభించినప్పుడు తలెత్తే ఊహించని ప్రవర్తనలను అధ్యయనం చేయడానికి Google DeepMind మరియు పలు ప్రపంచ భాగస్వాములు ఒక భారీ కార్యక్రమాన్ని ప్రారంభించారు.

మల్టీ-ఏజెంట్ సమస్య: వ్యక్తిగత మోడల్ భద్రతకు మించి

ప్రస్తుత AI యుగంలో ఎక్కువ కాలం పరిశోధనలు సింగిల్ మోడల్స్ భద్రతపైనే దృష్టి సారించాయి—అంటే ఒక నిర్దిష్ట LLM విషపూరితమైన కంటెంట్‌ను విడుదల చేయకుండా లేదా హానికరమైన ప్రాంప్ట్‌లను అనుసరించకుండా చూడటం. అయితే, అసలైన సవాలు "మల్టీ-ఏజెంట్ సిస్టమ్స్" (multi-agent systems) లో ఉందని Google DeepMind మరియు దాని భాగస్వాములు గుర్తించారు.

ఆర్థిక వ్యవస్థ అంతటా పెద్ద సంఖ్యలో ఏజెంట్లను మోహరించినప్పుడు, అవి ఒక సంక్లిష్టమైన పర్యావరణ వ్యవస్థను సృష్టిస్తాయి, అక్కడ వాటి సామూహిక ప్రవర్తన విడివిడి భాగాల మొత్తం కంటే పూర్తిగా భిన్నంగా ఉండవచ్చు. ఈ "ఏజెంట్ హైవ్ మైండ్" (agent hive mind) సంభావ్యంగా ఉద్భవించే మేధస్సును (emergent intelligence) లేదా మరింత ఆందోళనకరంగా, ఉద్భవించే అస్తవ్యస్తతను (emergent chaos) సృష్టించవచ్చు. విడివిడిగా ఉన్న మోడల్స్‌ను అధ్యయనం చేయడం ద్వారా ఈ ఫలితాలను మనం అంచనా వేయలేమని నిపుణులు హెచ్చరిస్తున్నారు; బదులుగా, ఏజెంట్లు డిజిటల్ సాండ్‌బాక్స్‌లలో ఎలా పరస్పరం సంభాషిస్తాయి, పోటీ పడతాయి లేదా అనుకోకుండా సహకరిస్తాయి అనే అంశాలను గమనించడానికి పరిశోధకులు వాస్తవిక, భారీ స్థాయి సిమ్యులేషన్లను ఉపయోగించాలి.

విద్యా పరిశోధనల కోసం $10 మిలియన్ల కూటమి

ఈ లోటును పూడ్చడానికి, పరిశోధకులకు $10 మిలియన్ల నిధులను అందించడానికి Google DeepMind ఒక శక్తివంతమైన కూటమిని ఏర్పాటు చేసింది. ఈ భాగస్వామ్యంలో Schmidt Sciences (ఎరిక్ మరియు వెండి స్మిత్ నేతృత్వంలోని స్వచ్ఛంద సంస్థ), ARIA (UK ప్రభుత్వ మూన్‌షాట్ ఏజెన్సీ), Cooperative AI ఫౌండేషన్ మరియు Google.org ఉన్నాయి.

పరిశోధనలను బిగ్ టెక్ ల్యాబ్‌ల పరిధి నుండి బయటకు తీసుకువచ్చి విద్యాసంస్థల్లోకి (academia) చేర్చడమే దీని వ్యూహాత్మక లక్ష్యం. Google మరియు Anthropic వంటి పరిశ్రమ దిగ్గజాలు సాంకేతికతను నిర్మిస్తున్నప్పటికీ, విద్యా పరిశోధకులకు భవిష్యత్తును మరింత లోతుగా పరిశీలించే మరియు వాణిజ్య ఉత్పత్తుల చక్రాలకు తక్షణ ప్రాధాన్యత కాకపోవచ్చు అనుకునే దీర్ఘకాలిక వ్యవస్థాగత ప్రమాదాలను విచారించే స్వేచ్ఛ ఉంటుంది. ప్రస్తుతం లేని "మల్టీ-ఏజెంట్ సేఫ్టీ" (multi-agent safety) అనే ప్రాథమిక రంగాన్ని నిర్మించడం ఈ నిధి యొక్క లక్ష్యం.

ప్రాంప్ట్ ఇంజెక్షన్ల నుండి డిజిటల్ అరాచకత్వం వరకు

మల్టీ-ఏజెంట్ సిస్టమ్స్‌తో ముడిపడి ఉన్న ప్రమాదాలు కేవలం సిద్ధాంతపరమైనవి మాత్రమే కాదు; అవి ప్రస్తుతం ఉన్న సైబర్ సెక్యూరిటీ ముప్పుల యొక్క మరింత శక్తివంతమైన రూపాలు. ప్రధాన ఆందోళనలు ఇవి:

Advanced Prompt Injections: An agent could be "hijacked" by a single malicious sentence buried in a document, turning a helpful assistant into self-guided malware.
Automated Scams and Cyberattacks: Agents capable of reasoning and improvisation can execute complex, multi-step social engineering or hacking attempts at scale.
Systemic Instability: Just as human institutions can cause unforeseen economic shifts, a massive deployment of autonomous agents could lead to digital "anarchy" or market instability.

Unlike traditional software, which follows fixed paths written by humans, AI agents reason and improvise. This unpredictability necessitates a shift toward "zero trust" frameworks—an approach championed by Anthropic—where every agent is treated as a potential vulnerability.

Key Takeaways

New Funding Initiative: Google DeepMind and partners have committed $10 million to fund academic research into the unpredictable behaviors of interacting AI agents.
Emergent Risks: The primary concern is that millions of autonomous agents could create systemic risks, such as automated cyberattacks and "hive mind" behaviors, that cannot be predicted by testing single models.
Shift in Security Paradigms: As agents move from fixed software to reasoning entities, the industry is shifting toward "zero trust" models to mitigate the risks of hijacking and prompt injection.