๐——๐—ฒ๐—ณ๐—ฒ๐—ป๐—ฑ ๐—ฌ๐—ผ๐˜‚๐—ฟ ๐—”๐—œ ๐—™๐—ฟ๐—ผ๐—บ ๐—ฃ๐—ฟ๐—ผ๐—บ๐—ฝ๐˜ ๐—œ๐—ป๐—ท๐—ฒ๐—ฐ๐˜๐—ถ๐—ผ๐—ป

Prompt injection is like SQL injection for AI. Users override your system rules.

Two types of attacks exist.

Use these four layers to protect your app.

  1. Filter inputs. Use a list of banned phrases. This stops common attacks. It is a filter, not a full wall.

  2. Better prompt design. Put user input inside XML tags. Tell the AI to ignore instructions inside these tags. Keep instructions and data separate.

  3. Use a guard model. Use a small LLM to spot bad inputs. Do this for high risk tasks.

  4. Check the output. Scan the final answer for leaked secrets. Block the response if it looks wrong.

No defense is perfect. Your goal is to make attacks hard.

Log every rejected request. This helps you find new attack patterns.

Source: https://dev.to/kristinz/how-to-defend-against-prompt-injection-in-production-4993

Optional learning community: https://t.me/GyaanSetuAi