๐๐ฒ๐ณ๐ฒ๐ป๐ฑ ๐ฌ๐ผ๐๐ฟ ๐๐ ๐๐ฟ๐ผ๐บ ๐ฃ๐ฟ๐ผ๐บ๐ฝ๐ ๐๐ป๐ท๐ฒ๐ฐ๐๐ถ๐ผ๐ป
Prompt injection is like SQL injection for AI. Users override your system rules.
Two types of attacks exist.
- Direct: Users type instructions in the chat.
- Indirect: Malicious text hides in files or web pages.
Use these four layers to protect your app.
Filter inputs. Use a list of banned phrases. This stops common attacks. It is a filter, not a full wall.
Better prompt design. Put user input inside XML tags. Tell the AI to ignore instructions inside these tags. Keep instructions and data separate.
Use a guard model. Use a small LLM to spot bad inputs. Do this for high risk tasks.
Check the output. Scan the final answer for leaked secrets. Block the response if it looks wrong.
No defense is perfect. Your goal is to make attacks hard.
Log every rejected request. This helps you find new attack patterns.
Source: https://dev.to/kristinz/how-to-defend-against-prompt-injection-in-production-4993
Optional learning community: https://t.me/GyaanSetuAi