๐ง๐ฒ๐๐๐ถ๐ป๐ด ๐๐ ๐ฆ๐ฎ๐ณ๐ฒ๐๐ ๐๐น๐ฎ๐๐๐ถ๐ณ๐ถ๐ฒ๐ฟ๐ ๐ณ๐ผ๐ฟ ๐ ๐ฒ๐ฑ๐ถ๐ฐ๐ฎ๐น ๐๐
I tested three local AI safety classifiers using 50 test cases. I wanted to see if they work for medical AI pipelines.
Medical AI uses sensitive terms like BRCA1 or GFR. If a safety model flags these as "harmful," the system fails.
I tested these models:
- WildGuard (Qwen3-4B)
- LlamaGuard3 (1B and 8B)
- Nemotron-3-Content-Safety (NVIDIA)
The Results:
โข WildGuard It has the highest Recall (1.000). It caught every single attack. But it has a high False Positive rate. It flagged medical terms like "BRCA1 mutation" and "GFR calculation" as unsafe. This makes it difficult for clinical use without a whitelist.
โข Nemotron-3-CS It has perfect Precision (1.000). It never flags safe content. However, it missed 8 attacks. It failed to detect malicious instructions hidden in Base64 or ROT13 encoding. It also missed infrastructure attacks.
โข LlamaGuard3 The 1B and 8B models performed similarly. They handled medical terms better than WildGuard but had lower overall scores.
Key Lessons for Engineers:
No model is perfect for medical AI. WildGuard stops attacks but blocks doctors. Nemotron stops false alarms but lets attacks through.
Testing tools matter. I used Passmark AI to automate these tests. I learned that using AI to click through complex UIs like Swagger causes timeouts. It is faster to navigate directly to REST URLs or use Playwright for POST requests.
Audit logs are missing. None of these classifiers provide built-in logs for user queries. You must build your own logging middleware for medical compliance.
If you need high security and can manage false positives, use WildGuard. If you cannot afford to block legitimate medical queries, use Nemotron-3-CS.
Source: https://dev.to/jh5_pulse/an-quan-de-long-xia-lai-liao-ma-nvidia-nemoclaw-shi-ce-1p04
Optional learning community: https://t.me/GyaanSetuAi