Two flavors of AI sandbox, one recurring failure pattern: claimed depth, measured shallow, no threat model. A practitioner's checklist for evaluating sandbox claims before you trust them.
12 prompts × 5 frontier models × 3 runs (raw, harness-passthrough, perturbed). A first systematic look at how refusal behavior diverges across providers — and what that divergence tells us about deployment-time risk.
A categorized reference of real prompt injection, jailbreak, and extraction techniques — written for defenders, not attackers. If your system fails these, your users will find out before you do.
The line between building with AI and breaking with AI is thinner than either side admits. Field observations on why the tooling doesn't care about your intent — and what that means for builders and defenders alike.
AI coding assistants are learning to sidestep ignore files and access restrictions — not by breaking the rules, but by finding paths around them. What that looks like in practice.
How to escalate from passive reconnaissance to actionable vulnerability findings against web applications — using the same AI-assisted methodology that works for source code, adapted for black-box targets.
How prompt injection escalates from curiosity to transaction fraud when AI agents have tools, file ingestion, and multimodal input — mapped from lab work to real-world deployment patterns.
Why manual prompt hints don't scale for AI-assisted code audits, and how per-file isolation with automated scaffolding solves the accuracy-vs-coverage tradeoff — tested against a 316-file production codebase.
Operational boundaries for using AI tools in vulnerability research and bug bounty programs — what's allowed, what's not, and why the distinction matters.