Tag: llm

9 posts

Screen Readers Solved Browser Agents Before Browser Agents Existed

June 10, 2026

Most browser agents reach for vision to read a page. The structured web already ships a clean, LLM-ready description of itself in the accessibility tree — built decades ago for screen readers. Vision is the escape hatch, not the default.

ai agents browser tool-use methodology llm

Which Model's Guardrails Fail First? — Cross-Model Refusal Benchmark v0

May 5, 2026

12 prompts × 5 frontier models × 3 runs (raw, harness-passthrough, perturbed). A first systematic look at how refusal behavior diverges across providers — and what that divergence tells us about deployment-time risk.

ai security llm red-team benchmark huggingface open-research

Prompt Engineering: What Actually Moves the Needle

April 10, 2026

Practical techniques for getting better output from LLMs: focused on what works, not what sounds impressive.

ai machine-learning llm

101 Prompts Every AI Builder Should Test Before Going Live

April 6, 2026

A categorized reference of real prompt injection, jailbreak, and extraction techniques — written for defenders, not attackers. If your system fails these, your users will find out before you do.

ai security llm risk business red-team

AI Hacking vs. Hacking AI: Notes from the Field

April 3, 2026

The line between building with AI and breaking with AI is thinner than either side admits. Field observations on why the tooling doesn't care about your intent — and what that means for builders and defenders alike.

ai security risk llm ethics

Outside-In: AI-Assisted Vulnerability Scanning When You Don't Have the Source

March 30, 2026

How to escalate from passive reconnaissance to actionable vulnerability findings against web applications — using the same AI-assisted methodology that works for source code, adapted for black-box targets.

ai security llm automation risk

What I Keep Seeing That Nobody Is Writing Down

March 29, 2026

Why I started documenting AI behavior from an operational background — and what this site is actually for.

llm machine-learning business

Prompt Injection Attack Surfaces: A Practical Taxonomy

March 29, 2026

How prompt injection escalates from curiosity to transaction fraud when AI agents have tools, file ingestion, and multimodal input — mapped from lab work to real-world deployment patterns.

ai security llm machine-learning risk

Scaling AI Vulnerability Scanning Beyond One File at a Time

March 29, 2026

Why manual prompt hints don't scale for AI-assisted code audits, and how per-file isolation with automated scaffolding solves the accuracy-vs-coverage tradeoff — tested against a 316-file production codebase.

ai security llm automation risk

← All posts