Show HN: We post-trained a model that pen tests instead of refusing
Sentiment Mix
Geography
Expert Signals
dk189
author • 1 mention
Hacker News
source • 1 mention
AI-Generated Claims
Generated from linked receipts; click sources for full context.
Anthropic and OpenAI's publicly available models are explicitly guard-railed so that they refuse offensive tasks.
Supported by 1 story
And their cyber-focussed models are gated for enterprises.
Supported by 1 story
A worst case outcome is if only the adversaries have access.Meanwhile, most existing AI cyber tools are just wrappers.
Supported by 1 story
The problem is that they still have all the guardrails on from the foundation model where they will inherit its refusals.For this project we've post-trained a specific model on a decade of capture-the-flag contests.
Supported by 1 story
This won't be made available to anyone and everyone, but we do believe that responsible SMEs and midmarket companies also need access to these tools in order to identify key vulnerabilities in their systems; not just enterprises.We have developed two modes that run over a CLI:• Security scan: a read-only audit of your local codebase for...
Supported by 1 story
Related Events
Amazon CEO's Talks with U.S. Officials Triggered Crackdown on Anthropic Models
Policy & Regulation • 6/21/2026
Anthropic cuts access to AI models over US 'national security' order - Hürriyet Daily News
Policy & Regulation • 6/21/2026
US ban on Anthropic models sparks AI sovereignty concerns - The Times of India
LLMs • 6/21/2026
Scoop: Trump admin blocks foreign access to Anthropic's most powerful AI - Axios
LLMs • 6/21/2026
Anthropic and OpenAI expand London offices to tap into AI talent pool - Crypto Briefing
LLMs • 6/21/2026