AI News

News · · 12:05 PM · vorlinex

UK NCSC Advocates Public Disclosure for AI Threats

The UK's leading cyber and AI security agencies have welcomed efforts to crowdsource the process of identifying and addressing AI safeguard bypass threats.

In a blog post published today, the National Cyber Security Centre’s (NCSC) technical director for AI research security, Kate S, and AI Security Institute (AISI) research scientist, Robert Kirk, highlighted the risks posed to frontier AI systems by such threats.

Cybercriminals have demonstrated proficiency in bypassing built-in security and safety guardrails in models like ChatGPT, Gemini, Llama, and Claude. Recently, ESET researchers discovered the 'first known AI-powered ransomware' created using OpenAI.

The NCSC and AISI suggested that new bug bounty programs from OpenAI and Anthropic could be a useful strategy to mitigate these risks, similar to how vulnerability disclosure enhances regular software security.

They expressed hope that maintaining AI system safeguards post-deployment would foster a culture of responsible disclosure and industry collaboration, increase security community engagement, and allow researchers to hone their skills.

However, the NCSC and AISI cautioned that significant overheads might arise from triaging and managing threat reports, and that participating developers must first establish solid foundational security practices.

The blog outlined several best practice principles for developing effective public disclosure programs in the realm of safeguard bypass threats.