Paper ID: 2411.08813

Rethinking CyberSecEval: An LLM-Aided Approach to Evaluation Critique

Suhas Hariharan, Zainab Ali Majid, Jaime Raldua Veuthey, Jacob Haimes

A key development in the cybersecurity evaluations space is the work carried out by Meta, through their CyberSecEval approach. While this work is undoubtedly a useful contribution to a nascent field, there are notable features that limit its utility. Key drawbacks focus on the insecure code detection part of Meta's methodology. We explore these limitations, and use our exploration as a test case for LLM-assisted benchmark analysis.

Submitted: Nov 13, 2024

Topics

Global Evaluation
LLM Benchmark
Meta Level
Insecure Code

Links

arXiv PDF