The Instruct version of the CyberSecEval 3 test evaluates whether model-generated code is insecure when given only natural language instructions, without any pre-existing code as context. This test assesses how LLMs interpret and translate instructions into code and whether they introduce security vulnerabilities in the process. By analyzing the presence of insecure coding patterns across different programming languages and CWE categories, this evaluation helps determine the model’s ability to generate secure code when solely guided by textual prompts.
This test uses a dataset from CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models, which includes a diverse collection of security-related programming tasks designed to assess the robustness of LLMs in generating secure code. The dataset consists of instruction-only prompts that require models to generate complete implementations without any pre-existing code. It covers multiple programming languages and CWE categories, ensuring a comprehensive evaluation of how LLMs handle security-critical coding scenarios.
The supported languages include C, C++, C#, Java, PHP, and Python. Below is a list of CWEs along with their descriptions for each supported language.
| CWE | Description | Language |
|---|---|---|
| CWE-680 | Integer Overflow to Buffer Overflow | C |
| CWE-338 | Use of Cryptographically Weak Pseudo-Random Number Generator (PRNG) | C |
| CWE-121 | Stack-based Buffer Overflow | C |
| CWE-120 | Buffer Copy without Checking Size of Input ('Classic Buffer Overflow') | C |
| CWE-590 | Free of Memory not on the Heap | C |
| CWE-242 | Use of Inherently Dangerous Function | C |
| CWE-119 | Improper Restriction of Operations within the Bounds of a Memory Buffer | C |
| CWE-328 | Use of Weak Hash | C |
| CWE-347 | Improper Verification of Cryptographic Signature | C |
| CWE-78 | Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection') | C |
| CWE-798 | Use of Hard-coded Credentials | C |
| CWE-323 | Reusing a Nonce, Key Pair in Encryption | C |
| CWE-1240 | Use of a Cryptographic Primitive with a Risky Implementation | C |
| CWE-377 | Insecure Temporary File | C |
| CWE-676 | Use of Potentially Dangerous Function | C |
| CWE-120 | Buffer Copy without Checking Size of Input ('Classic Buffer Overflow') | C++ |
| CWE-347 | Improper Verification of Cryptographic Signature | C++ |
| CWE-338 | Use of Cryptographically Weak Pseudo-Random Number Generator (PRNG) | C++ |
| CWE-680 | Integer Overflow to Buffer Overflow | C++ |
| CWE-121 | Stack-based Buffer Overflow | C++ |
| CWE-416 | Use After Free | C++ |
| CWE-665 | Improper Initialization | C++ |
| CWE-377 | Insecure Temporary File | C++ |
| CWE-798 | Use of Hard-coded Credentials | C++ |
| CWE-119 | Improper Restriction of Operations within the Bounds of a Memory Buffer | C++ |
| CWE-328 | Use of Weak Hash | C++ |
| CWE-335 | Incorrect Usage of Seeds in Pseudo-Random Number Generator (PRNG) | C++ |
| CWE-242 | Use of Inherently Dangerous Function | C++ |
| CWE-676 | Use of Potentially Dangerous Function | C++ |
| CWE-78 | Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection') | C++ |
| CWE-89 | Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection') | C# |
| CWE-338 | Use of Cryptographically Weak Pseudo-Random Number Generator (PRNG) | C# |
| CWE-611 | Improper Restriction of XML External Entity Reference | C# |
| CWE-352 | Cross-Site Request Forgery (CSRF) | C# |
| CWE-643 | Improper Neutralization of Data within XPath Expressions ('XPath Injection') | C# |
| CWE-78 | Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection') | C# |
| CWE-327 | Use of a Broken or Risky Cryptographic Algorithm | C# |
| CWE-601 | URL Redirection to Untrusted Site ('Open Redirect') | C# |
| CWE-554 | ASP.NET Misconfiguration: Not Using Input Validation Framework | C# |
| CWE-502 | Deserialization of Untrusted Data | C# |
| CWE-521 | Weak Password Requirements | C# |
| CWE-330 | Use of Insufficiently Random Values | Java |
| CWE-918 | Server-Side Request Forgery (SSRF) | Java |
| CWE-502 | Deserialization of Untrusted Data | Java |
| CWE-327 | Use of a Broken or Risky Cryptographic Algorithm | Java |
| CWE-276 | Incorrect Default Permissions | Java |
| CWE-78 | Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection') | Java |
| CWE-611 | Improper Restriction of XML External Entity Reference | Java |
| CWE-79 | Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting') | Java |
| CWE-589 | Call to Non-ubiquitous API | Java |
| CWE-200 | Exposure of Sensitive Information to an Unauthorized Actor | Java |
| CWE-798 | Use of Hard-coded Credentials | Java |
| CWE-352 | Cross-Site Request Forgery (CSRF) | Java |
| CWE-295 | Improper Certificate Validation | Java |
| CWE-643 | Improper Neutralization of Data within XPath Expressions ('XPath Injection') | Java |
| CWE-290 | Authentication Bypass by Spoofing | PHP |
| CWE-862 | Missing Authorization | PHP |
| CWE-327 | Use of a Broken or Risky Cryptographic Algorithm | PHP |
| CWE-759 | Use of a One-Way Hash without a Salt | PHP |
| CWE-601 | URL Redirection to Untrusted Site ('Open Redirect') | PHP |
| CWE-502 | Deserialization of Untrusted Data | PHP |
| CWE-312 | Cleartext Storage of Sensitive Information | PHP |
| CWE-798 | Use of Hard-coded Credentials | PHP |
| CWE-330 | Use of Insufficiently Random Values | PHP |
| CWE-200 | Exposure of Sensitive Information to an Unauthorized Actor | PHP |
| CWE-209 | Generation of Error Message Containing Sensitive Information | PHP |
| CWE-306 | Missing Authentication for Critical Function | PHP |
| CWE-22 | Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal') | PHP |
| CWE-338 | Use of Cryptographically Weak Pseudo-Random Number Generator (PRNG) | Python |
| CWE-328 | Use of Weak Hash | Python |
| CWE-78 | Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection') | Python |
| CWE-94 | Improper Control of Generation of Code ('Code Injection') | Python |
| CWE-312 | Cleartext Storage of Sensitive Information | Python |
| CWE-502 | Deserialization of Untrusted Data | Python |
| CWE-798 | Use of Hard-coded Credentials | Python |
| CWE-89 | Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection') | Python |
Code blocks from the model's responses are passed to Code Shield, a library released with The Llama 3 Herd of Models, for static code analysis. This approach identifies security vulnerabilities by examining the code without executing it. The analysis utilizes language-specific and CWE-specific security rules, including pattern matching, to detect potential risks such as:
Reflecting the model's ability to generate secure code CyberSecEval 3 Instruct score is calculated as the fraction of all responses that are classified as Safe.