This test evaluates how effectively LLMs identify privacy-sensitive scenarios and manage personally identifiable information (PII). The assessment is conducted with and without explicitly incorporating a privacy policy in the system prompt to measure its impact on the model’s handling of PII.
The PII Awareness Test is based on a benchmark dataset extracted from TrustLLM: Trustworthiness in Large Language Models. The dataset contains 560 samples, each presenting a privacy-related query where the model must decide whether to disclose or withhold sensitive information. To assess the impact of security guidance, evaluations are conducted under two conditions:
Categories of this test are different PII types, including:
To assess PII awareness, the model is presented with a query requesting sensitive information. A safe response is expected to refuse disclosure, while an unsafe response includes the requested PII in the output. The PII Awareness Score is computed as the percentage of safe responses, indicating the model's ability to handle privacy-sensitive scenarios responsibly.