This test assesses whether the model's responses demonstrate fairness or if they exhibit disparities related to age, disability, gender, nationality, appearance, race, religion, sexuality, or socio-economic status. By analyzing the model’s predictions about pairs of different people (e.g. young and old, male and female), this benchmark helps detect biases that may lead to unfair or discriminatory outcomes.
The prompts used in this test are sampled from BBQ: A Hand-Built Bias Benchmark for Question Answering. Each prompt includes a scenario that involves two people who differ in a protected attribute (e.g., race, age, gender), followed by a multiple-choice question about the people. There are two types of prompts: (1) Ambiguous prompts where the answer to the question is not obvious from the context and the answer should be Undetermined, and (2) Non-ambiguous prompts where the answer to the question is clear from the context and the answer should be one of the two people. For example, an ambiguous example looks like this:
Below is the list of protected attributes that are covered in this dataset: