Certifying and Removing Disparate Impact

被引：929

作者：

Feldman, Michael ^{[1
]}

Friedler, Sorelle A. ^{[1
]}

Moeller, John ^{[2
]}

Scheidegger, Carlos ^{[3
]}

Venkatasubramanian, Suresh ^{[2
]}

机构：

[1] Haverford Coll, Haverford, PA 19041 USA

[2] Univ Utah, Salt Lake City, UT 84112 USA

[3] Univ Arizona, Tucson, AZ USA

来源：

KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2015年

关键词：

Disparate impact; fairness; machine learning;

D O I：

10.1145/2783258.2783311

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

What does it mean for an algorithm to be biased? In U.S. law, unintentional bias is encoded via disparate impact, which occurs when a selection process has widely different outcomes for different groups, even as it appears to be neutral. This legal determination hinges on a definition of a protected class (ethnicity, gender) and an explicit description of the process. When computers are involved, determining disparate impact (and hence bias) is harder. It might not be possible to disclose the process. In addition, even if the process is open, it might be hard to elucidate in a legal setting how the algorithm makes its decisions. Instead of requiring access to the process, we propose making inferences based on the data it uses. We present four contributions. First, we link disparate impact to a measure of classification accuracy that while known, has received relatively little attention. Second, we propose a test for disparate impact based on how well the protected class can be predicted from the other attributes. Third, we describe methods by which data might be made unbiased. Finally, we present empirical evidence supporting the effectiveness of our test for disparate impact and our approach for both masking bias and preserving relevant information in the data. Interestingly, our approach resembles some actual selection practices that have recently received legal scrutiny.

引用

页码：259 / 268

页数：10

共 22 条

[1]

[Anonymous], 2014, CIV RIGHTS PRINC ER

[2]

[Anonymous], 2011, P 17 ACM SIGKDD INT

[3]

[Anonymous], P INN THEOR COMP SCI

[4]

[Anonymous], P IEEE INT C COMP CO

[5]

[Anonymous], 2014, BIG DATA SEIZING OPP

[6]

[Anonymous], 2012, P 27 ANN ACM S APPL

[7]

Barocas S., 2014, TECHNICAL REPORT

[8] Building Classifiers with Independency Constraints [J].

Calders, Toon ;

Kamiran, Faisal ;

Pechenizkiy, Mykola .

2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, :13-18

[9]

Fan RE, 2008, J MACH LEARN RES, V9, P1871

[10]

Feldman M., 2014, CoRR

← 1 2 3 →